Host not reporting

Host gone stale in the fleet: find out why

The host exists in your fleet view but its data stopped updating. Start with the last-seen timestamp — it tells you which class of failure you have.

What this looks like

Unlike a host that never connected, a stale host enrolled successfully and reported for a while — then its charts flatlined and its last-seen timestamp stopped advancing. Something changed: the agent process, the host itself, the network path, or the credentials.

One calibration first: the agent reports on a heartbeat cadence, so gaps of a couple of minutes are normal and self-heal. Treat a host as genuinely stale only when last-seen is well beyond the heartbeat interval and not recovering.

Common root causes

Agent service stopped or crashed

An OS upgrade, an out-of-memory kill, or a manual systemctl stop leaves the agent down while the host runs fine — the most common cause of a stale fleet entry.

Host offline or rebuilt

If the machine itself is down, terminated, or was re-imaged without re-installing the agent, the fleet entry goes stale by definition.

Network or firewall egress change

A new firewall rule, security group change, or proxy policy that blocks outbound HTTPS to the ingest host silences the agent without stopping it.

API key rotated

If the project key was rotated after the agent was installed, every send starts failing auth from that moment — visible as auth errors in the agent's journal.

Step-by-step diagnosis

Read the dashboard first, then confirm on the host — the last-seen timestamp narrows it fast.

  1. 1

    Check the last-seen timestamp

    In the fleet view, note exactly when the host last reported. A gap of a minute or two is normal heartbeat cadence; correlate a longer cutoff time with deploys, network changes, or maintenance in that window.

  2. 2

    Confirm the host itself is up

    Ping or SSH into the machine. If the host is down or was terminated, the fleet entry is correct — restore the host or remove the entry deliberately.

  3. 3

    Check and restart the agent service

    Run systemctl status allstak-agent. If it is stopped or failed, systemctl restart allstak-agent and watch the fleet view — a healthy agent reappears within the heartbeat interval.

  4. 4

    Read the journal for the failure pattern

    Run journalctl -u allstak-agent --since "-1h". Auth errors mean the key was rotated; connection timeouts or TLS failures mean the network path changed; a clean stop means someone or something stopped the unit.

  5. 5

    Verify egress and re-install if the key changed

    Curl the ingest host over HTTPS from the machine to confirm egress. If the journal shows auth failures, re-run the install.sh one-liner with the current project key to re-enroll with fresh credentials.

Prevent it from recurring

  • Scan the fleet view's status column regularly — stale hosts stand out at a glance.
  • Route infrastructure events to Slack with a notification rule so silence gets noticed early.
  • Include the agent in your post-maintenance checklist after OS upgrades and reboots.
  • Treat key rotation as a fleet-wide operation: rotate, then re-enroll every host in the same window.

Still stuck?

If the agent runs cleanly on the host, egress works, and the key is current but last-seen still does not advance, check the AllStak status page, then email [email protected] with the hostname, project name, and the time the host went stale — we can check ingest from our side.

Frequently asked questions

How long a gap in last-seen is normal?

The agent reports on a heartbeat cadence, so gaps of a minute or two are routine — brief network blips self-heal. Investigate when the gap clearly exceeds the heartbeat interval and keeps growing.

The host was rebuilt from an image — why is it stale?

A re-imaged machine no longer has the agent installed (or has stale credentials). Re-run the install.sh one-liner with your project key; better, bake it into the image or provisioning step.

Can I get alerted when a host stops reporting?

Route infrastructure and server events to Slack or another channel with a notification rule, so fleet problems surface in the channels your team already watches.

Is a stale host different from a decommissioned one?

Yes. A decommissioned host was deliberately removed and its agent stops on purpose — that is expected. A stale host should be reporting and is not; that is what this guide diagnoses.

A fleet you can trust at a glance

Every Linux server in one live view with health, last-seen, metrics, and security events — next to your errors and logs. Start free.