Host gone stale in the fleet: find out why
The host exists in your fleet view but its data stopped updating. Start with the last-seen timestamp — it tells you which class of failure you have.
What this looks like
Unlike a host that never connected, a stale host enrolled successfully and reported for a while — then its charts flatlined and its last-seen timestamp stopped advancing. Something changed: the agent process, the host itself, the network path, or the credentials.
One calibration first: the agent reports on a heartbeat cadence, so gaps of a couple of minutes are normal and self-heal. Treat a host as genuinely stale only when last-seen is well beyond the heartbeat interval and not recovering.
Common root causes
Agent service stopped or crashed
An OS upgrade, an out-of-memory kill, or a manual systemctl stop leaves the agent down while the host runs fine — the most common cause of a stale fleet entry.
Host offline or rebuilt
If the machine itself is down, terminated, or was re-imaged without re-installing the agent, the fleet entry goes stale by definition.
Network or firewall egress change
A new firewall rule, security group change, or proxy policy that blocks outbound HTTPS to the ingest host silences the agent without stopping it.
API key rotated
If the project key was rotated after the agent was installed, every send starts failing auth from that moment — visible as auth errors in the agent's journal.
Step-by-step diagnosis
Read the dashboard first, then confirm on the host — the last-seen timestamp narrows it fast.
- 1
Check the last-seen timestamp
In the fleet view, note exactly when the host last reported. A gap of a minute or two is normal heartbeat cadence; correlate a longer cutoff time with deploys, network changes, or maintenance in that window.
- 2
Confirm the host itself is up
Ping or SSH into the machine. If the host is down or was terminated, the fleet entry is correct — restore the host or remove the entry deliberately.
- 3
Check and restart the agent service
Run systemctl status allstak-agent. If it is stopped or failed, systemctl restart allstak-agent and watch the fleet view — a healthy agent reappears within the heartbeat interval.
- 4
Read the journal for the failure pattern
Run journalctl -u allstak-agent --since "-1h". Auth errors mean the key was rotated; connection timeouts or TLS failures mean the network path changed; a clean stop means someone or something stopped the unit.
- 5
Verify egress and re-install if the key changed
Curl the ingest host over HTTPS from the machine to confirm egress. If the journal shows auth failures, re-run the install.sh one-liner with the current project key to re-enroll with fresh credentials.
Prevent it from recurring
- Scan the fleet view's status column regularly — stale hosts stand out at a glance.
- Route infrastructure events to Slack with a notification rule so silence gets noticed early.
- Include the agent in your post-maintenance checklist after OS upgrades and reboots.
- Treat key rotation as a fleet-wide operation: rotate, then re-enroll every host in the same window.
Still stuck?
If the agent runs cleanly on the host, egress works, and the key is current but last-seen still does not advance, check the AllStak status page, then email [email protected] with the hostname, project name, and the time the host went stale — we can check ingest from our side.
Frequently asked questions
How long a gap in last-seen is normal?
The agent reports on a heartbeat cadence, so gaps of a minute or two are routine — brief network blips self-heal. Investigate when the gap clearly exceeds the heartbeat interval and keeps growing.
The host was rebuilt from an image — why is it stale?
A re-imaged machine no longer has the agent installed (or has stale credentials). Re-run the install.sh one-liner with your project key; better, bake it into the image or provisioning step.
Can I get alerted when a host stops reporting?
Route infrastructure and server events to Slack or another channel with a notification rule, so fleet problems surface in the channels your team already watches.
Is a stale host different from a decommissioned one?
Yes. A decommissioned host was deliberately removed and its agent stops on purpose — that is expected. A stale host should be reporting and is not; that is what this guide diagnoses.
Explore more
By framework
Compare
A fleet you can trust at a glance
Every Linux server in one live view with health, last-seen, metrics, and security events — next to your errors and logs. Start free.