Recover an Agent Breach¶
This tutorial walks through revoking a compromised agent token and re-enrolling the agent, with minimal data loss.
Scenario¶
You suspect an agent's JWT or configuration file was leaked, for example, the config file was exposed on a compromised host or accidentally committed to a repository. The token is still technically valid, so the compromised party could impersonate the agent.
Goal: Revoke the token immediately, issue fresh credentials, and restore the agent to normal operation, without restarting the Finch service.
How revocation works¶
Finch validates every inbound agent request by checking the JWT's embedded resource ID RID against its database. When you deregister an agent, its record is deleted from the database. The next request using the old token is rejected with an authentication error. No service restart is required.
During the window between deregister and re-deploy, Alloy's WAL buffers telemetry locally. Short gaps (minutes) typically result in no data loss once the agent reconnects.
Step 1 - Find the agent's resource ID¶
List all agents to find the one you want to revoke:
If you know the hostname, use describe to confirm:
finchctl agent describe --agent.rid rid:finch:8d134b24c2541730:agent:59ddbb5d-73b2-45bf-95d3-5520dcf37618 finch.example.com
Note the RID value, you will need it in the next step.
Step 2 - Deregister the agent¶
finchctl agent deregister --agent.rid rid:finch:8d134b24c2541730:agent:59ddbb5d-73b2-45bf-95d3-5520dcf37618 finch.example.com
The agent record is deleted from the database. Any request using the old token is immediately rejected. Alloy on the agent host starts buffering data in its WAL.
Optional - Full teardown and redeploy¶
If you suspect the agent host itself is compromised, perform a full teardown before redeploying to ensure no residual state (old config, cached credentials) remains. Teardown is supported on Linux, FreeBSD, and macOS.
Tear down the existing Alloy installation, stops and purges service, binary and config.
Teardown removes:| Platform | Service management | Binary | Config | Data |
|---|---|---|---|---|
| Linux | systemctl stop/disable, removes /etc/systemd/system/alloy.service |
/usr/bin/alloy |
/etc/alloy |
/var/lib/alloy |
| FreeBSD | service alloy stop, sysrc -x alloy_enable, removes /etc/rc.d/alloy |
/usr/bin/alloy |
/etc/alloy |
/var/lib/alloy |
| macOS | launchctl bootout, removes plist from /Library/LaunchDaemons/ |
/usr/local/bin/alloy |
/etc/alloy |
/var/lib/alloy |
WAL data is lost on teardown
Tearing down removes /var/lib/alloy, including buffered WAL data. Any
telemetry collected since the last successful flush will not be recovered.
Step 3 - Re-enroll the agent¶
Register a fresh agent and deploy the new configuration to the host. Follow
Re-enroll an Agent. The agent was deregistered, so use
agent register to get a new configuration, RID and JWT, then agent deploy to
push the config.
Summary¶
| Step | Command | Effect |
|---|---|---|
| List agents | agent list service-name |
Find the compromised agent's RID |
| Revoke token | agent deregister --agent.rid rid:... service-name |
Immediate rejection of old token |
| Re-enroll | see Re-enroll an Agent | New RID + JWT, agent reconnects, WAL flushes |
| Full clean | agent teardown then agent deploy |
Removes all residual state |