Security¶
Finch uses four distinct security layers, each scoped to a specific trust boundary:
| Layer | Protocol | Used for | Who holds the credential |
|---|---|---|---|
| SSH | TCP/22 | Deployment operations only | Operator (SSH key) |
| HTTPS / TLS | TCP/443 | All runtime traffic | Traefik (server cert) |
| mTLS | gRPC over HTTPS | finchctl → Finch control channel | finchctl client (cert + key) |
| JWT | Bearer token over HTTPS | Agent data ingestion; dashboard access | Alloy agent; dashboard user |
SSH - Deployment channel¶
SSH is used exclusively during deployment and maintenance operations:
service deploy, service update, service register, service deregister,
service rotate-certificate, service rotate-secret, agent deploy,
agent teardown.
All SSH operations are standard remote-command execution and secure file copy. There are no persistent SSH sessions and no SSH involvement in runtime traffic. Once a stack is deployed, SSH is only needed again when something changes.
HTTPS - Transport security¶
Traefik is the TLS-terminating reverse proxy for all Finch services. It listens on port 80 and 443; port 80 permanently redirects to HTTPS.
Three TLS modes are available:
| Mode | How | Flag |
|---|---|---|
| Self-signed | Certificate + key generated during service deploy |
(default) |
| Let's Encrypt | ACME HTTP challenge | --service.letsencrypt / --service-letsencrypt.email |
| Custom | Operator-supplied certificate + key copied to server | --service.customtls / --service.customtls-cert / --service.customtls-key |
Traefik routes inbound requests to backend services based on path prefix and
configures RequestClientCert globally, which causes Traefik to forward any
presented client certificate to the backend as the X-Forwarded-Tls-Client-Cert
header. This is the foundation of the mTLS layer below.
mTLS - finchctl identity¶
Every finchctl command that communicates with the Finch service uses a mTLS
client certificate. The full flow:
- finchctl presents its client certificate during the TLS handshake with Traefik.
- Traefik forwards the certificate PEM to the Finch gRPC backend via the
X-Forwarded-Tls-Client-Certheader. - Finch reads the header and validates the client certificate against the related CA file.
Certificate architecture¶
All mTLS assets are generated client-side by finchctl using ECDSA P-256:
- A CA certificate and key are generated locally. The CA key is discarded immediately after signing; it is never stored.
- A client certificate and key are signed by the CA and stored in
~/.config/finch.json(base64-encoded). They never leave the operator's machine. - Only the CA certificate is uploaded to the server, one file per registered finchctl client.
Finch validates each incoming request by checking the presented client certificate against the related CA. A certificate is valid only if it was signed by its own CA and passes a full x509 chain check with extende key usage for client auth.
Certificate validity and rotation¶
Certificates are valid for 90 days. Rotate them proactively with
service rotate-certificate (see Rotate mTLS Certificates).
Why no certificate revocation list (CRL/OCSP) is needed¶
Revoking a finchctl client's access is a one-step operation:
This removes the clients CA file from the server immediately. The next request from that client fails validation because its certificate can no longer be verified against any CA. No restart, no CRL distribution point, no OCSP responder.
Agent JWT - Data ingestion authentication¶
Enrolled agents (Alloy) authenticate to the data endpoints (Loki, Mimir, Pyroscope) using JWT Bearer.
Flow¶
- Alloy sends
Authorization: Bearer <token>with every write request to/loki,/mimir, or/pyroscope. - Traefik intercepts the request and performs a auth call to finch before forwarding to the backend.
- Finch auth server extracts the Bearer token and checks:
- HMAC-SHA256 signature using the stack secret
issclaim =finch,subclaim =agent- Token expiry (
expclaim) - RID lookup in the database - the token's
ridclaim must match a registered agent - If all checks pass, Finch returns HTTP 200 and Traefik forwards the request. Any failure returns HTTP 401 and the write is rejected.
Token properties¶
| Property | Value |
|---|---|
| Algorithm | HMAC-SHA256 (HS256) |
| Validity | 365 days |
| Claims | iss=finch, sub=agent, rid=<resource-id>, iat, exp |
| Signing key | 32-byte random secret in finch.json |
RID-gated validation as revocation¶
The RID check means that deregistering an agent is equivalent to immediately revoking its token - no token blacklist required. The database record is deleted and the next validation returns "unknown agent" for that RID, regardless of whether the token signature is still valid.
To rotate all tokens simultaneously (break-glass scenario), use
service rotate-secret (see Rotate the Signing Secret).
Dashboard JWT - Operator access¶
The Finch dashboard is accessed via a short-lived JWT issued by
finchctl service dashboard. The token is passed as a URL parameter to the
dashboard login endpoint and grants scoped, role-limited access.
Token properties¶
| Property | Value |
|---|---|
| Algorithm | HMAC-SHA256 (HS256) |
| Validity | Configurable session timeout (default 1800 s = 30 min) |
| Claims | iss=finch, sub=dashboard, role, scope, iat, exp |
| Signing key | Same 32-byte secret as agent tokens |
RBAC roles¶
Three roles are available: admin, operator, and viewer (default). For
the full permission matrix, see Dashboard: RBAC roles.
Scope¶
The scope claim is a list of agent RIDs or hostnames. When non-empty, the
dashboard user can only see agents whose RID or hostname appears in the list.
An empty scope means access to all agents.
# Grant read-only access to two specific agents for 1 hour
finchctl service dashboard \
--permission.role viewer \
--permission.scope rid:finch:8d134b24c2541730:agent:59ddbb5d-73b2-45bf-95d3-5520dcf37618 \
--permission.scope web-02 \
--permission.session-timeout 3600 \
finch.example.com
Short session lifetimes (30 min by default) mean dashboard tokens expire quickly without any active revocation mechanism.
Design philosophy¶
Finch deliberately omits several security features found in enterprise-grade platforms:
| Not included | Reason |
|---|---|
| Certificate revocation lists (CRL) | Deregistering removes the CA file - revocation is instant without a distribution mechanism |
| OCSP stapling | Same reasoning; short 90-day certs limit exposure further |
| Fine-grained JWT revocation | RID-gated DB check provides per-token revocation without a blacklist |
| Multi-tenancy | Single operator, single secret - simplicity is a design goal |
| Rate limiting | Out of scope; add at the network/firewall layer if needed |
The result is a security model that is easy to reason about, easy to operate, and whose failure modes (compromised token, expiring / compromised cert) have clear, single-command recovery paths.