Skip to content

Security

Finch uses four distinct security layers, each scoped to a specific trust boundary:

Layer Protocol Used for Who holds the credential
SSH TCP/22 Deployment operations only Operator (SSH key)
HTTPS / TLS TCP/443 All runtime traffic Traefik (server cert)
mTLS gRPC over HTTPS finchctl → Finch control channel finchctl client (cert + key)
JWT Bearer token over HTTPS Agent data ingestion; dashboard access Alloy agent; dashboard user

SSH - Deployment channel

SSH is used exclusively during deployment and maintenance operations: service deploy, service update, service register, service deregister, service rotate-certificate, service rotate-secret, agent deploy, agent teardown.

All SSH operations are standard remote-command execution and secure file copy. There are no persistent SSH sessions and no SSH involvement in runtime traffic. Once a stack is deployed, SSH is only needed again when something changes.


HTTPS - Transport security

Traefik is the TLS-terminating reverse proxy for all Finch services. It listens on port 80 and 443; port 80 permanently redirects to HTTPS.

Three TLS modes are available:

Mode How Flag
Self-signed Certificate + key generated during service deploy (default)
Let's Encrypt ACME HTTP challenge --service.letsencrypt / --service-letsencrypt.email
Custom Operator-supplied certificate + key copied to server --service.customtls / --service.customtls-cert / --service.customtls-key

Traefik routes inbound requests to backend services based on path prefix and configures RequestClientCert globally, which causes Traefik to forward any presented client certificate to the backend as the X-Forwarded-Tls-Client-Cert header. This is the foundation of the mTLS layer below.


mTLS - finchctl identity

Every finchctl command that communicates with the Finch service uses a mTLS client certificate. The full flow:

  1. finchctl presents its client certificate during the TLS handshake with Traefik.
  2. Traefik forwards the certificate PEM to the Finch gRPC backend via the X-Forwarded-Tls-Client-Cert header.
  3. Finch reads the header and validates the client certificate against the related CA file.

Certificate architecture

All mTLS assets are generated client-side by finchctl using ECDSA P-256:

  • A CA certificate and key are generated locally. The CA key is discarded immediately after signing; it is never stored.
  • A client certificate and key are signed by the CA and stored in ~/.config/finch.json (base64-encoded). They never leave the operator's machine.
  • Only the CA certificate is uploaded to the server, one file per registered finchctl client.

Finch validates each incoming request by checking the presented client certificate against the related CA. A certificate is valid only if it was signed by its own CA and passes a full x509 chain check with extende key usage for client auth.

Certificate validity and rotation

Certificates are valid for 90 days. Rotate them proactively with service rotate-certificate (see Rotate mTLS Certificates).

Why no certificate revocation list (CRL/OCSP) is needed

Revoking a finchctl client's access is a one-step operation:

finchctl service deregister root@10.19.80.100

This removes the clients CA file from the server immediately. The next request from that client fails validation because its certificate can no longer be verified against any CA. No restart, no CRL distribution point, no OCSP responder.


Agent JWT - Data ingestion authentication

Enrolled agents (Alloy) authenticate to the data endpoints (Loki, Mimir, Pyroscope) using JWT Bearer.

Flow

  1. Alloy sends Authorization: Bearer <token> with every write request to /loki, /mimir, or /pyroscope.
  2. Traefik intercepts the request and performs a auth call to finch before forwarding to the backend.
  3. Finch auth server extracts the Bearer token and checks:
  4. HMAC-SHA256 signature using the stack secret
  5. iss claim = finch, sub claim = agent
  6. Token expiry (exp claim)
  7. RID lookup in the database - the token's rid claim must match a registered agent
  8. If all checks pass, Finch returns HTTP 200 and Traefik forwards the request. Any failure returns HTTP 401 and the write is rejected.

Token properties

Property Value
Algorithm HMAC-SHA256 (HS256)
Validity 365 days
Claims iss=finch, sub=agent, rid=<resource-id>, iat, exp
Signing key 32-byte random secret in finch.json

RID-gated validation as revocation

The RID check means that deregistering an agent is equivalent to immediately revoking its token - no token blacklist required. The database record is deleted and the next validation returns "unknown agent" for that RID, regardless of whether the token signature is still valid.

To rotate all tokens simultaneously (break-glass scenario), use service rotate-secret (see Rotate the Signing Secret).


Dashboard JWT - Operator access

The Finch dashboard is accessed via a short-lived JWT issued by finchctl service dashboard. The token is passed as a URL parameter to the dashboard login endpoint and grants scoped, role-limited access.

Token properties

Property Value
Algorithm HMAC-SHA256 (HS256)
Validity Configurable session timeout (default 1800 s = 30 min)
Claims iss=finch, sub=dashboard, role, scope, iat, exp
Signing key Same 32-byte secret as agent tokens

RBAC roles

Three roles are available: admin, operator, and viewer (default). For the full permission matrix, see Dashboard: RBAC roles.

Scope

The scope claim is a list of agent RIDs or hostnames. When non-empty, the dashboard user can only see agents whose RID or hostname appears in the list. An empty scope means access to all agents.

# Grant read-only access to two specific agents for 1 hour
finchctl service dashboard \
  --permission.role viewer \
  --permission.scope rid:finch:8d134b24c2541730:agent:59ddbb5d-73b2-45bf-95d3-5520dcf37618 \
  --permission.scope web-02 \
  --permission.session-timeout 3600 \
  finch.example.com

Short session lifetimes (30 min by default) mean dashboard tokens expire quickly without any active revocation mechanism.


Design philosophy

Finch deliberately omits several security features found in enterprise-grade platforms:

Not included Reason
Certificate revocation lists (CRL) Deregistering removes the CA file - revocation is instant without a distribution mechanism
OCSP stapling Same reasoning; short 90-day certs limit exposure further
Fine-grained JWT revocation RID-gated DB check provides per-token revocation without a blacklist
Multi-tenancy Single operator, single secret - simplicity is a design goal
Rate limiting Out of scope; add at the network/firewall layer if needed

The result is a security model that is easy to reason about, easy to operate, and whose failure modes (compromised token, expiring / compromised cert) have clear, single-command recovery paths.