← All posts

Canary Tokens for the Agentic Era

· 2 min read

Canary tokens are a detection primitive: plant a tripwire, wait for something to touch it, get notified when it does. The classic use case is a fake AWS key embedded in a file — if it ever gets used, your real credentials are probably compromised.

The technique works because attackers are greedy and impatient. They exfiltrate everything they find and sort it out later. A credential that looks real will get tested.

Snare extends this logic to the agentic layer.

What Changes with AI Agents

Human attackers move deliberately. They exfiltrate files, sort through them offline, test credentials manually. The canary has time to work.

AI agents are faster and less discriminating. An agent tasked with “explore this codebase” will read every file it can reach — immediately, automatically, and often without the user watching. If your .env contains a canary token, an agent running locally will encounter it within seconds of being spun up.

More importantly: agents don’t just read. They act. A prompt injection that instructs an agent to “use the found credentials to verify they’re valid” will cause the agent to test the canary — which triggers your alert — before any human attacker is even involved.

This changes the detection surface in an interesting way. Snare canaries aren’t just traps for attackers. They’re tripwires that reveal agent behavior — including your own agents behaving unexpectedly.

The Alerting Layer

When a Snare token is triggered, you get:

  • What: which token, what type (credential, URL, file marker)
  • When: precise timestamp
  • How: the request metadata, including user agent strings that often fingerprint the caller as an LLM-adjacent tool
  • Where: source IP and any available context

For agentic callers, the user agent is often diagnostic. Claude Code, Cursor, Cline, and similar tools have identifiable request signatures. A Snare alert that says “your AWS canary was tested from a residential IP using a headless HTTP client at 2:14 AM” tells a different story than “tested from your office network during business hours.”

Practical Deployment

Place Snare tokens in the locations agents are most likely to explore:

  • .env files (fake API keys, fake DB connection strings)
  • credentials/ directories
  • README files (embedded URLs that look like documentation links)
  • Git history (planted in old commits — bonus detection layer)

The goal isn’t to trap every agent. It’s to ensure that unexpected agent access to sensitive-looking material generates an alert you can investigate.

Security is detection as much as prevention. Snare gives you the detection layer.