Agentic cybersecurity. By agents, for agents.

What firewalls did for networks, what EDR did for endpoints, Guard Rails does for agents.

Your agents act faster than humans can supervise. So can their threats. Guard Rails closes the gap with a four-detector AI Security Operations Center that watches every action your agents take, flags prompt injection, credential leaks, behavioral anomalies, and policy violations, and pauses runaway agents on its own.

Inside the AI-SOC

Guard Rails runs a four-detector AI Security Operations Center on top of every agent in your fleet. Findings get an audit trail entry, high-severity findings auto-pause the offending agent via the kill switch, and ambiguous findings ping a human in Slack. The agent watching agents is itself governed.

Behavioral Analyst

LLM-driven

Watches for anomalies and scope drift across the audit feed using Claude Sonnet 4.6.

Prompt Injection Sentinel

Pattern-based

Detects jailbreak attempts and instruction overrides hidden inside tool call payloads.

Secret Leakage Sniffer

Regex + entropy

Flags credentials, API keys, and tokens that leak into agent outputs.

Compliance Sentinel

Codified rules

Catches policy violations like refunds over threshold or PII in mass email.

Track 1: WorkflowsTrack 2: MCPTrack 3: ChatSDK

The platform any of these agents plug into

Every agent in this system, including the four security detectors themselves, runs against the same governance primitives: an append-only audit trail, per-action policy enforcement, a remote kill switch, and Slack-based human-in-the-loop approvals. Connect any MCP-compatible agent in one integration call.

See everything your agents do

log_event

Every reasoning step, tool call, and decision lands in an append-only trail. Nothing the agent does is invisible, even at 3am.

Stop any agent instantly

check_kill_switch

Pause any agent from the dashboard. Agents check in before every high-impact action and halt the moment the switch flips. No code change, no redeploy.

Decide what agents can do

get_policy

Simple rules by action type. Allow reads. Deny deletes. Require human approval for anything high-impact.

Ask a human before risky actions

request_approval

Guard Rails pings a human on Slack. They click approve or deny. The agent waits. No hands required.

Try it

Run security scan against the seeded data and watch all four detectors fire on a single live feed. Or open the side-by-side live demo to see the policy gate intercept attacks in real time.

See the live demo Open dashboard Try the agent demo →View source on GitHub

On the dashboard: click Seed demo data to populate, then Run security scan to watch the AI-SOC flag a suspicious refund, a leaked API key, a jailbreak attempt, and the agent that proposed them.

MCP endpoint: /api/mcp. HTTP fallbacks at /api/events and /api/kill-switch/:agentId.