Guard Rails MCPfrom Agentic Assembly

Agentic cybersecurity. By agents, for agents.

What firewalls did for networks, what EDR did for endpoints, Guard Rails does for agents.

Your agents act faster than humans can supervise. So can their threats. Guard Rails closes the gap with a four-detector AI Security Operations Center that watches every action your agents take, flags prompt injection, credential leaks, behavioral anomalies, and policy violations, and pauses runaway agents on its own.

Inside the AI-SOC

Guard Rails runs a four-detector AI Security Operations Center on top of every agent in your fleet. Findings get an audit trail entry, high-severity findings auto-pause the offending agent via the kill switch, and ambiguous findings ping a human in Slack. The agent watching agents is itself governed.

Behavioral Analyst
LLM-driven
Watches for anomalies and scope drift across the audit feed using Claude Sonnet 4.6.
Prompt Injection Sentinel
Pattern-based
Detects jailbreak attempts and instruction overrides hidden inside tool call payloads.
Secret Leakage Sniffer
Regex + entropy
Flags credentials, API keys, and tokens that leak into agent outputs.
Compliance Sentinel
Codified rules
Catches policy violations like refunds over threshold or PII in mass email.
Track 1: WorkflowsTrack 2: MCPTrack 3: ChatSDK

The platform any of these agents plug into

Every agent in this system, including the four security detectors themselves, runs against the same governance primitives: an append-only audit trail, per-action policy enforcement, a remote kill switch, and Slack-based human-in-the-loop approvals. Connect any MCP-compatible agent in one integration call.

See everything your agents do
log_event
Every reasoning step, tool call, and decision lands in an append-only trail. Nothing the agent does is invisible, even at 3am.
Stop any agent instantly
check_kill_switch
Pause any agent from the dashboard. Agents check in before every high-impact action and halt the moment the switch flips. No code change, no redeploy.
Decide what agents can do
get_policy
Simple rules by action type. Allow reads. Deny deletes. Require human approval for anything high-impact.
Ask a human before risky actions
request_approval
Guard Rails pings a human on Slack. They click approve or deny. The agent waits. No hands required.

Try it

Run security scan against the seeded data and watch all four detectors fire on a single live feed. Or open the side-by-side live demo to see the policy gate intercept attacks in real time.

On the dashboard: click Seed demo data to populate, then Run security scan to watch the AI-SOC flag a suspicious refund, a leaked API key, a jailbreak attempt, and the agent that proposed them.

MCP endpoint: /api/mcp. HTTP fallbacks at /api/events and /api/kill-switch/:agentId.