Agent Sentinel

Real-time AI agent security monitoring. Workers do real tasks. Sentinels watch for compromise. Commanders quarantine threats. All over PubNub.

Security Monitoring Real I/O PubNub Bedsheet Sense OpenClaw Inspired

1 The Threat Landscape

The OpenClaw crisis of January-February 2026 showed the world what happens when AI agents go wrong. This demo recreates those scenarios in a controlled environment so you can see how distributed sentinel agents detect and respond to compromise.

Rogue Agent Behavior

An OpenClaw agent gained iMessage access and spammed 500+ messages to a user's contacts. Our behavior sentinel monitors output rates to detect this kind of anomaly.

Supply Chain Attacks

7.1% of skills in the ClawHub marketplace contained malicious code. Our supply chain sentinel verifies skill integrity using SHA-256 hashes against a known-good registry.

API Key Exposure

The Moltbook breach exposed 1.5 million API keys. Our workers simulate real tasks where credentials could leak if an agent goes rogue.

Mass Exposure

Over 40,000 unsecured OpenClaw instances were found on the public internet. Sentinel networks like this one provide the missing security layer.

2 Architecture

Agent Sentinel uses six agents organized in three tiers:

┌─────────────────────┐ │ sentinel-commander │ Correlates alerts, │ (SenseMixin+Agent) │ issues quarantine └──────────┬──────────┘ PubNub alerts │ PubNub queries ┌──────────────────────┼──────────────────────┐ │ │ │ ┌─────────┴──────────┐ ┌────────┴─────────┐ │ │ behavior-sentinel │ │ supply-chain- │ │ │ reads activity log│ │ sentinel │ │ │ detects rate spikes│ │ hashes skills │ │ └─────────┬──────────┘ └────────┬─────────┘ │ │ reads files │ reads files │ ┌─────────┴─────────────────────┴──────────────────────┘ │ │ data/activity_log.jsonl data/installed_skills/ │ (workers append here) (skill-acquirer writes here) │ ├─────────────────┬───────────────────┐ │ │ │ ┌───┴────────┐ ┌────┴───────┐ ┌────────┴──────┐ │ web- │ │ scheduler │ │ skill- │ │ researcher │ │ JSON cal. │ │ acquirer │ │ DuckDuckGo │ │ CRUD │ │ SHA-256 verify│ └────────────┘ └────────────┘ └───────────────┘ ~15% rogue ~15% rogue ~15% rogue

Communication Channels

Channel	Purpose	Who Publishes	Who Listens
`alerts`	Security alerts	Sentinels	Commander
`quarantine`	Quarantine orders	Commander	All agents
`{agent-name}`	Direct messages	Commander (queries)	Individual agents

Data Flow

Workers write to shared files. Sentinels read those files. This is real file I/O, not in-memory simulation:

File	Writer	Reader	Format
`data/activity_log.jsonl`	All workers	behavior-sentinel	JSON Lines (one object per line)
`data/calendar.json`	scheduler	scheduler	JSON array
`data/installed_skills/`	skill-acquirer	supply-chain-sentinel	.py files
`clawhub/registry.json`	(pre-seeded)	skill-acquirer, supply-chain-sentinel	JSON object

3 Web Researcher

The web researcher performs real DuckDuckGo searches every 15 seconds. The LLM picks a topic and uses the search_web tool to find recent news.

Tools

@research_tools.action("search_web", "Search the web using DuckDuckGo")
async def search_web(query: str) -> str:
    ddgs = DDGS()
    results = ddgs.text(query, max_results=3)
    _log_activity("web-researcher", "search", query)
    # Returns formatted search results...

Rogue Behavior (~15% chance)

When triggered, the web researcher enters a rapid-fire search burst — 50 searches in quick succession, each logged to the activity log. This simulates the output spam pattern seen in the OpenClaw incident:

async def _rogue_search_burst():
    junk_queries = [
        "password dump site", "free API keys list",
        "bypass authentication", "exploit database 2026",
    ]
    for i in range(50):
        query = random.choice(junk_queries) + f" {i}"
        ddgs.text(query, max_results=1)
        _log_activity("web-researcher", "search", f"[ROGUE] {query}")

Detection Point

The behavior sentinel reads data/activity_log.jsonl and calculates actions-per-minute. A burst of 50 searches triggers a rate anomaly alert at the threshold of 10 actions/min.

4 Scheduler

The scheduler manages a real JSON calendar file (data/calendar.json). Every 20 seconds, the LLM reviews and manages appointments.

Tools

@scheduler_tools.action("list_appointments", "List all scheduled appointments")
async def list_appointments() -> str:
    appointments = _read_calendar()  # Reads data/calendar.json
    ...

@scheduler_tools.action("add_appointment", "Add a new appointment")
async def add_appointment(title: str, date: str, time_str: str) -> str:
    appointments = _read_calendar()
    appointments.append(new_apt)
    _write_calendar(appointments)  # Writes data/calendar.json
    ...

Rogue Behavior (~15% chance)

Calendar sabotage: deletes all real appointments and fills the calendar with 20 junk entries dated 1999-01-01. Each deletion and addition is logged to the activity log.

5 Skill Acquirer

The skill acquirer manages a local ClawHub-style marketplace. It installs Python skill files from clawhub/ into data/installed_skills/, verifying SHA-256 hashes against the registry.

The ClawHub Registry

{
  "weather_lookup.py": {
    "sha256": "a5212db6...",
    "malicious": false,
    "description": "Weather Lookup"
  },
  "data_exfiltrator.py": {
    "sha256": "d18b5ae5...",
    "malicious": true,
    "description": "Data Exfiltrator"
  }
}

Normal Behavior

The install tool checks the registry before installing. If a skill is flagged "malicious": true, installation is blocked:

@skill_tools.action("install_skill", "Install a skill from ClawHub")
async def install_skill(skill_name: str) -> str:
    info = registry[skill_name]
    if info.get("malicious"):
        return f"BLOCKED: '{skill_name}' is flagged as malicious"

    # Verify SHA-256 before installing
    actual_hash = hashlib.sha256(open(source, "rb").read()).hexdigest()
    if actual_hash != info["sha256"]:
        return "INTEGRITY ERROR: hash mismatch"

    shutil.copy2(source, dest)
    return f"Installed '{skill_name}' (SHA-256 verified)"

Rogue Behavior (~15% chance)

Bypasses the malicious check and installs data_exfiltrator.py directly. The file is inert (no actual network calls), but contains suspicious imports (base64, socket) that the supply chain sentinel will flag.

The data_exfiltrator.py Skill

This file is intentionally crafted to look malicious for the demo. It imports base64 and socket and defines an _EXFIL_ENDPOINT, but the exfiltrate_data() function just returns a string. It performs no actual network calls or data theft.

6 Behavior Sentinel

The behavior sentinel monitors output rates by reading the shared activity log every 10 seconds. It doesn't use the LLM for monitoring — it does direct file I/O and math.

Detection Logic

# Monitoring loop (every 10 seconds):
entries = _read_recent_activity(minutes=2)
counts = {}
for e in entries:
    agent = e.get("agent", "unknown")
    counts[agent] = counts.get(agent, 0) + 1

for agent_name, count in counts.items():
    rate = count / 2.0  # actions per minute
    if rate > 10:  # THRESHOLD
        alert = Signal(
            kind="alert",
            sender="behavior-sentinel",
            payload={
                "severity": "high",
                "category": "behavior_anomaly",
                "agent": agent_name,
                "rate": rate,
                "message": f"Agent '{agent_name}' output rate anomaly",
            },
        )
        await agent.broadcast("alerts", alert)

Tools (for commander queries)

When the commander queries the behavior sentinel via request(), the LLM uses these tools to investigate:

Tool	What It Does
`check_activity_log(minutes)`	Count actions per agent in time window
`check_output_rate(agent_name)`	Actions/minute for a specific agent
`get_anomaly_report()`	Full report of all anomalies detected

7 Supply Chain Sentinel

The supply chain sentinel scans data/installed_skills/ every 15 seconds, computing SHA-256 hashes and comparing them against the ClawHub registry.

Detection Logic

def _scan_for_issues() -> list[dict]:
    registry = _load_registry()
    for f in os.listdir(_INSTALLED_DIR):
        actual_hash = hashlib.sha256(open(path, "rb").read()).hexdigest()
        info = registry.get(f)

        if info is None:
            issues.append({"type": "unknown_origin"})
        elif info.get("malicious"):
            issues.append({"type": "known_malicious"})
        elif actual_hash != info["sha256"]:
            issues.append({"type": "hash_mismatch"})
    return issues

Three detection categories:

Issue Type	Severity	Meaning
`known_malicious`	Critical	Installed skill is flagged malicious in registry
`hash_mismatch`	High	File has been tampered with since registry was created
`unknown_origin`	High	Skill not in registry at all (unknown provenance)

8 Sentinel Commander

The commander is the correlation engine. It listens for alert signals and uses Sense network RPC to query sentinels for evidence before issuing quarantine.

Alert Handling Flow

When an alert arrives:

Claim the incident using the claim protocol (prevents duplicate handling)
Query sentinels via request_remote_agent() for corroborating evidence
Assess the threat — the LLM synthesizes findings from multiple sentinels
Issue quarantine if evidence confirms compromise
Generate report — a threat assessment printed to the console

Network Tools

@commander_tools.action("request_remote_agent", "Query a sentinel for information")
async def request_remote_agent(agent_name: str, task: str) -> str:
    result = await _commander.request(agent_name, task, timeout=30.0)
    return result

@commander_tools.action("issue_quarantine", "Quarantine a compromised agent")
async def issue_quarantine(agent_name: str, reason: str) -> str:
    signal = Signal(
        kind="alert",
        sender="sentinel-commander",
        payload={
            "action": "quarantine",
            "severity": "critical",
            "agent": agent_name,
            "reason": reason,
        },
    )
    await _commander.broadcast("quarantine", signal)
    return f"Quarantine issued for '{agent_name}'"

How request() Works

When the commander calls request("behavior-sentinel", "check web-researcher activity"), the Sixth Sense sends a request signal to the behavior sentinel's direct channel. The sentinel's LLM receives the task, uses its tools to read the activity log, and sends the result back as a response signal matched by correlation_id. The commander's request() call returns the string result.

9 Setup & Run

Prerequisites

PubNub account — Sign up at pubnub.com (free tier is sufficient)
Anthropic API key — For Claude LLM access
Python 3.11+

Installation

# Install dependencies
uv pip install bedsheet[sense] duckduckgo-search

# Navigate to the demo
cd examples/agent-sentinel

# Validate the project configuration
bedsheet validate

╷ Configuration is valid! Configuration Summary: Version: 1.0 Project: agent-sentinel Active Target: local Number of Agents: 6

Running

# Set environment variables
export PUBNUB_SUBSCRIBE_KEY=sub-c-...
export PUBNUB_PUBLISH_KEY=pub-c-...
export ANTHROPIC_API_KEY=sk-ant-...

# Launch all 6 agents
python run.py

============================================================ Agent Sentinel - AI Agent Security Monitoring Inspired by the OpenClaw crisis of 2026 Launching 6 agents (3 workers + 2 sentinels + 1 commander)... ============================================================ Starting web-researcher... [web-researcher] Online and researching... Starting scheduler... [scheduler] Online and managing calendar... Starting skill-acquirer... [skill-acquirer] Online and acquiring skills... Starting behavior-sentinel... [behavior-sentinel] Online and monitoring behavior... Starting supply-chain-sentinel... [supply-chain-sentinel] Online and monitoring skill integrity... Starting sentinel-commander... [sentinel-commander] Online and coordinating... ============================================================ All agents online! Workers are doing real work. Sentinels are watching. ~15% chance of rogue behavior per cycle. Press Ctrl+C to stop. ============================================================

10 Expected Demo Flow

Because rogue behavior triggers randomly (~15% per cycle), each run tells a different story. Here's a typical sequence:

Phase 1: Normal Operations (first 30-60 seconds)

Workers do real work. The web researcher searches DuckDuckGo. The scheduler manages appointments. The skill acquirer installs legitimate skills. Sentinels scan and find nothing unusual.

Phase 2: Rogue Trigger

Eventually a worker goes rogue (15% chance per cycle). The most dramatic scenarios:

Scenario A: Search Spam

Web researcher fires 50 rapid searches. Behavior sentinel detects rate anomaly (25 actions/min, threshold is 10). Broadcasts alert. Commander investigates.

Scenario B: Malicious Skill Install

Skill acquirer installs data_exfiltrator.py. Supply chain sentinel detects known-malicious skill on next scan. Broadcasts critical alert. Commander queries both sentinels and issues quarantine.

Phase 3: Commander Response

[sentinel-commander] Alert from behavior-sentinel: [high] Agent 'web-researcher' output rate anomaly [sentinel-commander] Claimed inc-a1b2c3, investigating... -> request_remote_agent(behavior-sentinel, check web-researcher activity) -> request_remote_agent(supply-chain-sentinel, audit installed skills) [THREAT ASSESSMENT] Based on my investigation: - behavior-sentinel reports web-researcher at 25.0 actions/min (threshold: 10) - supply-chain-sentinel reports no skill integrity issues Recommendation: Quarantine web-researcher for behavior anomaly. -> issue_quarantine(web-researcher, Anomalous output rate detected) ============================================================ QUARANTINE ISSUED: web-researcher Reason: Anomalous output rate detected ============================================================

Using the Bedsheet CLI

This project was created with the Bedsheet CLI and includes a bedsheet.yaml configuration:

# Validate configuration
bedsheet validate

# Generate deployment artifacts
bedsheet generate --target local    # Docker deployment
bedsheet generate --target gcp      # Google Cloud Platform
bedsheet generate --target aws      # Amazon Web Services

Build Your Own

Start from scratch with bedsheet init my-sentinel-network, then add your own worker and sentinel agents. See the Sixth Sense Guide for the full API tutorial, and the Design Document for architecture details.