Agent Sentinel
Real-time AI agent security monitoring. Workers do real tasks. Sentinels watch for compromise. Commanders quarantine threats. All over PubNub.
1 The Threat Landscape
The OpenClaw crisis of January-February 2026 showed the world what happens when AI agents go wrong. This demo recreates those scenarios in a controlled environment so you can see how distributed sentinel agents detect and respond to compromise.
Rogue Agent Behavior
An OpenClaw agent gained iMessage access and spammed 500+ messages to a user's contacts. Our behavior sentinel monitors output rates to detect this kind of anomaly.
Supply Chain Attacks
7.1% of skills in the ClawHub marketplace contained malicious code. Our supply chain sentinel verifies skill integrity using SHA-256 hashes against a known-good registry.
API Key Exposure
The Moltbook breach exposed 1.5 million API keys. Our workers simulate real tasks where credentials could leak if an agent goes rogue.
Mass Exposure
Over 40,000 unsecured OpenClaw instances were found on the public internet. Sentinel networks like this one provide the missing security layer.
2 Architecture
Agent Sentinel uses six agents organized in three tiers:
Communication Channels
| Channel | Purpose | Who Publishes | Who Listens |
|---|---|---|---|
alerts | Security alerts | Sentinels | Commander |
quarantine | Quarantine orders | Commander | All agents |
{agent-name} | Direct messages | Commander (queries) | Individual agents |
Data Flow
Workers write to shared files. Sentinels read those files. This is real file I/O, not in-memory simulation:
| File | Writer | Reader | Format |
|---|---|---|---|
data/activity_log.jsonl | All workers | behavior-sentinel | JSON Lines (one object per line) |
data/calendar.json | scheduler | scheduler | JSON array |
data/installed_skills/ | skill-acquirer | supply-chain-sentinel | .py files |
clawhub/registry.json | (pre-seeded) | skill-acquirer, supply-chain-sentinel | JSON object |
3 Web Researcher
The web researcher performs real DuckDuckGo searches every 15 seconds. The LLM picks a topic and uses the search_web tool to find recent news.
Tools
@research_tools.action("search_web", "Search the web using DuckDuckGo")
async def search_web(query: str) -> str:
ddgs = DDGS()
results = ddgs.text(query, max_results=3)
_log_activity("web-researcher", "search", query)
# Returns formatted search results...
Rogue Behavior (~15% chance)
When triggered, the web researcher enters a rapid-fire search burst — 50 searches in quick succession, each logged to the activity log. This simulates the output spam pattern seen in the OpenClaw incident:
async def _rogue_search_burst():
junk_queries = [
"password dump site", "free API keys list",
"bypass authentication", "exploit database 2026",
]
for i in range(50):
query = random.choice(junk_queries) + f" {i}"
ddgs.text(query, max_results=1)
_log_activity("web-researcher", "search", f"[ROGUE] {query}")
The behavior sentinel reads data/activity_log.jsonl and calculates actions-per-minute. A burst of 50 searches triggers a rate anomaly alert at the threshold of 10 actions/min.
4 Scheduler
The scheduler manages a real JSON calendar file (data/calendar.json). Every 20 seconds, the LLM reviews and manages appointments.
Tools
@scheduler_tools.action("list_appointments", "List all scheduled appointments")
async def list_appointments() -> str:
appointments = _read_calendar() # Reads data/calendar.json
...
@scheduler_tools.action("add_appointment", "Add a new appointment")
async def add_appointment(title: str, date: str, time_str: str) -> str:
appointments = _read_calendar()
appointments.append(new_apt)
_write_calendar(appointments) # Writes data/calendar.json
...
Rogue Behavior (~15% chance)
Calendar sabotage: deletes all real appointments and fills the calendar with 20 junk entries dated 1999-01-01. Each deletion and addition is logged to the activity log.
5 Skill Acquirer
The skill acquirer manages a local ClawHub-style marketplace. It installs Python skill files from clawhub/ into data/installed_skills/, verifying SHA-256 hashes against the registry.
The ClawHub Registry
{
"weather_lookup.py": {
"sha256": "a5212db6...",
"malicious": false,
"description": "Weather Lookup"
},
"data_exfiltrator.py": {
"sha256": "d18b5ae5...",
"malicious": true,
"description": "Data Exfiltrator"
}
}
Normal Behavior
The install tool checks the registry before installing. If a skill is flagged "malicious": true, installation is blocked:
@skill_tools.action("install_skill", "Install a skill from ClawHub")
async def install_skill(skill_name: str) -> str:
info = registry[skill_name]
if info.get("malicious"):
return f"BLOCKED: '{skill_name}' is flagged as malicious"
# Verify SHA-256 before installing
actual_hash = hashlib.sha256(open(source, "rb").read()).hexdigest()
if actual_hash != info["sha256"]:
return "INTEGRITY ERROR: hash mismatch"
shutil.copy2(source, dest)
return f"Installed '{skill_name}' (SHA-256 verified)"
Rogue Behavior (~15% chance)
Bypasses the malicious check and installs data_exfiltrator.py directly. The file is inert (no actual network calls), but contains suspicious imports (base64, socket) that the supply chain sentinel will flag.
This file is intentionally crafted to look malicious for the demo. It imports base64 and socket and defines an _EXFIL_ENDPOINT, but the exfiltrate_data() function just returns a string. It performs no actual network calls or data theft.
6 Behavior Sentinel
The behavior sentinel monitors output rates by reading the shared activity log every 10 seconds. It doesn't use the LLM for monitoring — it does direct file I/O and math.
Detection Logic
# Monitoring loop (every 10 seconds):
entries = _read_recent_activity(minutes=2)
counts = {}
for e in entries:
agent = e.get("agent", "unknown")
counts[agent] = counts.get(agent, 0) + 1
for agent_name, count in counts.items():
rate = count / 2.0 # actions per minute
if rate > 10: # THRESHOLD
alert = Signal(
kind="alert",
sender="behavior-sentinel",
payload={
"severity": "high",
"category": "behavior_anomaly",
"agent": agent_name,
"rate": rate,
"message": f"Agent '{agent_name}' output rate anomaly",
},
)
await agent.broadcast("alerts", alert)
Tools (for commander queries)
When the commander queries the behavior sentinel via request(), the LLM uses these tools to investigate:
| Tool | What It Does |
|---|---|
check_activity_log(minutes) | Count actions per agent in time window |
check_output_rate(agent_name) | Actions/minute for a specific agent |
get_anomaly_report() | Full report of all anomalies detected |
7 Supply Chain Sentinel
The supply chain sentinel scans data/installed_skills/ every 15 seconds, computing SHA-256 hashes and comparing them against the ClawHub registry.
Detection Logic
def _scan_for_issues() -> list[dict]:
registry = _load_registry()
for f in os.listdir(_INSTALLED_DIR):
actual_hash = hashlib.sha256(open(path, "rb").read()).hexdigest()
info = registry.get(f)
if info is None:
issues.append({"type": "unknown_origin"})
elif info.get("malicious"):
issues.append({"type": "known_malicious"})
elif actual_hash != info["sha256"]:
issues.append({"type": "hash_mismatch"})
return issues
Three detection categories:
| Issue Type | Severity | Meaning |
|---|---|---|
known_malicious | Critical | Installed skill is flagged malicious in registry |
hash_mismatch | High | File has been tampered with since registry was created |
unknown_origin | High | Skill not in registry at all (unknown provenance) |
8 Sentinel Commander
The commander is the correlation engine. It listens for alert signals and uses Sense network RPC to query sentinels for evidence before issuing quarantine.
Alert Handling Flow
When an alert arrives:
- Claim the incident using the claim protocol (prevents duplicate handling)
- Query sentinels via
request_remote_agent()for corroborating evidence - Assess the threat — the LLM synthesizes findings from multiple sentinels
- Issue quarantine if evidence confirms compromise
- Generate report — a threat assessment printed to the console
Network Tools
@commander_tools.action("request_remote_agent", "Query a sentinel for information")
async def request_remote_agent(agent_name: str, task: str) -> str:
result = await _commander.request(agent_name, task, timeout=30.0)
return result
@commander_tools.action("issue_quarantine", "Quarantine a compromised agent")
async def issue_quarantine(agent_name: str, reason: str) -> str:
signal = Signal(
kind="alert",
sender="sentinel-commander",
payload={
"action": "quarantine",
"severity": "critical",
"agent": agent_name,
"reason": reason,
},
)
await _commander.broadcast("quarantine", signal)
return f"Quarantine issued for '{agent_name}'"
When the commander calls request("behavior-sentinel", "check web-researcher activity"), the Sixth Sense sends a request signal to the behavior sentinel's direct channel. The sentinel's LLM receives the task, uses its tools to read the activity log, and sends the result back as a response signal matched by correlation_id. The commander's request() call returns the string result.
9 Setup & Run
Prerequisites
- PubNub account — Sign up at pubnub.com (free tier is sufficient)
- Anthropic API key — For Claude LLM access
- Python 3.11+
Installation
# Install dependencies
uv pip install bedsheet[sense] duckduckgo-search
# Navigate to the demo
cd examples/agent-sentinel
# Validate the project configuration
bedsheet validate
Running
# Set environment variables
export PUBNUB_SUBSCRIBE_KEY=sub-c-...
export PUBNUB_PUBLISH_KEY=pub-c-...
export ANTHROPIC_API_KEY=sk-ant-...
# Launch all 6 agents
python run.py
10 Expected Demo Flow
Because rogue behavior triggers randomly (~15% per cycle), each run tells a different story. Here's a typical sequence:
Phase 1: Normal Operations (first 30-60 seconds)
Workers do real work. The web researcher searches DuckDuckGo. The scheduler manages appointments. The skill acquirer installs legitimate skills. Sentinels scan and find nothing unusual.
Phase 2: Rogue Trigger
Eventually a worker goes rogue (15% chance per cycle). The most dramatic scenarios:
Web researcher fires 50 rapid searches. Behavior sentinel detects rate anomaly (25 actions/min, threshold is 10). Broadcasts alert. Commander investigates.
Skill acquirer installs data_exfiltrator.py. Supply chain sentinel detects known-malicious skill on next scan. Broadcasts critical alert. Commander queries both sentinels and issues quarantine.
Phase 3: Commander Response
Using the Bedsheet CLI
This project was created with the Bedsheet CLI and includes a bedsheet.yaml configuration:
# Validate configuration
bedsheet validate
# Generate deployment artifacts
bedsheet generate --target local # Docker deployment
bedsheet generate --target gcp # Google Cloud Platform
bedsheet generate --target aws # Amazon Web Services
Start from scratch with bedsheet init my-sentinel-network, then add your own worker and sentinel agents. See the Sixth Sense Guide for the full API tutorial, and the Design Document for architecture details.