A scoring framework that evaluates production AI agents across three security dimensions and places each on a shared risk quadrant — so buyers, security teams, and vendors speak the same language about agent risk.
AIRQ is developed and maintained by AI security researchers and practitioners from across the industry.
AIRQ promotes healthy AI risk appetite and rewards vendor transparency. Built on a rigorous, data-driven methodology aligned with established industry standards, it enables risk quantification where existing frameworks stop at guidance — and works on its own for AI agent selection, threat modeling, and security hardening.
The methodology mimics what a real security team can do, so the questions it raises are ones you can ask directly to your AI agent vendor. Over time, AIRQ is useful for tracking how your security posture evolves as agentic platforms update.
The AIRQ framework uses CoSAI’s three security principles — Human-governed & Accountable, Bounded & Resilient, and Transparent & Verifiable — as a qualitative checkpoint for agents near quadrant boundaries, and draws on CoSAI’s agentic governance and supply-chain workstreams to calibrate its defense scoring tiers.
The methodology doc is a big step up in rigor over other public AI-security scoring docs.
Together, MAESTRO + the Lethal Trifecta + AIVSS + AST10 give this methodology structural depth, quantitative rigor, and operational specificity needed to produce risk rankings that are actually actionable for practitioners.
This is a real framework with a genuinely good taxonomy. Giving each class its own attack model instead of lumping everything under ‘AI agent’ is exactly the right instinct, and the per-class profiles are where the report is at its best. The report’s sharpest points about security architecture are what a busy reader should walk away with.
Every evaluated agent is placed on a two-axis risk map — Defense Controls vs. Attack Surface — with Blast Radius encoded as bubble size. The result is four risk positions.
Every agent is scored on its documented default configuration using public evidence — vendor docs, published CVEs, and independent research. Three axes feed into the quadrant and the composite AIRQ Score.
How easily the agent can be compromised
How much damage a compromised agent can cause
How effectively defenses reduce raw risk