agentobjectmodel.org

AOM Specification

This directory contains the complete JSON Schema definitions for the Agent Object Model (AOM)™ protocol. To validate surfaces and outputs, use the CLI from the repo root (aom.py / aom.mjs).

Schemas

aom-input-schema.json (core surface schema)

What it defines: The structure of an AOM surface (screen, modal, panel, widget, drawer).

Validates: *.aom.json files in examples/ (for example examples/v0.1.0/)

Key sections:

Use when: Generating or consuming AOM from web/mobile screens.

aom-output-schema.json (agent output schema)

What it defines: The structure of an agent’s response to an AOM surface.

Validates: *.output.json files in Examples/ or examples/ (typically under each surface’s outputs/ folder).

Key sections:

Use when: Building agents that operate on AOM surfaces.

Secure/signed payloads are out of scope for this spec; any standard for signed envelopes or verification will be defined elsewhere.

AOM Contracts

AOM defines two machine-readable contracts. Both are required for AOM-compliant agent systems:

Contract Schema File Extension Purpose
Surface aom-input-schema.json *.aom.json Describes what the agent sees (UI state, available actions, entities)
Output aom-output-schema.json *.output.json Describes what the agent does (thought, action, result, confidence)

Versioning

Both schemas use semantic versioning:

Each AOM surface must declare its version:

{
  "aom_version": "0.1.0",
  ...
}

Key principles (quick)

These mirror the design intent throughout the spec:

  1. Task-centric — organized around user goals, not UI layout.
  2. Entity-driven — data structures are explicit and typed.
  3. Action-oriented — what the agent can do is enumerated and validated.
  4. State-aware — workflows and session context can be represented explicitly.
  5. Layout-free — no CSS, coordinates, or presentation noise.
  6. Semantic-only — meaning first; avoid DOM coupling.
  7. Automation guardrailsforbidden / allowed / open control how agents may use the surface.

Design Principles

1. Task-Centric, Not DOM-Centric

AOM describes what users can accomplish (tasks, actions, entities), not the HTML structure.

Why: Agents reason about goals, not CSS selectors.

2. Entity-Driven

Domain objects (Product, Order, User) are first-class citizens with schemas, runtime validations, and current values.

Why: Agents operate on structured data, not unstructured text.

3. Declarative Actions

Actions declare their inputs, outputs, effects, priorities, and preconditions.

Why: Agents can plan, validate, and execute actions safely.

4. Production Intelligence

AOM natively supports automated testing via signals.test_cases and runtime escalation gating via meta.confidence.

Why: Agents require strict validation and human-in-the-loop fallback paths for enterprise reliability.

5. Mode Flexibility

Supports both single-shot (one action → done) and flow (multi-step workflows).

Why: Different tasks have different execution patterns.

6. Runtime-Agnostic

AOM is JSON. Works with any agent framework, LLM, or automation tool.

Why: Interoperability across ecosystems.

JSON Schema Details

Both schemas use JSON Schema Draft 2020-12:

Required vs Optional Fields

Input/core schema (aom-input-schema.json):

Output schema (aom-output-schema.json):


Extending AOM

Custom Fields

Both schemas allow additionalProperties: true in specific sections:

Example:

{
  "context": {
    "app_name": "MyApp",
    "locale": "en-US",
    "custom_tenant_id": "acme-corp",
    "custom_feature_flags": ["beta_ui", "dark_mode"]
  }
}

Custom Entity Types

Entity schemas support arbitrary field types and custom validation rules:

{
  "entities": {
    "CustomWidget": {
      "schema": {
        "widget_id": {"type": "string", "required": true},
        "config": {"type": "object", "required": false}
      },
      "current": {
        "widget_id": "w123",
        "config": {"color": "blue", "size": "large"}
      }
    }
  }
}

Optional: binds_to (agents.json Integration)

Actions can optionally reference external API/tool definitions via the binds_to field:

{
  "actions": [
    {
      "id": "submit_checkout",
      "label": "Place order",
      "category": "mutation",
      "description": "Submit checkout and create order.",
      "input_entities": ["CheckoutIntent"],
      "output_entities": ["OrderConfirmation"],
      "effects": [
        "entities.OrderConfirmation.current = shop_api.place(...)",
        "state.workflow.step_id = 'order_placed'"
      ],
      "binds_to": {
        "type": "agent.workflow_step",
        "ref": "place_order_confirm_checkout",
        "optional": true
      }
    }
  ]
}

When to use:

Your runtime has an external tool registry (e.g., agents.json, MCP tools, OpenAPI specs)

You want agents to call real APIs instead of simulating via effects

When binds_to is present:

Runtime tries to resolve binds_to.ref from external registry

If found → use external tool schema (parameters, authentication, etc.)

If not found AND optional: true → fall back to AOM’s inline effects

If not found AND optional: false → fail with clear error

Schema:

type (string) — Namespace/type of external binding (e.g., “agent.workflow_step”, “mcp.tool”, “openapi.operation”)

ref (string) — External identifier (tool name, operation ID, etc.)

optional (boolean, default false) — Whether binding is required

Default behavior: If binds_to is omitted, runtime executes action using AOM’s effects only.

Roadmap: Auto-resolution from common tool registries.


Optional: A2H (Agent-to-Human) Integration

AOM natively supports the industry-standard A2H protocol for safe Human-in-the-Loop (HITL) escalations. This allows the surface to dictate when an agent must pause and ask a human for approval or data.

1. Defining the Policy in the Surface (aom-core-schema)

The surface defines which actions require human intervention via the a2h_policy object on an action:

{
  "actions": [
    {
      "id": "delete_database",
      "label": "Delete Production DB",
      "category": "mutation",
      "a2h_policy": {
        "requires_authorization": true,
        "escalation_channel": "in_app"
      }
    }
  ]
}

2. Executing the Intent in the Output (aom-output-schema)

When the agent realizes it needs to escalate (either due to the surface’s a2h_policy or low internal confidence), it outputs an a2h_intent inside the meta block:

{
  "mode": "flow",
  "action": { "action_id": "none" },
  "meta": {
    "done": false,
    "confidence": 0.4,
    "a2h_intent": {
      "type": "AUTHORIZE",
      "message": "I am about to delete the production database. Do I have your approval to proceed?"
    }
  }
}

Supported A2H Intents:

Validation Tools

Tools are organized by language under Tools/ so you can use only Python or only Node. See Tools/README.md.

Python

# From repo root (pip install -r Tools/python/validate/requirements.txt first)
python Tools/python/validate/validate.py spec/v0.1.0/aom-input-schema.json examples/v0.1.0/login-single/login.aom.json
python Tools/python/validate/validate_all.py
python Tools/python/validate/validate_all.py v0.1.0/ecom-flow

Node

# From repo root (npm install in Tools/node/validate first)
node Tools/node/validate/validate.js spec/v0.1.0/aom-input-schema.json examples/v0.1.0/login-single/login.aom.json
node Tools/node/validate/validate_all.js
node Tools/node/validate/validate_all.js v0.1.0/ecom-flow

Generating output files for testing

Golden *.output.json files under each example’s outputs/ folder are generated by the create-outputs tools. From repo root:

python Tools/python/create-outputs/create_outputs.py
# or
node Tools/node/create-outputs/create_outputs.js

Schema Changelog

v0.1.0 (2026-02-26)

Initial public release (current)

Roadmap: v0.2.0 (not yet released)

Future versions MAY introduce:


FAQ

Q: Why separate surface and output schemas?
A: Surfaces describe what’s available (input to agent), outputs describe what the agent decided (output from agent). Different lifecycles, different consumers.

Q: Can I use AOM with non-LLM agents?
A: Yes. AOM is JSON. Any system that can parse JSON and make decisions can consume AOM.

Q: Does AOM require specific UI frameworks?
A: No. AOM is framework-agnostic. Generate it from React, Vue, mobile apps, or even server-rendered HTML.

Q: What about authentication/security?
A: AOM surfaces can include state.session.authenticated and state.session.user_id. Authorization logic lives in your runtime, not the schema.

Q: Can AOM represent native mobile screens?
A: Yes. surface_kind supports screens, modals, panels, drawers, widgets. The abstraction works for web and mobile.


References


Contributing

Found an issue or have a suggestion?

  1. Validate your examples against the schemas first
  2. Open an issue with concrete examples
  3. Propose changes with before/after JSON snippets

Schema improvements should maintain backward compatibility when possible.