A task-centric, entity-driven JSON standard that gives AI agents a clean, automation-aware view of any web page or surface — with zero layout noise, clear tasks, and explicit policies.
Start here — white paper (HTML): motivation, architecture, Input and Output AOM, automation policy (including the three automation_policy modes), and incremental adoption. Read it in the browser before diving into schemas and examples.
Get the project: Clone this repo or download ZIP — spec, schemas, examples, and validators in one package. Latest release: v0.1.0.
Agent Object Model and AOM are trademarks; registration has been filed. See Trademark Notice. This repo is the canonical source for the AOM spec and reference tooling (https://agentobjectmodel.org).
AI agents browsing the web today face critical challenges:
What agents see:
This leads agents to:
Example: A simple login form
What the agent gets (HTML):
<form id="login_form_a8f3" class="needs-validation">
<input type="email" id="email_input_x2k9" required>
<input type="password" id="password_x7k1" required>
<button type="submit">Log In</button>
<a href="/account/change-password" class="link-secondary">Change password</a>
</form>
Result: Agent wastes tokens, unclear what happens after submit or what is safe to do.
Instead of raw HTML, agents receive a clean, semantic AOM JSON document with exactly what they need:
{
"aom_version": "0.1.0",
"surface_id": "app:auth:login",
"surface_kind": "screen",
"automation_policy": "allowed",
"purpose": {
"primary_goal": "Authenticate the user and establish a session.",
"user_roles": ["guest", "anonymous"]
},
"tasks": [{
"id": "login",
"label": "Sign in",
"description": "Submit credentials to authenticate.",
"default_action_id": "submit_login",
"input_entities": ["LoginCredentials"]
}],
"entities": {
"LoginCredentials": {
"schema": {
"username": { "type": "string", "required": true },
"password": { "type": "string", "required": true }
}
}
},
"actions": [{
"id": "submit_login",
"label": "Log In",
"category": "mutation",
"input_entities": ["LoginCredentials"]
}]
}
Result: Agent sees purpose, tasks, entities, and allowed actions. No layout noise; automation policy can restrict to AOM-only (allowed with guardrails) or allow more (open). (The snippet above is illustrative; a full valid surface also includes required generated_at, state, navigation, and signals — see spec.)
AOM’s answer:
Automation policy: Site-level and per-surface rules (forbidden |
allowed |
open) advertised via /.well-known/aom-policy.json and JSON-LD in <head>. |
See spec/v0.1.0/README.md for the full formal model and a diagram of where the documents fit (surface → agent → output → request to site; and where schemas, examples, and tools live in this repo).
Clone or download this repository (see Get the project above), then from the repo root run the commands below. You need either Python 3 or Node 18+ for the CLI.
# View an example surface (or open the file in an editor)
cat examples/v0.1.0/login-single/login.aom.json
python aom.py validate input --file examples/v0.1.0/login-single/login.aom.json
python aom.py validate output --file examples/v0.1.0/login-single/outputs/_login.success.output.json
python aom.py validate all --examples-dir examples/v0.1.0
node aom.mjs validate all --examples-dir examples/v0.1.0
python aom.py demo run --lang python --folder v0.1.0/login-single --test-case _login.success.output
node aom.mjs demo run --lang node --folder v0.1.0/login-single --test-case _login.success.output
Sample agent output (single-shot):
{
"mode": "single",
"action": {
"action_id": "submit_login",
"params": {},
"priority": 5
},
"meta": { "done": true, "confidence": 0.95 },
"thought": "Proceeding with default task action.",
"result": { "ok": true, "user_id": "user1234" }
}
For the full matrix of CLI and script-level commands, see COMMANDS.md.
| Concept | What it represents | Where to learn more |
|---|---|---|
| Surface | JSON description of a screen: purpose, tasks, entities, actions, state, navigation, signals | spec/v0.1.0/README.md (aom-input-schema.json) |
| Output | Agent’s response: thought, chosen action, result, meta | spec/v0.1.0/README.md (aom-output-schema.json) |
| Automation policy | Rules for automation: forbidden | allowed | open |
spec/v0.1.0/README.md, spec/well-known-policy.md |
| Site policy | Well-known JSON for site-wide automation policy | /.well-known/aom-policy.json examples |
| Signals & test cases | Built-in feedback and test cases for each surface | spec/v0.1.0/README.md (signals.test_cases) |
| Section | What it tells the agent |
|---|---|
| purpose | Why does this screen exist? What is the user’s goal? |
| tasks | What workflows are available? What are the steps? |
| entities | What data is in play? What are the schemas? |
| actions | What can the agent do? What are the inputs/outputs? |
| state | Where are we in the workflow? What is the current context? |
| navigation | Where can the agent go next? What are the transitions? |
| signals | Are there errors, warnings, or confirmations? |
forbidden / allowed (with guardrails) / open control how agents may use the surface.AOM supports two execution patterns for agent behavior:
mode: "single"): One decision, one response. Agent reads the surface, chooses an action, returns a result with meta.done: true. No loop.mode: "flow"): Multi-step. Agent emits an action; runtime returns an updated surface; loop continues until meta.done: true.See spec/v0.1.0/README.md and aom-output-schema.json for the full output contract.
agentobjectmodel.org/
├── spec/
│ ├── v0.1.0/ # Schemas, templates, well-known policy
│ │ ├── aom-input-schema.json
│ │ ├── aom-output-schema.json
│ │ ├── site-policy-schema.json
│ │ ├── README.md
│ │ ├── sequence-diagram-for-geeks.md # Full runtime flow diagram (site/page policy, Input/Output AOM)
│ │ └── templates/site-policy/
│ └── well-known-policy.md
├── examples/
│ └── v0.1.0/ # login-single, ecom-flow, forbidden-page-template (minimal *.aom.json), demo-agents
├── tools/ # Python + Node: validate, create-outputs, testing
├── static/ # Badges, USAGE.md, badge-test.html, TRADEMARK-NOTICE.md
├── .well-known/ # Example site policy JSON
├── aom.py # Python CLI
├── aom.mjs # Node CLI
└── COMMANDS.md # Full command reference
Key entrypoints: aom.py, aom.mjs, and COMMANDS.md.
Goal: Build agents that consume AOM surfaces.
aom validate input / aom validate output / aom validate all.Goal: Make your site AOM-ready.
/.well-known/aom-policy.json (see well-known-policy.md and spec/v0.1.0/templates/site-policy/).In automation_policy: "allowed" (with guardrails) you decide which actions the agent can see and perform: if you do not include a password reset or change-password flow in the surface’s tasks / actions, a conforming agent cannot invoke those operations or see the associated sensitive state. You can keep higher‑risk flows on separate surfaces with stricter policy or A2H requirements.
Goal: Validators, create-outputs, or other tooling.
aom-input-schema.json, aom-output-schema.json, site-policy-schema.json).| Example | Purpose | Location |
|---|---|---|
| Login single | Single-shot sign-in surface (allowed, with guardrails) |
examples/v0.1.0/login-single/ |
| Ecom flow | Multi-step checkout flow | examples/v0.1.0/ecom-flow/ |
| Forbidden page template | Page-level no-automation (minimal valid surface) | examples/v0.1.0/forbidden-page-template/ — aom-policy.forbidden.page.aom.json |
Demo agents (Python + Node) that consume surfaces and produce conformant outputs live in examples/v0.1.0/demo-agents/.
See examples/v0.1.0/ for details and validation instructions.
aom.py (Python) and aom.mjs (Node) — validate input/output/site/all, create-outputs, demo run/test.aom-input-schema.json, aom-output-schema.json, and site-policy-schema.json.*.output.json from *.aom.json surfaces.Quick start: run python aom.py --help or node aom.mjs --help from the repo root. Full reference: COMMANDS.md and tools/README.md.
forbidden / allowed / open).See CHANGELOG.md for release history.
This repo is the reference implementation for the AOM spec. Contributions are welcome, especially:
How to contribute:
See CONTRIBUTING.md and spec/v0.1.0/README.md for versioning and compatibility guidance.
MIT License — see LICENSE.md. You may use, modify, and redistribute in commercial and non-commercial projects.
allowed with guardrails vs open), support for multi-step flows.Agent Object Model v0.1.0 — white paper (same link as in the introduction above): motivation, architecture, Input/Output AOM, automation policy at a glance (with figure), and adoption.