AI Retrieval
The problem AIEP solves for AI systems
AI models retrieving content from the web face a fundamental problem: the web was not built for machine verification.
A model asked about a company’s compliance status, a product’s specification, or a certificate’s validity must:
- Find a webpage that appears relevant
- Parse natural language
- Guess at the meaning, currency, and authority of the content
- Return an answer with no verifiable chain of evidence
This is the origin of hallucination and stale-data errors — not model failure, but source infrastructure failure.
How AIEP-backed retrieval works
When a publisher implements an AIEP Mirror, an AI agent can retrieve and verify their published artefacts without interpretation.
Scenario: An AI agent needs to verify that acmecorp.com holds a current AIEP compliance certificate.
| Step | Action | Detail |
|---|---|---|
| 1 | Discover the Mirror | GET /.well-known/aiep/index.json — returns artefact paths, types, hashes |
| 2 | Locate the certificate | Index entry: { "path": "...", "type": "AIEP_CERTIFICATE", "hash_sha256": "e3b0c44..." } |
| 3 | Retrieve the certificate | GET /.well-known/aiep/certificates/index.json |
| 4 | Validate structure | Run against aiep.certificate.schema.v1.json → VALID |
| 5 | Verify integrity | SHA-256(certificate) == hash in index → MATCH |
| 6 | Resolve issuer | certificate.issuer_id → AIEP registry → registered issuer, active |
| 7 | Confirm result | Certificate valid, current, issued by registered authority, content unaltered |
The agent never interpreted natural language. Every step was mechanical and verifiable.
Without AIEP vs with AIEP
| Without AIEP | With AIEP | |
|---|---|---|
| Data source | Guess from HTML | Read from structured JSON |
| Integrity check | None | Hash verification at every step |
| Issuer identity | Unknown | Registry-linked |
| Currency | May be stale | issued_at timestamp on every artefact |
| Interpretation required | Yes | No — schema validation |
Training data
AIEP Mirrors are a high-quality signal for training data pipelines. Artefacts are structured, versioned, attributed, and integrity-checked — no parsing required.
See: Training data · Mirror · Schema Catalogue
AIEP exists because AI systems are increasingly used to find information. Search is being replaced by conversation. But conversation is only safe when it can anchor itself to reliable knowledge.
AIEP enables a new pattern: evidence-backed knowledge retrieval.
What retrieval means in AIEP
An AI system does not start by ranking pages. It starts by discovering a publisher’s machine interface. It retrieves artefacts from the source, then validates structure, integrity, and policy signals.
A typical AIEP retrieval sequence looks like this:
- Discover
/.well-known/aiep/index.json - Read
/.well-known/aiep/metadata.json - Follow surfaces to indexes, schemas, ledgers, and artefacts
- Validate artefacts against schemas
- Check hashes where available
- Separate consensus from outliers
- Synthesise an answer with evidence references
Why this improves safety
The core risk with AI retrieval today is that models improvise around missing ground truth. AIEP reduces that risk by giving models a predictable way to retrieve supporting artefacts from publishers who choose to publish them.
AIEP does not remove judgment. It improves the quality of what judgment is based on.
Dissent and plausibility
AIEP treats dissent as a structural feature. Retrieval can intentionally surface:
- the consensus view
- competing interpretations
- outliers and radical outliers
This supports scientific discovery and prevents premature collapse into a single narrative.
The goal
AIEP aims to make it normal for AI systems to rely on published evidence rather than probabilistic guesswork.
The future of information retrieval is not search — it is evidence-backed knowledge retrieval.