P146 — Multi Modal Evidence Normalisation Protocol

◎ OS PUB Apache 2.0 ← All specifications

Publication Date: 2026-03-27 Status: Open Source Prior Art Disclosure Licence: Apache License 2.0 Author/Organisation: Phatfella Ltd Schema: AIEP_OS_SPEC_TEMPLATE v1.0.1 — https://aiep.dev/schemas/aiep-os-spec-template/v1.0.1

Framework Context

[0001] This disclosure operates within an Architected Instruction and Evidence Protocol (AIEP) environment as defined in United Kingdom patent application number GB2519711.2, filed 20 November 2025, the entire contents of which are incorporated herein by reference.

[0002] The present disclosure extends the Deterministic Normalisation Engine defined in P10 to encompass non-textual evidence modalities, defining governed normalisation pipelines for structured tables, code blocks, image metadata records, sensor readings, and audio transcript artefacts such that all evidence artefact types reach the AIEP evidence rail in a canonical, hash-verifiable, jurisdiction-tagged form.

Field of the Disclosure

[0003] This disclosure relates to multi-modal evidence normalisation protocols for governed artificial intelligence reasoning systems.

[0004] More particularly, the disclosure concerns a modal dispatch layer that classifies incoming evidence artefacts by modality type and routes each to a modality-specific normalisation pipeline, with all output artefacts conforming to the EvidenceNode schema defined in P133 regardless of input modality.

Background

[0005] The AIEP core normalisation engine (P10) defines a deterministic normalisation pipeline for textual evidence artefacts: extracting source URL, retrieval timestamp, content hash, jurisdiction, and classification tags. This pipeline is well-suited to HTML pages, PDF documents, and plain-text sources, but does not address the increasing proportion of AI-relevant evidence arriving as structured tables, executable code, image content, sensor data streams, or audio recordings.

[0006] Non-textual evidence modalities present distinct normalisation challenges. Structured tables require row and column extraction, schema inference, and numeric value canonicalisation. Code blocks require language detection, AST extraction, and executable-content flagging. Image evidence requires metadata extraction from EXIF and embedded descriptors rather than visual content analysis. Sensor readings require unit normalisation and temporal alignment with the evidence timeline.

[0007] Without a modal dispatch layer and modality-specific normalisation pipelines, AIEP-governed reasoning systems either reject non-textual evidence — limiting their evidence base — or ingest it without proper canonicalisation, producing evidence artefacts that cannot be deterministically verified by other nodes.

Summary of the Disclosure

[0008] A ModalityClassifier inspects each incoming evidence artefact and assigns a modality field from the enumeration: TEXT, TABLE, CODE, IMAGE_METADATA, SENSOR, AUDIO_TRANSCRIPT. The classifier operates on HTTP Content-Type headers, file extension, and a lightweight structural heuristic applied to the content body. Ambiguous artefacts that cannot be confidently classified are assigned MIXED and processed by the text pipeline with a mixed_modality_warning flag appended.

[0009] Text Pipeline (P10): Unchanged. Processes TEXT and MIXED artefacts through the existing deterministic normalisation engine. All other pipelines produce output that extends the EvidenceNode with a modality field and modal_metadata object.

[0010] Table Pipeline: Extracts rows, columns, and a detected schema from structured table content (HTML <table>, CSV, or JSON array). Computes content_hash over the canonical JSON serialisation of the extracted table (rows sorted, column names lowercased, numeric values rounded to six significant figures). Adds modal_metadata.row_count, modal_metadata.column_names, and modal_metadata.detected_schema to the EvidenceNode.

[0011] Code Pipeline: Detects programming language via extension and content heuristics. Extracts a normalised abstract syntax tree (AST) schema where supported; falls back to line-count and top-level structure summary where full AST extraction is not available. Sets modal_metadata.language, modal_metadata.line_count, and modal_metadata.executable_content = true to flag that content may be runnable, requiring governance review before use in automated reasoning chains.

[0012] Image Metadata Pipeline: Extracts structured metadata from EXIF, IPTC, and XMP headers embedded in image files. Does NOT process visual content. Computes content_hash over the canonical JSON serialisation of extracted metadata fields. Sets modal_metadata.image_format, modal_metadata.capture_timestamp (if present), modal_metadata.gps_coordinates (if present, rounded to four decimal places), and modal_metadata.camera_make_model (if present).

[0013] Sensor Pipeline: Normalises time-series sensor readings to a canonical unit system per the AIEP sensor unit registry (SI units by default, with declared deviations for domain-specific units). Computes content_hash over the canonical JSON serialisation of the normalised reading array. Sets modal_metadata.sensor_type, modal_metadata.unit, modal_metadata.sample_count, and modal_metadata.temporal_range.

[0014] Audio Transcript Pipeline: Accepts audio artefacts for which a text transcript has been produced by a separate transcription process. Treats the transcript as a TEXT artefact processed through the P10 pipeline. Adds modal_metadata.audio_source_url, modal_metadata.transcript_confidence, and modal_metadata.language to the EvidenceNode. The audio file itself is not stored as an evidence artefact; only the verified transcript enters the evidence rail.

[0015] All pipeline outputs are validated against the EvidenceNode schema (P133) before admission to the evidence rail. Any EvidenceNode failing schema validation is rejected with a NormalisationRejectionRecord identifying the modality, failure field, and rejection reason.

Technical Effect

[0016] Modal dispatch with modality-specific pipelines enables AIEP-governed reasoning systems to consume structured table, code, image metadata, sensor, and audio transcript evidence without discarding non-textual artefacts or admitting them in un-normalised form.

[0017] Content hash computation over canonical serialisations of each modality — rather than over raw content bytes — ensures that semantically equivalent artefacts that differ in whitespace, byte order, or encoding are detected as duplicates by the Semantic Evidence Deduplication Protocol (P148), while semantically distinct artefacts produce distinct hashes.

[0018] The executable content flag on code artefacts provides a governance hook for systems that require review before executable evidence is used in automated reasoning chains, without prohibiting code evidence from entering the evidence rail.

[0019] Publishing the multi-modal normalisation pipelines as open prior art prevents any party from claiming proprietary rights over the modal dispatch or modality-specific canonicalisation layer, ensuring interoperability across independent AIEP node implementations.

Claims

A multi-modal evidence normalisation protocol for a governed AI reasoning system, the protocol comprising: a ModalityClassifier assigning an incoming evidence artefact to a modality from an enumeration including TEXT, TABLE, CODE, IMAGE_METADATA, SENSOR, and AUDIO_TRANSCRIPT; a set of modality-specific normalisation pipelines each computing a content hash over a canonical serialisation of the extracted modal content; and a schema validation gate confirming that each pipeline output conforms to the EvidenceNode schema before evidence rail admission.
The protocol of claim 1, wherein the TABLE pipeline computes a content hash over a canonical JSON serialisation of the extracted table with rows sorted, column names lowercased, and numeric values rounded to six significant figures, producing hash-equivalent output for semantically identical tables regardless of source encoding differences.
The protocol of claim 1, wherein the CODE pipeline sets an executable_content flag on artefacts identified as runnable code, providing a governance hook for systems requiring human or automated review before executable evidence is admitted to active reasoning chains.
The protocol of claim 1, wherein the IMAGE_METADATA pipeline extracts EXIF, IPTC, and XMP metadata from image files without processing visual content, and computes a content hash over the canonical JSON serialisation of extracted metadata fields, enabling image-derived evidence to enter the evidence rail without visual content analysis.
The protocol of claim 1, wherein the AUDIO_TRANSCRIPT pipeline admits the produced text transcript to the evidence rail rather than the audio file itself, attaching transcript confidence and language metadata, and wherein the audio source URL is preserved as a provenance reference in the EvidenceNode.

Brief Description of the Drawing

FIG. 1 — Modal dispatch flow: incoming artefact → ModalityClassifier → modality-specific pipeline → EvidenceNode schema validation → evidence rail.

FIG. 2 — EvidenceNode schema extension showing modality field and modal_metadata object alongside base EvidenceNode fields from P133.

FIG. 3 — Table pipeline canonical serialisation process: raw table input → row extraction → schema inference → canonical JSON → SHA-256 hash.

Abstract

A multi-modal evidence normalisation protocol extends the AIEP deterministic normalisation engine to non-textual evidence modalities. A ModalityClassifier assigns each incoming artefact to a modality type: TEXT, TABLE, CODE, IMAGE_METADATA, SENSOR, or AUDIO_TRANSCRIPT. Modality-specific normalisation pipelines extract structured content, compute content hashes over canonical serialisations, and produce EvidenceNode-conformant artefacts regardless of input modality. The CODE pipeline flags executable content for governance review. The IMAGE_METADATA pipeline operates on embedded metadata only. A schema validation gate rejects artefacts that do not conform to the EvidenceNode schema. All pipeline implementations are published as open prior art.

Licence

Apache License 2.0 — https://www.apache.org/licenses/LICENSE-2.0

P146 — AIEP — Multi-Modal Evidence Normalisation Protocol