P63 — AIEP — Content Hash Binding Protocol
Publication Date: 2026-02-26 Status: Open Source Prior Art Disclosure Licence: Apache License 2.0 Author/Organisation: Phatfella Ltd Schema: AIEP_OS_SPEC_TEMPLATE v1.0.1 — https://aiep.dev/schemas/aiep-os-spec-template/v1.0.1
Field of the Invention
[0001] The disclosure relates to deterministic canonicalisation and cryptographic hash binding of web content artefacts.
[0002] More particularly, the disclosure concerns a content hash binding protocol defining canonical serialisation rules for JSON-structured web artefacts and hash algorithm requirements, enabling reproducible content-addressable identity across distributed systems.
Framework Context
[0003] This invention operates within an Architected Instruction and Evidence Protocol (AIEP) environment as defined in United Kingdom patent application number GB2519711.2, filed 20 November 2025, the entire contents of which are incorporated herein by reference.
[0004] The present invention extends deterministic canonicalisation, governance, and execution integrity mechanisms defined in the AIEP environment while remaining independently implementable as described herein.
Background
[0005] JSON serialisation is not deterministic by default. Different implementations may produce different byte sequences for semantically identical objects depending on key ordering, whitespace handling, number formatting, and encoding.
[0006] Non-deterministic serialisation produces different hash values for semantically identical content, preventing reproducible content-addressable identity and breaking cross-node hash equivalence verification.
[0007] Existing JSON canonicalisation standards address key ordering but do not specify number formatting, whitespace handling, and hash algorithm versioning as a unified protocol for AIEP web artefacts.
[0008] There exists a need for a unified content hash binding protocol defining all canonicalisation rules and hash algorithm requirements as a versioned specification.
Summary of the Disclosure
[0011] A content hash binding protocol is disclosed defining canonicalisation rules and hash algorithm requirements for AIEP web artefacts.
[0012] Canonicalisation rules are:
(a) UTF-8 encoding for all string values; (b) alphabetical sorting of JSON object keys at all nesting levels; (c) removal of all insignificant whitespace between tokens; and (d) deterministic number formatting using minimal decimal representation without trailing zeros.
[0013] The default hash algorithm is SHA-256. The resulting hash is encoded as sha256:<hex-digest>.
[0014] Future hash algorithms must be declared as versioned entries in the AIEP manifest schema, enabling algorithm agility without breaking existing implementations.
[0015] A ContentHash is computed as:
ContentHash = H_algorithm(CanonicalSerialisation)
where H_algorithm is the declared hash function and CanonicalSerialisation is the byte sequence produced by applying all canonicalisation rules.
[0016] Any system computing a hash over an AIEP web artefact without applying all canonicalisation rules produces a non-conforming hash and must not be used for admissibility or cross-node equivalence verification.
[0017] The technical effect is provision of a deterministic, reproducible content-addressable identity mechanism for AIEP web artefacts enabling cross-node hash equivalence verification.
Brief Description of the Drawings
[0018] Figure 1 illustrates the canonicalisation pipeline from raw JSON to canonical byte sequence.
[0019] Figure 2 illustrates the hash computation step and output format.
[0020] Figure 3 illustrates algorithm versioning in the manifest schema.
[0021] Figure 4 illustrates cross-node equivalence verification using ContentHash.
ASCII Drawings
Figure 1 — Canonicalisation Pipeline
Raw JSON Object
|
v
+------------------+
| Step 1: |
| UTF-8 encode all |
| string values |
+--------+---------+
|
v
+--------+---------+
| Step 2: |
| Sort all object |
| keys |
| alphabetically |
| at all levels |
+--------+---------+
|
v
+--------+---------+
| Step 3: |
| Remove all |
| insignificant |
| whitespace |
+--------+---------+
|
v
+--------+---------+
| Step 4: |
| Deterministic |
| number format: |
| minimal decimal, |
| no trailing zeros|
+--------+---------+
|
v
CanonicalSerialisation
(deterministic byte seq)
Figure 2 — Hash Computation and Output Format
CanonicalSerialisation
|
v
+------------------+
| SHA-256 (default)|
| or declared alg |
+--------+---------+
|
v
ContentHash = "sha256:<hex-digest>"
Figure 3 — Algorithm Versioning in Manifest Schema
+------------------------------------------+
| AIEP Manifest Schema |
|------------------------------------------|
| hash_algorithms: { |
| "sha256": { |
| "status": "active", |
| "version": "1" |
| }, |
| "sha3-256": { |
| "status": "declared", |
| "version": "2" |
| } |
| } |
+------------------------------------------+
New algorithms added as versioned entries.
Existing algorithm entries remain immutable.
Figure 4 — Cross-Node Equivalence Verification
Node A Node B
+------------------+ +------------------+
| Raw JSON (same) | | Raw JSON (same) |
+--------+---------+ +--------+---------+
| |
v v
Canonicalise Canonicalise
(identical rules) (identical rules)
| |
v v
ContentHash_A ContentHash_B
| |
+---------------+---------------+
|
v
ContentHash_A == ContentHash_B
(equivalence verified)
Detailed Description
1. Canonicalisation Rules
[0022] All AIEP web artefact JSON must be canonicalised before hashing using the following rules applied in order:
(a) all string values encoded in UTF-8; (b) all object keys sorted alphabetically at all nesting levels; (c) all insignificant whitespace removed between tokens; and (d) all numbers formatted using minimal decimal representation without trailing zeros.
[0023] These rules produce a deterministic byte sequence for any semantically identical JSON object regardless of the implementation producing it.
2. Hash Algorithm
[0024] The default hash algorithm is SHA-256. The ContentHash is formatted as sha256:<hex-digest>.
[0025] Future hash algorithms must be declared as versioned entries in the AIEP manifest schema before use. Algorithm declarations are append-only and existing entries are immutable.
3. Non-Conforming Hashes
[0026] Any hash computed over an AIEP web artefact without applying all four canonicalisation rules is non-conforming and must not be used for admissibility determination or cross-node equivalence verification.
Claims
-
A content hash binding protocol for AIEP web artefacts comprising: (a) canonicalisation rules requiring UTF-8 encoding, alphabetical key sorting at all levels, removal of insignificant whitespace, and deterministic number formatting; (b) a default hash algorithm of SHA-256 with output formatted as
sha256:<hex-digest>; (c) a ContentHash computed as H_algorithm(CanonicalSerialisation) where CanonicalSerialisation is the byte sequence produced by applying all canonicalisation rules; (d) a requirement that future hash algorithms be declared as versioned entries in the AIEP manifest schema before use; and (e) a prohibition on use of non-conforming hashes for admissibility determination or cross-node equivalence verification. -
The protocol of claim 1 wherein algorithm version entries in the manifest schema are append-only and existing entries are immutable.
-
A computing system configured to compute ContentHash values conforming to the protocol of any of claims 1 to 2.
-
A non-transitory computer-readable medium storing instructions which, when executed, produce ContentHash values conforming to the protocol of any of claims 1 to 2.
Licence
Any person is granted a perpetual, irrevocable, worldwide, royalty-free licence to make, use, implement, modify, or distribute any system or method described in this disclosure for any purpose, without restriction, under the Apache License 2.0.
A copy of the Apache License 2.0 is available at https://www.apache.org/licenses/LICENSE-2.0
Abstract
A content hash binding protocol is disclosed defining canonicalisation rules and hash algorithm requirements for AIEP web artefacts. Canonicalisation requires UTF-8 encoding, alphabetical key sorting at all nesting levels, removal of insignificant whitespace, and deterministic number formatting. The default hash algorithm is SHA-256 producing output formatted as sha256:<hex-digest>. Future algorithms must be declared as versioned manifest schema entries before use. A ContentHash is computed as H_algorithm(CanonicalSerialisation). Non-conforming hashes computed without applying all canonicalisation rules must not be used for admissibility determination or cross-node equivalence verification.