◎ OS PUB Apache 2.0 ← All specifications

P164 — AIEP — Evidence Statistical Sampling and Validity Protocol

Publication Date: 2026-03-27 Status: Open Source Prior Art Disclosure Licence: Apache License 2.0 Author/Organisation: Phatfella Ltd Schema: AIEP_OS_SPEC_TEMPLATE v1.0.1 — https://aiep.dev/schemas/aiep-os-spec-template/v1.0.1

Framework Context

[0001] This disclosure operates within an Architected Instruction and Evidence Protocol (AIEP) environment as defined in United Kingdom patent application number GB2519711.2, filed 20 November 2025, the entire contents of which are incorporated herein by reference.

[0002] The present disclosure defines a protocol for producing statistically valid samples of large AIEP evidence corpora for use in reasoning chains where full-corpus ingestion is impractical, comprising a SampleSpec schema, a StratifiedSampler generating representative samples across taxonomy dimensions (P160), a ValidityCertificate encoding confidence intervals and sampling methodology, and a governed interface through which reasoning chains may declare their use of sampled evidence and expose the sampling parameters in their reasoning outputs.

Field of the Disclosure

[0003] This disclosure relates to evidence statistical sampling and validity protocols for governed artificial intelligence reasoning systems reasoning over large evidence corpora.

[0004] More particularly, the disclosure concerns stratified random sampling across evidence taxonomy dimensions, sample size calculation for configurable confidence levels and margins of error, sampling without replacement from the distributed evidence index (P133), a ValidityCertificate schema expressing the statistical validity bounds of a drawn sample, and integration with the AIEP reasoning output schema to allow downstream parties to evaluate the statistical basis of corpus-sampled reasoning conclusions.

Background

[0005] AIEP-governed reasoning systems may have access to corpora containing millions of evidence artefacts. Reasoning chains that must access the full corpus for a broad query — for example, to characterise the state of evidence on a large scientific topic — cannot always complete within acceptable latency bounds if every artefact must be retrieved and processed. A statistically valid sampling protocol enables reasoning systems to draw a representative subset of the corpus with known confidence bounds, reason over that subset, and transparently express the sampling parameters in their outputs.

[0006] Uniform random sampling is insufficient for heterogeneous evidence corpora: a uniformly sampled subset of a corpus dominated by a single subject domain or language will under-represent minority domains and languages. Stratified sampling — drawing proportionally from defined strata — produces samples that representatively reflect the composition of the full corpus across all relevant dimensions.

[0007] The statistical validity of a corpus-sampled reasoning conclusion depends on the sample size, the sampling methodology, the variability of the property being estimated across the corpus, and whether the sampling was with or without replacement. These parameters must be expressed in a machine-readable format attached to the reasoning output, enabling any consuming system to evaluate the statistical reliability of the conclusion.

Summary of the Disclosure

[0008] SampleSpec Schema: A SampleSpec defines the parameters of a sampling operation before it is executed:

corpus_query — a TaxonomyQuery (P160) or DEID list defining the target corpus
target_confidence — desired confidence level (e.g. 0.95 for 95%)
target_margin_of_error — desired margin of error as a proportion (e.g. 0.05 for ±5%)
stratification_dimensions — list of ClassificationVector dimensions (P160) to stratify by (e.g. ["subject_domain", "evidence_quality_tier"])
sampling_method — STRATIFIED_RANDOM (default) or SYSTEMATIC or CLUSTER
replacement — boolean; whether sampling is with replacement (default: false)
seed — integer random seed for reproducibility (null = non-reproducible)

[0009] Sample Size Calculation: Before drawing a sample, the SampleSizer computes the required sample size n using the Cochran formula:

n₀ = (Z² × p × (1 - p)) / e²
n  = n₀ / (1 + ((n₀ - 1) / N))

where Z is the Z-score corresponding to target_confidence (1.96 for 95%), p = 0.5 (maximum variance, used when population proportion is unknown), e = target_margin_of_error, and N is the total size of the target corpus. The computed n is the minimum required sample size. The SampleSizer may round up to the nearest multiple of the number of strata to ensure proportional allocation.

[0010] StratifiedSampler: The StratifiedSampler divides the target corpus into strata according to the stratification_dimensions specified in the SampleSpec. Each stratum corresponds to a unique combination of dimension values. The sample allocation across strata is proportional: the number of artefacts drawn from each stratum equals n × (stratum_size / N), rounded to the nearest integer with remainder redistributed to the largest strata. Within each stratum, artefacts are selected by stratified random sampling using the specified seed value (or a cryptographically random seed if null).

[0011] Sampling Without Replacement: When replacement = false (default), the StratifiedSampler maintains a SampledSet of already-selected DEID values and excludes them from subsequent selections. Selected DEIDs are resolved via the Distributed Evidence Identity Protocol (P162) to retrieve the corresponding EvidenceNodes. If a DEID cannot be resolved within a configurable timeout, it is replaced by the next available artefact in the same stratum and the resolution failure is logged in the ValidityCertificate.

[0012] ValidityCertificate Schema: A ValidityCertificate is produced after sampling completes:

certificate_id — SHA-256 of canonical serialisation of all other fields
sample_spec — the SampleSpec that governed this sampling operation
corpus_size_n — total artefact count in the target corpus at time of sampling
sample_size_drawn — actual number of artefacts drawn
achieved_confidence — confidence level achieved (may differ from target if corpus is smaller than required sample size)
achieved_margin_of_error — margin of error achieved
stratum_counts — map of stratum identifier to count of artefacts drawn from that stratum
unresolved_deid_count — count of DEIDs that could not be resolved and were replaced
sampling_timestamp — ISO 8601 timestamp of sampling execution
evidence_deid_list — ordered list of DEIDs of all artefacts in the sample
seed_used — the random seed actually used (including the generated seed if input was null)

[0013] Reasoning Output Integration: Where a reasoning chain uses a sampled corpus, the ReasoningOutput schema includes a sampling_certificate_id field referencing the ValidityCertificate stored in the ledger (P80). The reasoning output additionally includes sampling_validity_notice — a required human-readable statement of the form: “This conclusion is based on a stratified random sample of [n] artefacts drawn from a corpus of [N], with [X]% confidence and a margin of error of ±[Y]%. Full sampling parameters are available via certificate [certificate_id].”

ASCII Architecture

Corpus Query (TaxonomyQuery or DEID list)
        │
        ▼
┌────────────────────────┐
│  SampleSizer           │
│  Cochran formula       │
│  n = f(Z, p, e, N)     │
└──────────┬─────────────┘
           │ required n
           ▼
┌────────────────────────┐
│  StratifiedSampler     │◀── stratification_dimensions
│  proportional by strata│    (P160 ClassificationVector)
│  random seed applied   │
└──────────┬─────────────┘
           │ sampled DEID list
           ▼
┌────────────────────────┐     ┌──────────────────┐
│  DEID Resolver (P162)  │────▶│  EvidenceNodes   │
│  fetch each artefact   │     │  (sample set)    │
└──────────┬─────────────┘     └──────────────────┘
           │
           ▼
┌────────────────────────┐
│  ValidityCertificate   │──▶ Ledger (P80)
│  (confidence, MoE,     │    ReasoningOutput.sampling_certificate_id
│   stratum_counts,      │
│   seed_used, DEIDs)    │
└────────────────────────┘

Operational Detail

[0014] Small Corpus Handling: Where the target corpus size N is smaller than the computed n, the StratifiedSampler draws all N artefacts (a census rather than a sample). The ValidityCertificate records achieved_confidence = 1.0 and achieved_margin_of_error = 0.0 to reflect a full census, and the sampling_validity_notice states “Full corpus census performed (N < minimum required sample size).”

[0015] Reproducibility: Where seed is specified in the SampleSpec, re-execution of the sampling operation with the same SampleSpec and the same corpus state produces an identical sample. This enables peer review of corpus-sampled reasoning conclusions by independent parties who can reproduce the exact sample from the ValidityCertificate parameters and the corpus state at sampling_timestamp.

[0016] Sequential Sampling: For continuously updated corpora (high-volatility artefacts, P163), a SequentialSampler variant supports incremental sampling: the SampleSpec specifies a snapshot_timestamp and the sampler draws only from artefacts whose retrieval_timestamp predates the snapshot, ensuring a stable corpus boundary even for live feeds.

Claims-Exclusion Notice

This specification is published as open-source prior art. No patent claims are asserted by the author in respect of the mechanisms described. Any third party seeking to patent mechanisms substantially equivalent to those described herein is placed on notice of this prior art disclosure.