◎ OS PUB Apache 2.0 ← All specifications

P195 — AIEP — Evidence Counterfactual Test Protocol

Publication Date: 2026-03-27 Status: Open Source Prior Art Disclosure Licence: Apache License 2.0 Author/Organisation: Phatfella Ltd Schema: AIEP_OS_SPEC_TEMPLATE v1.0.1 — https://aiep.dev/schemas/aiep-os-spec-template/v1.0.1


Framework Context

[0001] This disclosure operates within an Architected Instruction and Evidence Protocol (AIEP) environment as defined in United Kingdom patent application number GB2519711.2, filed 20 November 2025, the entire contents of which are incorporated herein by reference.

[0002] The present disclosure defines a protocol for subjecting evidence artefacts to CounterfactualTests — structured challenges that probe whether a reasoning chain’s conclusion remains stable when specific evidence artefacts are hypothetically removed, replaced, or modified — enabling corpus operators and reasoning chain auditors to identify conclusions that are fragile (dependent on a small number of critical evidence artefacts) versus robust (stable across a range of evidence perturbations).


Field of the Disclosure

[0003] This disclosure relates to evidence robustness testing and counterfactual sensitivity analysis protocols for artificial intelligence reasoning systems.

[0004] More particularly, the disclosure concerns: a CounterfactualTestCase schema; counterfactual test types; a CounterfactualTestResult schema; sensitivity score computation; integration with the evidence dependency graph (P170) and synthesis protocol (P185); and the use of counterfactual testing in governance audit workflows (P89).


Background

[0005] A reasoning chain that draws a strong conclusion from a small number of critical evidence artefacts is vulnerable: if any of those artefacts is subsequently quarantined, retracted, or disclosed to be unreliable, the conclusion may no longer hold. Counterfactual testing — asking “what would the reasoning conclude if artefact X were absent?” — probes this fragility before the conclusion is acted upon.

[0006] Counterfactual analysis is standard in scientific practice (sensitivity analyses, leave-one-out analyses in meta-analyses) and legal reasoning (testing whether a conclusion depends on disputed evidence). AIEP formalises this as a structured protocol attached to the evidence corpus rather than leaving it to individual reasoning chain implementations.

[0007] The Evidence Dependency Graph (P170) identifies which artefacts a SynthesisNode depends on — providing ready-made candidates for counterfactual perturbation. Counterfactual tests can be systematically applied to all SynthesisNodes (P185) to assess their robustness before they are admitted as first-class corpus artefacts.


Summary of the Disclosure

[0008] CounterfactualTestCase Schema:

  • test_case_id — SHA-256 of canonical serialisation
  • subject_deid — DEID (P162) of the artefact or SynthesisNode (P185) being tested
  • test_type — one of:
    • REMOVAL — hypothetically remove the specified input artefact(s) and re-evaluate
    • REPLACEMENT — hypothetically replace an input artefact with an alternative DEID
    • TRUST_DOWNGRADE — hypothetically apply a lower trust score (P124) to an input and re-evaluate
    • CORROBORATION_REDUCTION — hypothetically remove one or more corroborators (P184) and re-evaluate
  • perturbation_target_deids — the DEIDs being perturbed (removed, replaced, or downgraded)
  • replacement_deids — (for REPLACEMENT type) the DEIDs to substitute in place of perturbation_target_deids
  • original_conclusion_summary — brief statement of the conclusion from the unperturbed evidence base
  • requested_by — node fingerprint (P46) or governance entity identifier
  • requested_at — ISO 8601 timestamp

[0009] CounterfactualTestResult Schema:

  • result_id — SHA-256 of canonical serialisation
  • test_case_id — reference to the originating CounterfactualTestCase
  • perturbed_conclusion_summary — the conclusion reached with the perturbation applied
  • conclusion_stabilitySTABLE (conclusion unchanged), MODIFIED (conclusion changed in degree but not direction), REVERSED (conclusion reversed), NULL (no conclusion reachable from remaining evidence)
  • stability_score — a value in [0.0, 1.0] representing the degree of stability: 1.0 = fully stable (identical conclusion); 0.0 = completely reversed or null
  • critical_artefacts — list of DEIDs whose individual removal or perturbation produced stability_score < 0.5 — these are the evidence artefacts the conclusion is most dependent on
  • test_completed_at — ISO 8601 timestamp
  • result_signature — signed by the testing node

[0010] Sensitivity Score: The SensitivityScore for an evidence base is computed as: sensitivity_score = 1.0 - mean(stability_score across all CounterfactualTestCases for the subject)

A high sensitivity score indicates the conclusion is fragile; a low sensitivity score indicates robustness. Sensitivity scores are included in the corpus quality report (P172) for SynthesisNodes.

[0011] Automated Counterfactual Testing for SynthesisNodes: When a SynthesisNode (P185) is admitted to the corpus, the pre-synthesis verification optionally triggers an automated CounterfactualTestSuite — a set of REMOVAL test cases, one per source artefact — to compute the sensitivity score before the SynthesisNode is made accessible to reasoning chains. If sensitivity_score > configurable_threshold (default: 0.5), the SynthesisNode is annotated (P176) with a CAVEAT annotation noting high sensitivity to specific source artefacts.

[0012] Governance Integration: Governance auditors (P89) may request CounterfactualTestCases for any evidence artefact or SynthesisNode as part of an evidence quality audit. The results are recorded to the AuditLog (P171) and may be used to require additional corroboration (see P184) before a conclusion generated from fragile evidence is acted upon.


ASCII Architecture

CounterfactualTestCase
(subject_deid, test_type, perturbation_target_deids)


CounterfactualTestEngine
  - apply perturbation hypothetically (no corpus modification)
  - re-evaluate subject evidence base without/with modified artefacts
  - compare conclusion before and after perturbation


CounterfactualTestResult
  (conclusion_stability, stability_score, critical_artefacts)

        ├──▶ SensitivityScore = 1 - mean(stability_scores)

        ├── HIGH SENSITIVITY (score > 0.5):
        │   CAVEAT annotation (P176) on SynthesisNode
        │   Governance audit record (P89, P171)

        └── LOW SENSITIVITY (score ≤ 0.5):
            conclusion marked ROBUST in evidence package (P174)

Automated CounterfactualTestSuite:
  SynthesisNode admission → REMOVAL tests per source artefact
  → high sensitivity → CAVEAT annotation before release

Operational Detail

[0013]** Hypothetical Perturbation Principle: Counterfactual tests never modify the corpus. The perturbation is applied within the testing engine’s working context only — the actual evidence corpus remains unchanged. The test produces information about the robustness of the conclusion, not a change to the underlying evidence.

[0014] Leave-One-Out for Meta-Analysis: For SynthesisNodes of type META_ANALYSIS (P185), the automated CounterfactualTestSuite generates a full leave-one-out analysis: one REMOVAL test per source artefact. This mirrors standard meta-analytic sensitivity practice and produces a comprehensive set of stability scores identifying any single study the meta-analytic conclusion is disproportionately dependent on.

[0015] Integration with Provenance Challenge: When a ProvenanceChallenge (P178) is opened against an artefact that is a dependency of a SynthesisNode, a TRUST_DOWNGRADE CounterfactualTestCase is automatically triggered for the SynthesisNode: the challenged artefact’s trust score is hypothetically set to 0.0 and the sensitivity detected. This provides immediate intelligence about the SynthesisNode’s robustness to the unresolved challenge before the verdict is returned.


Claims-Exclusion Notice

This specification is published as open-source prior art. No patent claims are asserted by the author in respect of the mechanisms described. Any third party seeking to patent mechanisms substantially equivalent to those described herein is placed on notice of this prior art disclosure.