◎ OS PUB Apache 2.0 ← All specifications

P186 — AIEP — Evidence Scoring Aggregation Protocol

Publication Date: 2026-03-27 Status: Open Source Prior Art Disclosure Licence: Apache License 2.0 Author/Organisation: Phatfella Ltd Schema: AIEP_OS_SPEC_TEMPLATE v1.0.1 — https://aiep.dev/schemas/aiep-os-spec-template/v1.0.1


Framework Context

[0001] This disclosure operates within an Architected Instruction and Evidence Protocol (AIEP) environment as defined in United Kingdom patent application number GB2519711.2, filed 20 November 2025, the entire contents of which are incorporated herein by reference.

[0002] The present disclosure defines a protocol for producing a single composite AggregateEvidenceScore for a given reasoning query or evidence corpus scope — combining the trust score (P124), freshness score (P147), corroboration score (P184), and coverage density (P179) dimensions into a query-level or scope-level summary score that a reasoning chain can use as a pre-reasoning confidence calibration signal.


Field of the Disclosure

[0003] This disclosure relates to multi-dimensional evidence scoring aggregation protocols for artificial intelligence reasoning systems.

[0004] More particularly, the disclosure concerns: an AggregateEvidenceScore schema; a dimension weighting model; aggregation over a DEID list or taxonomy scope; a confidence band representation; and the integration of aggregate scores with reasoning chain pre-reasoning calibration.


Background

[0005] An AIEP reasoning chain operating over a corpus of evidence artefacts must evaluate evidence quality across multiple orthogonal dimensions simultaneously: source trust, temporal freshness, independent corroboration, and corpus coverage. Evaluating these dimensions individually for each artefact in the chain’s evidence base is computationally intensive and cognitively demanding for the reasoning agent. A pre-computed aggregate score covering the evidence base as a whole provides a rapid calibration signal before detailed per-artefact evaluation.

[0006] This aggregate score is not a replacement for per-artefact evaluation but a filter: if the aggregate score for a domain is very low, the reasoning chain is alerted to treat conclusions from that domain with high uncertainty before it begins detailed reasoning. If the aggregate score is high, the chain can proceed with higher initial confidence while still performing per-artefact evaluation.


Summary of the Disclosure

[0007] AggregateEvidenceScore Schema:

  • score_id — SHA-256 of canonical serialisation
  • scopeDEID_LIST:{deid_list_hash} or TAXONOMY_SCOPE:{taxonomy_query_hash}
  • artefact_count — number of artefacts in the scope
  • mean_trust_score — arithmetic mean of trust scores (P124) across all artefacts in scope
  • mean_freshness_score — arithmetic mean of freshness scores (P147)
  • mean_corroboration_score — arithmetic mean of corroboration scores (P184)
  • coverage_density — the CoverageCell (P179) coverage_density for the taxonomy scope (null for DEID_LIST scopes)
  • conflict_fraction — fraction of artefacts in scope with conflict_status ≠ CLEAR (P161)
  • governance_warning_fraction — fraction of artefacts in scope carrying an active GOVERNANCE_WARNING annotation (P176)
  • aggregate_score — the composite weighted score in [0.0, 1.0] (see [0008])
  • confidence_bandHIGH (≥ 0.75), MODERATE (0.5–0.74), LOW (0.25–0.49), VERY_LOW (< 0.25)
  • computed_at — ISO 8601 timestamp
  • score_signature — cryptographic signature

[0008] Aggregate Score Computation: The aggregate_score is a weighted composite:

  • trust_component = mean_trust_score — weight: 0.35
  • freshness_component = mean_freshness_score — weight: 0.25
  • corroboration_component = mean_corroboration_score — weight: 0.25
  • coverage_component = coverage_density (or 1.0 for DEID_LIST scopes) — weight: 0.15
  • Penalty deductions: subtract 0.1 × conflict_fraction and 0.15 × governance_warning_fraction (clamped to 0)
  • aggregate_score = max(0.0, (0.35 × trust + 0.25 × freshness + 0.25 × corroboration + 0.15 × coverage) - conflict_penalty - warning_penalty)

[0009] Aggregation API: The scoring aggregation API is available at POST /evidence/aggregate-score accepting a ScopeRequest with either a deid_list or a taxonomy_query plus optional include_details flag. With include_details: true the response includes the per-dimension component scores alongside the composite; with false (default) only the composite score and confidence band are returned.

[0010] Pre-Reasoning Integration: Evidence packages (P174) include an optional aggregate_score field in the EvidencePackageManifest when the package’s corpus_scope maps to a supported aggregation scope. Reasoning chains receiving a package with an aggregate_score can use the confidence_band to calibrate initial reasoning uncertainty before per-artefact evaluation.

[0011] Score Caching: AggregateEvidenceScores for common taxonomy scopes are cached for a configurable period (default: 1 hour) and reused for repeat queries within the cache window. The cache is invalidated when any artefact in the scope is updated, quarantined, or receives a new governance warning.


ASCII Architecture

ScopeRequest (deid_list | taxonomy_query)


AggregateScoreComputer
  - retrieve trust/freshness/corroboration per artefact
  - retrieve coverage density (P179)
  - compute per-dimension means
  - apply conflict/warning penalties
  - compute weighted composite


AggregateEvidenceScore
  (aggregate_score, confidence_band, per-dimension components)

        ├──▶ Reasoning Chain (pre-reasoning calibration)
        ├──▶ EvidencePackage Manifest (P174)
        └──▶ Cache (1h, per taxonomy scope)

Operational Detail

[0012] Corpus-Wide Aggregate: The AggregateScoreComputer also provides a corpus-wide aggregate score — the aggregation over all active artefacts in the corpus — published daily alongside the DomainCoverageMap (P179) as part of the standard corpus health report (P172).

[0013] Score Interpretation Guidance: An aggregate_score reflects the overall epistemic quality of the evidence base for a given scope. It does not guarantee that specific artefacts within the scope are correct; it indicates the collective signal strength available. Reasoning chains should treat VERY_LOW confidence band scores as an instruction to seek additional evidence before drawing conclusions, not as a reason to refuse to reason.


Claims-Exclusion Notice

This specification is published as open-source prior art. No patent claims are asserted by the author in respect of the mechanisms described. Any third party seeking to patent mechanisms substantially equivalent to those described herein is placed on notice of this prior art disclosure.