◎ OS PUB Apache 2.0 ← All specifications

P227 — AIEP — Reasoning Anomaly Detection Engine

Applicant: Neil Grassby Classification: Patent Application — Confidential Priority: Claims priority from GB2519711.2 filed 20 November 2025 Architecture Layer: AIEP Phase 2 Support Layer

Framework Context

[0001] This specification operates within an AIEP environment as defined in GB2519711.2 and GB2519798.9. The present specification defines the reasoning anomaly detection mechanism of the Phase-2 AIEP architecture, providing a continuous monitoring layer that identifies pathological reasoning behaviours before they propagate to action and world state update.

Field of the Invention

[0002] The present invention relates to anomaly detection systems and reasoning quality monitoring architectures for evidence-bound artificial intelligence.

[0003] More particularly, the invention relates to a system that monitors the ongoing outputs, resource consumption, and inference patterns of active reasoning agents to detect anomalous behaviours including goal drift, evidence fabrication, circular reasoning, and resource overconsumption.

Background

[0004] AI reasoning systems may exhibit pathological behaviours that are not detectable through output quality evaluation alone. Anomalous behaviours such as goal drift (gradual shift towards unapproved objectives), evidence fabrication (producing citations to non-existent evidence artefacts), and circular reasoning (deriving claims that serve as their own supporting evidence) require dedicated monitoring mechanisms.

[0005] Anomaly detection in the AIEP architecture must be: grounded in verifiable signals from the evidence ledger; able to detect both abrupt and gradual anomalies; fast enough to intervene before anomalous outputs are acted upon; and itself governed and auditable.

Summary of the Invention

[0006] The invention provides a Reasoning Anomaly Detection Engine (RADE) that monitors four anomaly categories: goal drift (systematic shift in an agent’s objective distribution away from approved goals); evidence integrity (citations where the cited evidence artefact hash does not match the ledger record); circular reasoning (claim support chains that cycle back to the claim being supported); and resource anomaly (deviation from expected resource consumption profiles beyond statistical bounds).

[0007] Detected anomalies produce an anomaly record admitted to the AIEP evidence ledger and, above a configurable severity threshold, trigger safety escalation through the Safety Governance Engine (P215).

ASCII Architecture

Active Reasoning Agents
    |         |         |
    v         v         v
+---------------------------------------------+
| Reasoning Anomaly Detection Engine (RADE)   |
|                                             |
|  Goal Drift Monitor                        |
|  Evidence Integrity Checker                |
|  Circular Reasoning Detector               |
|  Resource Anomaly Monitor                  |
+---------------------+-----------------------+
                      |
           anomaly detected?
           |                  |
           v                  v
    Anomaly Record         Safety Escalation
    (ledger)               (P215, if severity
                            above threshold)

Detailed Description

[0008] Goal Drift Detection. The RADE monitors the distribution of goal types pursued by each agent over a rolling window. When the distribution shifts significantly towards unapproved goal categories (measured by KL divergence from the approved distribution), a goal drift alert is generated.

[0009] Evidence Integrity Checking. For every evidence artefact citation produced during a reasoning session, the RADE retrieves the cited artefact from the ledger and verifies that its content hash matches the cited hash. Any mismatch produces an evidence integrity anomaly record.

[0010] Circular Reasoning Detection. The RADE constructs the support graph for each output claim, tracing evidence hashes backwards through the inference chain. Cycles in the support graph indicate circular reasoning. Circular chains are identified by topological sort of the support graph.

[0011] Resource Anomaly Detection. The RADE compares actual resource consumption to the predictions from the Predictive Resource Planning Engine (P226). Deviations above the statistical threshold are flagged as resource anomalies, which may indicate runaway simulation or infinite loop conditions.

[0012] Safety Escalation. Anomalies above the configured severity threshold trigger a safety escalation signal to the Safety Governance Engine (P215). The governance engine may suspend the anomalous agent, quarantine its outputs, or initiate an emergency governance review.

Technical Effect

[0013] The invention provides continuous, multi-pattern anomaly monitoring for AI reasoning sessions, detecting pathological reasoning modes before they produce evidence ledger admissions. By detecting goal drift through CWSG consistency checking, evidence fabrication through citation hash verification, circular reasoning through inference support graph cycle detection, and resource anomalies through consumption threshold analysis, the engine provides layered defence against distinct failure modes. By escalating detected anomalies to the Safety Governance Engine with structured anomaly records, the engine ensures governance-directed response proportional to anomaly severity.

Claims

A computer-implemented method for reasoning anomaly detection, the method comprising: (a) monitoring reasoning session outputs for goal drift by evaluating claimed goal targets against verified CWSG entity states and detecting claimed targets not reachable from the current world state; (b) verifying evidence integrity by retrieving each cited evidence artefact from the ledger and confirming that its content hash matches the cited hash; (c) detecting circular reasoning by constructing the inference support graph for each output claim and applying topological sort to identify cycles; (d) detecting resource anomalies by comparing actual consumption against Predictive Resource Planning Engine forecasts and flagging deviations above a statistical threshold; and (e) admitting anomaly records to the AIEP evidence ledger for all detected anomalies, and escalating anomalies above a configured severity threshold to the Safety Governance Engine.
The method of claim 1, wherein goal drift detection classifies each anomaly by severity based on the magnitude of the deviation between claimed goal target and CWSG reachable state.
The method of claim 1, wherein evidence integrity mismatches produce evidence fabrication anomaly records identifying the cited artefact identifier, expected hash, and received hash.
The method of claim 1, wherein circular reasoning detection records the full cycle path in the anomaly record, identifying each inference step in the circular chain.
The method of claim 1, wherein safety escalation instructs the Safety Governance Engine with the anomaly type, affected agent identifier, and severity score, enabling governed suspension or quarantine response.
A Reasoning Anomaly Detection Engine comprising: one or more processors; memory storing a support graph evaluator, anomaly record buffer, and severity threshold configuration; wherein the processors are configured to execute the method of claim 1.
A non-transitory computer-readable medium storing instructions that, when executed by a processor, implement the method of claim 1.

Abstract

A reasoning anomaly detection engine for evidence-bound artificial intelligence continuously monitors reasoning sessions for goal drift, evidence fabrication, circular reasoning, and resource anomalies, each detected through distinct evaluation protocols. Anomaly records are admitted to the AIEP evidence ledger with severity scores, and anomalies exceeding configured thresholds trigger safety escalation to the Safety Governance Engine with structured anomaly context.