◎ OS PUB Apache 2.0 ← All specifications

P212 — AIEP — Self-Model and Capability Awareness Engine

Applicant: Neil Grassby Classification: Patent Application — Confidential Priority: Claims priority from GB2519711.2 filed 20 November 2025 Architecture Layer: AIEP AGI Cognition Layer — Phase 2


Framework Context

[0001] This specification operates within an AIEP environment as defined in GB2519711.2 and GB2519798.9. The present specification defines the self-modelling capability of the Phase-2 AIEP cognition architecture, enabling the reasoning system to maintain and query an evidence-based model of its own capabilities, limitations, and resource state.


Field of the Invention

[0002] The present invention relates to self-modelling architectures and capability awareness systems for evidence-bound artificial intelligence.

[0003] More particularly, the invention relates to a system in which an AI reasoning engine maintains a structured model of its own capabilities, operational constraints, and historical performance record, derived from evidence artefacts, that it may query during goal planning and task selection.


Background

[0004] AI systems that select tasks, plan workflows, and allocate resources must have accurate knowledge of their own capabilities. Conventional systems use static capability declarations or parametric self-assessment lacking verifiable provenance. An evidence-bound architecture requires capability claims to be grounded in observable performance records.


Summary of the Invention

[0005] The invention provides a Self-Model and Capability Awareness Engine (SMCAE) that maintains a structured self-model comprising: a registered capability catalogue (tools, reasoning strategies, knowledge domains); a historical performance record bound to task execution evidence artefacts; a current resource state record (compute budget, memory utilisation, active workflow count); and a capability confidence score for each registered capability derived from historical performance.

[0006] The SMCAE exposes a capability query interface that the Goal Formation Engine (P210) and Workflow Construction Engine (P211) may invoke to verify whether a proposed action is within current operational capability before committing to execution.


ASCII Architecture

Task Execution Evidence Artefacts
(outcomes, accuracy deltas, resource usage)
            |
            v
+------------------------------------------+
|   Self-Model & Capability Awareness      |
|                                          |
|  Capability Catalogue                    |
|    tool_X: confidence=0.92              |
|    tool_Y: confidence=0.61              |
|  Resource State                          |
|    compute_budget: 74% remaining        |
|    memory_utilisation: 52%             |
|    active_workflows: 3                  |
|  Performance History                    |
|    (evidence-bound per task)            |
+-------------------+----------------------+
                    |
        +-----------+-----------+
        |                       |
        v                       v
  Capability Query           Update Cycle
  Interface                  (on new evidence
  (P210, P211, P214)          artefact admission)

Definitions

[0007] Self-Model and Capability Awareness Engine (SMCAE): The subsystem that maintains a structured, evidence-bound model of the system’s own capabilities, operational constraints, and resource state, and exposes a query interface for use by planning and reasoning subsystems.

[0008] Capability Catalogue: A structured registry of tool and reasoning strategy capabilities, each entry comprising a capability identifier, a capability confidence score, a performance history reference, and an active/inactive status flag.

[0009] Capability Confidence Score: A numeric score in the range [0.0, 1.0] representing the historical success rate of a capability across executed tasks, derived exclusively from evidence artefact performance records.

[0010] Performance History Record: An immutable evidence artefact recording the outcome of a single capability application: task identifier, capability identifier, outcome category (SUCCESS, PARTIAL, FAILURE), accuracy delta, and resource consumed.

[0011] Resource State Snapshot: A point-in-time record of current operational resource quantities: compute budget remaining, memory utilisation percentage, and count of active concurrent workflows.


Detailed Description

Capability Registration. [0012] The SMCAE maintains a capability catalogue initialised at system startup with all registered tools and reasoning strategies. Each entry specifies the capability’s declared preconditions, postconditions, and an initial confidence score of 0.5 (neutral prior) if no performance history exists. As performance history accumulates, confidence scores are updated using a governance-policy-defined weighting function that weights recent performance records more heavily than historical ones. Capability catalogue entries are version-controlled; each update produces a new catalogue version with an associated hash.

Performance Record Ingestion. [0013] On each task execution completion, the Action Execution Engine (P206) and Outcome Learning Engine (P207) emit performance history records as evidence artefacts. The SMCAE subscribes to these artefacts and, on receipt, updates the relevant capability’s performance history store and recalculates its confidence score. Recalculation is bounded by a policy-defined maximum history window (default: last 200 execution records per capability). Performance history stores are cryptographically append-only; no record may be modified or deleted.

Resource State Tracking. [0014] The SMCAE maintains a real-time resource state snapshot updated by the Resource Allocation Engine (P213) on each allocation or release event. The snapshot records compute budget consumed, memory utilisation percentage, and the count of active concurrent workflows. Resource state snapshots are not admitted to the evidence ledger individually; instead, a resource state summary is appended to each workflow execution record as a non-mutable annotation.

Capability Query Interface. [0015] The Goal Formation Engine (P210), Tool Synthesis Engine (P211), and Meta-Reasoning Engine (P214) may invoke the capability query interface to: retrieve the current confidence score for a specified capability; request the minimum-confidence capability in a proposed workflow; and query whether total resource cost of a proposed workflow falls within the current resource budget. Query responses include the capability catalogue version hash, enabling callers to detect catalogue updates that may invalidate a prior query result.

Low-Confidence Flagging. [0016] Capabilities whose confidence score falls below a policy-defined minimum threshold (default: 0.4) are flagged as LOW-CONFIDENCE in the catalogue. Workflows depending on flagged capabilities are not automatically rejected but are annotated with a low-confidence warning artefact. The Meta-Reasoning Engine (P214) uses low-confidence annotations as an input to quality scoring.


Technical Effect

[0017] The invention provides an evidence-grounded, continuously-updated self-model for AI reasoning systems. By deriving capability confidence scores exclusively from historical performance evidence artefacts, the system avoids reliance on static or parametric self-assessments. By exposing the self-model through a typed query interface, planning subsystems can make informed decisions about task feasibility without accessing the evidence ledger directly. By maintaining append-only performance history stores, the full operational record of the system’s capability evolution is preserved for audit.


Claims

  1. A computer-implemented method for self-modelling and capability awareness, the method comprising: (a) maintaining a capability catalogue comprising entries for each registered tool and reasoning strategy, each entry including a capability identifier, a capability confidence score, and a performance history reference; (b) ingesting performance history records from task execution evidence artefacts and updating capability confidence scores using a policy-defined weighting function applied to the bounded performance history window; (c) maintaining a real-time resource state snapshot comprising compute budget remaining, memory utilisation, and active workflow count, updated on each allocation or release event; (d) exposing a capability query interface returning confidence scores, minimum-confidence identification, and resource budget feasibility assessments for proposed workflows; and (e) flagging capabilities whose confidence score falls below a policy minimum threshold as LOW-CONFIDENCE, and annotating dependent workflow records with a low-confidence warning artefact.

  2. The method of claim 1, wherein performance history stores are cryptographically append-only and no performance record may be modified or deleted.

  3. The method of claim 1, wherein capability confidence score recalculation is applied over a policy-defined maximum history window, weighting recent records more heavily.

  4. The method of claim 1, wherein query responses include the current capability catalogue version hash, enabling callers to detect catalogue updates that may invalidate a prior query result.

  5. The method of claim 1, wherein the capability catalogue is version-controlled, each update producing a new catalogue version with an associated hash recorded in downstream workflow execution artefacts.

  6. A Self-Model and Capability Awareness Engine comprising: one or more processors; memory storing a capability catalogue, performance history stores, and a resource state snapshot; wherein the processors are configured to execute the method of claim 1.

  7. A non-transitory computer-readable medium storing instructions that, when executed by a processor, implement the method of claim 1.


Abstract

A self-model and capability awareness engine for evidence-bound artificial intelligence maintains a capability catalogue with confidence scores derived exclusively from historical performance evidence artefacts. The engine continuously updates confidence scores as task execution records are ingested, flags low-confidence capabilities, and exposes a query interface enabling planning subsystems to assess task feasibility against current capability and resource state. All performance history stores are cryptographically append-only, preserving the full operational record of the system’s capability evolution for audit. | active_workflows: 3 | +------------------+-----------------------+ | v Capability Query Interface (consulted by P210, P211, P213)


---

## Detailed Description

[0007] **Capability Confidence Score.** For each registered capability, the SMCAE computes a confidence score from the historical performance record: the fraction of prior invocations that produced outputs assessed as correct by the Outcome Learning Engine (P207). Scores are computed over a configurable rolling window of recent invocations.

[0008] **Capability Uncertainty Flag.** When fewer than the minimum required invocations are available to compute a reliable confidence score, the capability is flagged as uncertain. Goal plans that depend on uncertain capabilities are escalated to the Meta-Reasoning Engine (P214) for additional scrutiny.

[0009] **Resource State Snapshot.** The resource state record is updated on each reasoning cycle. The record includes: current compute budget remaining; estimated time-to-budget-exhaustion at current consumption rate; active workflow count; and memory pressure indicator.

[0010] **Self-Model Evidence Binding.** Capability confidence scores are computed only from task outcomes that have been admitted as evidence artefacts to the AIEP ledger. Self-assessment is therefore fully traceable to observable performance records.

---

## Claims

1. A self-model and capability awareness engine for an evidence-bound reasoning architecture wherein capability confidence scores are derived from evidence-bound performance records.
2. The system of claim 1 wherein capabilities with insufficient performance history are flagged as uncertain.
3. The system of claim 1 wherein resource state is tracked on each reasoning cycle and exposed through a standard query interface.
4. The system of claim 1 wherein the capability query interface is available to goal planning and workflow construction subsystems.

Dependencies