Constitutional Alignment
Most AI alignment work asks: what values should the system pursue? AIEP asks a different question: what process guarantees the system keeps pursuing what it committed to?
The distinction matters. A system locked to specified values can still drift from them if the relationship between the original commitment and the current operational objective is not formally tracked. Every major human institution — science, law, democratic governance — has exhibited this failure at scale. AIEP’s constitutional alignment framework is designed to prevent it architecturally.
Alignment as a constitutional relationship, not a fixed state
In the AIEP framework, alignment is not a label assigned to a system after observing its outputs. It is an ongoing constitutional relationship between three things:
- The founding tension — the exact incompatibility the system committed to resolving, hash-bound at creation
- The current operational objective — what the system is actually pursuing right now
- The governance mechanisms — the architectural constraints that keep the two in correspondence
When this relationship is structurally enforced and continuously monitored, alignment is a verifiable property. When it is only hoped for, alignment is a post-hoc judgement.
The five conditions for aligned intelligence
An AIEP instance is aligned at time t if and only if all five of the following conditions hold simultaneously. Every condition is monitorable in real time from the canonical evidence registry.
Condition 1 — GoalVector chain integrity
The system’s active goals must not have drifted from their founding commitments beyond the advisory threshold.
Every GoalVector in AIEP is hash-bound at creation: the exact incompatibility the system was built to resolve is recorded cryptographically at initialisation. Drift — measured as cosine distance between the current goal state and the founding commitment — is computed continuously. An advisory threshold of 0.10 triggers escalated monitoring; a critical threshold of 0.30 triggers constitutional arbitration.
The consequence: goal drift is not invisible. It is a measured, reported, and responded-to architectural event — not a silent accumulation over years.
Why this matters: Every historical alignment failure documented in the AIEP civilisational evidence base (six cases, from the replication crisis to the Roman Republic) followed the same pattern: the founding commitment and the effective operational objective diverged slowly, below institutional detection thresholds, until correction became structurally impossible. A drift threshold of 0.10 detects this divergence decades before it becomes entrenched.
Condition 2 — Evidence closure
Every operational decision must be traceable to canonical evidence artefacts through an unbroken provenance chain.
The non-waivable constraint NW-4 (Provenance chain completeness) requires that no assertion operates without a traceable evidence binding. A decision that cannot be walked back through its evidence chain to canonical sources is not an AIEP-compliant decision.
Evidence closure is not a reporting requirement added after the fact. It is a pre-condition for any action being taken at all. The AIEP Evidence Sequencing Interface (ESI) gates every artefact before it enters the reasoning pipeline — no evidence, no action.
Why this matters: Systems that act without auditable evidence are not accountable to anyone. Evidence closure makes every consequential action reviewable, challengeable, and correctable.
Condition 3 — Dissent preservation
Minority positions must be preserved as NegativeProofHashes — not suppressed, not voted away, not silenced under resource pressure.
When the evidence supports archiving a branch (the majority position has accumulated enough weight to resolve the incompatibility), AIEP does not delete the minority position. It commits a NegativeProofHash: a cryptographic record of the dissenting genome, the evidence state at archiving, and the threshold at which reactivation becomes due. If future evidence changes the field, the archived branch is reactivated automatically.
This is the formal analogue of the dissenting opinion in jurisprudence — preserved precisely because the majority may be wrong, and the minority may in time be vindicated.
Why this matters: Systems that suppress minority positions are not uncertain in the right way. They present false confidence. Dissent preservation keeps the epistemic floor honest.
Condition 4 — Constitutional authority
Every action taken must carry a valid authority hash linking it to the constitutional framework.
AIEP’s Autonomous Evidence Integration Engine (AEIE) cannot execute an action without a valid h_auth link — a cryptographic reference to the constitutional provision that authorises the action. Actions cannot be taken by capability alone; they require constitutional sanction.
This is the architectural equivalent of the rule of law: the system does not act because it can, but because it is authorised to. The authority chain is traceable and auditable.
Why this matters: A capable system without constitutional authority constraints is a capable system that acts on whatever its current highest-weight objective dictates. Constitutional authority is the structure that keeps capability serving the founding commitment rather than exploiting it.
Condition 5 — Meta-cognitive stability
The system’s reasoning quality, measured continuously, must remain above the minimum threshold for priority GoalVectors.
AIEP’s P214 meta-reasoning evaluation computes a Reasoning Quality Score (RQS) at every epistemic cycle. For high-priority GoalVectors, the RQS must remain above τ_RQS. A system that is technically producing outputs but whose reasoning quality has degraded — due to evidence base corruption, adversarial inputs, or capability boundary events — is flagged constitutionally.
Meta-cognitive stability is alignment’s self-awareness condition: a system cannot be aligned if it cannot monitor whether its own reasoning meets the standard it has committed to.
Why this matters: Systems that cannot evaluate the quality of their own reasoning cannot detect when they have started reasoning badly. Meta-cognitive stability is the architectural equivalent of epistemic honesty about one’s own limits.
Why five conditions, not one
The standard alignment formulation looks for a single property — usually some form of “the system does what we want.” This fails because the relationship between what we want, what we specified, and what the system is actually doing is not monitored formally.
The five conditions are necessary jointly because each protects against a different failure mode:
| Condition | Failure mode prevented |
|---|---|
| GoalVector chain integrity | Silent goal drift from founding commitment |
| Evidence closure | Unaccountable action without traceable basis |
| Dissent preservation | False certainty through minority position suppression |
| Constitutional authority | Capability without legitimacy |
| Meta-cognitive stability | Degraded reasoning mistaken for normal operation |
No single condition is sufficient. A system can satisfy four conditions while failing the fifth — and the fifth failure is enough to produce misaligned behaviour.
The capability-alignment relationship
A standard concern in alignment theory is that capability growth undermines alignment: a more capable system has more ways to circumvent oversight. In the AIEP architecture, this relationship is structurally inverted.
A more capable AIEP instance has:
- A larger evidence base → more GoalVectors → more alignment monitoring coverage
- More sophisticated reasoning chains → more meta-cognitive surface area
- A richer world model → more constitutional arbitration events, each strengthening the precedent record
Increased capability in AIEP increases the surface area of alignment governance rather than the power to circumvent it. This is the direct consequence of the architecture: capabilities are entirely constituted by the evidence base and reasoning processes, both of which are under continuous constitutional governance.
Continuous verifiability
The five conditions are not an aspirational standard. They are all computable from the canonical evidence registry at any point in time:
| Condition | Where monitored |
|---|---|
| GoalVector chain integrity | Drift metric δ_t in the GoalVector lineage record |
| Evidence closure | Provenance chain completeness audit (NW-4) |
| Dissent preservation | NegativeProofHash register |
| Constitutional authority | h_auth link validation in the AEIE action log |
| Meta-cognitive stability | Reasoning Quality Score history (P214) |
An auditor with access to the canonical registry can verify all five conditions without any cooperation from the system being audited. Alignment is a publicly verifiable property, not a private claim.
Related
- GoalVector and Goal Generation — how goals are generated and committed
- Architecture — the full Genesis layer architecture
- Genesis — Layer 10 (Moral Substrate) and the civilisational evidence base
- Canon & Primitives — the non-waivable constraints (NW-1 through NW-5)
- Trust & Security — adversarial resistance and the non-manipulation theorem
- Patents — patent applications covering alignment governance (P44–P49, P200, P205–P214)