Context-Graph Substrate

The computational backbone

15 min read

A22Prose proofThe data structure that implements the sheaf condition computationally.

How beautiful the world would be if there were a procedure for moving through labyrinths.
— Umberto Eco, The Name of the Rose (1980)

This chapter formalizes the context-graph substrate as Anchor A22: a typed data model with five node types (Context, Claim, Witness, Constraint, Equivalence), five edge types, and three operations (glue, restrict, transport) that together constitute the minimal storage model required by the Third Mode. A22 reifies the machinery developed in Parts II through IV into a concrete specification that builders can implement. The reader seeking the narrative motivation for this substrate should consult Vol I, Chapter 7 (The Witness Protocol), where the data-modeling problem is introduced through worked examples rather than formal schema.

The Buzzword Problem

"Context graph" has become an overloaded phrase. Every vendor claims one. It appears in slide decks about knowledge management, entity resolution, recommendation engines, and fraud detection. It has come to mean "a graph that is somehow relevant to context," which is to say, nothing precise at all. We will use it in a specific sense.

This chapter gives the term content by specifying exactly what a substrate for the Third Mode must represent. The question is not "what is a context graph?" but "what is the minimal data model that supports governed predicate invention, sheaf-style gluing, and cost-aware coherence?"

The answer is a typed data model with five node types, five edge types, and three operations. Anything smaller loses a capability the Third Mode requires. Anything larger adds complexity without structural necessity.

What the Substrate Must Represent

Part IV defined operations: predicate invention (A17), invariant checking (A17b, A18), proposal and certification (A19, A19b), search (A20), and cost accounting (A21). These are formal specifications. A builder asks: what do I actually store?

The answer cannot be:

Just facts (loses locality)
Just graphs (loses witnesses and constraints)
Just triples (loses typing and logic)
Knowledge graphs as typically understood (no gluing, no typed witnesses, no cost model)

Each capability from Parts II–IV implies a storage requirement:

Capability	Storage Requirement
Local truth	Claims scoped to contexts
Typed verification	Witnesses with class labels
Site structure	Contexts with refinement morphisms
Transport	Equivalences with certificates
Conflict detection	Contradictions with proofs
Cost tracking	Operations annotated with budgets

The substrate is the reification of this machinery.

Anchor A22: Context-Graph Substrate

A22

Context-Graph Substrate

NODES (five types):

Context(U): A view with signature $\Sigma_U$ , invariants $I_U$ , logic $L_U$ , absence policy $P_U$ per A12
Claim(p): A proposition with content, status (asserted | retracted | pending), and timestamp
Witness( $\pi$ ): Typed per A2c as decidable | probabilistic | attested
Constraint(c): Typed per A18 as integrity | semantic | computational | authority; enforced at assertion-time and glue-time. (We use $c$ to avoid collision with $I_U$ , the invariant set of a context.)
Equivalence(e): Witnessed sameness per A10 with kind $K$ , scope $S$ , transport certificate per A16

EDGES (five types):

supports( $\pi$ , p): Witness $\pi$ supports claim $p$
scoped_to(p, U): Claim $p$ is asserted in context $U$
refines(U, V): Context $U$ refines $V$ . We write $U \to V$ when U refines V (U is finer than V; claims in V can be restricted to U).
transports(e, p, p′): Equivalence $e$ transports claim $p$ to $p'$
contradicts( $\delta$ , p, q): Proof witness $\delta$ establishes $p$ and $q$ are inconsistent

Note on hyperedges: transports and contradicts are ternary relations (hyperedges). In storage, they are reified as junction tables or edge-nodes linking three objects.

Note on entities and overlaps: Entities (the objects claims are about) are opaque tokens inside claim content. The substrate maintains an entity incidence index: for each entity term $x$ , record which claims mention $x$ and in which contexts. Incidence yields candidate overlaps: two contexts potentially overlap on $x$ iff both contain claims mentioning $x$ . Whether those mentions co-refer is determined by Equivalence witnesses (A10) and becomes binding only after identity maintenance (Chapter 21). The index is not a node type because entity tokens are supplied by the host system; the Third Mode supplies equivalence, transport, and identity maintenance over those tokens.

OPERATIONS:

glue(cover, target): Attempts sheaf condition. Returns GlobalClaim( $p$ , $U$ , $\gamma$ ) where $\gamma$ is a GluingWitness, or ObstructionWitness(failing_overlaps, minimal_unsat_core).
restrict( $U \to V$ , p): Restricts claim from coarser to finer context (functorial action).
transport(e, p, footprint): Transports claim along equivalence, checking property footprint per A16.

Why Contexts Are Nodes

In most graph databases, context is ambient or encoded as edge metadata. In the context graph, contexts are first-class nodes because you need to store multiple logics, signatures, and invariants simultaneously and move claims between them via explicit morphisms.

A claim in the FDA regulatory view and a claim in the EU regulatory view are not the same claim with different labels. They are claims in different contexts with different proof logics and different absence policies. The substrate must represent this difference as structure, not annotation.

When you transport a claim from one context to another, you are traversing a morphism in the site. That morphism has a source, a target, and constraints on what properties survive the transport. Without contexts as nodes, this structure has nowhere to live.

The Five Node Types

Context (U)

A context is a view with typed payload:

Signature $\Sigma_U$ : the vocabulary available in this context
Invariants $I_U$ : the constraints enforced in this context
Logic $L_U$ : the proof rules valid in this context
Absence policy $P_U$ : how this context treats missing data (CWA, OWA, or explicit unknown)

Contexts form a site per A12. The refinement relation U refines V means U is more specific: claims valid in V can be restricted to U, but not necessarily vice versa.

Claim (p)

A claim is a proposition with metadata:

Content: the assertion itself (including the entity terms it mentions)
Status: asserted (active), retracted (withdrawn), or pending (under review)
Timestamp: when the claim was made or last modified

Claims have no intrinsic "scope" field. Scoping is purely structural: every claim is linked to at least one context via scoped_to edges. A claim may be asserted in multiple contexts with different witnesses in each. The entity incidence index tracks which entities each claim mentions.

Witness (π)

A witness is evidence that supports a claim, typed per A2c:

Decidable: verification terminates with ok | fail
Probabilistic: verification terminates with (ok, p, bounds) | (fail, p)
Attested: verification checks provenance, returns ok_if_trusted(authority) | fail

A claim without a witness has no standing. This is the discipline that separates the Third Mode from systems that store assertions without evidence.

Constraint (c)

A constraint is a rule that claims must satisfy, typed per A18:

Integrity: structural validity (types, cardinalities, referential integrity)
Semantic: domain-level rules (a dress cannot be both minimalist and maximalist)
Computational: resource bounds (query must complete in 100ms)
Authority: who is allowed to assert what (only certified labs can attest purity)

Constraints are scoped: different contexts may enforce different constraints on the same predicate.

Equivalence (e)

An equivalence is a witnessed sameness per A10:

Kind $K$ : what type of sameness (identity, isomorphism, approximation)
Scope $S$ : where the sameness holds
Transport certificate: what properties can be moved along this equivalence per A16

Equivalences are themselves witnessed claims about identity across contexts. The same discipline applies: an equivalence without a witness has no standing. Without this node type, the "e" in transport operations has no home. Equivalences are not implicit; they are first-class objects that license specific operations.

The Five Edge Types

supports(π, p)

Links a witness to the claim it supports. The edge carries no information beyond the linkage; typing is on the witness node.

scoped_to(p, U)

Links a claim to a context where it is asserted. A claim may be scoped to multiple contexts (with different witnesses in each). Note: scoped_to is a structural relation; restrict is an operation that produces a restricted claim.

refines(U, V)

Encodes site structure. U refines V means U is a more specific view. Claims in V can be restricted to U. This edge is the morphism in the site category.

transports(e, p, p′)

Links an equivalence to the claims it connects. The edge records which properties were checked (the footprint). Per A16, transport requires a certificate, which is stored on the Equivalence node.

contradicts(δ, p, q)

Links a proof witness δ to two claims p and q that are inconsistent. The same discipline applies: a contradiction without a proof witness has no standing. Conflicts are explicit, computed, and stored.

The Three Operations

glue(cover, target)

The central operation. Takes a cover of local claims and attempts to produce a global claim in the target context.

Input:

A cover {U_i → U}: contexts that jointly describe the target
Claims $p_i$ in each $U_i$ with witnesses $\pi_i$

Procedure:

Check coverage: do the $U_i$ cover $U$ ?
Compute overlaps: derive $U_i \cap U_j$ from shared entity incidence
Check overlap agreement: for all overlaps, do $p_i|_{\text{overlap}}$ and $p_j|_{\text{overlap}}$ agree?
If yes: produce GlobalClaim( $p$ , $U$ , $\gamma$ ) where $\gamma$ is a GluingWitness
If no: produce ObstructionWitness(failing_overlaps, minimal_unsat_core)

Output:

Success: GlobalClaim with a GluingWitness (amalgamation certificate). The witness is typed as decidable per A2c. Decidable here refers to the procedure's outcome (it terminates with success/failure), not the claim's truth status in every logic.
Failure: ObstructionWitness with the overlaps that failed and a minimal set of claims that cannot be reconciled

Cost: Charged per A21 model. Scales with number of overlaps and witness verification cost.

restrict(U → V, p)

Restricts a claim from a coarser context to a finer one. This is a functorial action: it always succeeds (you can always ask "what does p mean in V?"), but the status may change. A claim that is true in U may be unknown in V if V uses a different logic or has stricter invariants.

transport(e, p, footprint)

Transports a claim along an equivalence, checking that the properties in the footprint survive.

Input:

Equivalence $e$ connecting entities $a$ and $b$
Claim $p$ about $a$
Property footprint: which properties must be preserved

Output:

Success: Claim $p'$ about $b$ with transport certificate
Failure: Reason why the footprint could not be preserved

Per A16, transport without a certificate for the required properties is not transport; it is speculation.

What the Substrate Is Not

Not a visualization. The context graph is a data model, not a UI pattern. You may visualize it, but that is not its purpose.

Not a knowledge graph (in the common sense). Knowledge graphs as typically understood(Krötzsch 2014) are (entity, relation, entity) triples. The context graph has typed witnesses, scoped claims, computable gluing, and explicit conflict detection. Different species entirely.

Consider what a triple store can represent: "dress_X is puffy." Now consider what it cannot represent: who said it, how it was decided, what scope it holds in, what logic governs its negation, what it costs to keep consistent with other views, and what precisely failed when two sources disagree. The context graph makes all of these first-class, queryable objects with typed verification and computable glue outcomes. That is the difference between a knowledge graph and a Third Mode substrate.

Not a property graph. Property graphs allow arbitrary key-value properties on nodes and edges. The context graph has a fixed schema (five node types, five edge types) with typed payloads. The schema is the discipline. Arbitrary properties are metadata, not structure.

Not an ontology. Ontologies describe domain concepts. The context graph is a substrate for representing any domain's claims, witnesses, and coherence structure. It is domain-agnostic infrastructure.

Example: Fashion Catalog Substrate

Instantiate A22 for the running fashion example.

Contexts:

U_brand_A: Brand A's catalog view (signature includes puffy, logic is measurement-based)
U_brand_B: Brand B's catalog view (signature includes puffy, logic is merchandiser attestation)
U_global: Company-wide catalog (union signature, requires agreement on overlaps)

Claims and witnesses: Brand A claims dress_X is puffy, witnessed by measurement (volume ratio against a reference set). Brand B claims the same dress is puffy, witnessed by merchandiser attestation. Both claims are scoped to their respective contexts; both refine into the global view.

Glue attempt: The system attempts to compose these local claims into a global claim on U_global. Coverage passes — both brands' contexts jointly cover the item. Overlap is non-empty — dress_X appears in both. The question reduces to agreement: can the measurement witness and the attestation witness be reconciled?

Scenario A: Calibration exists. A calibration witness maps between the two vocabularies — showing, for instance, that scores above a threshold in the measurement scheme correspond to "puffy" in the merchandiser vocabulary. Agreement holds. The system produces a GlobalClaim backed by a GluingWitness that records the method (calibration), the local witnesses, and the calibration reference. The global claim is earned, not asserted.

Scenario B: Calibration missing or fails. No calibration exists, or the mapping shows disagreement — the measurement score falls in a borderline range that the merchandiser vocabulary classifies differently. The system produces an ObstructionWitness: the failing overlap, the nature of the disagreement, the minimal conflicting core, and the resolution options — obtain a calibration, restrict scope so no global claim is made for this item, or fork the predicate into puffy_measured and puffy_attested.

Both outcomes are first-class artifacts. Success produces a claim with provenance. Failure produces a receipt that explains what went wrong and what options exist. Neither is silent.

Scenario C: Transport failure.

Brand A uses a local identifier brandA:dress_123. Brand B uses brandB:dress_9F2. An Equivalence node exists:

Equivalence(
  e_X,
  kind: identity,
  scope: U_global,
  witness: MatchingDossier(SKU_alignment_2026Q1),
  certificate: {physical_properties: yes, pricing: no}
)

A downstream system attempts to transport pricing information along e_X:

transport(e_X, p_price, footprint={pricing})

Result: Failure. The certificate on e_X does not include pricing. The transport operation returns:

TransportFailure(
  reason: footprint_not_covered,
  requested: {pricing},
  available: {physical_properties},
  resolution: "obtain pricing certificate or restrict footprint"
)

Scenario D: Contradiction witness.

Brand A asserts sustainable(dress_X) with a supplier certificate. Brand B asserts ¬sustainable(dress_X) based on an audit finding. The substrate computes the inconsistency and stores it explicitly:

Witness(δ, class=decidable, method=logical_contradiction)
contradicts(δ, p_sustainable_A, p_not_sustainable_B)

The contradiction is not inferred silently at query time. It is computed, witnessed, and stored. Downstream queries can ask: "what contradictions exist for dress_X?" and receive a structured answer with provenance on both sides.

Implementation Considerations

Storage options:

Relational: contexts, claims, witnesses, constraints, equivalences as tables; edges as junction tables with foreign keys
Graph database: natural fit for traversal, but must enforce schema discipline (no arbitrary edge types)
Hybrid: claims in relational for query performance, edges in graph for traversal

Indexes to maintain:

By context: all claims scoped to U
By claim: all witnesses supporting p
By entity: all claims about entity x across contexts
By overlap: all claims that cross U ∩ V (computed from refines + entity incidence)

Cost tracking:

Operations record estimated cost before execution and actual cost after
Budget enforcement gates glue and transport operations
Over-budget operations return early with budget_exceeded reason

Versioning:

Claims carry timestamps
Witnesses may expire (attested witnesses have authority validity periods)
Predicate versions per A17b are tracked via claim metadata
Migration witnesses link old and new versions per A16

Consequence

The substrate is specified. It is minimal: five node types, five edge types, three operations. It is typed: witnesses have classes, constraints have kinds, contexts have logics. It is operational: gluing is a computation that succeeds or fails with receipts.

Everything in Part V builds on this substrate. Chapter 21 asks how identity emerges from witness networks rather than brittle keys. Chapter 22 asks what a predicate must carry to be accepted into the substrate. Chapter 23 asks what a query promises when it runs against this structure. Chapter 24 asks how the system survives time as predicates evolve and contexts change.

The substrate is the foundation. The architecture follows.