Context-Graph Substrate: The computational backbone

A22Prose proofThe data structure that implements the sheaf condition computationally.

How beautiful the world would be if there were a procedure for moving through labyrinths.

Umberto Eco, The Name of the Rose (1980)

This chapter formalizes the context-graph substrate as Anchor A22: a typed data model with five node types (Context, Claim, Witness, Constraint, Equivalence), five edge types, and three operations (glue, restrict, transport) that together constitute the minimal storage model required by the Third Mode. A22 reifies the machinery developed in Parts II through IV into a concrete specification that builders can implement. The reader seeking the narrative motivation for this substrate should consult Vol I, Chapter 7 (The Witness Protocol), where the data-modeling problem is introduced through worked examples rather than formal schema.

The Buzzword Problem

"Context graph" has become an overloaded phrase. Every vendor claims one. It appears in slide decks about knowledge management, entity resolution, recommendation engines, and fraud detection. It has come to mean "a graph that is somehow relevant to context," which is to say, nothing precise at all. We will use it in a specific sense.

This chapter gives the term content by specifying exactly what a substrate for the Third Mode must represent. The question is not "what is a context graph?" but "what is the minimal data model that supports governed predicate invention, sheaf-style gluing, and cost-aware coherence?"

The answer is a typed data model with five node types, five edge types, and three operations. Anything smaller loses a capability the Third Mode requires. Anything larger adds complexity without structural necessity.

What the Substrate Must Represent

Part IV defined operations: predicate invention (A17), invariant checking (A17b, A18), proposal and certification (A19, A19b), search (A20), and cost accounting (A21). These are formal specifications. A builder asks: what do I actually store?

The answer cannot be:

  • Just facts (loses locality)
  • Just graphs (loses witnesses and constraints)
  • Just triples (loses typing and logic)
  • Knowledge graphs as typically understood (no gluing, no typed witnesses, no cost model)

Each capability from Parts II–IV implies a storage requirement:

CapabilityStorage Requirement
Local truthClaims scoped to contexts
Typed verificationWitnesses with class labels
Site structureContexts with refinement morphisms
TransportEquivalences with certificates
Conflict detectionContradictions with proofs
Cost trackingOperations annotated with budgets

The substrate is the reification of this machinery.

Anchor A22: Context-Graph Substrate

A22
Context-Graph Substrate

NODES (five types):

  • Context(U): A view with signature ΣU\Sigma_U, invariants IUI_U, logic LUL_U, absence policy PUP_U per A12
  • Claim(p): A proposition with content, status (asserted | retracted | pending), and timestamp
  • Witness(π\pi): Typed per A2c as decidable | probabilistic | attested
  • Constraint(c): Typed per A18 as integrity | semantic | computational | authority; enforced at assertion-time and glue-time. (We use cc to avoid collision with IUI_U, the invariant set of a context.)
  • Equivalence(e): Witnessed sameness per A10 with kind KK, scope SS, transport certificate per A16

EDGES (five types):

  • supports(π\pi, p): Witness π\pi supports claim pp
  • scoped_to(p, U): Claim pp is asserted in context UU
  • refines(U, V): Context UU refines VV. We write UVU \to V when U refines V (U is finer than V; claims in V can be restricted to U).
  • transports(e, p, p′): Equivalence ee transports claim pp to pp'
  • contradicts(δ\delta, p, q): Proof witness δ\delta establishes pp and qq are inconsistent

Note on hyperedges: transports and contradicts are ternary relations (hyperedges). In storage, they are reified as junction tables or edge-nodes linking three objects.

Note on entities and overlaps: Entities (the objects claims are about) are opaque tokens inside claim content. The substrate maintains an entity incidence index: for each entity term xx, record which claims mention xx and in which contexts. Incidence yields candidate overlaps: two contexts potentially overlap on xx iff both contain claims mentioning xx. Whether those mentions co-refer is determined by Equivalence witnesses (A10) and becomes binding only after identity maintenance (Chapter 21). The index is not a node type because entity tokens are supplied by the host system; the Third Mode supplies equivalence, transport, and identity maintenance over those tokens.

OPERATIONS:

  • glue(cover, target): Attempts sheaf condition. Returns GlobalClaim(pp, UU, γ\gamma) where γ\gamma is a GluingWitness, or ObstructionWitness(failing_overlaps, minimal_unsat_core).
  • restrict(UVU \to V, p): Restricts claim from coarser to finer context (functorial action).
  • transport(e, p, footprint): Transports claim along equivalence, checking property footprint per A16.

Why Contexts Are Nodes

In most graph databases, context is ambient or encoded as edge metadata. In the context graph, contexts are first-class nodes because you need to store multiple logics, signatures, and invariants simultaneously and move claims between them via explicit morphisms.

A claim in the FDA regulatory view and a claim in the EU regulatory view are not the same claim with different labels. They are claims in different contexts with different proof logics and different absence policies. The substrate must represent this difference as structure, not annotation.

When you transport a claim from one context to another, you are traversing a morphism in the site. That morphism has a source, a target, and constraints on what properties survive the transport. Without contexts as nodes, this structure has nowhere to live.

The Five Node Types

Context (U)

A context is a view with typed payload:

  • Signature ΣU\Sigma_U: the vocabulary available in this context
  • Invariants IUI_U: the constraints enforced in this context
  • Logic LUL_U: the proof rules valid in this context
  • Absence policy PUP_U: how this context treats missing data (CWA, OWA, or explicit unknown)

Contexts form a site per A12. The refinement relation U refines V means U is more specific: claims valid in V can be restricted to U, but not necessarily vice versa.

Claim (p)

A claim is a proposition with metadata:

  • Content: the assertion itself (including the entity terms it mentions)
  • Status: asserted (active), retracted (withdrawn), or pending (under review)
  • Timestamp: when the claim was made or last modified

Claims have no intrinsic "scope" field. Scoping is purely structural: every claim is linked to at least one context via scoped_to edges. A claim may be asserted in multiple contexts with different witnesses in each. The entity incidence index tracks which entities each claim mentions.

Witness (π)

A witness is evidence that supports a claim, typed per A2c:

  • Decidable: verification terminates with ok | fail
  • Probabilistic: verification terminates with (ok, p, bounds) | (fail, p)
  • Attested: verification checks provenance, returns ok_if_trusted(authority) | fail

A claim without a witness has no standing. This is the discipline that separates the Third Mode from systems that store assertions without evidence.

Constraint (c)

A constraint is a rule that claims must satisfy, typed per A18:

  • Integrity: structural validity (types, cardinalities, referential integrity)
  • Semantic: domain-level rules (a dress cannot be both minimalist and maximalist)
  • Computational: resource bounds (query must complete in 100ms)
  • Authority: who is allowed to assert what (only certified labs can attest purity)

Constraints are scoped: different contexts may enforce different constraints on the same predicate.

Equivalence (e)

An equivalence is a witnessed sameness per A10:

  • Kind KK: what type of sameness (identity, isomorphism, approximation)
  • Scope SS: where the sameness holds
  • Transport certificate: what properties can be moved along this equivalence per A16

Equivalences are themselves witnessed claims about identity across contexts. The same discipline applies: an equivalence without a witness has no standing. Without this node type, the "e" in transport operations has no home. Equivalences are not implicit; they are first-class objects that license specific operations.

The Five Edge Types

supports(π, p)

Links a witness to the claim it supports. The edge carries no information beyond the linkage; typing is on the witness node.

scoped_to(p, U)

Links a claim to a context where it is asserted. A claim may be scoped to multiple contexts (with different witnesses in each). Note: scoped_to is a structural relation; restrict is an operation that produces a restricted claim.

refines(U, V)

Encodes site structure. U refines V means U is a more specific view. Claims in V can be restricted to U. This edge is the morphism in the site category.

transports(e, p, p′)

Links an equivalence to the claims it connects. The edge records which properties were checked (the footprint). Per A16, transport requires a certificate, which is stored on the Equivalence node.

contradicts(δ, p, q)

Links a proof witness δ to two claims p and q that are inconsistent. The same discipline applies: a contradiction without a proof witness has no standing. Conflicts are explicit, computed, and stored.

The Three Operations

glue(cover, target)

The central operation. Takes a cover of local claims and attempts to produce a global claim in the target context.

Input:

  • A cover {U_i → U}: contexts that jointly describe the target
  • Claims pip_i in each UiU_i with witnesses πi\pi_i

Procedure:

  1. Check coverage: do the UiU_i cover UU?
  2. Compute overlaps: derive UiUjU_i \cap U_j from shared entity incidence
  3. Check overlap agreement: for all overlaps, do pioverlapp_i|_{\text{overlap}} and pjoverlapp_j|_{\text{overlap}} agree?
  4. If yes: produce GlobalClaim(pp, UU, γ\gamma) where γ\gamma is a GluingWitness
  5. If no: produce ObstructionWitness(failing_overlaps, minimal_unsat_core)

Output:

  • Success: GlobalClaim with a GluingWitness (amalgamation certificate). The witness is typed as decidable per A2c. Decidable here refers to the procedure's outcome (it terminates with success/failure), not the claim's truth status in every logic.
  • Failure: ObstructionWitness with the overlaps that failed and a minimal set of claims that cannot be reconciled

Cost: Charged per A21 model. Scales with number of overlaps and witness verification cost.

restrict(U → V, p)

Restricts a claim from a coarser context to a finer one. This is a functorial action: it always succeeds (you can always ask "what does p mean in V?"), but the status may change. A claim that is true in U may be unknown in V if V uses a different logic or has stricter invariants.

transport(e, p, footprint)

Transports a claim along an equivalence, checking that the properties in the footprint survive.

Input:

  • Equivalence ee connecting entities aa and bb
  • Claim pp about aa
  • Property footprint: which properties must be preserved

Output:

  • Success: Claim pp' about bb with transport certificate
  • Failure: Reason why the footprint could not be preserved

Per A16, transport without a certificate for the required properties is not transport; it is speculation.

What the Substrate Is Not

Not a visualization. The context graph is a data model, not a UI pattern. You may visualize it, but that is not its purpose.

Not a knowledge graph (in the common sense). Knowledge graphs as typically understood(Krötzsch 2014)Denny Vrandečić and Markus Krötzsch, "Wikidata: A Free Collaborative Knowledgebase," Communications of the ACM 57, no. 10 (2014): 78–85.View in bibliography are (entity, relation, entity) triples. The context graph has typed witnesses, scoped claims, computable gluing, and explicit conflict detection. Different species entirely.

Consider what a triple store can represent: "dress_X is puffy." Now consider what it cannot represent: who said it, how it was decided, what scope it holds in, what logic governs its negation, what it costs to keep consistent with other views, and what precisely failed when two sources disagree. The context graph makes all of these first-class, queryable objects with typed verification and computable glue outcomes. That is the difference between a knowledge graph and a Third Mode substrate.

Not a property graph. Property graphs allow arbitrary key-value properties on nodes and edges. The context graph has a fixed schema (five node types, five edge types) with typed payloads. The schema is the discipline. Arbitrary properties are metadata, not structure.

Not an ontology. Ontologies describe domain concepts. The context graph is a substrate for representing any domain's claims, witnesses, and coherence structure. It is domain-agnostic infrastructure.

Example: Fashion Catalog Substrate

Instantiate A22 for the running fashion example.

Contexts:

  • U_brand_A: Brand A's catalog view (signature includes puffy, logic is measurement-based)
  • U_brand_B: Brand B's catalog view (signature includes puffy, logic is merchandiser attestation)
  • U_global: Company-wide catalog (union signature, requires agreement on overlaps)

Claims and witnesses: Brand A claims dress_X is puffy, witnessed by measurement (volume ratio against a reference set). Brand B claims the same dress is puffy, witnessed by merchandiser attestation. Both claims are scoped to their respective contexts; both refine into the global view.

Glue attempt: The system attempts to compose these local claims into a global claim on U_global. Coverage passes — both brands' contexts jointly cover the item. Overlap is non-empty — dress_X appears in both. The question reduces to agreement: can the measurement witness and the attestation witness be reconciled?

Scenario A: Calibration exists. A calibration witness maps between the two vocabularies — showing, for instance, that scores above a threshold in the measurement scheme correspond to "puffy" in the merchandiser vocabulary. Agreement holds. The system produces a GlobalClaim backed by a GluingWitness that records the method (calibration), the local witnesses, and the calibration reference. The global claim is earned, not asserted.

Scenario B: Calibration missing or fails. No calibration exists, or the mapping shows disagreement — the measurement score falls in a borderline range that the merchandiser vocabulary classifies differently. The system produces an ObstructionWitness: the failing overlap, the nature of the disagreement, the minimal conflicting core, and the resolution options — obtain a calibration, restrict scope so no global claim is made for this item, or fork the predicate into puffy_measured and puffy_attested.

Both outcomes are first-class artifacts. Success produces a claim with provenance. Failure produces a receipt that explains what went wrong and what options exist. Neither is silent.

Scenario C: Transport failure.

Brand A uses a local identifier brandA:dress_123. Brand B uses brandB:dress_9F2. An Equivalence node exists:

Equivalence(
  e_X,
  kind: identity,
  scope: U_global,
  witness: MatchingDossier(SKU_alignment_2026Q1),
  certificate: {physical_properties: yes, pricing: no}
)

A downstream system attempts to transport pricing information along e_X:

transport(e_X, p_price, footprint={pricing})

Result: Failure. The certificate on e_X does not include pricing. The transport operation returns:

TransportFailure(
  reason: footprint_not_covered,
  requested: {pricing},
  available: {physical_properties},
  resolution: "obtain pricing certificate or restrict footprint"
)

Scenario D: Contradiction witness.

Brand A asserts sustainable(dress_X) with a supplier certificate. Brand B asserts ¬sustainable(dress_X) based on an audit finding. The substrate computes the inconsistency and stores it explicitly:

Witness(δ, class=decidable, method=logical_contradiction)
contradicts(δ, p_sustainable_A, p_not_sustainable_B)

The contradiction is not inferred silently at query time. It is computed, witnessed, and stored. Downstream queries can ask: "what contradictions exist for dress_X?" and receive a structured answer with provenance on both sides.

Implementation Considerations

Storage options:

  • Relational: contexts, claims, witnesses, constraints, equivalences as tables; edges as junction tables with foreign keys
  • Graph database: natural fit for traversal, but must enforce schema discipline (no arbitrary edge types)
  • Hybrid: claims in relational for query performance, edges in graph for traversal

Indexes to maintain:

  • By context: all claims scoped to U
  • By claim: all witnesses supporting p
  • By entity: all claims about entity x across contexts
  • By overlap: all claims that cross U ∩ V (computed from refines + entity incidence)

Cost tracking:

  • Operations record estimated cost before execution and actual cost after
  • Budget enforcement gates glue and transport operations
  • Over-budget operations return early with budget_exceeded reason

Versioning:

  • Claims carry timestamps
  • Witnesses may expire (attested witnesses have authority validity periods)
  • Predicate versions per A17b are tracked via claim metadata
  • Migration witnesses link old and new versions per A16

Consequence

The substrate is specified. It is minimal: five node types, five edge types, three operations. It is typed: witnesses have classes, constraints have kinds, contexts have logics. It is operational: gluing is a computation that succeeds or fails with receipts.

Everything in Part V builds on this substrate. Chapter 21 asks how identity emerges from witness networks rather than brittle keys. Chapter 22 asks what a predicate must carry to be accepted into the substrate. Chapter 23 asks what a query promises when it runs against this structure. Chapter 24 asks how the system survives time as predicates evolve and contexts change.

The substrate is the foundation. The architecture follows.