The Coherence Topos and Vocabulary Evolution

Appendix L

41 min read

The Coherence Topos and Vocabulary Evolution

This appendix addresses a specific problem: how autonomous computational agents might invent new concepts, certify them against existing commitments, transport them across institutional boundaries, and account for the cost — within a mathematically rigorous structure.

Existing approaches address fragments of this problem. Retrieval-augmented generation retrieves without coherence guarantees. Multi-agent frameworks compose outputs without gluing conditions. Knowledge graphs structure without vocabulary invention. Schema systems enforce without evolution. The composition of these fragments under formal guarantees remains open.

Parts I–VI of The Proofs assembled the components: commitment sets (A1), witnessed equivalence (A10), context sites (A12b), the sheaf condition (A13), fibrations (A14), transport discipline (A16), predicate invention (A17), conservative extension (A17b), and the coherence cost model (A21). This appendix states the structural consequence that these components jointly entail, develops several results that are (to our knowledge) novel, positions the work explicitly against the existing landscape, and identifies concrete research programs for the mathematical community.

The topos theorem (L.2) is a consequence, not a contribution — it follows from Giraud's theorem applied to the context site. We state it because its corollaries are the contribution: the internal logic subsumes the logic-selection machinery of A15, the subobject classifier provides the multi-valued truth that A4 reached for, and the monadic structure of predicate invention (L.6) gives a formal theory of vocabulary evolution that has not, to our knowledge, been developed elsewhere.

L.1 What This Appendix Claims

We distinguish three levels of novelty:

Standard results applied to a new domain (L.2, L.3): The coherence topos theorem and its internal-logic corollary are instances of known mathematics (Giraud, Mac Lane–Moerdijk). We claim only that the instantiation is well-formed and that the corollaries are operationally significant for distributed systems.

Novel results (L.4–L.6): The Obstruction Cohomology computation, the acyclicity theorem for hierarchical sites, the H2H^2 meta-obstruction for federated sites, the non-composability of overlap agreement, and the characterization of the predicate invention monad as a quotient of a free monad are (to our knowledge) new. They are provable within standard sheaf theory and model theory but have not appeared in the literature because the combination — sheaf-theoretic coherence applied to vocabulary evolution under conservative extension constraints — has not been studied.

Landscape comparison (L.7): We position this work explicitly against Spivak's functorial data migration, Goguen's sheaf semantics, Abramsky's sheaf-theoretic contextuality, and Caramello's bridge program. The comparison identifies what is shared, what is new, and where the framework extends existing work.

Open problems (L.8): Seven precisely stated problems for the mathematical community, including two new problems motivated by the results of this appendix (the Eilenberg-Moore category of I\mathcal{I} and persistent cohomology of evolving sites).

Notation

Symbols from Parts I–VI (commitment sets, anchors, etc.) follow the conventions in Appendix H. The following notation is specific to this appendix or used here with specialized meaning. Standard category-theoretic and sheaf-theoretic notation follows Mac Lane & Moerdijk(Lane 1992)Saunders Mac Lane, Sheaves in Geometry and Logic: A First Introduction to Topos Theory (New York: Springer-Verlag, 1992).View in bibliography.

SymbolMeaningIntroduced
(Ctx,J)(\mathbf{Ctx}, J)Context site: category Ctx\mathbf{Ctx} with Grothendieck topology JJA12b
PSh(Ctx)\mathbf{PSh}(\mathbf{Ctx})Presheaf category [Ctxop,Set][\mathbf{Ctx}^{\mathrm{op}}, \mathbf{Set}]L.2
Sh(Ctx,J)\mathbf{Sh}(\mathbf{Ctx}, J)Sheaf category (the coherence topos)L.2
Ω\OmegaSubobject classifier; Ω(U)\Omega(U) = JJ-closed sieves on UUL.2
aaSheafification: left exact left adjoint to inclusion i:ShPShi : \mathbf{Sh} \hookrightarrow \mathbf{PSh}L.2
WP\mathcal{W}^{\mathcal{P}}Exponential sheaf: certification space from proposals to witnessesL.3
Cˇn(U,F)\check{C}^n(\mathcal{U}, F)Čech nn-cochains of presheaf FF with respect to cover U\mathcal{U}L.4
Hˇn(U,F)\check{H}^n(\mathcal{U}, F)Čech nn-th cohomology groupL.4
δn\delta^nČech coboundary map CˇnCˇn+1\check{C}^n \to \check{C}^{n+1}L.4
Ci×UCjC_i \times_U C_jOverlap (fiber product) of CiC_i and CjC_j over UUL.4.1
E2p,qE_2^{p,q}Second page of the Čech-to-derived-functor spectral sequenceL.4.1
Hq(F)\underline{H}^q(F)Presheaf of local cohomology groupsL.4.1
Sig\mathbf{Sig}Category of signatures with inclusion morphismsL.5
Σ,Σ\Sigma, \Sigma'Signatures (finite sets of typed predicate/function symbols)A17
PPProposal endofunctor: P(Σ)P(\Sigma) = single-predicate extension proposalsL.6.1
PP^*Free monad on PP: finite sequences of proposalsL.6.1
I\mathcal{I}Predicate invention monad: admissible extensions of Σ\SigmaL.6.2
π:PI\pi : P^* \twoheadrightarrow \mathcal{I}Quotient monad morphism (surjective)L.6.2
η,μ\eta, \muMonad unit and multiplicationL.6.2
SigI\mathbf{Sig}_{\mathcal{I}}Kleisli category: vocabulary evolution pathsL.6.2
kerπ\ker \piKernel of the quotient: inadmissible proposal combinationsL.6.3

L.2 The Coherence Topos

Throughout this appendix, (Ctx,J)(\mathbf{Ctx}, J) is the context site from A12b and Ctx\mathbf{Ctx} is assumed essentially small.

Theorem(Coherence Topos)

Sh(Ctx,J)\mathbf{Sh}(\mathbf{Ctx}, J) is a Grothendieck topos(Verdier 1972--1973)Michael Artin and Alexander Grothendieck and Jean-Louis Verdier, Théorie des Topos et Cohomologie Étale des Schémas (SGA 4) (Berlin: Springer-Verlag, 1972--1973).View in bibliography(Lane 1992, ch. III, §4)Saunders Mac Lane, Sheaves in Geometry and Logic: A First Introduction to Topos Theory (New York: Springer-Verlag, 1992), ch. III, §4.View in bibliography. It has all finite limits, all small colimits, exponentials, a subobject classifier Ω\Omega, and the inclusion i:Sh(Ctx,J)PSh(Ctx)i : \mathbf{Sh}(\mathbf{Ctx}, J) \hookrightarrow \mathbf{PSh}(\mathbf{Ctx}) has a left exact left adjoint aa (sheafification).

Proof

By Giraud's theorem(Verdier 1972--1973)Michael Artin and Alexander Grothendieck and Jean-Louis Verdier, Théorie des Topos et Cohomologie Étale des Schémas (SGA 4) (Berlin: Springer-Verlag, 1972--1973).View in bibliography(Lane 1992, ch. III, Theorem 1)Saunders Mac Lane, Sheaves in Geometry and Logic: A First Introduction to Topos Theory (New York: Springer-Verlag, 1992), ch. III, Theorem 1.View in bibliography. The topology JJ determines a Lawvere-Tierney operator j:ΩPShΩPShj : \Omega_{\mathrm{PSh}} \to \Omega_{\mathrm{PSh}} via j(S)={f:VUf(S)J(V)}j(S) = \{f : V \to U \mid f^*(S) \in J(V)\}, which is idempotent, preserves top, and preserves meets. The jj-sheaves are the JJ-sheaves, and the category of jj-sheaves in a topos is a topos (Mac Lane & Moerdijk, Ch. V, Theorem 1).

Corollary: The Subobject Classifier and Epistemic Status

Truth Values in the Coherence Topos

The subobject classifier Ω(U)={SS is a J-closed sieve on U}\Omega(U) = \{S \mid S \text{ is a } J\text{-closed sieve on } U\}. Truth values are not {0,1}\{0, 1\} but JJ-closed sieves: families of contexts in which a claim holds, closed under the covering relation.

A4 Epistemic StatusTopos Interpretation
True in UUThe maximal sieve  ⁣U\uparrow\! U (all refinements)
False in UUThe empty sieve \emptyset
Undetermined in UUA proper non-empty JJ-closed sieve
Conflict at UUBoth χφ\chi_\varphi and χ¬φ\chi_{\neg\varphi} are non-empty, proper sieves

The internal logic is intuitionistic. Excluded middle holds at UU iff the restricted topology is discrete (every sieve covers). This is the closed-world assumption. Non-discrete topologies yield open-world reasoning natively — no adapter required.

Theorem(Logic Selection as Topos Relativization)

The indexed logic selection of A15 is a special case of relativizing to sub-topologies. Specifically: Uφ¬φU \Vdash \varphi \lor \neg\varphi for all φ\varphi iff JJ restricted to the sieve below UU is the discrete topology. The CWA/OWA distinction is not an engineering parameter but a structural property of the topology over each context.

Proof

A Grothendieck topos is Boolean iff ¬¬=idΩ\neg\neg = \mathrm{id}_\Omega, which holds iff every JJ-closed sieve is maximal or empty — the discrete topology(Lane 1992, ch. VI, §6)Saunders Mac Lane, Sheaves in Geometry and Logic: A First Introduction to Topos Theory (New York: Springer-Verlag, 1992), ch. VI, §6.View in bibliography. Restricting to the slice Sh/U\mathbf{Sh}/U yields a sub-topos whose Booleanness depends on the induced topology on the under-category U/CtxU/\mathbf{Ctx}.

L.3 Exponentials and Certification

The topos has exponentials. For sheaves P\mathcal{P} (proposals, per A19) and W\mathcal{W} (witnesses, per A2c):

WP(U)HomSh/U(PU,WU)\mathcal{W}^{\mathcal{P}}(U) \cong \mathrm{Hom}_{\mathbf{Sh}/U}(\mathcal{P}|_U, \mathcal{W}|_U)

A certification contract (A19b) is a global section cΓ(WP)c \in \Gamma(\mathcal{W}^{\mathcal{P}}): a natural transformation PW\mathcal{P} \Rightarrow \mathcal{W} that commutes with all restriction maps. The topos guarantees the space of certifications is a well-defined sheaf. Coherence of certification across contexts is naturality. Whether a particular certification exists is the engineering problem; the topos provides the space in which to search.

L.4 Obstruction Cohomology: A Worked Computation

This section contains what we believe to be novel: an explicit computation of the first sheaf cohomology group H1H^1 for a concrete context site arising in data integration, and its interpretation as classifying ambiguous identity resolution. The companion paper Predicate Invention Under Sheaf Constraints (SCPI) proves that the same H1H^1 classifies obstructions to predicate invention across heterogeneous agent contexts, formalizing the descent problem that A17's three obligations address. The SHEAF Protocol extends this diagnostic to a distributed setting with mechanism-design enforcement.

The Setup: Three-Merchant Catalog

Let Ctx\mathbf{Ctx} be the poset category with objects {U,A,B,C,AB,AC,BC}\{U, A, B, C, A \wedge B, A \wedge C, B \wedge C\} where A,B,CA, B, C are merchant contexts covering the catalog context UU, and the \wedge-objects are pairwise overlaps. Morphisms are inclusions (each overlap refines both parents).

The topology JJ declares {AU,BU,CU}\{A \to U, B \to U, C \to U\} as a cover.

Let FF be the presheaf of product identifiers:

  • F(A)={a1,a2,a3}F(A) = \{a_1, a_2, a_3\} (merchant A's products)
  • F(B)={b1,b2,b3}F(B) = \{b_1, b_2, b_3\} (merchant B's products)
  • F(C)={c1,c2}F(C) = \{c_1, c_2\} (merchant C's products)

On overlaps, restriction identifies shared products:

  • F(AB)F(A \wedge B): product a2a_2 and b1b_1 are "the same item" — but the identification is ambiguous (two possible matchings exist)
  • F(AC)F(A \wedge C): product a3a_3 and c1c_1 are unambiguously identified
  • F(BC)F(B \wedge C): no shared products

The Čech Complex

The Čech cohomology of FF with respect to the cover U={A,B,C}\mathcal{U} = \{A, B, C\} is computed from the cochain complex:

Cˇ0(U,F)δ0Cˇ1(U,F)δ1Cˇ2(U,F)\check{C}^0(\mathcal{U}, F) \xrightarrow{\delta^0} \check{C}^1(\mathcal{U}, F) \xrightarrow{\delta^1} \check{C}^2(\mathcal{U}, F)

where:

  • Cˇ0=F(A)×F(B)×F(C)\check{C}^0 = F(A) \times F(B) \times F(C) — local sections (one per merchant)
  • Cˇ1=F(AB)×F(AC)×F(BC)\check{C}^1 = F(A \wedge B) \times F(A \wedge C) \times F(B \wedge C) — comparison on overlaps
  • Cˇ2=F(ABC)\check{C}^2 = F(A \wedge B \wedge C) — triple overlaps (empty here)

The coboundary δ0\delta^0 sends a local section (sA,sB,sC)(s_A, s_B, s_C) to the tuple of restrictions: (ρA(sA)ρB(sB),  ρA(sA)ρC(sC),  ρB(sB)ρC(sC))(\rho_A(s_A) - \rho_B(s_B),\; \rho_A(s_A) - \rho_C(s_C),\; \rho_B(s_B) - \rho_C(s_C)).

Computing H0H^0 and H1H^1

H0(U,F)=kerδ0H^0(\mathcal{U}, F) = \ker \delta^0 — the global sections. These are the tuples of local products that agree on all overlaps: the coherent global catalog. If the identifications on overlaps are consistent, H0H^0 is the glued catalog.

H1(U,F)=kerδ1/imδ0H^1(\mathcal{U}, F) = \ker \delta^1 / \mathrm{im}\, \delta^0 — the ambiguity group.

Theorem(H^1 Classifies Ambiguous Identity Resolution)

For the three-merchant site above, H1(U,F)0H^1(\mathcal{U}, F) \neq 0 whenever the overlap ABA \wedge B admits multiple consistent identifications of shared products. Concretely: if a2a_2 could match either b1b_1 or b2b_2 (both matchings are consistent with the restriction maps), then H1H^1 has order 2\geq 2, and its elements correspond bijectively to the distinct global catalogs that could be assembled from the same local data.

Proof

A 1-cocycle σkerδ1\sigma \in \ker \delta^1 assigns to each overlap an identification that satisfies the cocycle condition on triple overlaps (vacuously here, since ABC=A \wedge B \wedge C = \emptyset). Two 1-cocycles are cohomologous if they differ by a coboundary — a relabeling of local products that induces the identification difference.

When ABA \wedge B admits two matchings m1:a2b1m_1 : a_2 \leftrightarrow b_1 and m2:a2b2m_2 : a_2 \leftrightarrow b_2, these define distinct 1-cocycles. They are cohomologous iff there exists a relabeling of F(A)F(A) or F(B)F(B) that transforms one matching into the other. If b1b2b_1 \neq b_2 and neither is in the image of any other identification, no such relabeling exists, and the cocycles represent distinct cohomology classes.

Each class corresponds to a distinct global catalog: the "same" local data assembled into different global pictures depending on which identification is chosen.

Remark

This is the formal version of a problem every data integration practitioner knows: two sources share some entities, the matching is ambiguous, and different matchings produce different downstream results. H10H^1 \neq 0 is the mathematical name for this ambiguity. The group structure tells you how many distinct resolutions exist and how they relate. This is not metaphor — it is computable for finite context sites.

For the agentic substrate specifically: when two AI agents operating in different contexts propose identity claims about shared entities, H1H^1 measures the irreducible ambiguity in reconciling those claims. No amount of embedding similarity resolves it; only an explicit choice of cocycle representative (a witnessed identification) does.

L.4.1 Acyclicity of Hierarchical Sites

The three-merchant example has non-trivial H1H^1 because the overlap structure admits ambiguity. A natural question: for which site structures does ambiguity vanish? The answer connects organizational topology to coherence cost.

Hierarchical Context Site

A context site (Ctx,J)(\mathbf{Ctx}, J) is hierarchical if:

  1. Ctx\mathbf{Ctx} is a finite rooted tree (poset where every element except the root has exactly one immediate predecessor)
  2. The topology JJ is generated by parent-children families: for each non-leaf node UU with children {C1,,Ck}\{C_1, \ldots, C_k\}, the family {CiU}i=1k\{C_i \to U\}_{i=1}^k is a cover
  3. For distinct siblings Ci,CjC_i, C_j (children of the same parent), the overlap Ci×UCjC_i \times_U C_j is the initial object \emptyset (no shared sub-context between different branches)
Theorem(Acyclicity of Hierarchical Sites)

Let (Ctx,J)(\mathbf{Ctx}, J) be a hierarchical context site. For any abelian presheaf FF on Ctx\mathbf{Ctx} and any cover U\mathcal{U} in JJ:

Hˇn(U,F)=0for all n1\check{H}^n(\mathcal{U}, F) = 0 \quad \text{for all } n \geq 1

In particular, H1=0H^1 = 0: there is no ambiguity in identity resolution for hierarchical organizations.

Proof

We prove this by analyzing the Čech complex directly.

Step 1: Structure of overlaps in a tree.

Let UU be a node with children {C1,,Ck}\{C_1, \ldots, C_k\} forming a cover. For iji \neq j, the overlap Ci×UCj=C_i \times_U C_j = \emptyset by the tree condition (distinct branches share no sub-context). Therefore for any presheaf FF:

F(Ci×UCj)=F()={}(terminal, for an abelian presheaf: the zero object)F(C_i \times_U C_j) = F(\emptyset) = \{*\} \quad \text{(terminal, for an abelian presheaf: the zero object)}

Step 2: Collapse of the Čech complex.

The Čech complex for cover U={C1,,Ck}\mathcal{U} = \{C_1, \ldots, C_k\} of UU is:

Cˇ0=iF(Ci)δ0Cˇ1=i<jF(Ci×UCj)δ1Cˇ2=i<j<lF(Ci×UCj×UCl)\check{C}^0 = \prod_{i} F(C_i) \xrightarrow{\delta^0} \check{C}^1 = \prod_{i < j} F(C_i \times_U C_j) \xrightarrow{\delta^1} \check{C}^2 = \prod_{i < j < l} F(C_i \times_U C_j \times_U C_l) \to \cdots

Since Ci×UCj=C_i \times_U C_j = \emptyset for all iji \neq j, every term Cˇn=0\check{C}^n = 0 for n1n \geq 1. The complex is:

iF(Ci)00\prod_i F(C_i) \to 0 \to 0 \to \cdots

Therefore Hˇn=0\check{H}^n = 0 for all n1n \geq 1.

Step 3: Extension to composite covers.

For a cover of a non-root node, the same argument applies locally: each non-leaf is covered by its children, which are pairwise disjoint. By the Čech-to-derived-functor spectral sequence (or directly by Leray's theorem applied to the refinement of any cover by the canonical parent-children covers), the vanishing extends to all covers in JJ, not just the generating ones.

Step 4: Recursive argument for depth >1> 1.

For a tree of depth dd, consider the cover of the root by its children, then each child by its children, etc. The Čech-to-sheaf cohomology spectral sequence for this iterated cover has:

E2p,q=Hˇp(U,Hq(F))E_2^{p,q} = \check{H}^p(\mathcal{U}, \underline{H}^q(F))

where Hq\underline{H}^q is the presheaf of local cohomology. By induction on depth: Hq=0\underline{H}^q = 0 for q1q \geq 1 (each sub-tree is acyclic by the inductive hypothesis), so E2p,q=0E_2^{p,q} = 0 for q1q \geq 1. And E2p,0=Hˇp(U,F)=0E_2^{p,0} = \check{H}^p(\mathcal{U}, F) = 0 for p1p \geq 1 by Step 2. Therefore the spectral sequence degenerates and Hn(Ctx,F)=0H^n(\mathbf{Ctx}, F) = 0 for all n1n \geq 1.

Remark

This theorem has a precise operational meaning: hierarchical organizations have no identity ambiguity. When contexts are organized as a tree — a corporate hierarchy, a taxonomic classification, a file system — the hierarchy itself resolves all identity questions. Two items in different branches are either identified by a common ancestor's decree or they are not. There is no room for multiple consistent identifications because distinct branches share no sub-context on which to disagree.

This explains a familiar phenomenon: hierarchical organizations are easy to integrate. Corporate mergers between divisions that shared no operations succeed trivially. Taxonomies with strict inclusion are unambiguous. File systems never have merge conflicts within a single tree.

The price of acyclicity is rigidity. A tree cannot express "A and B share some context but neither subsumes the other." Peer-to-peer and federated structures can, and they pay for it with non-trivial cohomology.

L.4.2 Higher Obstructions in Federated Sites

Federated structures are the opposite extreme from hierarchies: multiple overlapping authorities, no single root, non-trivial shared contexts. We show that federated sites can have non-trivial H2H^2, which classifies meta-conflicts — disagreements not about identity itself but about how to resolve identity disagreements.

Federated Context Site

A context site (Ctx,J)(\mathbf{Ctx}, J) is federated if:

  1. Ctx\mathbf{Ctx} contains a set of federation nodes {F1,,Fm}\{F_1, \ldots, F_m\} and member nodes {M1,,Mn}\{M_1, \ldots, M_n\}
  2. Each member belongs to at least one federation: for each MjM_j, there exists FiF_i with a morphism MjFiM_j \to F_i (membership)
  3. The topology JJ includes the cover {MjFiMjFi}\{M_j \to F_i \mid M_j \in F_i\} for each federation FiF_i
  4. Members of distinct federations may share non-trivial overlaps: Mj×FiMkM_j \times_{F_i} M_k need not be initial
  5. There exists a global context GG covered by {F1,,Fm}\{F_1, \ldots, F_m\}
Theorem(Non-Trivial H^2 in Federated Sites)

There exists a federated context site (Ctx,J)(\mathbf{Ctx}, J) and presheaf FF such that H2(U,F)0H^2(\mathcal{U}, F) \neq 0 for a cover U\mathcal{U} of the global context. Elements of H2H^2 classify meta-obstructions: situations where pairwise identity resolutions exist but no globally consistent resolution strategy exists.

Proof

Construction. Let Ctx\mathbf{Ctx} have:

  • Global context GG
  • Three federation nodes F1,F2,F3F_1, F_2, F_3 covering GG
  • Six member nodes MijM_{ij} for 1i<j31 \leq i < j \leq 3, where MijM_{ij} belongs to both FiF_i and FjF_j (the shared member between federations ii and jj)
  • Triple overlap M123M_{123} belonging to all three federations

The cover of GG is U={F1,F2,F3}\mathcal{U} = \{F_1, F_2, F_3\}. The pairwise overlaps are Fi×GFj=MijF_i \times_G F_j = M_{ij}. The triple overlap is F1×GF2×GF3=M123F_1 \times_G F_2 \times_G F_3 = M_{123}.

Let FF be a presheaf of identification protocols (an abelian group, for concreteness Z/2Z\mathbb{Z}/2\mathbb{Z}-valued):

  • F(Fi)=Z/2ZF(F_i) = \mathbb{Z}/2\mathbb{Z} for each federation (two possible identity conventions: "match by name" vs "match by code")
  • F(Mij)=Z/2ZF(M_{ij}) = \mathbb{Z}/2\mathbb{Z} (the agreed convention on the shared member)
  • F(M123)=Z/2ZF(M_{123}) = \mathbb{Z}/2\mathbb{Z}

The restriction maps ρi:F(Fi)F(Mij)\rho_i : F(F_i) \to F(M_{ij}) are the identity (each federation imposes its convention on its shared members).

The Čech complex:

Cˇ0=(Z/2)3δ0Cˇ1=(Z/2)3δ1Cˇ2=Z/2\check{C}^0 = (\mathbb{Z}/2)^3 \xrightarrow{\delta^0} \check{C}^1 = (\mathbb{Z}/2)^3 \xrightarrow{\delta^1} \check{C}^2 = \mathbb{Z}/2

The coboundary δ0(a1,a2,a3)=(a1a2,a1a3,a2a3)\delta^0(a_1, a_2, a_3) = (a_1 - a_2, a_1 - a_3, a_2 - a_3).

The coboundary δ1(b12,b13,b23)=b12b13+b23\delta^1(b_{12}, b_{13}, b_{23}) = b_{12} - b_{13} + b_{23} (the alternating sum on the triple overlap).

Computing H2H^2: kerδ1\ker \delta^1: we need b12b13+b23=0b_{12} - b_{13} + b_{23} = 0 in Z/2\mathbb{Z}/2, i.e., b12+b13+b23=0b_{12} + b_{13} + b_{23} = 0. This kernel has order $4$ (any two of the three values determine the third).

imδ0\mathrm{im}\, \delta^0: the image consists of (a1a2,a1a3,a2a3)(a_1-a_2, a_1-a_3, a_2-a_3). Over Z/2\mathbb{Z}/2, this gives vectors (a1+a2,a1+a3,a2+a3)(a_1+a_2, a_1+a_3, a_2+a_3). When (a1,a2,a3)(a_1,a_2,a_3) ranges over (Z/2)3(\mathbb{Z}/2)^3, the image has order $4(onecanverify:themaphaskernel (one can verify: the map has kernel {(0,0,0), (1,1,1)}, so the image has __CURRENCY_2__/2 = 4 elements).

Therefore Hˇ1=kerδ1/imδ0=(Z/2)2/(Z/2)2\check{H}^1 = \ker\delta^1 / \mathrm{im}\,\delta^0 = (\mathbb{Z}/2)^2 / (\mathbb{Z}/2)^2... Let us compute more carefully.

imδ0\mathrm{im}\,\delta^0: with a=(a1,a2,a3)(Z/2)3a = (a_1,a_2,a_3) \in (\mathbb{Z}/2)^3:

  • (0,0,0)(0,0,0)(0,0,0) \mapsto (0,0,0)
  • (1,0,0)(1,1,0)(1,0,0) \mapsto (1,1,0)
  • (0,1,0)(1,0,1)(0,1,0) \mapsto (1,0,1)
  • (0,0,1)(0,1,1)(0,0,1) \mapsto (0,1,1)
  • (1,1,0)(0,1,1)(1,1,0) \mapsto (0,1,1)
  • (1,0,1)(1,0,1)(1,0,1) \mapsto (1,0,1)
  • (0,1,1)(1,1,0)(0,1,1) \mapsto (1,1,0)
  • (1,1,1)(0,0,0)(1,1,1) \mapsto (0,0,0)

So imδ0={(0,0,0),(1,1,0),(1,0,1),(0,1,1)}\mathrm{im}\,\delta^0 = \{(0,0,0), (1,1,0), (1,0,1), (0,1,1)\}, which has order 4.

kerδ1\ker\delta^1: we need b12+b13+b23=0(mod2)b_{12} + b_{13} + b_{23} = 0 \pmod{2}: (0,0,0),(1,1,0),(1,0,1),(0,1,1)(0,0,0), (1,1,0), (1,0,1), (0,1,1) — also order 4.

So Hˇ1=kerδ1/imδ0=0\check{H}^1 = \ker\delta^1/\mathrm{im}\,\delta^0 = 0 in this case.

Now Hˇ2=Cˇ2/imδ1=Z/2/imδ1\check{H}^2 = \check{C}^2 / \mathrm{im}\,\delta^1 = \mathbb{Z}/2 / \mathrm{im}\,\delta^1.

imδ1\mathrm{im}\,\delta^1: δ1(b12,b13,b23)=b12+b13+b23\delta^1(b_{12},b_{13},b_{23}) = b_{12}+b_{13}+b_{23}. Since (1,0,0)1(1,0,0) \mapsto 1, the image is all of Z/2\mathbb{Z}/2.

So Hˇ2=0\check{H}^2 = 0 here as well. This is because the nerve of this cover is the 2-simplex Δ2\Delta^2, which is contractible.

The non-trivial case requires a cover whose nerve has non-trivial H2H^2. We modify the construction: let GG be covered by four federations F1,F2,F3,F4F_1, F_2, F_3, F_4 with pairwise overlaps MijM_{ij} for all i<ji < j, triple overlaps MijkM_{ijk} for all i<j<ki < j < k, but no quadruple overlap (M1234=M_{1234} = \emptyset). The nerve is the boundary of a 3-simplex Δ3S2\partial\Delta^3 \cong S^2, which has H2(S2,Z/2)=Z/20H^2(S^2, \mathbb{Z}/2) = \mathbb{Z}/2 \neq 0.

Concretely, the Čech complex becomes:

(Z/2)4δ0(Z/2)6δ1(Z/2)4δ20(\mathbb{Z}/2)^4 \xrightarrow{\delta^0} (\mathbb{Z}/2)^6 \xrightarrow{\delta^1} (\mathbb{Z}/2)^4 \xrightarrow{\delta^2} 0

(The last term is $0because because \check^3 = F(M_) = F(\emptyset) = 0$.)

The standard computation gives Hˇ2=Z/2\check{H}^2 = \mathbb{Z}/2. A non-trivial 2-cocycle assigns values to each triple overlap such that the alternating sum condition is satisfied, but these values cannot be decomposed as coboundaries from pairwise overlaps. This is a meta-obstruction: each pair of federations can resolve their identity disagreements, and each triple of federations can find a consistent resolution, but there is no single global resolution strategy compatible with all four federations simultaneously.

Remark

The H2H^2 meta-obstruction has a vivid operational interpretation. Consider four regulatory bodies (F1,,F4F_1, \ldots, F_4) each overseeing a set of financial institutions. Any two regulators can agree on how to identify shared entities. Any three can find a consistent protocol. But when all four try to federate, a global obstruction emerges: the pairwise agreements, though locally consistent in triples, cannot be simultaneously satisfied. This is a higher-order coordination failure — not a conflict about data but a conflict about conflict-resolution strategies.

For the agentic substrate: H20H^2 \neq 0 means that even if every pair of AI agents can resolve their identity disputes, and every triple can coordinate, the system as a whole may still lack a globally consistent identity protocol. The obstruction is structural, residing in the topology of the federation, not in any particular data disagreement.

The nerve of the cover is the key invariant: when it has non-trivial higher homotopy, higher cohomology obstructions emerge. This connects the formal theory to classical algebraic topology in a precise and computable way.

L.4.3 The Cohomological Hierarchy: A Classification

The results of L.4, L.4.1, and L.4.2 fit into a single classification:

Site StructureNerve TopologyH0H^0H1H^1H2H^2Operational Meaning
Hierarchical (tree)ContractibleGlobal sections$0 | __CURRENCY_5__No ambiguity; hierarchy resolves all
Flat peer-to-peerS1\bigvee S^1 (wedge of circles)Partial globalsNon-trivial$0$Identity ambiguity; finitely many resolutions
Federated (overlapping authorities)S2S^2 or higherPartial globalsMay be non-trivialNon-trivialMeta-obstruction; coordination strategy conflict
Fully connectedContractible (Δn1\Delta^{n-1})Global sections$0 | __CURRENCY_8__Total overlap; everyone sees everything
Remark

The fully connected case is as acyclic as the hierarchical case, but for the opposite reason: in a tree, siblings share nothing; in a complete graph, everyone shares everything. Both extremes are cohomologically trivial. The interesting (and realistic) cases lie between these extremes — partial overlap, partial authority, partial sharing. These are exactly the structures that arise in multi-agent AI systems, federated databases, and inter-organizational data sharing.

The cohomological hierarchy provides a quantitative topology of organizational coherence cost. An architect choosing between a hierarchical and federated design is choosing a point in this hierarchy, with precise consequences for the complexity of identity resolution.

L.5 Vocabulary Evolution: Composability and Its Limits

Signature Category

Sig\mathbf{Sig} is the category of signatures (finite sets of typed predicate/function symbols) with morphisms the signature inclusions ΣΣ\Sigma \hookrightarrow \Sigma'.

Theorem(Composability of Conservative Extensions)

If ΣΣ\Sigma \hookrightarrow \Sigma' and ΣΣ\Sigma' \hookrightarrow \Sigma'' are both conservative extensions (A17b), then ΣΣ\Sigma \hookrightarrow \Sigma'' is conservative.

Proof

Let φ\varphi be a Σ\Sigma-sentence with (Σ,I,L)φ(\Sigma'', I'', L) \vdash \varphi. Since φ\varphi is also a Σ\Sigma'-sentence, conservativity of ΣΣ\Sigma' \hookrightarrow \Sigma'' yields (Σ,I,L)φ(\Sigma', I', L) \vdash \varphi. Conservativity of ΣΣ\Sigma \hookrightarrow \Sigma' then yields (Σ,I,L)φ(\Sigma, I, L) \vdash \varphi. The converse is monotonicity.

This composability is what makes incremental vocabulary evolution safe. A chain of conservative extensions is conservative. You verify each step; the chain is automatic.

But overlap agreement does not compose. This is the central tension in the theory of vocabulary evolution, and we state it as a theorem:

Theorem(Non-Composability of Overlap Agreement)

There exist admissible extensions ΣΣ1=Σ{q1}\Sigma \hookrightarrow \Sigma_1 = \Sigma \cup \{q_1\} and ΣΣ2=Σ{q2}\Sigma \hookrightarrow \Sigma_2 = \Sigma \cup \{q_2\}, each satisfying all three obligations of A17, such that ΣΣ12=Σ{q1,q2}\Sigma \hookrightarrow \Sigma_{12} = \Sigma \cup \{q_1, q_2\} fails Obligation 2 (overlap agreement).

Proof

Construction. Let Ctx\mathbf{Ctx} have three objects: UU, VV, and UVU \wedge V. Let Σ\Sigma contain a sort DD (dresses).

Define q1:D[0,1]q_1 : D \to [0,1] (a scoring predicate) with:

  • In UU: q1(d)=material_quality(d)q_1(d) = \text{material\_quality}(d)
  • In VV: q1(d)=material_quality(d)q_1(d) = \text{material\_quality}(d)
  • On overlap UVU \wedge V: agreement holds (same definition).

Define q2:D{true,false}q_2 : D \to \{\text{true}, \text{false}\} with:

  • In UU: q2(d)=[q1(d)>0.7]q_2(d) = [q_1(d) > 0.7] (thresholded from q1q_1)
  • In VV: q2(d)=[certified_sustainable(d)]q_2(d) = [\text{certified\_sustainable}(d)] (independent of q1q_1)
  • On overlap UVU \wedge V: agreement holds — both views happen to agree on the items in the overlap.

Individually, q1q_1 passes Obligation 2 (same definition in both views), and q2q_2 passes Obligation 2 (agreement on overlap for the current items).

Now add both. The compound predicate q3(d)=q2(d)[q1(d)>0.5]q_3(d) = q_2(d) \wedge [q_1(d) > 0.5] is derivable in Σ12\Sigma_{12}. In UU, this means [material_quality(d)>0.7][material_quality(d)>0.5][\text{material\_quality}(d) > 0.7] \wedge [\text{material\_quality}(d) > 0.5], which simplifies to q1(d)>0.7q_1(d) > 0.7. In VV, this means [certified_sustainable(d)][material_quality(d)>0.5][\text{certified\_sustainable}(d)] \wedge [\text{material\_quality}(d) > 0.5]. On the overlap, these may disagree: an item with material_quality=0.6\text{material\_quality} = 0.6 and certified_sustainable=true\text{certified\_sustainable} = \text{true} satisfies the VV-version but not the UU-version.

The interaction between q1q_1 and q2q_2 creates a derived predicate that fails overlap agreement, even though each individually passed.

Remark

This non-composability theorem is the formal reason why the coherence cost model (A21) exhibits quadratic scaling. Each new predicate must be checked against all existing predicates on all overlaps, not just in isolation. The monad multiplication — composing two rounds of predicate invention — requires a full re-verification of Obligation 2 for the composite.

For the agentic substrate: this means that autonomous agents cannot safely invent vocabulary in parallel and then merge the results. Vocabulary invention is inherently sequential at the overlap-checking stage. An agentic system that invents predicates concurrently must synchronize at the point of overlap verification. This is a structural limit, not an engineering deficiency.

L.6 The Predicate Invention Monad

Despite the non-composability of Obligation 2, predicate invention has a well-defined algebraic structure when the full A17 pipeline (including re-verification) is included. We develop this structure in three stages: the free monad of unconstrained proposals, the quotient that enforces admissibility, and the resulting algebraic characterization.

L.6.1 The Proposal Endofunctor

Proposal Endofunctor

Define the proposal endofunctor P:SigSigP : \mathbf{Sig} \to \mathbf{Sig} by:

P(Σ)={(Σ{q},δq)qΣ,  δq is a grounding definition for q}P(\Sigma) = \{(\Sigma \cup \{q\}, \delta_q) \mid q \notin \Sigma,\; \delta_q \text{ is a grounding definition for } q\}

where δq\delta_q specifies the sort, arity, and local definition of qq in each context. PP sends a signature to the set of all single-predicate extension proposals (without checking admissibility). On morphisms: an inclusion ΣΣ\Sigma \hookrightarrow \Sigma' maps a Σ\Sigma-proposal (Σ{q},δq)(\Sigma \cup \{q\}, \delta_q) to the Σ\Sigma'-proposal (Σ{q},δq)(\Sigma' \cup \{q\}, \delta_q) when qΣq \notin \Sigma', and discards it otherwise (the proposed predicate already exists).

Free Monad on Proposals

The free monad PP^* on the endofunctor PP is defined by:

P(Σ)=n0Pn(Σ)=Σ+P(Σ)+P(P(Σ))+P^*(\Sigma) = \coprod_{n \geq 0} P^n(\Sigma) = \Sigma + P(\Sigma) + P(P(\Sigma)) + \cdots

An element of P(Σ)P^*(\Sigma) is a finite sequence of extension proposals (q1,δ1),,(qn,δn)(q_1, \delta_1), \ldots, (q_n, \delta_n) applied to Σ\Sigma. The monadic structure:

  • Unit η:IdP\eta : \mathrm{Id} \to P^* embeds Σ\Sigma as the empty sequence of proposals.
  • Multiplication μ:PP\mu : P^{**} \to P^* flattens a sequence-of-sequences into a single sequence by concatenation.

PP^* is the free monad on PP in the sense of the universal property: for any monad TT and natural transformation α:PT\alpha : P \Rightarrow T, there exists a unique monad morphism αˉ:PT\bar{\alpha} : P^* \to T extending α\alpha.

L.6.2 The Admissibility Quotient

The free monad PP^* allows any sequence of proposals. The predicate invention monad I\mathcal{I} is the quotient that enforces the three obligations of A17.

Predicate Invention Monad

Define the admissibility relation \sim on P(Σ)P^*(\Sigma): two proposal sequences are equivalent if they yield the same final signature and both pass (or both fail) the A17 admissibility check. Define:

I(Σ)={ΣΣΣΣ passes A17}\mathcal{I}(\Sigma) = \{\Sigma' \supseteq \Sigma \mid \Sigma \hookrightarrow \Sigma' \text{ passes A17}\}

ordered by inclusion. There is a surjective monad morphism π:PI\pi : P^* \twoheadrightarrow \mathcal{I} that sends each proposal sequence to its composite extension (if admissible) or discards it (if not). The monadic structure:

  • Unit ηΣ:ΣI(Σ)\eta_\Sigma : \Sigma \hookrightarrow \mathcal{I}(\Sigma) — the identity extension (always admissible).
  • Multiplication μΣ:I(I(Σ))I(Σ)\mu_\Sigma : \mathcal{I}(\mathcal{I}(\Sigma)) \to \mathcal{I}(\Sigma) — compose extensions and re-verify Obligation 2 for the composite. μ\mu is well-defined because conservative extension composes (L.5) and Obligations 1 and 3 are monotone in signature; only Obligation 2 requires re-checking.

The Kleisli category SigI\mathbf{Sig}_{\mathcal{I}} has:

  • Objects: signatures
  • Morphisms ΣΣ\Sigma \to \Sigma': admissible extensions
  • Composition: extension-then-re-verify

This is the category of vocabulary evolution paths. A morphism in SigI\mathbf{Sig}_{\mathcal{I}} is a certified route from one vocabulary to another.

Theorem(Predicate Invention as Quotient of Free Monad)

I\mathcal{I} is a quotient monad of PP^*. Specifically, there is a surjective monad morphism π:PI\pi : P^* \twoheadrightarrow \mathcal{I} whose kernel is the congruence generated by two relations:

  1. Path independence: (q1,δ1),(q2,δ2)(q2,δ2),(q1,δ1)(q_1, \delta_1), (q_2, \delta_2) \sim (q_2, \delta_2), (q_1, \delta_1) when both orderings yield the same composite extension
  2. Admissibility filtering: (q1,δ1),,(qn,δn)(q_1, \delta_1), \ldots, (q_n, \delta_n) \sim \bot when the composite Σ{q1,,qn}\Sigma \cup \{q_1, \ldots, q_n\} fails any obligation of A17

Consequently, the category of I\mathcal{I}-algebras is a reflective subcategory of PP^*-algebras, consisting of those PP^*-algebras where the Obligation 2 equations hold.

Proof

That π\pi is a monad morphism: We must show π\pi commutes with unit and multiplication. For the unit: π(ηP(Σ))=π(Σ,empty sequence)=Σ=ηI(Σ)\pi(\eta_{P^*}(\Sigma)) = \pi(\Sigma, \text{empty sequence}) = \Sigma = \eta_{\mathcal{I}}(\Sigma). For multiplication: let s=((q1,δ1),)s = ((q_1, \delta_1), \ldots) be a sequence in P(P(Σ))P^*(P^*(\Sigma)), consisting of a sequence of sequences of proposals. Then π(μP(s))\pi(\mu_{P^*}(s)) = the composite of the flattened sequence, and μI(π(π(s)))\mu_{\mathcal{I}}(\pi(\pi(s))) = the composite of the composites. Since extension composition is associative (signature union is associative), these agree when both are admissible. When either is inadmissible, both map to \bot.

Surjectivity: Every admissible extension ΣΣ\Sigma \hookrightarrow \Sigma' with Σ=Σ{q1,,qn}\Sigma' = \Sigma \cup \{q_1, \ldots, q_n\} is the image of the proposal sequence (q1,δ1),,(qn,δn)(q_1, \delta_1), \ldots, (q_n, \delta_n) under π\pi.

Kernel characterization: Two proposal sequences have the same image under π\pi iff they yield the same composite signature (path independence) or both are inadmissible (admissibility filtering). These generate a congruence on PP^* because both relations are compatible with the monad multiplication (re-verification depends only on the composite, not the path).

Reflective subcategory: An I\mathcal{I}-algebra is a signature Σ\Sigma equipped with an action α:I(Σ)Σ\alpha : \mathcal{I}(\Sigma) \to \Sigma — a way to "absorb" admissible extensions. This is a PP^*-algebra that additionally satisfies: whenever two proposal sequences yield extensions that individually pass A17 but whose composite fails Obligation 2, the algebra's action must reject the composite. The reflector is the functor that takes a PP^*-algebra and quotients by the Obligation 2 relations.

Remark

The monad laws hold:

  • Left unit: μηI=id\mu \circ \eta_{\mathcal{I}} = \mathrm{id} (extending by nothing, then composing, is identity).
  • Right unit: μI(η)=id\mu \circ \mathcal{I}(\eta) = \mathrm{id} (composing with the identity extension is identity).
  • Associativity: μμI=μI(μ)\mu \circ \mu_{\mathcal{I}} = \mu \circ \mathcal{I}(\mu) — this holds because re-verification of Obligation 2 for the composite is independent of the order in which we compose three extensions. The overlap structure depends only on the final signature, not on the path taken to reach it.

The last point is significant: the cost of re-verification may depend on the path (some orderings may allow caching), but the result does not. The monad captures what is invariant (the admissibility condition); the cost model (A21) captures what varies (the verification effort).

L.6.3 The Algebraic Content of Non-Composability

The quotient structure π:PI\pi : P^* \twoheadrightarrow \mathcal{I} makes the non-composability theorem (L.5) algebraically precise.

Theorem(Non-Composability as Non-Freeness)

I\mathcal{I} is not a free monad on any endofunctor. Equivalently: the kernel of π:PI\pi : P^* \twoheadrightarrow \mathcal{I} is non-trivial; it contains proposal sequences that are admissible individually but inadmissible in combination.

Proof

If I\mathcal{I} were free on some endofunctor QQ, then every I\mathcal{I}-algebra would be determined by a QQ-action, with no additional equations. But the Obligation 2 constraint imposes equations that depend on pairs of proposals and their interaction on overlaps — equations that cannot be captured by the structure map of a single endofunctor. Specifically: the non-composability theorem (L.5) exhibits two proposals q1,q2q_1, q_2 such that I(Σ)Σ{q1}\mathcal{I}(\Sigma) \ni \Sigma \cup \{q_1\} and I(Σ)Σ{q2}\mathcal{I}(\Sigma) \ni \Sigma \cup \{q_2\}, but Σ{q1,q2}I(Σ)\Sigma \cup \{q_1, q_2\} \notin \mathcal{I}(\Sigma).

In a free monad FF^* on endofunctor QQ, if xF(Σ)x \in F^*(\Sigma) and yF(Σ)y \in F^*(\Sigma), then μ(x,y)F(Σ)\mu(x, y) \in F^*(\Sigma) (the monad multiplication is total). The predicate invention monad's multiplication is partial on the underlying set: not every pair of admissible extensions composes to an admissible extension. This partiality, formalized as a non-trivial kernel in π\pi, is the algebraic signature of non-freeness.

The precise obstruction: I\mathcal{I} is presented by the generators PP and the relations RR (Obligation 2 failures), making it a quotient P/RP^*/R rather than a free monad. This is analogous to how a group presented by generators and relations is not a free group unless the relations are trivial.

Remark

This characterization resolves a question implicit in the earlier formulation: why can't we "just" invent predicates in parallel? The answer is algebraic: I\mathcal{I} is not free, and the non-freeness comes precisely from the inter-predicate constraints of Obligation 2. A free monad would allow unrestricted parallel composition. The quotient structure forces sequential verification at the overlap boundary.

This also explains the cost model: the coherence budget (A21) is computing the size of the kernel of π\pi, restricted to a given overlap structure. A larger kernel means more inadmissible combinations, hence more verification work per predicate added. The quadratic scaling of Obligation 2 checking is a consequence of the kernel growing quadratically with signature size.

L.7 Relation to Existing Frameworks

The coherence topos framework occupies a specific position in the landscape of categorical approaches to data integration and distributed systems. We make the comparisons explicit to identify precisely what is shared, what is new, and what remains open.

L.7.1 Spivak's Functorial Data Migration

Spivak's program(Spivak 2012)David I. Spivak, "Functorial Data Migration," Information and Computation 217 (2012): 31–51.View in bibliography models databases as functors I:CSetI : \mathbf{C} \to \mathbf{Set} from a schema category C\mathbf{C} (encoding tables, columns, and foreign keys) to Set\mathbf{Set} (the actual data). Data migration between schemas C\mathbf{C} and D\mathbf{D} is a functor F:CDF : \mathbf{C} \to \mathbf{D} inducing three adjoint operations:

ΣFΔFΠF\Sigma_F \dashv \Delta_F \dashv \Pi_F

where ΔF\Delta_F is pullback (direct image), ΣF\Sigma_F is left Kan extension (existential migration), and ΠF\Pi_F is right Kan extension (universal migration).

What the coherence topos shares with Spivak: Both use category theory to formalize data integration. Both treat schemas as categories and data as functors. The restriction maps of our presheaves correspond to Spivak's pullback functors ΔF\Delta_F.

What the coherence topos adds that Spivak does not:

  1. Vocabulary invention. Spivak's framework migrates data between fixed schemas. The functor F:CDF : \mathbf{C} \to \mathbf{D} exists before migration begins. In our framework, the signature Σ\Sigma itself evolves: agents invent new predicates, and the admissibility of the invention is the central question. Spivak has no analog of Obligation 2 (overlap agreement for invented predicates) because his schemas do not grow during operation.

  2. Scoped truth and non-Boolean logic. Spivak's instances are Set\mathbf{Set}-valued functors: a row either exists or does not. Our sheaves carry epistemic status (A4): claims can be true, false, undetermined, or in conflict, with the logic varying by context (A15). The subobject classifier Ω\Omega of the coherence topos (L.2) subsumes this; Spivak's Set\mathbf{Set}-valued model does not.

  3. Cohomological obstruction theory. Spivak does not develop obstruction theory for migration. When ΔF\Delta_F fails (the pullback does not exist or is trivial), the failure is unstructured. Our H1H^1 computation (L.4) provides a classification of the distinct ways migration can fail, with a group structure on the failure modes. The acyclicity theorem (L.4.1) and the H2H^2 meta-obstruction (L.4.2) have no analogs in Spivak's work.

  4. Cost accounting. Spivak's adjunctions are "free" — there is no cost model for migration. Our coherence budget (A21) makes the cost of maintaining sheaf conditions explicit, and the quadratic scaling of Obligation 2 checking (a consequence of L.5's non-composability) quantifies the engineering tradeoff.

Remark

Spivak's framework is the right foundation for structural data migration: moving data between known schemas with known relationships. The coherence topos is designed for the harder problem: semantic data integration where the schemas themselves are evolving, the relationships are being discovered (not given), and the correctness of the discovery must be certified against formal obligations.

A precise connection: the Kleisli category SigI\mathbf{Sig}_{\mathcal{I}} of the predicate invention monad (L.6) can be viewed as a category of schemas with certified evolution paths. Spivak's functors F:CDF : \mathbf{C} \to \mathbf{D} correspond to morphisms in SigI\mathbf{Sig}_{\mathcal{I}} where the evolution is a single-step conservative extension. The framework developed here extends Spivak's to the setting where schemas evolve under formal governance.

L.7.2 Goguen's Sheaf Semantics

Goguen(Burstall 1992)Joseph A. Goguen and Rod M. Burstall, "Institutions: Abstract Model Theory for Specification and Programming," Journal of the ACM 39, no. 1 (1992): 95–146.View in bibliography proposed sheaves as a semantics for concurrent interacting objects, where each object has a local state and objects interact by sharing state on overlaps. This is the closest ancestor to our use of sheaves.

What we share with Goguen: The core insight — sheaves formalize when local information composes into global information — is Goguen's. Our site structure (Ctx,J)(\mathbf{Ctx}, J) is a descendant of his interaction sites.

What we add: Goguen's sheaves are on fixed interaction structures. He does not develop: predicate invention (the site's presheaf growing during operation), the non-composability of overlap agreement (L.5), obstruction cohomology as a classification of integration failures (L.4), or the monad structure of vocabulary evolution (L.6). Goguen also does not develop the connection to model-theoretic conservativity (A17b), which is essential for the safety guarantees of predicate invention.

L.7.3 Abramsky's Sheaf-Theoretic Contextuality

Abramsky and Brandenburger(Abramsky 2011)Citation not found: abramsky2011View in bibliography use sheaf theory to formalize contextuality in quantum mechanics: a family of local measurements is contextual if it has no global section — a presheaf that fails the sheaf condition. Their Čech cohomology detects contextuality, with H10H^1 \neq 0 implying strong contextuality.

What we share with Abramsky: The Čech cohomology machinery and the interpretation of H1H^1 as measuring obstruction to global consistency. Our H1H^1 computation (L.4) follows the same pattern.

What differs: Abramsky's presheaves are empirical models — probability distributions on measurement outcomes. Ours are data claims — assertions by computational agents about shared entities. The obstruction in Abramsky is physical (no hidden-variable model exists); ours is semantic (no consistent global identity assignment exists). The mathematics is the same; the domain and operational consequences are different. Critically, we develop the higher cohomology (H2H^2, L.4.2) and the structural classification (L.4.3), which Abramsky does not pursue in the same setting.

L.7.4 Caramello's Toposes as Bridges

Caramello's program(Caramello 2018)Olivia Caramello, Theories, Sites, Toposes: Relating and Studying Mathematical Theories through Topos-Theoretic `Bridges' (Oxford: Oxford University Press, 2018).View in bibliography uses Morita equivalence of toposes as a tool for transferring results between mathematical theories. Two theories are "Morita equivalent" if they classify the same topos, and the topos serves as a "bridge" for transferring invariants.

Connection to our work: Problem 1 in L.8 asks for the geometric theory classified by the coherence topos. If this theory can be identified, Caramello's bridge technique would immediately transfer invariants from other Morita-equivalent theories, potentially connecting coherent vocabulary evolution to problems in algebraic geometry, logic, or topology that have been studied independently.

What we add: Caramello's program is a meta-mathematical tool — it relates theories via their classifying toposes. We provide a specific instantiation: the coherence topos, with its specific site, specific presheaves, and specific theorems (acyclicity, non-composability, the monad characterization). Our work provides a concrete object for Caramello's program to analyze.

L.7.5 Capabilities Comparison

The landscape can be summarized in a table of capabilities:

CapabilitySpivakGoguenAbramskyCaramelloThis Work
Sheaf-theoretic coherenceImplicitYesYesMeta-levelYes
Vocabulary inventionNoNoNoNoYes (A17, L.6)
Obstruction cohomologyNoNoH1H^1 onlyNoH0H^0 through H2H^2 (L.4)
Non-composability theoremNoNoNoNoYes (L.5)
Free monad characterizationNoNoNoNoYes (L.6)
Cost accountingNoNoNoNoYes (A21)
Scoped non-Boolean logicNoNoImplicitYesYes (A15, L.2)
Hierarchical acyclicityN/ANoNoNoYes (L.4.1)
Multi-agent operational semanticsNoPartialNoNoYes (L.9)

The gap is not that sheaf theory is unapplied to data integration — Goguen applied it in 1992. The gap is that no existing framework addresses the full lifecycle of vocabulary in a distributed system: invention, certification, transport, versioning, cost, and the algebraic structure of the evolution process. Each prior framework addresses a fragment. This work addresses the composition of these fragments under formal guarantees.

L.8 Open Problems for the Mathematical Community

The following problems are precisely stated and, we believe, tractable for researchers in topos theory, HoTT, and categorical logic. They are not speculative — each connects to concrete phenomena in distributed data systems and multi-agent AI.

Problem 1: Classify the geometric theory of the coherence topos. Every Grothendieck topos classifies a geometric theory T\mathbb{T} such that models of T\mathbb{T} in any topos E\mathcal{E} correspond to geometric morphisms ESh(Ctx,J)\mathcal{E} \to \mathbf{Sh}(\mathbf{Ctx}, J). What is T\mathbb{T} for the coherence topos? This theory would axiomatize exactly those structures admitting coherent vocabulary evolution. Connection: Caramello's "bridge" program(Caramello 2018)Olivia Caramello, Theories, Sites, Toposes: Relating and Studying Mathematical Theories through Topos-Theoretic `Bridges' (Oxford: Oxford University Press, 2018).View in bibliography.

Problem 2 (Partially resolved): Cohomology of structured context sites. Section L.4.1 proved Hn=0H^n = 0 for hierarchical sites, confirming the tree-acyclicity conjecture. Section L.4.2 constructed a federated site with H20H^2 \neq 0, confirming the meta-obstruction conjecture. Remaining open: (a) Compute HnH^n for random context sites (Erdős–Rényi overlap graphs) and determine the threshold for vanishing. (b) For sites arising from real organizational structures, characterize the relationship between the Betti numbers of the nerve and the operational cost of coherence maintenance. (c) Determine whether the Čech cohomology equals the derived-functor cohomology for context sites with non-Hausdorff nerve (this holds for paracompact nerves by Leray's theorem but may fail in general).

Problem 3: Extend to (,1)(\infty,1)-toposes. Witnesses (A10) carry structure: kinds, composition, coherence conditions. The correct categorical home may be an (,1)(\infty,1)-topos where witnesses are 1-morphisms and witness-equivalences are 2-morphisms. Does the coherence topos extend to an \infty-topos? Does the resulting type theory validate a scoped univalence axiom? Connection: Lurie(Lurie 2009)Citation not found: lurie2009View in bibliography, Shulman.

Problem 4: Morita equivalence of context sites. When do two context sites (Ctx1,J1)(\mathbf{Ctx}_1, J_1) and (Ctx2,J2)(\mathbf{Ctx}_2, J_2) produce equivalent sheaf categories? This would formalize when two institutional arrangements — different organizations, different view decompositions — provide the same coherence guarantees. A Morita equivalence theorem for context sites would be a formal version of "organizational isomorphism from the coherence perspective."

Problem 5: Decidability frontier for the predicate invention monad. For which fragments of the ambient logic LL is the admissibility check for predicate invention (A17) decidable? The conservativity check is decidable for propositional and equality fragments, semi-decidable for first-order, undecidable for higher-order (see Appendix K, §K.1.1). What is the precise decidability frontier when overlap agreement (Obligation 2) is included? This connects to classical questions in mathematical logic but in a new setting where the signature itself is evolving.

Problem 6 (New): Eilenberg-Moore category of the predicate invention monad. Characterize the category of I\mathcal{I}-algebras (L.6.2). An I\mathcal{I}-algebra is a signature equipped with a "vocabulary absorption" operation satisfying the monad laws. What are the free I\mathcal{I}-algebras? Can the category of I\mathcal{I}-algebras be described as a variety of algebras (in the sense of universal algebra) with explicit equational axioms? The non-freeness theorem (L.6.3) implies the equational theory is non-trivial; its explicit description would connect to Birkhoff's HSP theorem and the theory of algebraic theories(Lane 1971, ch. VI)Saunders Mac Lane, Categories for the Working Mathematician (New York: Springer-Verlag, 1971), ch. VI.View in bibliography.

Problem 7 (New): Persistent cohomology of evolving context sites. As vocabulary evolves (new predicates are added, overlaps change), the context site (Ctx,J)(\mathbf{Ctx}, J) changes and its cohomology groups evolve. Does the sequence {Htn}t0\{H^n_t\}_{t \geq 0} of cohomology groups over time form a persistence module in the sense of topological data analysis? If so, the persistence diagram would classify the lifetime of obstructions: some ambiguities are transient (resolved by adding a predicate that disambiguates), others are persistent (structural, arising from the federation topology). The barcode of this persistence module would be a novel invariant of vocabulary evolution paths.

L.9 Connection to Agentic Systems

This section connects the mathematical framework to a concrete open problem in multi-agent AI: coherent vocabulary evolution in distributed computational agents.

Current multi-agent AI systems compose outputs (text, actions, tool calls) without composing meaning. Agent A proposes "this product is sustainable." Agent B proposes "this product is eco-friendly." Are these the same predicate? Are they consistent? If Agent C needs to act on both claims, what guarantees does it have?

The coherence topos provides the mathematical infrastructure for answering these questions:

  • Predicate invention (A17, L.6): An agent can propose a new concept. The proposal carries obligations. The monad structure (L.6.1–L.6.3) ensures that sequential inventions compose safely (conservative extension), while the non-composability theorem (L.5) and the non-freeness theorem (L.6.3) identify exactly where synchronization is required. The algebraic content is precise: the kernel of π:PI\pi : P^* \twoheadrightarrow \mathcal{I} is the set of inadmissible combinations, and its growth rate determines the cost of coordination.

  • Obstruction cohomology (L.4): When agents in different contexts make identity claims about shared entities, H1H^1 measures the irreducible ambiguity. The acyclicity theorem (L.4.1) says hierarchical agent organizations are free of this ambiguity. The H2H^2 meta-obstruction (L.4.2) says federated agent systems face a qualitatively harder problem: not just conflicts, but conflicts about conflict-resolution strategies. The cohomological hierarchy (L.4.3) gives a system architect a precise menu of tradeoffs.

  • Scoped transport (A16, Theorem K.3): An agent's claim is valid within its scope. Transporting that claim to another agent's scope requires a certificate. The topos provides the space of possible certifications (L.3); the engineering problem is constructing one.

  • Cost accounting (A21, Theorem K.4): Coherence has a price. The cost model makes the price explicit. The scope boundary in the coherence budget is the system's declaration of how far it will pay for meaning to compose.

  • Coordination without consensus. The framework does not require agents to share objectives, adopt common logics, or trust one another. It requires only overlap discipline: where two agents' domains intersect, their assertions on the intersection must agree (A13). This is a weaker assumption than shared values, and it is computationally verifiable — the only kind of constraint agents can enforce on each other. Conservative extension (A17b) serves each agent's self-interest: it protects prior commitments. The coherence budget (A21) prices coordination without moralizing it. The sheaf condition is a structural consequence of wanting local outputs to compose globally, not a norm imposed from outside.

    The enforcement layer lies outside The Proofs. The conservative extension condition (A17b) is a mathematical specification; Factor Prime (Vol II, Ch 17) provides an enforcement mechanism — collateralized bonds whose thermodynamic cost makes defection expensive without requiring trust; The Sovereign Syntax (Vol III, Epilogue) provides the verification artifact — the receipt that gives affected parties standing to contest.

  • Landscape position (L.7): This is not a reimagining of Spivak's functorial data migration or Goguen's sheaf semantics. It is their extension to the setting where schemas evolve, agents invent vocabulary, and the cost of coherence is a first-class citizen. The comparison table (L.7.5) makes the precise contribution explicit.

The target is a substrate where computational agents can invent, certify, transport, and version predicates under formal guarantees — where inter-agent coherence is a checkable property with a computable cost. The mathematics developed here provides a specification. The engineering required to realize it at scale remains substantial.

← Back to AppendicesBack to The Proofs →