The Coherence Topos and Vocabulary Evolution

Appendix L

41 min read

The Coherence Topos and Vocabulary Evolution

This appendix addresses a specific problem: how autonomous computational agents might invent new concepts, certify them against existing commitments, transport them across institutional boundaries, and account for the cost — within a mathematically rigorous structure.

Existing approaches address fragments of this problem. Retrieval-augmented generation retrieves without coherence guarantees. Multi-agent frameworks compose outputs without gluing conditions. Knowledge graphs structure without vocabulary invention. Schema systems enforce without evolution. The composition of these fragments under formal guarantees remains open.

Parts I–VI of The Proofs assembled the components: commitment sets (A1), witnessed equivalence (A10), context sites (A12b), the sheaf condition (A13), fibrations (A14), transport discipline (A16), predicate invention (A17), conservative extension (A17b), and the coherence cost model (A21). This appendix states the structural consequence that these components jointly entail, develops several results that are (to our knowledge) novel, positions the work explicitly against the existing landscape, and identifies concrete research programs for the mathematical community.

The topos theorem (L.2) is a consequence, not a contribution — it follows from Giraud's theorem applied to the context site. We state it because its corollaries are the contribution: the internal logic subsumes the logic-selection machinery of A15, the subobject classifier provides the multi-valued truth that A4 reached for, and the monadic structure of predicate invention (L.6) gives a formal theory of vocabulary evolution that has not, to our knowledge, been developed elsewhere.

L.1 What This Appendix Claims

We distinguish three levels of novelty:

Standard results applied to a new domain (L.2, L.3): The coherence topos theorem and its internal-logic corollary are instances of known mathematics (Giraud, Mac Lane–Moerdijk). We claim only that the instantiation is well-formed and that the corollaries are operationally significant for distributed systems.

Novel results (L.4–L.6): The Obstruction Cohomology computation, the acyclicity theorem for hierarchical sites, the $H^2$ meta-obstruction for federated sites, the non-composability of overlap agreement, and the characterization of the predicate invention monad as a quotient of a free monad are (to our knowledge) new. They are provable within standard sheaf theory and model theory but have not appeared in the literature because the combination — sheaf-theoretic coherence applied to vocabulary evolution under conservative extension constraints — has not been studied.

Landscape comparison (L.7): We position this work explicitly against Spivak's functorial data migration, Goguen's sheaf semantics, Abramsky's sheaf-theoretic contextuality, and Caramello's bridge program. The comparison identifies what is shared, what is new, and where the framework extends existing work.

Open problems (L.8): Seven precisely stated problems for the mathematical community, including two new problems motivated by the results of this appendix (the Eilenberg-Moore category of $\mathcal{I}$ and persistent cohomology of evolving sites).

Notation

Symbols from Parts I–VI (commitment sets, anchors, etc.) follow the conventions in Appendix H. The following notation is specific to this appendix or used here with specialized meaning. Standard category-theoretic and sheaf-theoretic notation follows Mac Lane & Moerdijk(Lane 1992).

Symbol	Meaning	Introduced
$(\mathbf{Ctx}, J)$	Context site: category $\mathbf{Ctx}$ with Grothendieck topology $J$	A12b
$\mathbf{PSh}(\mathbf{Ctx})$	Presheaf category $[\mathbf{Ctx}^{\mathrm{op}}, \mathbf{Set}]$	L.2
$\mathbf{Sh}(\mathbf{Ctx}, J)$	Sheaf category (the coherence topos)	L.2
$\Omega$	Subobject classifier; $\Omega(U)$ = $J$ -closed sieves on $U$	L.2
$a$	Sheafification: left exact left adjoint to inclusion $i : \mathbf{Sh} \hookrightarrow \mathbf{PSh}$	L.2
$\mathcal{W}^{\mathcal{P}}$	Exponential sheaf: certification space from proposals to witnesses	L.3
$\check{C}^n(\mathcal{U}, F)$	Čech $n$ -cochains of presheaf $F$ with respect to cover $\mathcal{U}$	L.4
$\check{H}^n(\mathcal{U}, F)$	Čech $n$ -th cohomology group	L.4
$\delta^n$	Čech coboundary map $\check{C}^n \to \check{C}^{n+1}$	L.4
$C_i \times_U C_j$	Overlap (fiber product) of $C_i$ and $C_j$ over $U$	L.4.1
$E_2^{p,q}$	Second page of the Čech-to-derived-functor spectral sequence	L.4.1
$\underline{H}^q(F)$	Presheaf of local cohomology groups	L.4.1
$\mathbf{Sig}$	Category of signatures with inclusion morphisms	L.5
$\Sigma, \Sigma'$	Signatures (finite sets of typed predicate/function symbols)	A17
$P$	Proposal endofunctor: $P(\Sigma)$ = single-predicate extension proposals	L.6.1
$P^*$	Free monad on $P$ : finite sequences of proposals	L.6.1
$\mathcal{I}$	Predicate invention monad: admissible extensions of $\Sigma$	L.6.2
$\pi : P^* \twoheadrightarrow \mathcal{I}$	Quotient monad morphism (surjective)	L.6.2
$\eta, \mu$	Monad unit and multiplication	L.6.2
$\mathbf{Sig}_{\mathcal{I}}$	Kleisli category: vocabulary evolution paths	L.6.2
$\ker \pi$	Kernel of the quotient: inadmissible proposal combinations	L.6.3

L.2 The Coherence Topos

Throughout this appendix, $(\mathbf{Ctx}, J)$ is the context site from A12b and $\mathbf{Ctx}$ is assumed essentially small.

Theorem(Coherence Topos)

$\mathbf{Sh}(\mathbf{Ctx}, J)$ is a Grothendieck topos(Verdier 1972--1973)(Lane 1992, ch. III, §4). It has all finite limits, all small colimits, exponentials, a subobject classifier $\Omega$ , and the inclusion $i : \mathbf{Sh}(\mathbf{Ctx}, J) \hookrightarrow \mathbf{PSh}(\mathbf{Ctx})$ has a left exact left adjoint $a$ (sheafification).

Proof

By Giraud's theorem(Verdier 1972--1973)(Lane 1992, ch. III, Theorem 1). The topology $J$ determines a Lawvere-Tierney operator $j : \Omega_{\mathrm{PSh}} \to \Omega_{\mathrm{PSh}}$ via $j(S) = \{f : V \to U \mid f^*(S) \in J(V)\}$ , which is idempotent, preserves top, and preserves meets. The $j$ -sheaves are the $J$ -sheaves, and the category of $j$ -sheaves in a topos is a topos (Mac Lane & Moerdijk, Ch. V, Theorem 1).

∎

Corollary: The Subobject Classifier and Epistemic Status

Truth Values in the Coherence Topos

The subobject classifier $\Omega(U) = \{S \mid S \text{ is a } J\text{-closed sieve on } U\}$ . Truth values are not $\{0, 1\}$ but $J$ -closed sieves: families of contexts in which a claim holds, closed under the covering relation.

A4 Epistemic Status	Topos Interpretation
True in $U$	The maximal sieve $\uparrow\! U$ (all refinements)
False in $U$	The empty sieve $\emptyset$
Undetermined in $U$	A proper non-empty $J$ -closed sieve
Conflict at $U$	Both $\chi_\varphi$ and $\chi_{\neg\varphi}$ are non-empty, proper sieves

The internal logic is intuitionistic. Excluded middle holds at $U$ iff the restricted topology is discrete (every sieve covers). This is the closed-world assumption. Non-discrete topologies yield open-world reasoning natively — no adapter required.

Theorem(Logic Selection as Topos Relativization)

The indexed logic selection of A15 is a special case of relativizing to sub-topologies. Specifically: $U \Vdash \varphi \lor \neg\varphi$ for all $\varphi$ iff $J$ restricted to the sieve below $U$ is the discrete topology. The CWA/OWA distinction is not an engineering parameter but a structural property of the topology over each context.

Proof

A Grothendieck topos is Boolean iff $\neg\neg = \mathrm{id}_\Omega$ , which holds iff every $J$ -closed sieve is maximal or empty — the discrete topology(Lane 1992, ch. VI, §6). Restricting to the slice $\mathbf{Sh}/U$ yields a sub-topos whose Booleanness depends on the induced topology on the under-category $U/\mathbf{Ctx}$ .

∎

L.3 Exponentials and Certification

The topos has exponentials. For sheaves $\mathcal{P}$ (proposals, per A19) and $\mathcal{W}$ (witnesses, per A2c):

\mathcal{W}^{\mathcal{P}}(U) \cong \mathrm{Hom}_{\mathbf{Sh}/U}(\mathcal{P}|_U, \mathcal{W}|_U)

A certification contract (A19b) is a global section $c \in \Gamma(\mathcal{W}^{\mathcal{P}})$ : a natural transformation $\mathcal{P} \Rightarrow \mathcal{W}$ that commutes with all restriction maps. The topos guarantees the space of certifications is a well-defined sheaf. Coherence of certification across contexts is naturality. Whether a particular certification exists is the engineering problem; the topos provides the space in which to search.

L.4 Obstruction Cohomology: A Worked Computation

This section contains what we believe to be novel: an explicit computation of the first sheaf cohomology group $H^1$ for a concrete context site arising in data integration, and its interpretation as classifying ambiguous identity resolution. The companion paper Predicate Invention Under Sheaf Constraints (SCPI) proves that the same $H^1$ classifies obstructions to predicate invention across heterogeneous agent contexts, formalizing the descent problem that A17's three obligations address. The SHEAF Protocol extends this diagnostic to a distributed setting with mechanism-design enforcement.

The Setup: Three-Merchant Catalog

Let $\mathbf{Ctx}$ be the poset category with objects $\{U, A, B, C, A \wedge B, A \wedge C, B \wedge C\}$ where $A, B, C$ are merchant contexts covering the catalog context $U$ , and the $\wedge$ -objects are pairwise overlaps. Morphisms are inclusions (each overlap refines both parents).

The topology $J$ declares $\{A \to U, B \to U, C \to U\}$ as a cover.

Let $F$ be the presheaf of product identifiers:

$F(A) = \{a_1, a_2, a_3\}$ (merchant A's products)
$F(B) = \{b_1, b_2, b_3\}$ (merchant B's products)
$F(C) = \{c_1, c_2\}$ (merchant C's products)

On overlaps, restriction identifies shared products:

$F(A \wedge B)$ : product $a_2$ and $b_1$ are "the same item" — but the identification is ambiguous (two possible matchings exist)
$F(A \wedge C)$ : product $a_3$ and $c_1$ are unambiguously identified
$F(B \wedge C)$ : no shared products

The Čech Complex

The Čech cohomology of $F$ with respect to the cover $\mathcal{U} = \{A, B, C\}$ is computed from the cochain complex:

\check{C}^0(\mathcal{U}, F) \xrightarrow{\delta^0} \check{C}^1(\mathcal{U}, F) \xrightarrow{\delta^1} \check{C}^2(\mathcal{U}, F)

where:

$\check{C}^0 = F(A) \times F(B) \times F(C)$ — local sections (one per merchant)
$\check{C}^1 = F(A \wedge B) \times F(A \wedge C) \times F(B \wedge C)$ — comparison on overlaps
$\check{C}^2 = F(A \wedge B \wedge C)$ — triple overlaps (empty here)

The coboundary $\delta^0$ sends a local section $(s_A, s_B, s_C)$ to the tuple of restrictions: $(\rho_A(s_A) - \rho_B(s_B),\; \rho_A(s_A) - \rho_C(s_C),\; \rho_B(s_B) - \rho_C(s_C))$ .

Computing $H^0$ and $H^1$

$H^0(\mathcal{U}, F) = \ker \delta^0$ — the global sections. These are the tuples of local products that agree on all overlaps: the coherent global catalog. If the identifications on overlaps are consistent, $H^0$ is the glued catalog.

$H^1(\mathcal{U}, F) = \ker \delta^1 / \mathrm{im}\, \delta^0$ — the ambiguity group.

Theorem(H^1 Classifies Ambiguous Identity Resolution)

For the three-merchant site above, $H^1(\mathcal{U}, F) \neq 0$ whenever the overlap $A \wedge B$ admits multiple consistent identifications of shared products. Concretely: if $a_2$ could match either $b_1$ or $b_2$ (both matchings are consistent with the restriction maps), then $H^1$ has order $\geq 2$ , and its elements correspond bijectively to the distinct global catalogs that could be assembled from the same local data.

Proof

A 1-cocycle $\sigma \in \ker \delta^1$ assigns to each overlap an identification that satisfies the cocycle condition on triple overlaps (vacuously here, since $A \wedge B \wedge C = \emptyset$ ). Two 1-cocycles are cohomologous if they differ by a coboundary — a relabeling of local products that induces the identification difference.

When $A \wedge B$ admits two matchings $m_1 : a_2 \leftrightarrow b_1$ and $m_2 : a_2 \leftrightarrow b_2$ , these define distinct 1-cocycles. They are cohomologous iff there exists a relabeling of $F(A)$ or $F(B)$ that transforms one matching into the other. If $b_1 \neq b_2$ and neither is in the image of any other identification, no such relabeling exists, and the cocycles represent distinct cohomology classes.

Each class corresponds to a distinct global catalog: the "same" local data assembled into different global pictures depending on which identification is chosen.

∎

Remark

This is the formal version of a problem every data integration practitioner knows: two sources share some entities, the matching is ambiguous, and different matchings produce different downstream results. $H^1 \neq 0$ is the mathematical name for this ambiguity. The group structure tells you how many distinct resolutions exist and how they relate. This is not metaphor — it is computable for finite context sites.

For the agentic substrate specifically: when two AI agents operating in different contexts propose identity claims about shared entities, $H^1$ measures the irreducible ambiguity in reconciling those claims. No amount of embedding similarity resolves it; only an explicit choice of cocycle representative (a witnessed identification) does.

L.4.1 Acyclicity of Hierarchical Sites

The three-merchant example has non-trivial $H^1$ because the overlap structure admits ambiguity. A natural question: for which site structures does ambiguity vanish? The answer connects organizational topology to coherence cost.

Hierarchical Context Site

A context site $(\mathbf{Ctx}, J)$ is hierarchical if:

$\mathbf{Ctx}$ is a finite rooted tree (poset where every element except the root has exactly one immediate predecessor)
The topology $J$ is generated by parent-children families: for each non-leaf node $U$ with children $\{C_1, \ldots, C_k\}$ , the family $\{C_i \to U\}_{i=1}^k$ is a cover
For distinct siblings $C_i, C_j$ (children of the same parent), the overlap $C_i \times_U C_j$ is the initial object $\emptyset$ (no shared sub-context between different branches)

Theorem(Acyclicity of Hierarchical Sites)

Let $(\mathbf{Ctx}, J)$ be a hierarchical context site. For any abelian presheaf $F$ on $\mathbf{Ctx}$ and any cover $\mathcal{U}$ in $J$ :

\check{H}^n(\mathcal{U}, F) = 0 \quad \text{for all } n \geq 1

In particular, $H^1 = 0$ : there is no ambiguity in identity resolution for hierarchical organizations.

Proof

We prove this by analyzing the Čech complex directly.

Step 1: Structure of overlaps in a tree.

Let $U$ be a node with children $\{C_1, \ldots, C_k\}$ forming a cover. For $i \neq j$ , the overlap $C_i \times_U C_j = \emptyset$ by the tree condition (distinct branches share no sub-context). Therefore for any presheaf $F$ :

F(C_i \times_U C_j) = F(\emptyset) = \{*\} \quad \text{(terminal, for an abelian presheaf: the zero object)}

Step 2: Collapse of the Čech complex.

The Čech complex for cover $\mathcal{U} = \{C_1, \ldots, C_k\}$ of $U$ is:

\check{C}^0 = \prod_{i} F(C_i) \xrightarrow{\delta^0} \check{C}^1 = \prod_{i < j} F(C_i \times_U C_j) \xrightarrow{\delta^1} \check{C}^2 = \prod_{i < j < l} F(C_i \times_U C_j \times_U C_l) \to \cdots

Since $C_i \times_U C_j = \emptyset$ for all $i \neq j$ , every term $\check{C}^n = 0$ for $n \geq 1$ . The complex is:

\prod_i F(C_i) \to 0 \to 0 \to \cdots

Therefore $\check{H}^n = 0$ for all $n \geq 1$ .

Step 3: Extension to composite covers.

For a cover of a non-root node, the same argument applies locally: each non-leaf is covered by its children, which are pairwise disjoint. By the Čech-to-derived-functor spectral sequence (or directly by Leray's theorem applied to the refinement of any cover by the canonical parent-children covers), the vanishing extends to all covers in $J$ , not just the generating ones.

Step 4: Recursive argument for depth $> 1$ .

For a tree of depth $d$ , consider the cover of the root by its children, then each child by its children, etc. The Čech-to-sheaf cohomology spectral sequence for this iterated cover has:

E_2^{p,q} = \check{H}^p(\mathcal{U}, \underline{H}^q(F))

where $\underline{H}^q$ is the presheaf of local cohomology. By induction on depth: $\underline{H}^q = 0$ for $q \geq 1$ (each sub-tree is acyclic by the inductive hypothesis), so $E_2^{p,q} = 0$ for $q \geq 1$ . And $E_2^{p,0} = \check{H}^p(\mathcal{U}, F) = 0$ for $p \geq 1$ by Step 2. Therefore the spectral sequence degenerates and $H^n(\mathbf{Ctx}, F) = 0$ for all $n \geq 1$ .

∎

Remark

This theorem has a precise operational meaning: hierarchical organizations have no identity ambiguity. When contexts are organized as a tree — a corporate hierarchy, a taxonomic classification, a file system — the hierarchy itself resolves all identity questions. Two items in different branches are either identified by a common ancestor's decree or they are not. There is no room for multiple consistent identifications because distinct branches share no sub-context on which to disagree.

This explains a familiar phenomenon: hierarchical organizations are easy to integrate. Corporate mergers between divisions that shared no operations succeed trivially. Taxonomies with strict inclusion are unambiguous. File systems never have merge conflicts within a single tree.

The price of acyclicity is rigidity. A tree cannot express "A and B share some context but neither subsumes the other." Peer-to-peer and federated structures can, and they pay for it with non-trivial cohomology.

L.4.2 Higher Obstructions in Federated Sites

Federated structures are the opposite extreme from hierarchies: multiple overlapping authorities, no single root, non-trivial shared contexts. We show that federated sites can have non-trivial $H^2$ , which classifies meta-conflicts — disagreements not about identity itself but about how to resolve identity disagreements.

Federated Context Site

A context site $(\mathbf{Ctx}, J)$ is federated if:

$\mathbf{Ctx}$ contains a set of federation nodes $\{F_1, \ldots, F_m\}$ and member nodes $\{M_1, \ldots, M_n\}$
Each member belongs to at least one federation: for each $M_j$ , there exists $F_i$ with a morphism $M_j \to F_i$ (membership)
The topology $J$ includes the cover $\{M_j \to F_i \mid M_j \in F_i\}$ for each federation $F_i$
Members of distinct federations may share non-trivial overlaps: $M_j \times_{F_i} M_k$ need not be initial
There exists a global context $G$ covered by $\{F_1, \ldots, F_m\}$

Theorem(Non-Trivial H^2 in Federated Sites)

There exists a federated context site $(\mathbf{Ctx}, J)$ and presheaf $F$ such that $H^2(\mathcal{U}, F) \neq 0$ for a cover $\mathcal{U}$ of the global context. Elements of $H^2$ classify meta-obstructions: situations where pairwise identity resolutions exist but no globally consistent resolution strategy exists.

Proof

Construction. Let $\mathbf{Ctx}$ have:

Global context $G$
Three federation nodes $F_1, F_2, F_3$ covering $G$
Six member nodes $M_{ij}$ for $1 \leq i < j \leq 3$ , where $M_{ij}$ belongs to both $F_i$ and $F_j$ (the shared member between federations $i$ and $j$ )
Triple overlap $M_{123}$ belonging to all three federations

The cover of $G$ is $\mathcal{U} = \{F_1, F_2, F_3\}$ . The pairwise overlaps are $F_i \times_G F_j = M_{ij}$ . The triple overlap is $F_1 \times_G F_2 \times_G F_3 = M_{123}$ .

Let $F$ be a presheaf of identification protocols (an abelian group, for concreteness $\mathbb{Z}/2\mathbb{Z}$ -valued):

$F(F_i) = \mathbb{Z}/2\mathbb{Z}$ for each federation (two possible identity conventions: "match by name" vs "match by code")
$F(M_{ij}) = \mathbb{Z}/2\mathbb{Z}$ (the agreed convention on the shared member)
$F(M_{123}) = \mathbb{Z}/2\mathbb{Z}$

The restriction maps $\rho_i : F(F_i) \to F(M_{ij})$ are the identity (each federation imposes its convention on its shared members).

The Čech complex:

\check{C}^0 = (\mathbb{Z}/2)^3 \xrightarrow{\delta^0} \check{C}^1 = (\mathbb{Z}/2)^3 \xrightarrow{\delta^1} \check{C}^2 = \mathbb{Z}/2

The coboundary $\delta^0(a_1, a_2, a_3) = (a_1 - a_2, a_1 - a_3, a_2 - a_3)$ .

The coboundary $\delta^1(b_{12}, b_{13}, b_{23}) = b_{12} - b_{13} + b_{23}$ (the alternating sum on the triple overlap).

Computing $H^2$ : $\ker \delta^1$ : we need $b_{12} - b_{13} + b_{23} = 0$ in $\mathbb{Z}/2$ , i.e., $b_{12} + b_{13} + b_{23} = 0$ . This kernel has order $4$ (any two of the three values determine the third).

$\mathrm{im}\, \delta^0$ : the image consists of $(a_1-a_2, a_1-a_3, a_2-a_3)$ . Over $\mathbb{Z}/2$ , this gives vectors $(a_1+a_2, a_1+a_3, a_2+a_3)$ . When $(a_1,a_2,a_3)$ ranges over $(\mathbb{Z}/2)^3$ , the image has order $4 $(one can verify: the map has kernel$ {(0,0,0), (1,1,1)} $, so the image has __CURRENCY_2__/2 = 4$ elements).

Therefore $\check{H}^1 = \ker\delta^1 / \mathrm{im}\,\delta^0 = (\mathbb{Z}/2)^2 / (\mathbb{Z}/2)^2$ ... Let us compute more carefully.

$\mathrm{im}\,\delta^0$ : with $a = (a_1,a_2,a_3) \in (\mathbb{Z}/2)^3$ :

$(0,0,0) \mapsto (0,0,0)$
$(1,0,0) \mapsto (1,1,0)$
$(0,1,0) \mapsto (1,0,1)$
$(0,0,1) \mapsto (0,1,1)$
$(1,1,0) \mapsto (0,1,1)$
$(1,0,1) \mapsto (1,0,1)$
$(0,1,1) \mapsto (1,1,0)$
$(1,1,1) \mapsto (0,0,0)$

So $\mathrm{im}\,\delta^0 = \{(0,0,0), (1,1,0), (1,0,1), (0,1,1)\}$ , which has order 4.

$\ker\delta^1$ : we need $b_{12} + b_{13} + b_{23} = 0 \pmod{2}$ : $(0,0,0), (1,1,0), (1,0,1), (0,1,1)$ — also order 4.

So $\check{H}^1 = \ker\delta^1/\mathrm{im}\,\delta^0 = 0$ in this case.

Now $\check{H}^2 = \check{C}^2 / \mathrm{im}\,\delta^1 = \mathbb{Z}/2 / \mathrm{im}\,\delta^1$ .

$\mathrm{im}\,\delta^1$ : $\delta^1(b_{12},b_{13},b_{23}) = b_{12}+b_{13}+b_{23}$ . Since $(1,0,0) \mapsto 1$ , the image is all of $\mathbb{Z}/2$ .

So $\check{H}^2 = 0$ here as well. This is because the nerve of this cover is the 2-simplex $\Delta^2$ , which is contractible.

The non-trivial case requires a cover whose nerve has non-trivial $H^2$ . We modify the construction: let $G$ be covered by four federations $F_1, F_2, F_3, F_4$ with pairwise overlaps $M_{ij}$ for all $i < j$ , triple overlaps $M_{ijk}$ for all $i < j < k$ , but no quadruple overlap ( $M_{1234} = \emptyset$ ). The nerve is the boundary of a 3-simplex $\partial\Delta^3 \cong S^2$ , which has $H^2(S^2, \mathbb{Z}/2) = \mathbb{Z}/2 \neq 0$ .

Concretely, the Čech complex becomes:

(\mathbb{Z}/2)^4 \xrightarrow{\delta^0} (\mathbb{Z}/2)^6 \xrightarrow{\delta^1} (\mathbb{Z}/2)^4 \xrightarrow{\delta^2} 0

(The last term is $0 $because$ \check^3 = F(M_) = F(\emptyset) = 0$.)

The standard computation gives $\check{H}^2 = \mathbb{Z}/2$ . A non-trivial 2-cocycle assigns values to each triple overlap such that the alternating sum condition is satisfied, but these values cannot be decomposed as coboundaries from pairwise overlaps. This is a meta-obstruction: each pair of federations can resolve their identity disagreements, and each triple of federations can find a consistent resolution, but there is no single global resolution strategy compatible with all four federations simultaneously.

∎

Remark

The $H^2$ meta-obstruction has a vivid operational interpretation. Consider four regulatory bodies ( $F_1, \ldots, F_4$ ) each overseeing a set of financial institutions. Any two regulators can agree on how to identify shared entities. Any three can find a consistent protocol. But when all four try to federate, a global obstruction emerges: the pairwise agreements, though locally consistent in triples, cannot be simultaneously satisfied. This is a higher-order coordination failure — not a conflict about data but a conflict about conflict-resolution strategies.

For the agentic substrate: $H^2 \neq 0$ means that even if every pair of AI agents can resolve their identity disputes, and every triple can coordinate, the system as a whole may still lack a globally consistent identity protocol. The obstruction is structural, residing in the topology of the federation, not in any particular data disagreement.

The nerve of the cover is the key invariant: when it has non-trivial higher homotopy, higher cohomology obstructions emerge. This connects the formal theory to classical algebraic topology in a precise and computable way.

L.4.3 The Cohomological Hierarchy: A Classification

The results of L.4, L.4.1, and L.4.2 fit into a single classification:

Site Structure	Nerve Topology	$H^0$	$H^1$	$H^2$	Operational Meaning
Hierarchical (tree)	Contractible	Global sections	$0 $\| __CURRENCY_5__$	No ambiguity; hierarchy resolves all
Flat peer-to-peer	$\bigvee S^1$ (wedge of circles)	Partial globals	Non-trivial	$0$	Identity ambiguity; finitely many resolutions
Federated (overlapping authorities)	$S^2$ or higher	Partial globals	May be non-trivial	Non-trivial	Meta-obstruction; coordination strategy conflict
Fully connected	Contractible ( $\Delta^{n-1}$ )	Global sections	$0 $\| __CURRENCY_8__$	Total overlap; everyone sees everything

Remark

The fully connected case is as acyclic as the hierarchical case, but for the opposite reason: in a tree, siblings share nothing; in a complete graph, everyone shares everything. Both extremes are cohomologically trivial. The interesting (and realistic) cases lie between these extremes — partial overlap, partial authority, partial sharing. These are exactly the structures that arise in multi-agent AI systems, federated databases, and inter-organizational data sharing.

The cohomological hierarchy provides a quantitative topology of organizational coherence cost. An architect choosing between a hierarchical and federated design is choosing a point in this hierarchy, with precise consequences for the complexity of identity resolution.

L.5 Vocabulary Evolution: Composability and Its Limits

Signature Category

$\mathbf{Sig}$ is the category of signatures (finite sets of typed predicate/function symbols) with morphisms the signature inclusions $\Sigma \hookrightarrow \Sigma'$ .

Theorem(Composability of Conservative Extensions)

If $\Sigma \hookrightarrow \Sigma'$ and $\Sigma' \hookrightarrow \Sigma''$ are both conservative extensions (A17b), then $\Sigma \hookrightarrow \Sigma''$ is conservative.

Proof

Let $\varphi$ be a $\Sigma$ -sentence with $(\Sigma'', I'', L) \vdash \varphi$ . Since $\varphi$ is also a $\Sigma'$ -sentence, conservativity of $\Sigma' \hookrightarrow \Sigma''$ yields $(\Sigma', I', L) \vdash \varphi$ . Conservativity of $\Sigma \hookrightarrow \Sigma'$ then yields $(\Sigma, I, L) \vdash \varphi$ . The converse is monotonicity.

∎

This composability is what makes incremental vocabulary evolution safe. A chain of conservative extensions is conservative. You verify each step; the chain is automatic.

But overlap agreement does not compose. This is the central tension in the theory of vocabulary evolution, and we state it as a theorem:

Theorem(Non-Composability of Overlap Agreement)

There exist admissible extensions $\Sigma \hookrightarrow \Sigma_1 = \Sigma \cup \{q_1\}$ and $\Sigma \hookrightarrow \Sigma_2 = \Sigma \cup \{q_2\}$ , each satisfying all three obligations of A17, such that $\Sigma \hookrightarrow \Sigma_{12} = \Sigma \cup \{q_1, q_2\}$ fails Obligation 2 (overlap agreement).

Proof

Construction. Let $\mathbf{Ctx}$ have three objects: $U$ , $V$ , and $U \wedge V$ . Let $\Sigma$ contain a sort $D$ (dresses).

Define $q_1 : D \to [0,1]$ (a scoring predicate) with:

In $U$ : $q_1(d) = \text{material\_quality}(d)$
In $V$ : $q_1(d) = \text{material\_quality}(d)$
On overlap $U \wedge V$ : agreement holds (same definition).

Define $q_2 : D \to \{\text{true}, \text{false}\}$ with:

In $U$ : $q_2(d) = [q_1(d) > 0.7]$ (thresholded from $q_1$ )
In $V$ : $q_2(d) = [\text{certified\_sustainable}(d)]$ (independent of $q_1$ )
On overlap $U \wedge V$ : agreement holds — both views happen to agree on the items in the overlap.

Individually, $q_1$ passes Obligation 2 (same definition in both views), and $q_2$ passes Obligation 2 (agreement on overlap for the current items).

Now add both. The compound predicate $q_3(d) = q_2(d) \wedge [q_1(d) > 0.5]$ is derivable in $\Sigma_{12}$ . In $U$ , this means $[\text{material\_quality}(d) > 0.7] \wedge [\text{material\_quality}(d) > 0.5]$ , which simplifies to $q_1(d) > 0.7$ . In $V$ , this means $[\text{certified\_sustainable}(d)] \wedge [\text{material\_quality}(d) > 0.5]$ . On the overlap, these may disagree: an item with $\text{material\_quality} = 0.6$ and $\text{certified\_sustainable} = \text{true}$ satisfies the $V$ -version but not the $U$ -version.

The interaction between $q_1$ and $q_2$ creates a derived predicate that fails overlap agreement, even though each individually passed.

∎

Remark

This non-composability theorem is the formal reason why the coherence cost model (A21) exhibits quadratic scaling. Each new predicate must be checked against all existing predicates on all overlaps, not just in isolation. The monad multiplication — composing two rounds of predicate invention — requires a full re-verification of Obligation 2 for the composite.

For the agentic substrate: this means that autonomous agents cannot safely invent vocabulary in parallel and then merge the results. Vocabulary invention is inherently sequential at the overlap-checking stage. An agentic system that invents predicates concurrently must synchronize at the point of overlap verification. This is a structural limit, not an engineering deficiency.

L.6 The Predicate Invention Monad

Despite the non-composability of Obligation 2, predicate invention has a well-defined algebraic structure when the full A17 pipeline (including re-verification) is included. We develop this structure in three stages: the free monad of unconstrained proposals, the quotient that enforces admissibility, and the resulting algebraic characterization.

L.6.1 The Proposal Endofunctor

Proposal Endofunctor

Define the proposal endofunctor $P : \mathbf{Sig} \to \mathbf{Sig}$ by:

P(\Sigma) = \{(\Sigma \cup \{q\}, \delta_q) \mid q \notin \Sigma,\; \delta_q \text{ is a grounding definition for } q\}

where $\delta_q$ specifies the sort, arity, and local definition of $q$ in each context. $P$ sends a signature to the set of all single-predicate extension proposals (without checking admissibility). On morphisms: an inclusion $\Sigma \hookrightarrow \Sigma'$ maps a $\Sigma$ -proposal $(\Sigma \cup \{q\}, \delta_q)$ to the $\Sigma'$ -proposal $(\Sigma' \cup \{q\}, \delta_q)$ when $q \notin \Sigma'$ , and discards it otherwise (the proposed predicate already exists).

Free Monad on Proposals

The free monad $P^*$ on the endofunctor $P$ is defined by:

P^*(\Sigma) = \coprod_{n \geq 0} P^n(\Sigma) = \Sigma + P(\Sigma) + P(P(\Sigma)) + \cdots

An element of $P^*(\Sigma)$ is a finite sequence of extension proposals $(q_1, \delta_1), \ldots, (q_n, \delta_n)$ applied to $\Sigma$ . The monadic structure:

Unit $\eta : \mathrm{Id} \to P^*$ embeds $\Sigma$ as the empty sequence of proposals.
Multiplication $\mu : P^{**} \to P^*$ flattens a sequence-of-sequences into a single sequence by concatenation.

$P^*$ is the free monad on $P$ in the sense of the universal property: for any monad $T$ and natural transformation $\alpha : P \Rightarrow T$ , there exists a unique monad morphism $\bar{\alpha} : P^* \to T$ extending $\alpha$ .

L.6.2 The Admissibility Quotient

The free monad $P^*$ allows any sequence of proposals. The predicate invention monad $\mathcal{I}$ is the quotient that enforces the three obligations of A17.

Predicate Invention Monad

Define the admissibility relation $\sim$ on $P^*(\Sigma)$ : two proposal sequences are equivalent if they yield the same final signature and both pass (or both fail) the A17 admissibility check. Define:

\mathcal{I}(\Sigma) = \{\Sigma' \supseteq \Sigma \mid \Sigma \hookrightarrow \Sigma' \text{ passes A17}\}

ordered by inclusion. There is a surjective monad morphism $\pi : P^* \twoheadrightarrow \mathcal{I}$ that sends each proposal sequence to its composite extension (if admissible) or discards it (if not). The monadic structure:

Unit $\eta_\Sigma : \Sigma \hookrightarrow \mathcal{I}(\Sigma)$ — the identity extension (always admissible).
Multiplication $\mu_\Sigma : \mathcal{I}(\mathcal{I}(\Sigma)) \to \mathcal{I}(\Sigma)$ — compose extensions and re-verify Obligation 2 for the composite. $\mu$ is well-defined because conservative extension composes (L.5) and Obligations 1 and 3 are monotone in signature; only Obligation 2 requires re-checking.

The Kleisli category $\mathbf{Sig}_{\mathcal{I}}$ has:

Objects: signatures
Morphisms $\Sigma \to \Sigma'$ : admissible extensions
Composition: extension-then-re-verify

This is the category of vocabulary evolution paths. A morphism in $\mathbf{Sig}_{\mathcal{I}}$ is a certified route from one vocabulary to another.

Theorem(Predicate Invention as Quotient of Free Monad)

$\mathcal{I}$ is a quotient monad of $P^*$ . Specifically, there is a surjective monad morphism $\pi : P^* \twoheadrightarrow \mathcal{I}$ whose kernel is the congruence generated by two relations:

Path independence: $(q_1, \delta_1), (q_2, \delta_2) \sim (q_2, \delta_2), (q_1, \delta_1)$ when both orderings yield the same composite extension
Admissibility filtering: $(q_1, \delta_1), \ldots, (q_n, \delta_n) \sim \bot$ when the composite $\Sigma \cup \{q_1, \ldots, q_n\}$ fails any obligation of A17

Consequently, the category of $\mathcal{I}$ -algebras is a reflective subcategory of $P^*$ -algebras, consisting of those $P^*$ -algebras where the Obligation 2 equations hold.

Proof

That $\pi$ is a monad morphism: We must show $\pi$ commutes with unit and multiplication. For the unit: $\pi(\eta_{P^*}(\Sigma)) = \pi(\Sigma, \text{empty sequence}) = \Sigma = \eta_{\mathcal{I}}(\Sigma)$ . For multiplication: let $s = ((q_1, \delta_1), \ldots)$ be a sequence in $P^*(P^*(\Sigma))$ , consisting of a sequence of sequences of proposals. Then $\pi(\mu_{P^*}(s))$ = the composite of the flattened sequence, and $\mu_{\mathcal{I}}(\pi(\pi(s)))$ = the composite of the composites. Since extension composition is associative (signature union is associative), these agree when both are admissible. When either is inadmissible, both map to $\bot$ .

Surjectivity: Every admissible extension $\Sigma \hookrightarrow \Sigma'$ with $\Sigma' = \Sigma \cup \{q_1, \ldots, q_n\}$ is the image of the proposal sequence $(q_1, \delta_1), \ldots, (q_n, \delta_n)$ under $\pi$ .

Kernel characterization: Two proposal sequences have the same image under $\pi$ iff they yield the same composite signature (path independence) or both are inadmissible (admissibility filtering). These generate a congruence on $P^*$ because both relations are compatible with the monad multiplication (re-verification depends only on the composite, not the path).

Reflective subcategory: An $\mathcal{I}$ -algebra is a signature $\Sigma$ equipped with an action $\alpha : \mathcal{I}(\Sigma) \to \Sigma$ — a way to "absorb" admissible extensions. This is a $P^*$ -algebra that additionally satisfies: whenever two proposal sequences yield extensions that individually pass A17 but whose composite fails Obligation 2, the algebra's action must reject the composite. The reflector is the functor that takes a $P^*$ -algebra and quotients by the Obligation 2 relations.

∎

Remark

The monad laws hold:

Left unit: $\mu \circ \eta_{\mathcal{I}} = \mathrm{id}$ (extending by nothing, then composing, is identity).
Right unit: $\mu \circ \mathcal{I}(\eta) = \mathrm{id}$ (composing with the identity extension is identity).
Associativity: $\mu \circ \mu_{\mathcal{I}} = \mu \circ \mathcal{I}(\mu)$ — this holds because re-verification of Obligation 2 for the composite is independent of the order in which we compose three extensions. The overlap structure depends only on the final signature, not on the path taken to reach it.

The last point is significant: the cost of re-verification may depend on the path (some orderings may allow caching), but the result does not. The monad captures what is invariant (the admissibility condition); the cost model (A21) captures what varies (the verification effort).

L.6.3 The Algebraic Content of Non-Composability

The quotient structure $\pi : P^* \twoheadrightarrow \mathcal{I}$ makes the non-composability theorem (L.5) algebraically precise.

Theorem(Non-Composability as Non-Freeness)

$\mathcal{I}$ is not a free monad on any endofunctor. Equivalently: the kernel of $\pi : P^* \twoheadrightarrow \mathcal{I}$ is non-trivial; it contains proposal sequences that are admissible individually but inadmissible in combination.

Proof

If $\mathcal{I}$ were free on some endofunctor $Q$ , then every $\mathcal{I}$ -algebra would be determined by a $Q$ -action, with no additional equations. But the Obligation 2 constraint imposes equations that depend on pairs of proposals and their interaction on overlaps — equations that cannot be captured by the structure map of a single endofunctor. Specifically: the non-composability theorem (L.5) exhibits two proposals $q_1, q_2$ such that $\mathcal{I}(\Sigma) \ni \Sigma \cup \{q_1\}$ and $\mathcal{I}(\Sigma) \ni \Sigma \cup \{q_2\}$ , but $\Sigma \cup \{q_1, q_2\} \notin \mathcal{I}(\Sigma)$ .

In a free monad $F^*$ on endofunctor $Q$ , if $x \in F^*(\Sigma)$ and $y \in F^*(\Sigma)$ , then $\mu(x, y) \in F^*(\Sigma)$ (the monad multiplication is total). The predicate invention monad's multiplication is partial on the underlying set: not every pair of admissible extensions composes to an admissible extension. This partiality, formalized as a non-trivial kernel in $\pi$ , is the algebraic signature of non-freeness.

The precise obstruction: $\mathcal{I}$ is presented by the generators $P$ and the relations $R$ (Obligation 2 failures), making it a quotient $P^*/R$ rather than a free monad. This is analogous to how a group presented by generators and relations is not a free group unless the relations are trivial.

∎

Remark

This characterization resolves a question implicit in the earlier formulation: why can't we "just" invent predicates in parallel? The answer is algebraic: $\mathcal{I}$ is not free, and the non-freeness comes precisely from the inter-predicate constraints of Obligation 2. A free monad would allow unrestricted parallel composition. The quotient structure forces sequential verification at the overlap boundary.

This also explains the cost model: the coherence budget (A21) is computing the size of the kernel of $\pi$ , restricted to a given overlap structure. A larger kernel means more inadmissible combinations, hence more verification work per predicate added. The quadratic scaling of Obligation 2 checking is a consequence of the kernel growing quadratically with signature size.

L.7 Relation to Existing Frameworks

The coherence topos framework occupies a specific position in the landscape of categorical approaches to data integration and distributed systems. We make the comparisons explicit to identify precisely what is shared, what is new, and what remains open.

L.7.1 Spivak's Functorial Data Migration

Spivak's program(Spivak 2012) models databases as functors $I : \mathbf{C} \to \mathbf{Set}$ from a schema category $\mathbf{C}$ (encoding tables, columns, and foreign keys) to $\mathbf{Set}$ (the actual data). Data migration between schemas $\mathbf{C}$ and $\mathbf{D}$ is a functor $F : \mathbf{C} \to \mathbf{D}$ inducing three adjoint operations:

\Sigma_F \dashv \Delta_F \dashv \Pi_F

where $\Delta_F$ is pullback (direct image), $\Sigma_F$ is left Kan extension (existential migration), and $\Pi_F$ is right Kan extension (universal migration).

What the coherence topos shares with Spivak: Both use category theory to formalize data integration. Both treat schemas as categories and data as functors. The restriction maps of our presheaves correspond to Spivak's pullback functors $\Delta_F$ .

What the coherence topos adds that Spivak does not:

Vocabulary invention. Spivak's framework migrates data between fixed schemas. The functor $F : \mathbf{C} \to \mathbf{D}$ exists before migration begins. In our framework, the signature $\Sigma$ itself evolves: agents invent new predicates, and the admissibility of the invention is the central question. Spivak has no analog of Obligation 2 (overlap agreement for invented predicates) because his schemas do not grow during operation.
Scoped truth and non-Boolean logic. Spivak's instances are $\mathbf{Set}$ -valued functors: a row either exists or does not. Our sheaves carry epistemic status (A4): claims can be true, false, undetermined, or in conflict, with the logic varying by context (A15). The subobject classifier $\Omega$ of the coherence topos (L.2) subsumes this; Spivak's $\mathbf{Set}$ -valued model does not.
Cohomological obstruction theory. Spivak does not develop obstruction theory for migration. When $\Delta_F$ fails (the pullback does not exist or is trivial), the failure is unstructured. Our $H^1$ computation (L.4) provides a classification of the distinct ways migration can fail, with a group structure on the failure modes. The acyclicity theorem (L.4.1) and the $H^2$ meta-obstruction (L.4.2) have no analogs in Spivak's work.
Cost accounting. Spivak's adjunctions are "free" — there is no cost model for migration. Our coherence budget (A21) makes the cost of maintaining sheaf conditions explicit, and the quadratic scaling of Obligation 2 checking (a consequence of L.5's non-composability) quantifies the engineering tradeoff.

Remark

Spivak's framework is the right foundation for structural data migration: moving data between known schemas with known relationships. The coherence topos is designed for the harder problem: semantic data integration where the schemas themselves are evolving, the relationships are being discovered (not given), and the correctness of the discovery must be certified against formal obligations.

A precise connection: the Kleisli category $\mathbf{Sig}_{\mathcal{I}}$ of the predicate invention monad (L.6) can be viewed as a category of schemas with certified evolution paths. Spivak's functors $F : \mathbf{C} \to \mathbf{D}$ correspond to morphisms in $\mathbf{Sig}_{\mathcal{I}}$ where the evolution is a single-step conservative extension. The framework developed here extends Spivak's to the setting where schemas evolve under formal governance.

L.7.2 Goguen's Sheaf Semantics

Goguen(Burstall 1992) proposed sheaves as a semantics for concurrent interacting objects, where each object has a local state and objects interact by sharing state on overlaps. This is the closest ancestor to our use of sheaves.

What we share with Goguen: The core insight — sheaves formalize when local information composes into global information — is Goguen's. Our site structure $(\mathbf{Ctx}, J)$ is a descendant of his interaction sites.

What we add: Goguen's sheaves are on fixed interaction structures. He does not develop: predicate invention (the site's presheaf growing during operation), the non-composability of overlap agreement (L.5), obstruction cohomology as a classification of integration failures (L.4), or the monad structure of vocabulary evolution (L.6). Goguen also does not develop the connection to model-theoretic conservativity (A17b), which is essential for the safety guarantees of predicate invention.

L.7.3 Abramsky's Sheaf-Theoretic Contextuality

Abramsky and Brandenburger(Abramsky 2011) use sheaf theory to formalize contextuality in quantum mechanics: a family of local measurements is contextual if it has no global section — a presheaf that fails the sheaf condition. Their Čech cohomology detects contextuality, with $H^1 \neq 0$ implying strong contextuality.

What we share with Abramsky: The Čech cohomology machinery and the interpretation of $H^1$ as measuring obstruction to global consistency. Our $H^1$ computation (L.4) follows the same pattern.

What differs: Abramsky's presheaves are empirical models — probability distributions on measurement outcomes. Ours are data claims — assertions by computational agents about shared entities. The obstruction in Abramsky is physical (no hidden-variable model exists); ours is semantic (no consistent global identity assignment exists). The mathematics is the same; the domain and operational consequences are different. Critically, we develop the higher cohomology ( $H^2$ , L.4.2) and the structural classification (L.4.3), which Abramsky does not pursue in the same setting.

L.7.4 Caramello's Toposes as Bridges

Caramello's program(Caramello 2018) uses Morita equivalence of toposes as a tool for transferring results between mathematical theories. Two theories are "Morita equivalent" if they classify the same topos, and the topos serves as a "bridge" for transferring invariants.

Connection to our work: Problem 1 in L.8 asks for the geometric theory classified by the coherence topos. If this theory can be identified, Caramello's bridge technique would immediately transfer invariants from other Morita-equivalent theories, potentially connecting coherent vocabulary evolution to problems in algebraic geometry, logic, or topology that have been studied independently.

What we add: Caramello's program is a meta-mathematical tool — it relates theories via their classifying toposes. We provide a specific instantiation: the coherence topos, with its specific site, specific presheaves, and specific theorems (acyclicity, non-composability, the monad characterization). Our work provides a concrete object for Caramello's program to analyze.

L.7.5 Capabilities Comparison

The landscape can be summarized in a table of capabilities:

Capability	Spivak	Goguen	Abramsky	Caramello	This Work
Sheaf-theoretic coherence	Implicit	Yes	Yes	Meta-level	Yes
Vocabulary invention	No	No	No	No	Yes (A17, L.6)
Obstruction cohomology	No	No	$H^1$ only	No	$H^0$ through $H^2$ (L.4)
Non-composability theorem	No	No	No	No	Yes (L.5)
Free monad characterization	No	No	No	No	Yes (L.6)
Cost accounting	No	No	No	No	Yes (A21)
Scoped non-Boolean logic	No	No	Implicit	Yes	Yes (A15, L.2)
Hierarchical acyclicity	N/A	No	No	No	Yes (L.4.1)
Multi-agent operational semantics	No	Partial	No	No	Yes (L.9)

The gap is not that sheaf theory is unapplied to data integration — Goguen applied it in 1992. The gap is that no existing framework addresses the full lifecycle of vocabulary in a distributed system: invention, certification, transport, versioning, cost, and the algebraic structure of the evolution process. Each prior framework addresses a fragment. This work addresses the composition of these fragments under formal guarantees.

L.8 Open Problems for the Mathematical Community

The following problems are precisely stated and, we believe, tractable for researchers in topos theory, HoTT, and categorical logic. They are not speculative — each connects to concrete phenomena in distributed data systems and multi-agent AI.

Problem 1: Classify the geometric theory of the coherence topos. Every Grothendieck topos classifies a geometric theory $\mathbb{T}$ such that models of $\mathbb{T}$ in any topos $\mathcal{E}$ correspond to geometric morphisms $\mathcal{E} \to \mathbf{Sh}(\mathbf{Ctx}, J)$ . What is $\mathbb{T}$ for the coherence topos? This theory would axiomatize exactly those structures admitting coherent vocabulary evolution. Connection: Caramello's "bridge" program(Caramello 2018).

Problem 2 (Partially resolved): Cohomology of structured context sites. Section L.4.1 proved $H^n = 0$ for hierarchical sites, confirming the tree-acyclicity conjecture. Section L.4.2 constructed a federated site with $H^2 \neq 0$ , confirming the meta-obstruction conjecture. Remaining open: (a) Compute $H^n$ for random context sites (Erdős–Rényi overlap graphs) and determine the threshold for vanishing. (b) For sites arising from real organizational structures, characterize the relationship between the Betti numbers of the nerve and the operational cost of coherence maintenance. (c) Determine whether the Čech cohomology equals the derived-functor cohomology for context sites with non-Hausdorff nerve (this holds for paracompact nerves by Leray's theorem but may fail in general).

Problem 3: Extend to $(\infty,1)$ -toposes. Witnesses (A10) carry structure: kinds, composition, coherence conditions. The correct categorical home may be an $(\infty,1)$ -topos where witnesses are 1-morphisms and witness-equivalences are 2-morphisms. Does the coherence topos extend to an $\infty$ -topos? Does the resulting type theory validate a scoped univalence axiom? Connection: Lurie(Lurie 2009), Shulman.

Problem 4: Morita equivalence of context sites. When do two context sites $(\mathbf{Ctx}_1, J_1)$ and $(\mathbf{Ctx}_2, J_2)$ produce equivalent sheaf categories? This would formalize when two institutional arrangements — different organizations, different view decompositions — provide the same coherence guarantees. A Morita equivalence theorem for context sites would be a formal version of "organizational isomorphism from the coherence perspective."

Problem 5: Decidability frontier for the predicate invention monad. For which fragments of the ambient logic $L$ is the admissibility check for predicate invention (A17) decidable? The conservativity check is decidable for propositional and equality fragments, semi-decidable for first-order, undecidable for higher-order (see Appendix K, §K.1.1). What is the precise decidability frontier when overlap agreement (Obligation 2) is included? This connects to classical questions in mathematical logic but in a new setting where the signature itself is evolving.

Problem 6 (New): Eilenberg-Moore category of the predicate invention monad. Characterize the category of $\mathcal{I}$ -algebras (L.6.2). An $\mathcal{I}$ -algebra is a signature equipped with a "vocabulary absorption" operation satisfying the monad laws. What are the free $\mathcal{I}$ -algebras? Can the category of $\mathcal{I}$ -algebras be described as a variety of algebras (in the sense of universal algebra) with explicit equational axioms? The non-freeness theorem (L.6.3) implies the equational theory is non-trivial; its explicit description would connect to Birkhoff's HSP theorem and the theory of algebraic theories(Lane 1971, ch. VI).

Problem 7 (New): Persistent cohomology of evolving context sites. As vocabulary evolves (new predicates are added, overlaps change), the context site $(\mathbf{Ctx}, J)$ changes and its cohomology groups evolve. Does the sequence $\{H^n_t\}_{t \geq 0}$ of cohomology groups over time form a persistence module in the sense of topological data analysis? If so, the persistence diagram would classify the lifetime of obstructions: some ambiguities are transient (resolved by adding a predicate that disambiguates), others are persistent (structural, arising from the federation topology). The barcode of this persistence module would be a novel invariant of vocabulary evolution paths.

L.9 Connection to Agentic Systems

This section connects the mathematical framework to a concrete open problem in multi-agent AI: coherent vocabulary evolution in distributed computational agents.

Current multi-agent AI systems compose outputs (text, actions, tool calls) without composing meaning. Agent A proposes "this product is sustainable." Agent B proposes "this product is eco-friendly." Are these the same predicate? Are they consistent? If Agent C needs to act on both claims, what guarantees does it have?

The coherence topos provides the mathematical infrastructure for answering these questions:

Predicate invention (A17, L.6): An agent can propose a new concept. The proposal carries obligations. The monad structure (L.6.1–L.6.3) ensures that sequential inventions compose safely (conservative extension), while the non-composability theorem (L.5) and the non-freeness theorem (L.6.3) identify exactly where synchronization is required. The algebraic content is precise: the kernel of $\pi : P^* \twoheadrightarrow \mathcal{I}$ is the set of inadmissible combinations, and its growth rate determines the cost of coordination.
Obstruction cohomology (L.4): When agents in different contexts make identity claims about shared entities, $H^1$ measures the irreducible ambiguity. The acyclicity theorem (L.4.1) says hierarchical agent organizations are free of this ambiguity. The $H^2$ meta-obstruction (L.4.2) says federated agent systems face a qualitatively harder problem: not just conflicts, but conflicts about conflict-resolution strategies. The cohomological hierarchy (L.4.3) gives a system architect a precise menu of tradeoffs.
Scoped transport (A16, Theorem K.3): An agent's claim is valid within its scope. Transporting that claim to another agent's scope requires a certificate. The topos provides the space of possible certifications (L.3); the engineering problem is constructing one.
Cost accounting (A21, Theorem K.4): Coherence has a price. The cost model makes the price explicit. The scope boundary in the coherence budget is the system's declaration of how far it will pay for meaning to compose.
Coordination without consensus. The framework does not require agents to share objectives, adopt common logics, or trust one another. It requires only overlap discipline: where two agents' domains intersect, their assertions on the intersection must agree (A13). This is a weaker assumption than shared values, and it is computationally verifiable — the only kind of constraint agents can enforce on each other. Conservative extension (A17b) serves each agent's self-interest: it protects prior commitments. The coherence budget (A21) prices coordination without moralizing it. The sheaf condition is a structural consequence of wanting local outputs to compose globally, not a norm imposed from outside.

The enforcement layer lies outside The Proofs. The conservative extension condition (A17b) is a mathematical specification; Factor Prime (Vol II, Ch 17) provides an enforcement mechanism — collateralized bonds whose thermodynamic cost makes defection expensive without requiring trust; The Sovereign Syntax (Vol III, Epilogue) provides the verification artifact — the receipt that gives affected parties standing to contest.
Landscape position (L.7): This is not a reimagining of Spivak's functorial data migration or Goguen's sheaf semantics. It is their extension to the setting where schemas evolve, agents invent vocabulary, and the cost of coherence is a first-class citizen. The comparison table (L.7.5) makes the precise contribution explicit.

The target is a substrate where computational agents can invent, certify, transport, and version predicates under formal guarantees — where inter-agent coherence is a checkable property with a computable cost. The mathematics developed here provides a specification. The engineering required to realize it at scale remains substantial.

← Back to Appendices Back to The Proofs →

The Coherence Topos and Vocabulary Evolution

L.1 What This Appendix Claims

Notation

L.2 The Coherence Topos

Corollary: The Subobject Classifier and Epistemic Status

L.3 Exponentials and Certification

L.4 Obstruction Cohomology: A Worked Computation

The Setup: Three-Merchant Catalog

The Čech Complex

Computing H0H^0H0 and H1H^1H1

L.4.1 Acyclicity of Hierarchical Sites

L.4.2 Higher Obstructions in Federated Sites

L.4.3 The Cohomological Hierarchy: A Classification

L.5 Vocabulary Evolution: Composability and Its Limits

L.6 The Predicate Invention Monad

L.6.1 The Proposal Endofunctor

L.6.2 The Admissibility Quotient

L.6.3 The Algebraic Content of Non-Composability

L.7 Relation to Existing Frameworks

L.7.1 Spivak's Functorial Data Migration

L.7.2 Goguen's Sheaf Semantics

L.7.3 Abramsky's Sheaf-Theoretic Contextuality

L.7.4 Caramello's Toposes as Bridges

L.7.5 Capabilities Comparison

L.8 Open Problems for the Mathematical Community

L.9 Connection to Agentic Systems

Computing $H^0$ and $H^1$