Truth Needs Witnesses

The Witness Protocol

19 min read

In 1944, Jean Leray sat in a prisoner-of-war camp in Austria and thought about topology. Captured at the fall of France, held at Oflag XVII-A near Edelbach, Leray had redirected his research toward topology, away from the fluid dynamics that had military applications, and spent his captivity building tools for a field he had entered partly out of caution and partly out of genuine fascination. Among the tools he built was a framework for understanding how information gathered in overlapping neighborhoods of a space could be assembled into a global picture. He called the structures sheaves, a metaphor from bundling stalks of grain, and the framework turned out to be one of the most powerful ideas in twentieth-century mathematics.

Leray could not have known that seventy years later his sheaf theory would provide the precise diagnostic for a problem facing credit bureaus, interoperable databases, and autonomous computational agents: the problem of making locally coherent verification systems compose into globally coherent ones. Henri Cartan and Jean-Pierre Serre developed the theory further in the 1950s, giving it the algebraic precision that made it applicable far beyond topology. What they formalized was an insight available to any merchant who had tried to reconcile two honest ledgers that disagreed: the failure is not in either book. It is in the space between them.

Begin with the concrete version of that failure.

Marta Alvarez has one life and three ledgers. Her state tax authority holds the number she reported on her return. Her employer’s payroll system holds the number it paid her in wages. Her bank holds the number that actually arrived.

None of these systems is sloppy. Each has passed its audit. Each is internally coherent. Each is correct on its own terms. And yet they disagree about the same year.

The tax filing says $82,000. The payroll registry says $78,000. The bank statement says $85,000. Each number has a provenance: a signed return, a W‑2, a stream of deposits. None of the three numbers agree.

This mismatch does not live in any one ledger. It lives between them. It is exactly where automated enforcement now operates. A compliance agent does not know Marta. It knows contradictions. A discrepancy large enough to cross a threshold becomes a flag: enhanced review, delayed settlement, a provisional hold on an account used to pay rent and childcare.

Some gaps are innocent. The $4,000 difference between tax and payroll might be weekend freelance income the employer would never record. The $3,000 difference between payroll and deposits might be reimbursements or gifts. But a $7,000 gap between what Marta told the tax authority and what the bank received could also be unreported income. Each ledger balances. Together they produce a contradiction that no one ledger can resolve, because no one ledger can see the whole.

Leray's framework gives this failure a name and a measure. A sheaf assigns data to overlapping regions of a space and specifies a gluing condition: if the data agrees where the regions overlap, it can be combined into a single coherent picture of the larger space. When the data does not agree, the failure to glue is captured by a mathematical object called H^1, the first cohomology group of the covering. If H^1 vanishes, local truth composes into global truth. If H^1 is nontrivial, a structural obstruction exists: a mismatch that no correction to any individual record can fix, because it arises from the topology of how the systems overlap.

In Marta's case, the obstruction can be resolved by introducing the freelance income that A captures and B does not, the reimbursements that C captures and neither A nor B does. That bridging information is the missing section that allows gluing, and the cost of obtaining it is the irreducible coherence fee for this particular configuration of overlapping jurisdictions. The companion paper on the Bridge Problem demonstrates empirically that systems which pass all bilateral coherence checks can still fail at multilateral composition, and that the failure manifests as a nontrivial element of H^1. The obstruction can be classified, measured, and its resolution cost estimated before the resolution is attempted.

You do not need to carry the notation.

In plain language, the sheaf condition says: where your claims overlap with mine, they must not contradict. Not universal agreement. Not shared frameworks or common objectives. Only consistency on the facts that both systems touch.

The cascade compounds at institutional scale. In 2025, a hospital's discharge record was misread as a death record by Medicare. Medicare transmitted the death to the Social Security Administration per protocol. The SSA posted it to the Death Master File, a feed consumed automatically by the IRS, banks, credit bureaus, and insurers. A sixty-six-year-old Philadelphia woman discovered she had been declared dead when an emergency room turned her away: insurance inactive. Bank accounts froze. Social Security payments stopped. Every downstream system had acted correctly on the data it received. No single system erred within its own logic. The obstruction lived at the boundary between hospital and SSA, and it propagated instantly through every overlap. Correction did not cascade. The SSA issued a letter. The woman contacted each institution individually to reverse what had propagated automatically. Approximately twelve thousand Americans are erroneously declared dead each year. The SSA's error rate is less than a third of one percent, but the coherence fee for each error is paid entirely by the person whose life the composition disrupted.

When Properties Degrade

A bill of exchange that Bruges merchants recognized in Barcelona carried all five witness properties simultaneously, each emerging from specific crises in commercial practice. What happens when individual properties fail is equally characteristic, and the modern world is full of the pathologies.

Binding without conditions: a medieval oath bound the swearer to an obligation, but the terms were often vague enough that a stranger, arriving years later, had no way to evaluate whether the oath had been kept. The swearer was identifiable. What he owed was not, or not in a form that permitted evaluation by someone who had not been present at the swearing. Modern open-data initiatives reproduce the inverse pathology: conditions without stakes. A government that publishes the criteria by which it makes decisions achieves transparency, and a citizen who reads the criteria can determine whether a particular decision conforms. But if no consequence follows from nonconformity, if discovering a violation produces nothing more than a public record of the discovery, the transparency is decorative. Visible rules that no one enforces.

Stakes without recourse produce what the Stasi archive demonstrated: punishment without correction. Subjects of surveillance faced consequences (denied education, denied travel, imprisonment) based on information they could not see, from sources they could not identify, through processes they could not contest. A system that punishes on the basis of uncontestable evidence is structurally indistinguishable from a system that punishes arbitrarily, because the subject has no mechanism for distinguishing the two. And composition without the preceding four: a global aggregator that pulls data from millions of sources without checking whether the sources are attributed, specified, staked, or backed by recourse merely propagates unreliable assertions at planetary scale. Every extension of the chain multiplies the risk of undetected failure.

These pathologies compose in order. Strip away recourse and you get surveillance. Strip away stakes and you get a complaints bureau that accepts grievances and imposes nothing. Strip away conditions and you get a punishment regime whose subjects cannot understand why. Strip away binding and you get a rulebook that no identity is held to. Each degradation produces a recognizable institutional form.

But even all five properties, fully implemented, are insufficient if the systems operate in isolation. Two databases may each satisfy every property internally and still produce contradictions at the seam where their claims meet. The merchants in Bruges, each with a perfect ledger, faced exactly this: the failure was in the space between the books. Solving that problem requires the sheaf condition (agreement on overlaps) and the mathematical framework that Leray built in his Austrian prison camp.

From Physical Witness to Computational Witness

A Florentine notary who drafted a bill of exchange in 1410 occupied a body that persisted through time. If the bill was dishonored, the injured party could find the notary in his office, in his guild, or in the court that had licensed him, and pursue recourse against a person who could not disappear. Identity was underwritten by physics: a body occupies space, accumulates reputation, and answers for its attestations because it persists whether its owner wishes it to or not.

A bank deploys an agent to evaluate mortgage applications. On a Tuesday morning in March, the agent reviews an application from a woman named Diane (employment records, credit history, debt-to-income ratio, property appraisal) and denies it. The agent's process terminates. By the time Diane receives the denial letter, the specific instance that evaluated her application no longer exists. It has no office where she can appear, no guild membership to revoke, no body to compel into a courtroom. The notary's accountability rested on the persistence of his person. The agent's person does not persist.

What persists instead is a receipt: a signed, timestamped record of what the agent attested, what data it examined, what criteria it applied, and what conclusion it reached. The receipt is binding: it links the attestation to the agent's identity through a cryptographic key, surviving the agent's termination the way a notarial seal survived the notary's death. Without it, Diane has a denial and no trail back to the process that produced it.

The receipt also encodes conditions: the terms of the evaluation in a language that other processes can parse. A bill of exchange drafted in a Florentine notary's trained hand was inspectable by literate humans at human speed. Diane's receipt, specified in a formal language, is inspectable by any process that can parse it, at whatever speed the hardware allows. If the terms required human interpretation before they could be checked, verification would operate at human tempo regardless of the medium's capabilities, and the asymmetry between machine-speed decisions and human-speed review would become the central constitutional problem. Machine-readable conditions are what prevent that asymmetry from becoming permanent.

Before the agent evaluated Diane's application, the bank deposited collateral against the attestation: a bond slashable if the evaluation is later proven to have violated its stated criteria. This is stakes made inspectable in advance: Diane can verify, before deciding whether to accept the denial, that sufficient collateral backs the claim. Bonded collateral makes the cost of a false attestation visible before the attestation is relied upon.

Diane disputes the denial. The receipt specifies a mechanism for doing so: a forum, a timeline, a set of procedures she can engage with at human tempo. This is recourse, and the constraints on it are what separate a receipt regime from a surveillance regime. If the dispute is adjudicated at machine speed, producing a result before Diane can read the complaint she filed, the recourse is nominal. If it demands technical expertise she does not possess, the recourse is inaccessible. The forum must operate on terms the affected person can understand, in a timeframe she can act within. The notarial protest, a specific procedure physically enacted that produced a document with legal force, set the template. The computational version must meet the same standard or the receipt is decoration.

Finally, Diane's mortgage denial must be legible to other systems: the tax authority that needs to know her housing status, the insurance company that prices risk based on homeownership, the credit bureau that updates her file. This is composition, and it is where the sheaf condition becomes operational. If the denial receipt agrees with what these other systems record about Diane on their overlapping claims, it glues cleanly into a global picture. If it contradicts (if the bank's records show one income figure and the tax authority's show another), the obstruction is diagnosable, classifiable, and repairable by introducing the bridging information that supplies the missing agreement.

One agent, one transaction, five structural requirements encountered in the order the transaction demands them. The properties are the same ones the Bruges merchants relied on. What has changed is the medium that must guarantee them.

The Cost of Coherence

Between Bruges and Barcelona in 1410, the coherence fee was enormous. Different currencies, different calendars, different legal systems, different measurement standards. Spanning that gap required a notary who could navigate both frames, a correspondent who maintained offices in both cities, a courier who could carry documents across borders safely. The infrastructure was elaborate because the boundary was difficult.

When two domains overlap extensively, the cost of ensuring consistency drops: many points of contact provide many opportunities for checking. When domains barely touch, at a single point where a currency must be translated or a jurisdiction crossed, the cost concentrates at the narrow boundary and can be high relative to the data being composed. Marta's three databases overlap at specific data points (income, payments, deposits), and the coherence fee is the cost of reconciling those points: the compliance department, the reconciliation software, the audit process, the remediation when discrepancies surface.

As verification becomes computationally cheap, this fee shrinks relative to the trust tax. Checking automates. Matching accelerates. What persists is the intermediary's premium for standing at the chokepoint, and the defense shifts from "you cannot check this without me" to "you should not have to check this yourself," a convenience argument rather than a necessity argument. But the topological minimum remains. The sheaf condition must still be satisfied, and its dimensional cost (the structure of H^1 for a given covering) resists reduction by better technology or better intentions. It can only be paid.

The Seam

Marta's three databases were all tables: structured, schema-bound, internally consistent systems that disagreed at their overlaps. The sheaf condition diagnosed the failure precisely: local coherence, global obstruction, a computable cost of repair. But Marta's case describes a world in which every system speaks the same kind of language. Her databases disagreed about numbers, not about what a number is. The contemporary computational landscape has rebuilt the merchants' problem in a more radical form: two regimes that produce fundamentally different kinds of claims, and the seam between them is where the infrastructure is failing.

One regime has conquered the interface. Language models, embedding spaces, generative systems (call them the empire of strings) have solved the problem that Tappan's correspondents solved for nineteenth-century credit: making distant, unstructured information available at the point of decision. They retrieve, interpolate, summarize, propose. A merchant in Philadelphia now needs a prompt where he once needed a correspondent. The fluency is real. So is the void beneath it. A probability distribution assigns likelihood. It does not make promises. It cannot distinguish an observation from a projection, a grounded reference from a plausible one, a commitment honored from a commitment confabulated. This is a theorem about the representation: you cannot train a distribution to honor a contract it cannot express.

The other regime still runs the infrastructure that cannot afford to be wrong. Every bank balance, every flight reservation, every medical record, every entry in Marta's three databases passes through systems that enforce schemas, reject malformed data, and answer queries with provable consistency. Call them the empire of tables. They are rigorous, reliable, and frozen. When the world offers a distinction they lack the vocabulary to express (when a garment is "puffy" and the product schema contains no such attribute, when a borrower's creditworthiness depends on a context the scoring model has no field for), they wait. A human must rewrite the schema, migrate the data, update the constraints. Certification adapts only at the speed of human deliberation.

Each empire is coherent within itself. Each fails at the seam.

And the seam is widening, because strings are increasingly wedging themselves into tabular motifs. A language model's output is parsed as structured data and ingested by a database. An embedding (a point in high-dimensional space representing a word, a sentence, or an image as a vector of floating-point numbers) is stored in a column alongside integers and dates, as though proximity in embedding space were the same kind of fact as an account balance. Generated text is treated as a record. A prediction is treated as an observation. At every point where the empire of strings feeds into the empire of tables, a claim carrying no binding, no conditions, no stakes, and no recourse is being composed into systems that assume all four.

The sheaf condition makes the failure precise. Locally, each system is coherent: the language model's outputs are statistically well-formed, the database's records are schema-compliant. At the overlap, the point where a string-generated claim becomes a table-stored fact, the data carries incompatible truth-conditions. A probabilistic assertion has been glued to a deterministic record as though they shared a ground of validity, and they share nothing of the kind. The obstruction lives at the boundary, exactly where Leray's mathematics predicts.

The Bruges merchants' ledgers disagreed about exchange rates and calendar dates, facts within a shared ontology of commercial obligation. Strings and tables disagree about what counts as a fact at all. A bill of exchange could cross the boundary between Venetian and Florentine accounting because both sides agreed that an obligation was an obligation, however differently they denominated it. No equivalent agreement governs the boundary between a language model's probabilistic output and a database's schema-enforced record. The notary who bridged the gap between the two merchants had a protocol: the bill, the seal, the chain of endorsements, the conditions of protest. The boundary between strings and tables has no protocol, only ad hoc parsers and brittle pipelines.

Building that protocol (a witness structure for the seam between regimes that define truth differently) is the central engineering problem. It is also where the Quiet Foreclosure acquires its distinctive character in the current era. When foreclosure operates through probability shifts rather than discrete acts, the receipt regime's five fields (designed for table-operations with identifiable provenance) cannot attach to the harm. Each field presupposes a discrete event: an act to name, an authority to cite, bounds to state, a justification to examine, a path through which appeal can proceed. Continuous probability drift furnishes no such event; no individual query adjustment crosses a threshold a receipt could capture. The structural obstacle is not enforcement but ontology: the five fields require a moment of exercise, and continuous foreclosure is defined by the absence of any such moment. The witness structure for the seam is the mechanism by which the receipt regime extends its reach into the empire of strings. It is not the only problem.

The Plausibility Problem

Cases like the following are now routine. A voice cloned from a few seconds of audio passes a bank's phone-based identity verification. A document synthesized with correct formatting, institutional letterhead, and no visible artifacts clears a due-diligence review. A video places a person in a location she has never visited, speaking words she has never said, and circulates for days before forensic analysis establishes the fabrication. Each capability has been demonstrated publicly by 2024; the specific configurations vary, but the structural point is stable: fabrication at a quality that defeats casual inspection is now cheaper than the inspection itself.

Fabrication has become cheaper than verification, and the gap widens with each generation of generative model. The challenge is structural, not primarily a problem of "misinformation" (a term that frames the issue as volume of false claims in public discourse): the five witness properties assume that binding is costly, that attaching a claim to an identity requires expenditure sufficient to make false attribution expensive. An endorser's signature on a bill of exchange was costly to forge because reproducing a specific hand required access to the handwriting specimen and considerable skill. The cost of forgery bounded the risk of forgery, and that bound made the system tolerable.

When fabrication costs collapse, the bound dissolves. When any human attestation can be cheaply counterfeited, social trust loses its evidentiary value. The alternative is structural unforgability: cryptographic mechanisms whose security rests on mathematical difficulty rather than institutional authority. A cryptographic signature holds because reproducing it requires the private key, and obtaining the key requires either physical theft of a specific device or computation exceeding any adversary's resources. A zero-knowledge proof establishes that a claim is true without disclosing the evidence on which the claim rests, making verification possible without revealing the substrate that would enable counterfeiting. A tamper-evident log records events in a sequence that resists alteration: each entry is cryptographically chained to the one before it, and altering one entry would require recalculating the entire subsequent chain.

These mechanisms do not restore trust. They make trust unnecessary for the specific claims they cover, substituting mathematical structure for social reputation. The bill of exchange rested on the notary's institutional authority: his training, his oath, the legal system that recognized his seal. The cryptographic receipt rests on the difficulty of breaking a mathematical function, immune to bribery, social pressure, or institutional capture.

The merchants are still at the table in Bruges, and the ledgers still disagree. But now the notary's seal can be synthesized from a photograph and a bill of exchange counterfeited for less than the cost of the ink. The social apparatus that once bridged incompatible frames does not survive cheap fabrication. What remains is the structural answer: agreement on overlaps, enforced by mathematics that cannot be bribed.

A system that passes all bilateral coherence checks can still fail at composition. The sheaf condition — agreement on overlaps — is the minimum requirement, and the obstruction to meeting it is computable. If the plausibility problem can be solved without structural binding, this claim is wrong.

Full trilogy·Codex·Research·Proofs

#When Properties Degrade

#From Physical Witness to Computational Witness

#The Cost of Coherence

#The Seam

#The Plausibility Problem

When Properties Degrade

From Physical Witness to Computational Witness

The Cost of Coherence

The Seam

The Plausibility Problem