The Empire of Strings
When I say, before the registrar or altar, &c., 'I do,' I am not reporting on a marriage: I am indulging in it.
The confident machine
Ask a machine a question and watch the answer appear. The cursor blinks, and then the words arrive, sentence by sentence, clause by clause, with the unhurried fluency of a colleague who has read everything and forgotten nothing. The prose is clean. The citations look real. The argument follows a structure that any reader would recognize as competent, perhaps even authoritative. If you did not know how the text was produced, you would assume a person wrote it, and not a careless one.
The machine has generated a string: a sequence of characters, unbounded in length, drawn from a probability distribution over tokens conditioned on the tokens that preceded it. The string may be accurate. It may contain true propositions, correctly attributed, logically ordered. But something is missing, and the absence is a feature of the medium itself, one that no amount of training will remedy.
Two regimes now occupy the space where the notary once stood. The first is the empire of strings: the regime that proposes, interpolates, retrieves, and recombines, producing text indistinguishable from testimony but unable to bind an assertion to the conditions under which it may be relied upon.
What is missing is not truth. What is missing is commitment.
The empire of plausibility
Give the machine credit for what it does. The transformer architecture, which underlies every large language model in production, has accomplished something that eluded artificial intelligence research for half a century: the production of fluent, contextually appropriate natural language at arbitrary length and on arbitrary topics. The string is the medium, and the medium scales. Any proposition that can be expressed in language can be encoded as a string, and the transformer can generate strings that express propositions with a fidelity that improves, measurably and consistently, as the model grows.
The scaling is not accidental. It reflects a deep property of sequential representation: a string is universal. Any structured object — a table, a tree, a graph, a proof — can be serialized into a string and recovered from one. The string does not care what it carries. It is indifferent to content, domain, and logical structure. This indifference is its power and its limitation, but the power came first, and it is genuine.
The mechanism behind this scaling is worth a moment of attention, not for its technical detail but for what it reveals about the medium. The transformer processes a string by attending to every prior token simultaneously, weighting each by its relevance to the token being predicted. The computation is massive: a single forward pass through a large model involves tens of billions of arithmetic operations, distributed across thousands of processors drawing megawatts of electrical power. The infrastructure behind the blinking cursor — the datacenter, the cooling system, the fiber-optic links to the training data — is as physically imposing as any industrial installation of the twentieth century. The product of all this machinery is a conditional probability: the likelihood that a given character follows the characters that preceded it, given what the model learned during training.
What the string produces is plausibility: the property that a text sounds like something a competent author would write. Plausibility is not a trivial achievement. It requires implicit knowledge of syntax, semantics, domain conventions, and pragmatics. A model that can produce a plausible legal brief has, in some operationally meaningful sense, absorbed a large fraction of what a law student learns in three years. A model that can produce a plausible differential diagnosis has absorbed a large fraction of what a medical student learns in four. The string-generating machine is a machine for producing locally coherent text, and locally coherent text is what most readers encounter most of the time.
But local coherence is not global coherence. A paragraph can be internally sound and externally contradictory. A document can flow smoothly from premise to conclusion while silently violating a constraint established three pages earlier. Local truth is cheap. The string excels at producing it. What civilizations pay for — what the bill of exchange, the experimental report, and the chain of custody were built to provide — is the harder thing: coherence that survives the border between one system of truth and another.
A string cannot be broken
The deepest incapacity of the string is the impossibility of binding.
Consider what happens when a merchant in Bruges signs a bill of exchange. The signature is a sequence of marks on parchment, in one sense, a string. But the signature on the bill does something that no string, however accurate, can do on its own: it creates an obligation. The merchant who endorses the bill has bet his name and his property on the validity of the obligation. If the drawee refuses to pay, the holder can pursue recourse against the endorser. The signature can be broken, not in the sense that the ink can be erased, but in the sense that the commitment it represents can be violated, and the violation has consequences.
A string generated by a machine cannot, given current architectures, be broken in this sense. "I promise to pay Alice one hundred dollars on March 15" is a sequence of forty-nine characters. Typed by a merchant in a contract, it creates a legal obligation. Generated by a language model in response to a prompt, it creates nothing. The characters are identical. The institutional force is absent.
David Kaplan's work on demonstratives (Kaplan 1989)David Kaplan, "Demonstratives: An Essay on the Semantics, Logic, Metaphysics, and Epistemology of Demonstratives and Other Indexicals," (1989).View in bibliography illuminates why the gap is inherent, not contingent. A demonstrative expression, "I," "here," "now," "this," refers to something, but what it refers to depends entirely on the context of utterance. "I" in a string has no referent until someone specifies who is speaking. "Here" has no location until someone specifies where. A string containing the sentence "I am in Paris" is not true or false in isolation; it is true or false relative to a context that the string itself does not carry. Strip the context and you strip the reference. Strip the reference and the string becomes an unbound variable: a placeholder that looks like a commitment but isn't one.
Austin (Austin 1962)J. L. Austin, How to Do Things with Words (Oxford: Oxford University Press, 1962).View in bibliography made the complementary observation from the direction of action rather than reference. Some utterances do not describe the world but change it: "I do" in a marriage ceremony, "I promise" in a contractual negotiation, "I bet you sixpence" in a casual wager. Austin called these performatives: utterances that constitute the acts they describe. But performatives work only under what Austin called "felicity conditions": the right speaker, the right audience, the right institutional context, the right conventions. "I do" spoken by a priest at an altar constitutes a marriage. "I do" generated by a language model constitutes nothing. The string is the same. The felicity conditions are absent.
The incapacity is inherent in the medium. A string is a representation: a sequence of symbols drawn from an alphabet. Representations describe but do not bind. To bind is to create a relationship between the representation and the world, a relationship that carries consequences when violated. Binding requires something the string does not contain: an institutional context that connects the symbols to stakes, to parties, to conditions of enforcement. The inability to bind is not merely a technical gap; it is the gap in which the trust tax hides, the space where intermediaries insert themselves as the only parties capable of converting representation into obligation. The medieval merchants understood this distinction even if they would not have used these terms. The spoken word was adequate for expression. It was inadequate for commerce. The solution was not better speech. The solution was the notarial instrument.
Representation is arbitrary; identity is not
In 1872, the mathematician Felix Klein (Klein 1872)Felix Klein, "Vergleichende Betrachtungen über neuere geometrische Forschungen" (1872).View in bibliography proposed a principle that reorganized geometry. The Erlangen Programme, as it came to be known, classified geometries not by the shapes they studied but by the transformations under which their properties remained invariant. Euclidean geometry preserves distances and angles: rotate a triangle and it remains congruent to itself. Projective geometry preserves incidence: a line through a point remains a line through a point even under perspective distortion. Topology preserves continuity: stretch a circle into an ellipse and they remain topologically identical; tear it and they don't.
Klein's insight was that the representation changes but the invariant persists. The coordinates that describe a triangle in one frame look nothing like the coordinates that describe it in another, but the triangle is the same triangle if and only if the transformation between frames preserves the relevant properties. The representation is arbitrary. The identity is not.
Applied to information, the principle is immediately clarifying and immediately troubling. "Alice owes Bob one hundred dollars, due March 15" can be encoded as a JSON object, an XML document, a row in a relational database, a sentence in English, a sentence in Mandarin, or a handwritten entry in a merchant's ledger. The six encodings look nothing alike. The curly braces of the JSON, the angle brackets of the XML, the columnar regularity of the database row, the flowing script of the English sentence, the characters of the Mandarin, the ink on the parchment: six strings, six representational conventions, six utterly different arrangements of symbols.
Is the obligation the same?
The question cannot be answered by examining the strings. The JSON may encode the date as "2025-03-15" and the XML as "March 15, 2025." Are these the same date? Presumably, but the strings are different, and a character-level comparison will say they are not equal. The English sentence says "one hundred dollars" and the JSON says "100.00." Are these the same amount? In one currency, yes; but the English doesn't specify the currency, and the JSON may carry a currency field that the English lacks. The Mandarin sentence uses a different calendar convention. The ledger entry uses a different notation for the sum.
Each encoding is internally consistent. Each carries the information needed by the system that produced it. None carries the terms of its equivalence to the others. The equivalence is not a property of the representations; it is a property of the system that establishes their correspondence.
This was exactly the problem on the table in Bruges in 1410. The Venetian's ledger and the Florentine's ledger were two encodings of the same obligation, each internally sound, each useless at the border. The bill of exchange was the object that certified the equivalence, not by containing all the information from both ledgers, but by specifying the terms on which an entry in one could be recognized as the same entry in the other. The bill was a witnessed equivalence: an assertion that two things in two different frames were, for specified purposes, the same, backed by the institutional authority of the notary and the financial liability of the endorsers.
A string can encode information, transmit it, even assert that two pieces of information are equivalent. But the assertion of equivalence is itself another string, inheriting every incapacity of the medium. Who asserted the equivalence? Under what conditions does it hold? What happens if it's wrong? Representation is arbitrary; identity requires a witness.
Two strings can contradict and nothing breaks
The binding problem concerns single assertions; the composition problem concerns what happens when they accumulate.
A system that has generated one million assertions across ten thousand conversations has, in the aggregate, taken on a vast number of implicit commitments. If it asserted in conversation A that the boiling point of water at sea level is 100 degrees Celsius, and in conversation B that the boiling point of water at sea level is 212 degrees Fahrenheit, has it contradicted itself? Logically, no: the two assertions are compatible, expressed in different units. But nothing in the string representation carries the unit conversion. The two assertions are locally plausible and globally unrelated, because the system maintains no record of what it has previously committed to.
The problem deepens when the assertions do conflict. "This medication is safe for use during pregnancy" and "This medication is contraindicated in pregnant patients" are mutually exclusive, and both have appeared in the outputs of medical question-answering systems. The contradiction is invisible to the generating mechanism, because the mechanism produces each token by predicting the next token from the preceding context. It has no obligation to the assertions it made in previous contexts, no ledger of commitments, no mechanism for checking a new assertion against the body of assertions it has already produced. Each response begins, in effect, from a fresh sheet.
A string-generating system produces local continuations. Consistency is a global property of a set of assertions, the property that they can all be true together, and checking it requires something the string does not carry: a semantics that turns character sequences into propositions and a mechanism that evaluates propositions against each other. The gap between token and proposition is exactly the gap between the Venetian's ledger and the Florentine's: each locally consistent, each unable to compose with the other without an external object that certifies the conditions of equivalence.
A simple test exposes the incapacity. Define four rules: all glints are flerms; no flerm is a zoth; Plex is a glint; Plex is a zoth. The first three rules entail that Plex is a flerm. The second rule entails that no flerm is a zoth. The fourth rule asserts that Plex is a zoth. The set is inconsistent: the four rules cannot all be true together. A system with commitment discipline would detect the inconsistency and refuse to accept all four. A string-generating system, asked to affirm each rule in sequence, will affirm all four without hesitation, because each affirmation is a local continuation, the right answer to the immediate prompt, and the mechanism carries no obligation to check the local answer against the global set. The words "glint," "flerm," and "zoth" are arbitrary; the inconsistency is logical; and the failure lies in the medium's inability to maintain a ledger of commitments across assertions.
Researchers measuring this gap directly have found that large language models, when presented with the same question rephrased in different syntactic forms, produce contradictory answers roughly two times out of five. The inconsistency is not random; it is systematic, concentrated in domains where the answer depends on reasoning rather than recall. The models are more consistent on factual questions ("What is the capital of France?") and less consistent on inferential questions ("If all glints are flerms and no flerm is a zoth, can Plex be both a glint and a zoth?"). The reason is that factual recall is a string operation, pattern-matching against the training data, while inference is a commitment operation that requires evaluating propositions against each other. The string-generating machine performs the first well and the second poorly, because the first is a property of the medium and the second is not.
The incapacities compound when evidence travels: when strings from different sources, carrying different institutional contexts, meet in a shared processing space that flattens their distinctions.
A legal research system retrieves two holdings on the enforceability of liquidated damages clauses. The first, from 2019, says they are enforceable if they represent a reasonable forecast of loss at the time of contracting. The second, from 2023, says they are unenforceable if they are imposed through procedurally unconscionable means: a form contract, no negotiation, no meaningful alternative. Both holdings are real. Both were retrieved correctly from authoritative databases. The system synthesizes them into a confident paragraph that presents a unified position on the law of liquidated damages.
The synthesis is wrong, not because either holding is wrong, but because the second holding qualifies the first. The 2023 decision didn't overturn the 2019 principle; it added a procedural condition that narrows the principle's applicability. The relationship between the two holdings is one of scope: the first states the general rule, the second identifies an exception. A lawyer reading both opinions would see the qualification immediately. The system, processing them as strings, cannot see it at all, because the scope relationship is not in the strings. It is in the institutional context that produced them: which court, in what jurisdiction, at what level of authority, and with what precedential relationship to each other.
When evidence loses its address, when a claim is separated from its source, its conditions, and its relationship to other claims, contradiction becomes invisible. Two propositions that appear to agree may in fact disagree in ways that only the institutional context can reveal. Two propositions that appear to contradict may in fact be compatible under conditions that the strings alone do not specify. The system that processes them as flat text, feeding them into a context window where all inputs are equal, has no way to detect the difference. The context window is shared memory with no access controls: once text enters, its provenance vanishes into the undifferentiated stream.
The pattern repeats outside legal research. A fashion retailer integrates product data from three suppliers. Supplier A describes a garment as "silk blend, seventy percent silk, thirty percent polyester." Supplier B describes the same garment, same SKU, same manufacturer, as "one hundred percent polyester." Supplier C calls the fabric "satin," which is a weave, not a fiber, and says nothing about material composition at all. Three strings, each internally coherent, each arriving from a legitimate commercial source. The system must decide what the garment is made of, and the strings give it no basis for deciding, because the disagreement is not in the strings but in the sourcing relationships, the testing protocols, and the regulatory definitions that stand behind them. A human merchandiser would call Supplier A, ask for the test certificate, and resolve the conflict in fifteen minutes. The string-processing system has no test certificates, no phone numbers, and no way to distinguish a typo from a genuine compositional difference.
This is not a failure of retrieval. The correct data were retrieved. It is not a failure of generation. The synthesis was fluent and well-organized. It is a failure of composition: the inability to combine evidence from different sources in a way that preserves the relationships between them. Two strings can be concatenated, interleaved, summarized, or paraphrased, and the result is another string. But the result does not carry the terms under which the original strings were valid, the authorities that produced them, or the scope relationships that determine how they interact. The composition is lossy at exactly the point where losslessness matters.
The bill of exchange solved this problem for a different medium. When a bill traveled from Florence to Bruges, it carried the obligation and the chain of endorsements that certified each transfer. If the drawee refused to pay, the holder could trace the chain backward: each endorser had staked his name on the validity of the transfer, and each endorser remained liable. The chain did not merely transmit the claim; it transmitted the constraints under which the claim could be inspected and challenged at every step. A string that has been copied, summarized, or synthesized carries no such chain. It is evidence without custody.
The spoken word and its remedies
Oral testimony labored under every incapacity this chapter has described. A spoken promise was a string in the most literal sense: a sequence of sounds produced by a human voice, subject to the acoustics of the room and the fallibility of the listeners' memory. The promise could not be inspected after the fact. It could not be transmitted to someone who was not present. Two people who heard the same words might remember them differently, and there was no mechanism for resolving the disagreement. If the promisor died, the promise died with him, unless witnesses could be found, and witnesses had their own memories, their own interests, and their own mortality.
Civilizations did not solve this problem by making speech more reliable. They solved it by building institutional structures that wrapped the spoken word in conditions the word itself could not carry.
The Roman stipulatio was the earliest and most austere of these remedies. The scene was intimate and physical: two men standing close enough to hear each other clearly, in the open air or in a room with witnesses present, the words spoken aloud in prescribed Latin regardless of the parties' native tongue. One spoke the question: Spondesne centum dare? — Do you solemnly promise to give one hundred? The other answered with the single word that completed the contract: Spondeo — I solemnly promise. The formula was invariant. No synonym was acceptable. No written substitute was valid. The exchange had to occur in person, aloud, between parties who could see and hear each other. A letter containing the same words, the same string, created no obligation at all. The ritual did not improve the content. It bound the content to a context that the content alone could not carry: identifiable parties who were physically present, a prescribed formula that constrained what counted as a valid promise, the social and legal consequences of breach, the testimony of witnesses who heard the exchange, and the legal system's recognition of the form.
The oath wrapped the spoken word in a different kind of institutional force: divine witness. The perjurer risked not his reputation or his property but his soul: a sanction that could not be evaded, appealed, or settled out of court. The oath's power depended on the parties' belief in the divine witness, which made it effective in communities of shared faith and useless across confessional boundaries. But the structure was the same: the words were wrapped in conditions that the words themselves could not carry.
The written document addressed the most obvious incapacity of the spoken string: its transience. A promise committed to papyrus, parchment, or wax survived the absence, the forgetfulness, and the death of the promisor. The seal, a blob of wax impressed with the sender's signet ring, its pattern unique to the individual, its impression physically inseparable from the document, authenticated the author's identity without requiring his physical presence. And the notarial instrument carried the institutional authority of the notary's publica fides, a legal fiction that gave the notary's attestation the evidentiary weight of two or three ordinary witnesses and made the document enforceable across jurisdictions that recognized the notary's commission.
The large language model produces strings with the same incapacities that afflicted the spoken word: it cannot bind, cannot compose under coherence constraints, and cannot carry provenance. The question is whether computation will need to build equivalent solutions, or whether the string, at sufficient scale, will transcend what every prior medium exhibited.
The truthfulness objection
The strongest version of the opposing case is that the incapacities described above are engineering problems, not inherent ones, and that they are being solved.
Reinforcement learning from human feedback, the technique that transformed raw language models into conversational assistants, demonstrably reduces the rate at which models produce false statements. Constitutional AI embeds behavioral constraints directly into the training process, penalizing outputs that violate specified principles. Direct preference optimization aligns model outputs with human judgments of quality more efficiently than earlier methods. Each technique improves the accuracy of the string. The improvement is measurable, consistent, and ongoing.
The objection has empirical force. A model trained with these techniques produces fewer factual errors, fewer unsupported claims, and fewer logically inconsistent passages than its untrained predecessor. If the trend continues, if hallucination rates fall to negligibly small values, then the practical distinction between a truthful string and a bound commitment may become irrelevant. A machine that never lies is, for all practical purposes, a machine that can be trusted.
But truthfulness is a property of propositions. Bindingness is a property of systems of commitment. The distinction is not a matter of degree, a spectrum from "mostly truthful" to "perfectly truthful" that eventually reaches "bound," but a difference in kind. A true proposition asserts something about the world. A bound commitment creates a relationship between the asserter and the world, a relationship that carries consequences when violated. Training a model to produce true propositions is like training a speaker to be honest: it improves the quality of the string without addressing the deficit.
The stipulatio did not make the Roman merchant more honest. Honest merchants existed before the stipulatio and would have existed without it. What the stipulatio provided was enforceability: the institutional apparatus that made the merchant's promise matter regardless of his honesty, because the consequences of breach were external to the merchant's character and embedded in the legal system's recognition of the form. An honest merchant who spoke the stipulatio formula and a dishonest merchant who spoke the same formula created the same legal obligation. The institution did not care about the merchant's intentions. It cared about the structure of the commitment.
The fashion catalog illuminates the point from a different angle. A customer review reads: "I was hoping this would be flowy, but it's actually quite structured." A string-matching system searching for "flowy" dresses will return this review as a positive match. A more sophisticated model, one trained on enough examples to recognize the syntactic pattern of defeated expectations, will correctly classify the review as negative. But the correction is fragile. "This dress is flowy in a way that reminds me of structured Japanese design" will defeat the improved classifier, because the negation is now semantic rather than syntactic and depends on knowledge of fashion history that the model may or may not possess. Each layer of training addresses one class of errors and exposes another. The recursion does not bottom out at "truthful," because the problem is not the quality of the string. The problem is that the string does not carry the terms on which it may be relied upon, and no amount of training on propositions installs those terms.
Toward the empire of tables
If strings cannot bind, compose, or carry provenance, if they are, like the spoken promise, adequate for expression but incapable of commitment, then the natural remedy is what every civilization eventually tried: impose structure. Write it down. Define the categories. Codify the form. Build a schema that constrains what can be said and how it can be said, so that the resulting records are inspectable, comparable, and enforceable.
This is the empire of tables: the regime of databases, schemas, type systems, and formal specifications that enforces structure with the same rigor that the string refuses. Where the string is loose, the table is rigid. Where the string permits contradiction without consequence, the table rejects malformed input at the gate. Where the string loses provenance on copy, the table preserves identity through keys and foreign keys and referential integrity constraints.
The table solves half the problem. What it cannot do, why the empire of tables creates the other half of the problem even as it solves the first, is the subject of the next chapter.