Evidence Without Custody
A person who relates a hearsay is not obliged to enter into any particulars, to answer any questions, to solve any difficulties, to reconcile any contradictions, to explain any obscurities, to remove any ambiguities; he entrenches himself in the simple assertion that he was told so, and leaves the burden entirely upon the absent, and perhaps unknown author.
The oldest rule of evidence
A witness takes the stand in a federal courtroom. She testifies that on the morning of March 14, the defendant signed a contract in her office, that she watched him read each page, that she observed his handwriting as he initialed the margins and signed the final page, that a second witness was present and signed below. The opposing counsel rises. The cross-examination begins: How far from the defendant were you standing? Was there anyone else in the room? Had you met the defendant before that morning? Is it possible you are confusing this occasion with another? Are your initials on the document as well? Every question is a probe, designed to expose what the testimony hides: the distance, the lighting, the possibility of confusion, the witness's relationship to the parties, the circumstances in which her perception was formed. The trier of fact watches the responses, registers the witness's confidence or hesitation, and weighs the testimony against the documentary and physical evidence already in the record.
The system works because the assertion and the person who made it are both present. The oath binds the witness to the truth of her statement. The penalty for perjury gives the oath its force. Cross-examination tests the statement against the circumstances of its making. The judge supervises the process. The court reporter produces a transcript. The apparatus surrounds the bare assertion with safeguards, and the safeguards travel with the testimony into the permanent record of the proceeding.
But when the witness is unavailable, dead, ill, beyond the court's jurisdiction, or simply refusing to appear, the system confronts a different problem. In place of live testimony, a document is offered: a letter the witness wrote to her business partner on the evening of March 14, describing the same contract signing in the same detail. The letter contains the same factual assertions. The handwriting is authenticated. The letter's provenance is established through testimony from the partner who received it. But the letter cannot be cross-examined. No one can ask it follow-up questions, test its account against the physical evidence, or observe the declarant's demeanor under pressure. The letter is evidence, but it is evidence that has crossed an institutional boundary: from a setting where the safeguards could operate on the person who made the assertion to a setting where only the document remains.
The common law has a name for this situation. It is called hearsay: an out-of-court statement offered to prove the truth of the matter asserted(Courts 2023, Rule 801(c))United States Courts, "Federal Rules of Evidence" (2023), Rule 801(c).View in bibliography. For seven hundred years, the Anglo-American legal system has treated hearsay as presumptively inadmissible. The presumption rests on structural grounds: the safeguards that make live testimony reliable, oath, cross-examination, the tribunal's ability to assess demeanor, cannot operate on an absent declarant. What remains is the bare assertion, stripped of the institutional apparatus that vouches for it.
The principle crystallized at Winchester Castle on November 17, 1603(Smith 2009, chs. 9–10)John H. Langbein and Renée Lettow Lerner and Bruce P. Smith, History of the Common Law: The Development of Anglo-American Legal Institutions (New York: Aspen Publishers, 2009), chs. 9–10.View in bibliography. Sir Walter Raleigh stood accused of treason. The prosecution's case rested on an unsworn written examination of Lord Cobham, taken by the Privy Council and read aloud to the jury(Jardine 1832, pp. 389–520)David Jardine, Criminal Trials (London: Charles Knight, 1832), pp. 389–520.View in bibliography. Cobham was imprisoned nearby but never produced as a witness. Raleigh protested in terms that would echo through four centuries of evidence law: "Call my accuser before my face… If you proceed to condemn me by bare inferences, without an oath, without a subscription, without witnesses, upon a paper accusation, you try me by the Spanish Inquisition." The court refused. The jury convicted in fifteen minutes. One of Raleigh's own judges later lamented that "the justice of England has never been so degraded and injured as by the condemnation of Sir Walter Raleigh"(Jardine 1832, p. 520)David Jardine, Criminal Trials (London: Charles Knight, 1832), p. 520.View in bibliography. The Sixth Amendment would codify the confrontation right three and a half centuries later. When the Supreme Court gave the Clause its modern interpretation, Justice Scalia identified the Raleigh trial as "the principal evil" at which it was directed(States 2004, 541 U.S. at 50)Supreme Court of the United States, "Crawford v. Washington" (2004), 541 U.S. at 50.View in bibliography. The Federal Rules of Evidence, adopted in 1975, formalized the hearsay bar and its exceptions into the structure the American legal system uses today.
Bentham saw the problem with characteristic precision(Bentham 1827, bk. III, ch. 15)Jeremy Bentham, Rationale of Judicial Evidence, Specially Applied to English Practice (London: Hunt and Clarke, 1827), bk. III, ch. 15.View in bibliography. The person who relates hearsay "entrenches himself in the simple assertion that he was told so, and leaves the burden entirely upon the absent, and perhaps unknown author." The relater may be perfectly faithful in reporting what he was told. But the accuracy of the relay does not vouch for the reliability of the origin. The chain from the original speaker to the courtroom has lost the apparatus that could test the assertion: the oath, the cross-examination, the tribunal's ability to observe the declarant under pressure. The content traveled intact. The safeguards did not.
The legal system responded by identifying the circumstances under which the loss is tolerable. The Federal Rules of Evidence enumerate more than thirty exceptions to the hearsay bar(Courts 2023, Rules 803–804)United States Courts, "Federal Rules of Evidence" (2023), Rules 803–804.View in bibliography, each specifying circumstances that substitute for the missing safeguards. A business record is admissible when it was "made at or near the time by someone with knowledge," "kept in the course of a regularly conducted activity," and "making the record was a regular practice of that activity"(Courts 2023, Rule 803(6))United States Courts, "Federal Rules of Evidence" (2023), Rule 803(6).View in bibliography. The temporal proximity, the knowledge requirement, the institutional routine: these conditions substitute for cross-examination by ensuring that the record was produced under circumstances that limit the opportunities for fabrication or error. A dying declaration is admissible when the declarant believed death was imminent(Courts 2023, Rule 804(b)(2))United States Courts, "Federal Rules of Evidence" (2023), Rule 804(b)(2).View in bibliography, the condition of mortal peril serving as a guarantor of sincerity. A statement against interest is admissible when it was "so contrary to the declarant's proprietary or pecuniary interest" that a reasonable person would not have made it unless it were true(Courts 2023, Rule 804(b)(3))United States Courts, "Federal Rules of Evidence" (2023), Rule 804(b)(3).View in bibliography.
Each exception identifies a specific procedural or circumstantial structure that makes the evidence trustworthy despite the absence of the person who produced it. The exceptions represent seven centuries of accumulated judgment about what makes evidence reliable when the declarant cannot be questioned. The answer, in every case, involves the conditions under which the evidence was produced, not merely its content.
The string empire produces assertions without institutional backing. The table empire enforces structure but cannot maintain it across boundaries. Both empires generate evidence without custody: assertions that carry content but not the grounds on which the content may be relied upon.
The seam between systems
Every information system operates in the space between the two empires, and the seam between them is where evidence loses its backing.
The legal research platform is a representative case. The platform indexes case holdings in structured fields, court, date, jurisdiction, topic, procedural posture, and stores the holdings themselves as blocks of prose, because no schema can anticipate every doctrinal distinction a court may draw. A junior associate at a law firm asks the platform: "Is a liquidated damages clause enforceable in California?" The system retrieves two holdings from its database. The first, from a 2019 appellate decision, states that a liquidated damages clause is enforceable if it represents "a reasonable forecast of damages that would be caused by a breach." The second, from a 2023 decision in a different procedural posture, holds that a liquidated damages clause may be struck down if it was "procedurally unconscionable at the time of contract formation."
Both holdings are correctly retrieved, accurately quoted, and relevant to the query. The structured metadata, court, date, jurisdiction, is intact.
The system synthesizes: "Liquidated damages clauses are generally enforceable in California if they represent a reasonable forecast, but may be struck down if procedurally unconscionable." The synthesis reads like a competent legal summary. It is wrong in a way that no citation check will detect.
The 2023 holding did not overturn the 2019 standard. It refined it by adding an independent dimension of analysis: a clause can satisfy the "reasonable forecast" test and still fail on procedural unconscionability grounds, because the latter examines the bargaining constraints under which the clause was agreed to, the relative sophistication of the parties, the adequacy of disclosure, the presence or absence of meaningful negotiation, while the former examines only the clause's substantive relationship to anticipated damages. A clause can be substantively reasonable and procedurally unconscionable at the same time. The system's synthesis flattened a doctrinal refinement into a simple conjunction, implying an override where none exists.
The holdings were found. The text is accurate. The failure occurs at the seam: the system received two holdings from two distinct legal contexts, different courts, different procedural postures, different stages of a developing doctrinal line, and merged them in a space where the context had been stripped away. The structured fields told the system that the holdings came from different courts and different dates. They did not encode the doctrinal relationship between the two, whether the second overruled, distinguished, refined, or merely paralleled the first. That relationship is a fact about the legal context, and it lived outside the schema.
The pattern recurs across every hybrid system. A clinical decision support tool retrieves published treatment guidelines from a structured database and patient records from an electronic health record. The guidelines were produced by expert committees, reviewed over multi-year cycles, and published with explicit confidence grades. The patient records were entered by clinicians under time pressure, with abbreviations, ambiguities, and implicit assumptions that make sense to the treating physician and may not survive extraction into a different system. The tool combines both sources in a recommendation engine that weighs the guidelines against the patient's history. If the patient's allergy record is incomplete, if a known allergy was documented in a specialist's note but never propagated to the allergy table, the recommendation engine will fail to flag the contraindication.
The guideline says: "Do not prescribe penicillin to patients with documented penicillin allergy." The allergy table says: no allergy on record. The specialist's note, buried in a free-text field that the recommendation engine does not parse, says: "Patient reports severe reaction to amoxicillin in 2019." The guideline's rigor and the patient record's documentary fragility are both present in the source data and indistinguishable in the recommendation engine's combined input. The allergy that matters most exists in the wrong part of the hybrid: it lives in a string where the system expected a table entry, and the seam between the two is where the system's reliability fails. No error message. No audit trail. A physician who trusts the recommendation system's silence will prescribe a drug the patient cannot safely take.
The same structure recurs wherever a neural network's outputs pass through a rules engine: the network proposes, the rules engine vetoes, and neither speaks the other's language. The customer sees a recommendation list from which items have been silently removed; the patient sees a treatment plan from which options have been silently excluded. In every case, the seam between the statistical model and the constraint system is a site where the reasoning behind the output is forfeited in the handoff.
What custody demands
The legal system's answer to the hearsay problem has three components, and each maps onto a requirement that computational hybrids systematically fail to provide.
The first is provenance: the identity of the source and the chain through which the evidence reached the trier of fact. When a blood sample is entered as evidence in a criminal trial, the prosecution must document every person who handled the sample from collection to courtroom. The phlebotomist who drew the blood signs the collection form and seals the vial in a tamper-evident bag: a brown paper bag with an adhesive seal that tears visibly if broken, the technician's initials written across the seal in permanent marker, the date and time of collection noted on the attached chain-of-custody form. The courier who transports the sample to the laboratory signs the form upon receipt. The laboratory analyst signs upon intake, notes the seal's condition, breaks the seal under documentation, performs the analysis, reseals the results, and signs again. The evidence clerk who stores the results signs. Every link in the chain is identified. Every transfer is documented. Every handoff is signed.
A gap in the chain, an unsigned transfer, an unaccounted-for period between collection and laboratory receipt, a seal that cannot be verified as intact, does not prove contamination. But it opens the door to the argument that contamination is possible, that the sample presented in court may not be the sample collected at the scene. The defense attorney exploits the gap by demonstrating that the safeguards preventing tampering were not maintained(Courts 2023, Rule 901(a))United States Courts, "Federal Rules of Evidence" (2023), Rule 901(a).View in bibliography. Once the chain is shown to be broken, the prosecution must re-establish the evidence's integrity through independent means or accept its exclusion.
The evidence room that houses the sample between hearings is itself a provenance infrastructure: a climate-controlled vault with a single monitored entrance, its shelves organized by case number, each bin sealed and logged, each checkout signed and countersigned. The physical architecture, walls, locks, cameras, paper forms, is a material expression of the same requirement the hearsay rule articulates through doctrine: evidence must be traceable through every stage of its custody.
Provenance in computational systems is weaker by design. When a retrieval-augmented system pulls a text passage from a database and injects it into a language model's context window, the passage enters as a sequence of tokens. The context window, the buffer where the model assembles its inputs before generating a response, does not preserve the provenance of each token sequence. A passage from a government regulatory filing and a passage from a competitor's marketing copy arrive as adjacent text blocks, indistinguishable in format. The filing's authority (a regulatory agency with enforcement power, operating under statutory mandate) and the marketing copy's adversarial motivation (a competitor describing a rival's product for commercial advantage) are present in the original documents and absent from the context window.
What the model receives is a flat concatenation. Tokens from the filing sit next to tokens from the marketing copy with nothing to mark the boundary between a claim backed by enforcement power and a claim backed by a sales incentive. A developer monitoring the pipeline sees this plainly: the retrieval module's JSON log lists each chunk with a source URL, a similarity score, and a character offset, but the prompt assembled from those chunks carries none of that provenance into the generation step. The JSON log is metadata about the retrieval. The token stream is the evidence the model acts on. Between the two, the custody is severed. The model processes the combined stream and generates a synthesis that treats both sources as equivalent evidence. The context window is shared memory without access control: every claim enters with the same permissions, and no claim can be quarantined, challenged, or weighted on the basis of its origin.
The second is integrity: the assurance that the evidence has not been altered between its production and its presentation. The sealed evidence bag, the initialed tape, the sequential numbering of the chain-of-custody form: each is a physical mechanism designed to make unauthorized alteration detectable. In the digital domain, cryptographic hash functions serve the same structural purpose. A hash of a document at the time of its creation produces a fixed-length fingerprint that changes unpredictably if even a single character in the original is modified. A recipient who possesses both the document and the original hash can verify integrity by recomputing and comparing. If the fingerprints match, the document has not been altered since the hash was created.
The mechanism is sound within its narrow scope. The problem is that integrity verification requires the hash to have been computed at the moment of creation, by an identified party, under documented conditions. A hash computed after the fact, by an unidentified system, stored in an unaudited log, provides no guarantee: the document may have been altered before the hash was computed, and the hash itself may have been generated to match the already-altered version. A database administrator who discovers that transaction logs were modified will find the current hashes perfectly consistent, because the hashes were recomputed after the modification. The integrity mechanism verified what was presented to it, which is not the same as verifying what originally existed. The sealed evidence bag has the same vulnerability: if the bag was opened and resealed by someone who knew how to replicate the tamper-evident tape, the visual inspection at the laboratory will show an intact seal. The integrity mechanism works only when the conditions of its application are themselves trustworthy. Integrity, like provenance, is only as reliable as the verification chain under which the mechanism was applied. The hash does not certify itself. Something must certify the hash.
The third is context: the circumstances under which the original assertion was produced, and the scope within which it may be relied upon. This is the most demanding of the three requirements and the one that computational systems handle least well.
"New York City" appears in three records. In a shipping manifest filed with the United States Postal Service, it designates the five-borough municipality incorporated in 1898, with ZIP codes spanning Manhattan through Staten Island, a single legal entity for purposes of mail delivery and address validation. In a real estate listing on a Manhattan-focused brokerage platform, "NYC" means something narrower: the island of Manhattan, the market where a two-bedroom apartment trades at four thousand dollars per square foot, the borough that real estate agents invoke when they want the word's highest-price connotations. In a tourist review on a travel platform, "New York City" designates the experience: Times Square, Central Park, the Statue of Liberty, the subway at rush hour, an entity that overlaps with the municipality but carries different boundaries and different practical referents.
Three records sharing one string, each produced under different conventions, each carrying a distinct referent. A fourth record, a property tax assessment filed by the City of New York's Department of Finance, uses "New York City" in yet another sense: the taxing jurisdiction, which includes all five boroughs but subdivides them into tax classes that cross-cut the postal, real estate, and tourist boundaries. The string is identical in every case. What differs is the context that produced it: the postal system's administrative categories, the real estate market's pricing geography, the tourism industry's experiential mapping. A system that merges these records without tracking context will produce results that silently conflate a postal address with a real estate market with a tourist destination. A user searching for Manhattan apartments will see listings from the Bronx. A user searching for Brooklyn properties will see nothing, because the Manhattan-focused source never uses "New York City" to include Brooklyn. Both failures are silent. Both trace to the same cause: the context under which each record produced the string has been stripped away, and the merged record carries the content without the provenance.
The courtroom's three requirements, provenance, integrity, context, compose into a single demand: the evidence must travel with enough backing that a recipient can evaluate its reliability without access to the person who originally produced it. The hearsay rule's exclusions apply when this demand cannot be met. The hearsay exceptions apply when specific circumstantial or procedural safeguards substitute for the missing backing. The demand concerns the relationship between the evidence and the apparatus that produced it, indifferent to the medium, parchment, testimony, tokens, in which the evidence is encoded.
What strings forfeit, what tables forfeit
The sentence "This garment is sustainably produced" begins its life in a supplier's product catalog in Dhaka, where it carries an implicit authority: the supplier manufactured the garment, obtained a certification from a third-party auditor, and stands behind the claim. A retailer in London ingests the catalog and displays the sentence in a product listing, adding a layer of curatorial endorsement: someone selected this supplier and approved its catalog. A recommendation engine in a data center in Virginia indexes the listing and stores the sentence as a feature in a vector database, where it becomes a data point influencing ranking algorithms without any mechanism for identifying who asserted it or under what authority. A customer-facing chatbot retrieves the sentence from the vector database and presents it as part of a response to a consumer's question in Toronto: "Is this dress sustainably made?" The chatbot answers: "Yes, this garment is sustainably produced."
The words traveled intact. The authority behind them did not. The supplier's commercial responsibility, the retailer's curatorial judgment, the vector database's statistical provenance: all were present at some point in the chain and absent at the end of it.
The chain does not terminate at the chatbot. The consumer in Toronto reads the chatbot's answer and makes a purchasing decision. The retailer in London records the sale as a "sustainability-driven purchase" in its analytics database. The analytics feed a quarterly report to investors claiming that sustainable product lines grew by eighteen percent. The investor report becomes a regulatory disclosure. At no point in this extended chain does anyone re-examine whether the original supplier's claim was accurate, because every intermediate system treated the claim as established fact on the grounds that the system before it had done the same. The original assertion has traveled from a factory in Dhaka to a regulatory filing, and the custody was lost at the first handoff. Everything downstream was hearsay. The chatbot's answer is Bentham's hearsay problem at machine speed: the system "entrenches itself in the simple assertion that it was told so, and leaves the burden entirely upon the absent, and perhaps unknown author." The author, in this case, is a supplier in Dhaka whose identity the chatbot does not know, whose reliability it cannot assess, and whose commercial incentive to exaggerate it cannot detect.
A table loses context at the point of composition. The customer-12345 problem from the preceding chapter illustrates the mechanics: two databases assign the same integer key to different people, and the join on the shared key produces a single record that conflates a wholesale buyer in Antwerp with a retail consumer in Lyon. The join algorithm executed correctly, the key comparison operated as designed, and the schema's constraints were satisfied in both source tables and in the merged result. What was forfeited was the context: the two systems used the same identifier space but operated under different identity conventions, and the join could not detect the divergence because the divergence was encoded in the practices surrounding the data, not in the data itself.
The most revealing failure mode is the one that neither empire can detect: the impossible query. Ask a large language model to list all integers between one and one hundred that are both prime and divisible by four. The model will produce a list, 2, 4, 12, 20, 28 is representative of the kind of output produced (actual responses vary across models and runs, but reliably include non-primes), delivered with the syntactic confidence of a correct answer. Every number on the list is wrong. No prime greater than two can be divisible by four, and two itself is not divisible by four. The query is impossible. The constraints are jointly unsatisfiable, and the correct response is a refusal: a structured explanation of why the constraints conflict, which constraint eliminates which candidates, and how the user might reformulate the query to produce a satisfiable alternative.
The string-based system cannot refuse because it has no mechanism for detecting that the constraints conflict. Its training rewarded responsive, helpful output, and producing a confident-sounding list of numbers is more responsive than explaining an impossibility. The table-based system can sometimes detect the conflict, a well-designed database will reject an insertion that violates a declared constraint, but only within the boundaries of a single schema. A query that combines constraints from different schemas, different institutions, different conceptual domains may satisfy each locally and make no sense collectively. The mathematical fact that primality and divisibility by four are mutually exclusive for integers above two is the clearest case of a general pattern: constraints that hold within one domain may contradict constraints from another, and neither strings nor tables carry the mechanism for detecting cross-domain contradiction.
Refusal is itself a form of custody.
A system that can explain why it cannot answer, that produces an artifact documenting the conflicting constraints, the chain of reasoning that leads to impossibility, and the limits within which a reformulated query might succeed, is a system that maintains awareness of its own limits. The distinction between "I have the answer" and "I have generated a string that resembles an answer" is the custody distinction. A witness who says "I don't know" under cross-examination is more trustworthy than a witness who fabricates a confident response to a question she cannot answer. A system that refuses an impossible query is more trustworthy than a system that produces plausible-sounding nonsense. In both cases, the reliability depends on the backing: the oath that binds the witness to truthfulness, the constraint system that binds the computational agent to consistency. Without that backing, the output is hearsay: assertions offered without the safeguards under which their reliability can be assessed.
The metadata objection
The most natural response to the custody problem is to attach more metadata. Tag every retrieved document with its source URL. Log every database join with the schemas that produced it. Append provenance headers to every API response. Stamp every claim with a timestamp, an author identifier, a confidence score, a source trail. The engineering is straightforward. The standards exist. PROV-O, a W3C recommendation, defines a model for provenance information: entities, activities, agents, and the derivation relationships among them(McGuinness 2013)Timothy Lebo and Satya Sahoo and Deborah McGuinness, "PROV-O" (2013).View in bibliography. Dublin Core provides fifteen elements for describing digital resources(Initiative 2020)Dublin Core Metadata Initiative, "DCMI" (2020).View in bibliography. Schema.org offers a vocabulary of types and properties intended to cover any web content. These are real engineering artifacts, developed by competent communities, deployed at scale. A system that implements PROV-O to track the provenance of every claim it processes is better than a system that tracks nothing.
The objection deserves serious engagement, because it captures a real insight: provenance information should be recorded, and better tooling for recording it would improve the reliability of every system that handles evidence. The question is whether metadata is sufficient: whether attaching descriptive tags to claims solves the custody problem or relocates it.
The relocation is visible the moment you ask who verifies the metadata. A source tag that says "Supplier A" is useful only if the identity of Supplier A is stable across the systems that process the record, if the tag was applied by a process with access to the original source, and if the tag has not been modified in transit. Each of these conditions is itself a claim that requires verification. The tag is metadata about the content. The verification of the tag is metadata about the metadata. The chain of metadata descriptions never reaches a self-certifying terminus. It bottoms out, as it always has, at an institutional structure: an organization, a protocol, a person with authority who takes responsibility and faces consequences if the assertion proves false.
Consider the timestamp. A provenance system stamps each claim with the time it was ingested. Ingestion time records arrival; production time records origin. The two diverge by months or years. A supplier's catalog entry may have been written six months before ingestion into the retailer's system. A legal holding may have been issued three years before it was indexed by the research platform. The ingestion timestamp tells the system when it first saw the claim, not when the claim was made. To record the production timestamp, the provenance system must trust the source's own declaration about when the source produced the claim, which is itself an assertion requiring the same custody guarantees the provenance system was built to provide.
The medieval notary confronted the same recursion and solved it institutionally. The notary's seal on the register was not a descriptive tag attached to the document after the fact. It was a witnessed assertion: the notary had been commissioned by a recognized authority, had verified the identities of the contracting parties, had confirmed that the transaction described in the register had occurred in his presence, and staked his professional standing, his publica fides, on the accuracy of the record. The seal worked because it represented a formal commitment backed by personal consequences. A forged seal was not a data quality error; it was a criminal act, prosecutable in the courts that recognized the notary's commission. The tag says "this claim came from Supplier A." The seal says "I, the notary, witnessed this transaction, and I will be held accountable if the record is false."
Metadata describes; institutional structure vouches. The description is indispensable for organizing, searching, and filtering claims. But the custody problem is a reliability problem, not an organizational one: can the recipient of the evidence rely upon it in the absence of the person who produced it? The answer bottoms out at the apparatus under which the evidence was produced and whether that apparatus provides adequate guarantees. The hearsay rule does not ask whether the out-of-court statement is well-labeled. It asks whether the circumstances of its production provide sufficient assurance of trustworthiness to substitute for the absent declarant's live testimony.
What the courtroom already knew
The hearsay rule's exceptions are not a miscellaneous collection of carve-outs accumulated through centuries of case-by-case adjudication. They have a structure(Wigmore 1904, §§ 1420–1427)John Henry Wigmore, A Treatise on the Anglo-American System of Evidence in Trials at Common Law (Boston: Little, Brown and Company, 1904), §§ 1420–1427.View in bibliography, and it maps onto the same five requirements that the bill of exchange evolved through commercial practice. The oath serves the function the signature served: binding the declarant to the assertion. The business-records exception enforces the same temporal discipline the usance enforced. Mortal stakes in a dying declaration play the role endorser liability played in the commercial instrument. Cross-examination and the protest procedure are both mechanisms for challenge. The chain of custody and the endorsement chain are both sequential, auditable records of every hand that touched the evidence. What is new is that the legal system arrived at these requirements from an entirely independent tradition.
The first requirement is binding. The hearsay rule's central concern is the absence of oath and cross-examination, mechanisms that bind the declarant to the assertion. The oath creates a formal commitment between the speaker and the truth of the statement. The penalty for perjury makes the commitment enforceable. When the oath is absent, when the statement was made in a letter, in a conversation, in a casual remark, the binding is weakened, and the hearsay rule presumes the evidence unreliable. The exception for former testimony applies precisely when the declarant "testified about it under oath" and was "subject to cross-examination"(Courts 2023, Rule 804(b)(1))United States Courts, "Federal Rules of Evidence" (2023), Rule 804(b)(1).View in bibliography, when the binding mechanisms were operational even though the declarant is now unavailable. The binding survived the speaker's absence because it was created by the oath at the time the testimony was given, preserved in the transcript, and enforceable through the court's authority.
The second is conditions. Every hearsay exception specifies the circumstances under which the statement was produced, and treats those circumstances as reliability guarantees. The "present sense impression" exception admits statements made "while or immediately after the declarant perceived" the event described(Courts 2023, Rule 803(1))United States Courts, "Federal Rules of Evidence" (2023), Rule 803(1).View in bibliography. Temporal proximity limits the declarant's opportunity to fabricate or forget. The "excited utterance" exception admits statements made "while the declarant was under the stress of excitement" caused by a startling event(Courts 2023, Rule 803(2))United States Courts, "Federal Rules of Evidence" (2023), Rule 803(2).View in bibliography — the emotional intensity serving as a circumstantial guarantor, because a person in extremis reports rather than constructs.
Stakes — the third requirement — are visible in two of the most venerable exceptions. The dying declaration admits statements made by a declarant who "believed the declarant's death to be imminent." A person facing death has no future in which to benefit from deception; the condition of mortal peril serves as a guarantor of sincerity. The statement against interest admits assertions "so contrary to the declarant's proprietary or pecuniary interest" that no reasonable person would have made them unless they were true. In both cases, the declarant's exposure — to death, to financial loss — substitutes for the courtroom's formal safeguards by creating circumstances under which falsehood is unlikely.
The fourth requirement is recourse. The hearsay rule exists because cross-examination is unavailable when the declarant is absent. The exceptions compensate by specifying circumstances that make challenge less necessary, but the rule's baseline assumption is that testimony without the possibility of challenge is procedurally deficient. The exception for former testimony applies precisely because the cross-examination already occurred in an earlier proceeding, and the result is preserved in the transcript.
Composition, the fifth requirement, is the legal system's mechanism for evidence that passes through multiple hands. When a blood sample moves from the crime scene to the laboratory to the courtroom, each transfer is documented, each custodian identified, each handoff signed. The chain composes: each link vouches for the integrity of the transfer it controlled, and the complete chain is inspectable as a whole. A break in any single link compromises the evidentiary value of the entire chain, because downstream custodians cannot certify what happened during the gap. The composition is explicit, sequential, and auditable.
Five requirements. The legal system arrived at each through centuries of case-by-case adjudication over what makes evidence reliable when the person who produced it is absent — the Advisory Committee that drafted the Federal Rules identified three core conditions (oath, personal presence of the trier of fact, cross-examination) that each exception must functionally replace through "circumstantial guarantees of trustworthiness"(Courts 2023, Advisory Committee Note, Art. VIII)United States Courts, "Federal Rules of Evidence" (2023), Advisory Committee Note, Art. VIII.View in bibliography. The commercial system arrived at the same five through centuries of practice over what makes a financial instrument reliable when the person who issued it is in another city. Legal scholars have acknowledged that no single coherent theory explains every exception; some rest on historical accident or the mere absence of motive to falsify(Morgan 1948)Edmund M. Morgan, "Hearsay Dangers and the Application of the Hearsay Concept," Harvard Law Review 62, no. 2 (1948): 177–216.View in bibliography. But the convergence with an independently evolved commercial tradition suggests the latent logic runs deeper than any single tradition's self-account. These requirements are what any system — parchment or computational, medieval or modern — must provide if it is to compose claims from different sources with known reliability.
The structure that remains
The string empire produces evidence that anyone can generate and no one can vouch for. The table empire produces evidence that is vouched for within its schema and uninterpretable outside it. The hybrids that modern systems construct from both inherit the incapacities of each, and the seam between them is where evidence loses the backing that makes it reliable.
The loss is architectural. A string enters a context window and loses its provenance because the architecture treats all tokens as equivalent input regardless of their origin. A table joins with another table and loses cross-boundary context because the join algorithm operates on shared keys, and keys carry identity without carrying the conventions that define what identity means. The seam between the empires is governed by neither empire's rules, and no amount of metadata, logging, or post-hoc annotation can substitute for safeguards that were never present in the computational medium.
What the courtroom knew — what the Lex Mercatoria evolved across centuries of commercial practice, what the notarial system codified in its registers and seals, what the hearsay rule articulated in its exceptions — is that evidence crossing a trust boundary requires a specific set of guarantees. Those guarantees have been named, tested, and refined across seven hundred years of legal practice and three centuries of commercial innovation. They are not mysterious. The binding that the oath provides, the conditions that the business-records exception specifies, the stakes that endorser liability creates, the recourse that cross-examination enables, the composition that the chain of custody documents: these are engineering requirements, derived from centuries of practice, applicable to any medium that carries claims across boundaries.
The bill of exchange required three centuries and four stages of evolution to develop from a notarial deed that could not travel without the notary's physical presence to a negotiable instrument that composed across holders and jurisdictions. The database migration requires months of coordination to add a single column to a schema. Both timelines reflect the same reality: changing the form of record requires coordination that scales with the number of downstream systems that depend on the form.
Consider what it would mean for a database join to carry, alongside the matching rows, the terms under which the match was made — the schemas of both source tables, the conventions that defined the shared key, and a mechanism by which a downstream consumer could challenge the match if the conventions diverged. The customer-12345 collision would become, instead of a silent conflation, a formal dispute: the join would produce the merged record and the evidence that the two identity conventions are incompatible, and the system would halt or flag rather than propagate a falsehood. The sustainability claim that traveled from Dhaka to a regulatory filing would carry, at each handoff, a record of who asserted it, under what authority, and what recourse a downstream consumer has if the assertion proves false — the computational equivalent of the endorsement chain on the back of a bill of exchange, each signature visible, each signer accountable.
The requirements have been identified. The courtroom confirmed them. The bill of exchange embodied them. And whoever provides them in the computational medium will occupy the position the notary occupied in the commercial one — collecting the trust tax that Chapter 3 named, unless the guarantees can be made structural rather than custodial. "Dispensing with confrontation because testimony is obviously reliable," Justice Scalia wrote in the decision that gave the Raleigh precedent its modern force, "is akin to dispensing with jury trial because a defendant is obviously guilty"(States 2004, 541 U.S. at 62)Supreme Court of the United States, "Crawford v. Washington" (2004), 541 U.S. at 62.View in bibliography. Reliability is not a property of the evidence. It is a product of the apparatus under which the evidence was produced. Whether that apparatus can be built into a computational medium — whether protocol and data structure can provide what parchment and seal provided through convention, at a cost that makes large-scale composition practical — is the question the rest of the volume turns on. The medieval merchant who needed a bill to travel from Florence to Bruges could not wait for computation. The modern system that needs evidence to compose across databases, language models, and trust boundaries cannot wait for another three centuries of evolution.