The Cost of Forgetting

Information is inevitably tied to a physical representation and therefore to restrictions and possibilities related to the laws of physics and the parts available in the universe.

Rolf Landauer, 'The Physical Nature of Information' (1996), abstract

Capital migrates toward whoever controls the binding constraint. In the emerging regime, that constraint is energy-structured-into-computation. Computation is a physical process, subject to physical limits, and those limits determine what the new regime can and cannot do.

The link between information theory and thermodynamics can be stated as a single inequality, published by Rolf Landauer in 1961.(Landauer 1961)Rolf Landauer, "Irreversibility and Heat Generation in the Computing Process," IBM Journal of Research and Development 5, no. 3 (1961): 183–191.View in bibliography The result is deceptively modest: erasing one bit of information requires dissipating at least kTln2kT \ln 2 joules of energy, where k is Boltzmann's constant and T is temperature. kTln2kT \ln 2. Four symbols. The floor beneath all computation. At room temperature this works out to about 2.87×10212.87 \times 10^{-21} joules, a number so small it seems academic. But its implications are structural. It establishes a thermodynamic floor to irreversible computation, and that floor cannot be negotiated away by cleverness alone.

The bound applies not to "thinking" in general, but to the moments when a physical system is forced to forget. Computation, in its logical form, often involves operations that are logically irreversible: gates that discard information about which inputs produced a given output.(Bennett 1982)Charles H. Bennett, "The Thermodynamics of Computation—a Review," International Journal of Theoretical Physics 21, no. 12 (1982): 905–940.View in bibliography An AND gate, for instance, maps four possible input states to two possible outputs; the information about which input produced the output is destroyed. This logical irreversibility must be physically compensated. The Second Law of Thermodynamics forbids the total entropy of a closed system from decreasing. When a computational system erases information, when it reduces the number of distinguishable microstates in the memory register, that entropy must go somewhere. It goes into the environment as heat.

Landauer's limit is a thermodynamic law, as fundamental as the prohibition on perpetual motion.

At human-relevant scale: erasing one gigabyte—8.6×1098.6 \times 10^9 bits—requires at Landauer's minimum approximately 2.5×10112.5 \times 10^{-11} joules at room temperature, about 25 picojoules. A modern NVMe solid-state drive performing the same erasure dissipates on the order of microjoules, six orders of magnitude above the floor. The gap does not disappear with scale: from register-level operations to frontier training workloads, the ratio of engineering overhead to thermodynamic minimum is largely set by device physics rather than by the size of the computation.

Modern transistors operate roughly six orders of magnitude above the Landauer limit. A state-of-the-art processor dissipates on the order of 101510^{-15} to 101410^{-14} joules per switching event, depending on architecture and workload, while the Landauer minimum at room temperature is 102110^{-21} joules per irreversible bit erasure.(Lloyd 2000)Seth Lloyd, "Ultimate Physical Limits to Computation," Nature 406 (2000): 1047–1054.View in bibliography The gap represents engineering overhead — resistance, capacitance, imperfect switching, clock distribution, all the accumulated realities of silicon at scale. The historical trajectory compressed that overhead from roughly 10310^{-3} joules per operation in the vacuum tube era through 10910^{-9} in early transistors to 101510^{-15} in modern CMOS: a twelve-order-of-magnitude improvement, driven by exponential gains in architecture and fabrication.(LeBlanc 1974)Robert H. Dennard and Fritz H. Gaensslen and Hwa-Nien Yu and V. Leo Rideout and Ernest Bassous and Andre R. LeBlanc, "Design of Ion-Implanted MOSFET's with Very Small Physical Dimensions," IEEE Journal of Solid-State Circuits 9, no. 5 (1974): 256–268.View in bibliography Six orders of magnitude remain before the constant factor — kTln2kT \ln 2, physics rather than engineering — dominates.

At system scale, the gap compounds. A single inference query to a frontier language model—prompt processing plus a few hundred generated tokens—requires on the order of 101310^{13} floating-point operations. At Landauer's minimum, the irreversible component of that arithmetic costs less than a microjoule. Actual energy consumption for one such query, including memory hierarchy traversal, inter-GPU data movement, voltage regulation, and cooling, runs on the order of 10310^3 to 10410^4 joules. The ratio between thermodynamic floor and deployed practice is not six but roughly eleven orders of magnitude: six from transistor-level switching overhead, and five more from system architecture. In modern inference workloads, the energy cost of moving data between memory and processor dominates the energy cost of the computation itself.

The gap is both a ceiling and an opportunity, and the ceiling is descending. Dennard scaling held efficiency gains on a stable trajectory into the mid-2000s, after which it weakened. The approach toward the Landauer floor has slowed, but it has not stopped.

Charles Bennett showed in 1982 that logically reversible computations can in principle be performed with arbitrarily low energy dissipation; entropy increases only when information is discarded.(Bennett 1982)Charles H. Bennett, "The Thermodynamics of Computation—a Review," International Journal of Theoretical Physics 21, no. 12 (1982): 905–940.View in bibliography The tradeoff, storing all intermediate states and accepting slower circuits, is unfavorable for current applications, but the theoretical point stands: Landauer's limit applies to erasure, not to computation as such. A perfectly reversible computer could in principle compute with arbitrarily little energy. Real computers are not reversible, and so they pay the thermodynamic cost.

Seth Lloyd extended the analysis to ask what maximum computational rate a physical system of fixed energy and size could achieve.(Lloyd 2000)Seth Lloyd, "Ultimate Physical Limits to Computation," Nature 406 (2000): 1047–1054.View in bibliography His answer, drawing on both quantum mechanics and thermodynamics, suggests approximately thirty orders of magnitude of theoretical headroom between current systems and the ultimate physical limit. But this headline number is misleading. The Lloyd limit assumes perfect utilization of all matter as computational substrate, technologies that remain speculative. Physics does not cap computation at human-relevant scales. Engineering and economics do. The thermodynamic limits are far away. The engineering limits are much closer. The economic limits, the cost of energy, the scarcity of fabrication capacity, the time required to build infrastructure, are binding now.

The Landauer and Lloyd limits matter for economics because they establish boundary conditions on a quantity that is becoming a primary factor of production. At room temperature, one joule of energy implies an upper bound of roughly 3.5×10203.5 \times 10^{20} irreversible bit erasures, an astronomically large number. Today's binding constraints are engineering and delivery, not the thermodynamic floor. But the floor exists, and it cannot be repealed.

The constraint becomes binding under two conditions. First, energy scarcity: if total energy throughput is limited by resource constraints, climate policy, or cost, then computation faces a ceiling. Second, thermodynamic maturity: if devices approach Landauer efficiency, further improvement requires either more energy or lower temperatures. Neither condition is imminent, but both are visible on the trajectory. AI training runs now consume megawatts. If model capabilities continue to scale with compute, and compute continues to require energy, then the energy-computation-output linkage becomes a structural feature of the economy.

Where compute is the bottleneck, advantage accrues to whoever controls the cheapest reliable joules delivered to silicon at scale. This is the economic translation of Landauer's principle. The physics sets the floor; the engineering determines how far above the floor we operate; the economics determines who can afford to operate at scale. A company with access to cheap, reliable power and efficient hardware has a structural advantage over one without.

The era of costless scaling has a physical endpoint. After that endpoint, more computation requires more energy — not more cleverness, not better algorithms, but more joules. The constraint may not bind today, but it will bind eventually, and the infrastructure decisions being made now will determine who can operate at scale when it does. The question is how much room remains between here and the floor, and what the economics of that room look like.