Chapter 13 — Limitations, Risks, and Open Challenges

Published on: 2026-03-15 Last updated on: 2026-06-08 Version: 4
Chapter 13 — Limitations, Risks, and Open Challenges

Chapter 13 — Limitations, Risks, and Open Challenges

Thirteenth post of the chapter-by-chapter walkthrough of LLM Primer II: Language Models Through Mathematics. The chapter that takes the math and turns it toward the ceilings — the ones that cannot be raised by more compute alone.


The chapter that takes the math seriously, the other way

Most chapters in this book have used math to explain how something works. Chapter 13 uses math to explain why some things cannot keep working — or cannot keep improving at the current rate.

This is, in my experience, the chapter that surprises readers. They expect the limitations section to be the "ethics check-the-box" of an AI book. It isn't. It's a quantitative argument about what scale is going to cost, where bias enters the math, and which problems the math demonstrably cannot solve on its own.

13.1 Model size, compute cost, and energy constraints

Section 13.1 puts numbers on the question "can we just keep scaling?"

The chapter walks through the cost mathematics. Training a frontier model today costs nine figures in compute alone. The scaling laws from Chapter 8 predict that another order of magnitude of capability requires roughly two orders of magnitude more compute. Two more orders of magnitude after that, and you are talking about training costs that exceed the gross domestic product of small countries.

Energy is the same story translated into joules. The chapter does the arithmetic. A modern data center training one large model consumes power comparable to a small city. Doubling the model doubles (or worse) the consumption. There is a limit to how much electricity can be physically generated and routed to a single training run, and the leading labs are already touching it.

The section then asks the harder question: what do you do about it? The book is careful not to be glib here. It lays out the three families of responses — better algorithms (this is what Chapter 7 was about), better hardware (a topic with its own deep math), and changing the goalposts (deciding that "the next order of magnitude" isn't what we actually need next).

One line: the scaling laws say capability is a smooth function of compute. The energy grid says compute is not a smooth function of will. The two curves are about to meet.

13.2 Bias, ethics, and societal impact

Section 13.2 is the second half of the chapter, and the longer one. It treats bias as a mathematical phenomenon first — and the ethical conversation second, on top of that mathematical foundation.

The chapter shows where bias enters the pipeline. Training data is a sample from human text, which itself is a sample of human writing — neither uniform across populations, viewpoints, or eras. Maximum likelihood training (Chapter 3) makes the model reflect the empirical distribution of the data. If the data underrepresents a group, the model underrepresents that group. If the data systematically associates certain occupations with certain genders, the model reproduces that association. The math does not see this as bias. It sees it as correctly fitting the distribution it was given.

The section then turns to mitigations. Data curation. Reweighting. Constrained decoding. RLHF with explicit fairness signals. Each one has costs in the math. None of them removes the underlying issue: the model is a mirror of the data it learned from, and the data was made by people.

The ethical conversation that sits on top of this — who decides what's biased, who is harmed by a given failure mode, who gets to set the constraints — is, the book says clearly, not a mathematical question. It is a political and social one. Pretending it is mathematical is itself a kind of error. The chapter is direct about this and does not try to wrap it in equations.

The closing pages cover broader societal impact: labor, copyright, misinformation, surveillance. The book treats each one with the same honesty: here is what the math implies, here is what is not in the math, here is who is making the decision.

Worth holding onto: the math of bias tells you precisely where it enters and what it costs to remove. It does not tell you what counts as bias in the first place. That distinction is the whole chapter.

The shape of this chapter, deliberately

Some readers expect this kind of chapter to either dismiss the concerns (with "the math doesn't care") or surrender to them (with "we should slow down"). The book does neither. It treats limitations the same way it has treated everything else in Part III and Part IV — with the math made explicit where math applies, and the boundary clearly marked where it doesn't.

If Chapter 8 was the chapter that admitted what theory cannot yet explain, Chapter 13 is the chapter that admits what theory could never explain on its own.

What Chapter 13 sets up

You finish Chapter 13 with a much more honest picture of what this technology can and cannot do, what it costs, who pays the costs, and which decisions are mathematical versus political. From here, the book closes with one practical chapter for engineers — the chapter that turns understanding into practice.


Next — Chapter 14: Practical Knowledge for Engineers. The closing chapter. How to keep deepening your understanding once the book is done. The tools, libraries, and habits that turn the math you now have into shipping work. And the bridge to the other books in the LLM Primer series — RAG, MCP, real-world applications, scaling, security — each a different door into the same machine.

Want the full picture? The book includes specific numerical estimates for training costs and energy consumption, traces several real-world bias failures back to the math that produced them, and lays out a clear-eyed assessment of which open problems are technical and which are social. View LLM Primer II on Amazon →

SHO
SHO
CTO of Receipt Roller Inc., he builds innovative AI solutions and writes to make large language models more understandable, sharing both practical uses and behind-the-scenes insights.