LLM Primer IV — Designing AI Cognition with MCP: Series Introduction & Index

"Agents are only as good as the context they see, the tools they can reach, and the memory they carry." Welcome to Book IV in the LLM Primer series — and to the walkthrough that goes with it. Over the next fourteen days, one post per chapter, we'll open up the Model Context Protocol and the cognition layer it makes possible, and look at the decisions that determine whether an agent system quietly works or quietly fails.

Why Book IV exists

Books I, II, and III in this series gave you the model and the retrieval apparatus around it. Book I told the plain-language story of what LLMs are. Book II opened the mathematics underneath. Book III walked the production architecture of RAG. Book IV is about what surrounds a model once you try to make it act — to call tools, to maintain state across turns, to coordinate with other agents, and to do all of this without rewriting integration glue every quarter.

The pattern that broke in 2025 was the monolithic agent: a long system prompt, a handful of tools, a single context window asked to absorb every concern at once. It worked for demos. It frayed in production as prompts grew, tool surfaces expanded, and every new model release demanded another round of bespoke adapter code. The diagnosis converged from several angles — context dilution, instruction collision, the N times M integration matrix — and pointed at the same architectural answer: a protocol layer underneath the model that lets agents discover capabilities, negotiate sessions, and compose tools without either side knowing about the other in advance.

That layer is the Model Context Protocol. This book walks it honestly, layer by layer. The promise is not that MCP solves every agent problem. The promise is that, by the end, you will know what the protocol gives you, what it does not, and which patterns built on top of it survive contact with production.

Book in one sentence: Agentic systems need a protocol that decouples models from tools, a discipline for budgeting attention and memory, and a security model that takes server provenance seriously — and MCP is the layer where all three meet.

Who I wrote this for

Engineers building agent systems, technical PMs scoping them, and architects who have to defend the choices to a security review. The book assumes the reader is comfortable with the Book I picture of how an LLM behaves and the Book III picture of how retrieval is wired in; it does not assume the Book II mathematics. The center of gravity is the engineering: where the failure modes live, which decisions are reversible, and which lock the team in for years.

How to read it

Three modes that have worked for early readers. Front-to-back, if you are about to start building an MCP-based agent system and want the protocol in the order the decisions actually arrive. As a reference, if you have a working system and a specific layer that is hurting — the transport chapter, the memory chapter, the security chapters all stand on their own. Or as a sidebar for the architecture review, where the chapters become the prompts for the conversation a team needs to have before committing to a deployment topology.

The 14-chapter walk

March 30 — Chapter 1: The AI Integration Crisis and the Rise of Agentic Architecture. Why monolithic agents fray, what the N times M integration problem is, and the move from prompt engineering to context engineering.

March 31 — Chapter 2: Unveiling the Model Context Protocol. What MCP standardizes, the three roles (Host, Client, Server), how dynamic discovery differs from REST, and the session lifecycle with capability negotiation.

April 1 — Chapter 3: Server Primitives: Exposing Context and Capabilities. Resources, Prompts, and Tools — the three nouns a server can offer, their schemas, their lifecycles, and the discipline of choosing the right primitive for each thing.

April 2 — Chapter 4: Client Primitives: Sampling, Roots, Elicitation. The inverse surface — what the host gives back to the server, and the security implications of every capability handed back across the trust boundary.

April 3 — Chapter 5: Transport and Discovery. stdio versus Streamable HTTP, when to choose each, and how servers and clients find each other in local and remote deployments.

April 4 — Chapter 6: Fundamental Orchestration Patterns. The agent loop, tool routing, intermediate state management, and the patterns that keep an agent's reasoning legible.

April 5 — Chapter 7: Advanced Orchestration Patterns. Planner-executor, multi-agent coordination, hierarchical decomposition, and where each pattern earns its complexity.

April 6 — Chapter 8: Deployment Layouts. Strict MCP purity versus Reusable AI Agents versus hybrid layouts, and the tradeoffs each forces on the host.

April 7 — Chapter 9: The Attention Budget. Context as a managed resource, the cost of long windows, and the policies that decide what enters the model's view each turn.

April 8 — Chapter 10: Long-Horizon Memory. Episodic versus semantic memory, summarization policies, and the architectures that let an agent carry state across days.

April 9 — Chapter 11: Attack Surfaces in MCP Systems. The threat model — prompt injection through resources, malicious servers, tool poisoning, exfiltration paths.

April 10 — Chapter 12: Protocol Hardening. Server cards, consent UI, capability scoping, and the operational controls that put policy where it belongs.

April 11 — Chapter 13: Frameworks and Cloud. The ecosystem around MCP — agent frameworks, hosted servers, registries — and how to pick what to build on.

April 12 — Chapter 14: Benchmarking Agents. The measurements that matter, the ones that mislead, and how to build an eval harness that survives a model change.

What's different about Volume IV: Volumes I and II were about the model. Volume III was about the retrieval apparatus around it. This one is about the cognition layer — the protocol, the orchestration, the memory, and the security that turn a model into something that can act. Most agent failures are not model failures. They are decisions made one layer up that no amount of prompt engineering can recover.

About this book and the series

The LLM Primer series is the long answer to the question I kept being asked by engineers, founders, and the occasional regulator: how do these systems actually work, and what does it take to build one that holds up under load? Book I gave the shape of it. Book II gave the mathematics. Book III gave the production architecture of RAG. Book IV gives the cognition layer that sits above the model. Book V, in progress, turns to building real-world LLM applications end to end.

Want the whole picture right now? LLM Primer IV: Designing AI Cognition with MCP is the book this series is mapping — with the full protocol reference, orchestration playbooks, security checklists, and deployment templates that the walkthrough only sketches. View on Amazon →

See you tomorrow, with Chapter 1.

LLM Primer IV — Series Introduction & Index