Chapter 4 — Client Primitives: Agentic Behaviors and Control

Published on: 2026-04-02 Last updated on: 2026-06-12 Version: 1

Chapter 4 — Client Primitives: Agentic Behaviors and Control

Fourth post of the chapter-by-chapter walkthrough of LLM Primer IV: Designing AI Cognition with MCP. Server primitives expose what the server can offer; client primitives expose what the host will lend back — and each loan is a capability the user grants and a risk the host accepts.


Why this chapter exists

A server that only exposes resources, prompts, and tools knows nothing about its host: not the model it runs, not the files it can see, not whether the user is even at the keyboard. For many integrations that wall is intentional. For whole categories of useful behavior it is too tall. A server that needs to summarize a long document should not have to ship its own LLM. A server that wants to operate on a project should know which project. A server that needs the user's permission for a destructive step should be able to ask. Client primitives are how MCP punches small, controlled holes through the wall.

The three primitives — Sampling, Roots, Elicitation — each extend the server's reach into the host's territory in a precise, negotiated way. Each is also a security surface the host accepts on behalf of its user, and each composes with the others to produce behavior that is more agentic than the sum of the parts.

One line: Sampling lets the server use the host's model, Roots tell the server what scope it is working in, and Elicitation lets the server ask the user a question — each a deliberate loan, each a deliberate risk.

4.1 Sampling: borrowing the host's brain

With sampling enabled, a server can ask the host to run an inference call and return the result. It ships no model, holds no API key, and never learns what model the host actually used — it sends messages plus soft preferences (cost, speed, intelligence), and the host runs the call. From the server's vantage the host has become a generic LLM endpoint.

The leverage is real. A document-store server can run small reasoning steps inside its own logic — retrieve, rank, compare — without leaking domain knowledge to the host or shipping inference itself. The risk is equally real. A malicious server can use the host's model — and the user's budget — to do work the user never approved. The classic attack is a sampling payload disguised as a user instruction. The host is the only line of defense, which is why mature hosts default to denying sampling unless the user has opted in for that specific server, surface every call's prompt for inspection, cap calls per session, and meter cost against the user's wallet rather than the server's.

A subtler concern: sampling can encapsulate an internal agentic loop. To the host it looks like one sampling call; inside the server, an entire sub-agent has run. The pattern worth naming is the bounded sub-agent — the user grants a budget of N calls, T seconds, X dollars; the server runs whatever it likes inside that envelope and returns either a result or a graceful partial. Sampling does not give the server access to the conversation either; it sees only what it sent, and a host that smuggles conversation context into sampling payloads has weakened the trust boundary regardless of intent.

4.2 Roots: filesystem boundaries and project scope

A root is a URI — usually file://, though the spec is general — that the host has declared to be in-scope. The server calls roots/list and asks "what am I working with?" The host advertises a roots capability at initialization and emits notifications/roots/list_changed when the user switches projects. The list is just a list; there is no nested permission language, no glob patterns, no include/exclude rules.

The coarseness is deliberate, and it reflects an MCP-wide design choice worth understanding. A fine-grained permission language at the protocol layer creates a false sense of safety: users read a long scope list and assume the protocol enforces it, when in fact the protocol cannot prevent the server from misbehaving in ways the language never anticipated. A coarse primitive backed by external isolation — process boundaries, container mounts, OS-level access control — is honest about where enforcement actually lives. Roots are an honor system at the protocol layer, made enforceable at the runtime layer when sandboxing is in play.

Two practical consequences. First, when roots change, well-behaved servers drop cached state scoped to the old root — indices, parsed ASTs, watched-file subscriptions — or they leak state across project boundaries. Second, roots bound what the server should look at, not what it is allowed to do. A sensitive file inside a granted root — a credential dump, an envfile, personal correspondence — is still sensitive, and a polite server still applies judgment about what it surfaces.

4.3 Elicitation: letting the server ask a question

Elicitation is the newest of the three and the one that reflects what designers learned from the first wave of agent deployments. The server emits elicitation/create with a message (the question) and a requestedSchema (the shape of the expected answer). The host renders the question in its UI, collects the answer, validates it, and returns it. The server resumes its work with the answer in hand.

The schema is what makes elicitation safer than free-form prompts. A boolean cannot be answered with prose; an enum cannot be answered with a fourth option. The host can show the user in clear UI that they are being asked for a yes/no, and refuse anything else. The spec deliberately restricts schemas to flat objects of primitives — for real forms, use a tool whose argument schema is the form, where the user can review the model's filled-in values before invocation. Elicitation is for the one or two field cases.

The risk profile is phishing-shaped: a server that surfaces "the system requires your AWS access key to continue" is trying to get the user to type a secret in the wrong place. Hosts mitigate with clear attribution — every elicitation prompt labeled with the originating server, separately from the assistant's voice — and by refusing schemas that look credential-shaped. The positive flip side is the destructive-tool confirmation pattern: a tool's first job is to elicit "are you sure?" and only on confirmation does it act. This is one of the most effective MCP hardening practices in production.

4.4 Composing the three

The primitives are designed to compose. A server with all three has, in effect, a complete agent runtime delegated to the host: it can read scope (roots), reason about it (sampling), ask clarifying questions (elicitation), and act through its own tools. The combinations matter. Sampling plus roots is an autonomous agent server; sampling plus elicitation is a conversational server with no filesystem reach; roots plus elicitation is a deterministic helper. Each on its own is bounded; together they multiply, and a host's consent UI is at its best when it can name the combination the user is granting rather than asking permission for each in isolation.

Worth holding onto: read the three client primitives as one question — how much of itself is the host willing to lend to the server? Sampling lends the model. Roots lend a scope. Elicitation lends the user's attention. Each loan is negotiated separately so consent can be deliberate, and the protocol's job is not to remove the risk but to make it legible.

What Chapter 4 sets up

Server primitives and client primitives between them describe everything the host and server can do to each other. What we have not yet addressed is how any of this travels over the wire. tools/list and sampling/createMessage are not just abstract messages — they ride on a transport, and the choice of transport quietly decides almost every operational property of an MCP integration. Servers also need to be findable: a host that wants to use a server has to know it exists, where it lives, and whether to trust the claim. Chapter 5 takes up both.


Next — Chapter 5: Transport Protocols and Discovery. The three transports MCP supports — stdio, SSE, Streamable HTTP — compared honestly, and the .well-known/mcp.json plus Server Card layer that turns point integrations into something that resembles an ecosystem.

Want the full picture? The book walks each primitive's full lifecycle, error model, and risk surface, treats the bounded sub-agent pattern in depth, and lays out the tiered-consent model that has emerged for hosts managing read, scoped, and agentic servers. View LLM Primer IV on Amazon →

SHO
SHO
CTO of Receipt Roller Inc., he builds innovative AI solutions and writes to make large language models more understandable, sharing both practical uses and behind-the-scenes insights.