Chapter 5 — Transport Protocols and Discovery

Fifth post of the chapter-by-chapter walkthrough of LLM Primer IV: Designing AI Cognition with MCP. Every message in Chapters 3 and 4 has been floating in the air between host and server — but messages do not float, they ride on a transport, and the transport quietly decides nearly everything operational about an integration.

Why this chapter exists

MCP defines its message format — JSON-RPC 2.0 over a duplex channel — and then defines three transports that implement the channel. The transports are not interchangeable. Each fits a deployment pattern, carries different operational baggage, and exposes different security surfaces. A second question hangs over the same territory: how does a host find a server in the first place? Until the host knows a server exists and where it lives, the most beautifully specified protocol produces no conversations.

This chapter walks both. The transports decide whether the server is co-located or remote, whether you have authentication or process isolation, and how the server scales. The discovery layer decides whether your server is something one engineer uses or something a team can adopt.

One line: stdio is co-located trust, Streamable HTTP is networked authentication, SSE is the deprecated middle — and .well-known/mcp.json plus Server Cards is what turns a configuration problem into a lookup problem.

5.1 stdio, SSE, and Streamable HTTP

stdio is the host launching the server as a child process. The host writes JSON-RPC to the child's stdin and reads responses from stdout; stderr carries logs. No port, no socket, no network. Authentication is handled by the OS process-launch boundary. Claude Desktop, Cursor, and the official MCP inspector all use stdio for local servers. The model is honest about what it is: a tool on your own machine, co-located, in the same trust domain as the host. It cannot be shared across hosts, cannot run on a different machine, and one-per-host is the only multiplicity available — five clients wanting the same server burn five processes.

HTTP+SSE was MCP's first answer to the network case. Bidirectional, but duplex bolted together from two unidirectional pieces. Deprecated. Still in the wild. You will meet SSE-based servers for some time yet.

Streamable HTTP is the preferred network transport. A single endpoint — typically /mcp — accepts JSON-RPC requests and replies either with a single response (for synchronous calls) or with a stream of Server-Sent Events when there are multiple messages, including server-initiated notifications. Sessions are identified by an Mcp-Session-Id header set at initialization and threaded through every subsequent request. The session is the unit of state, freeable by DELETE or reaped by timeout. The transport composes cleanly with load balancers, reverse proxies, edge caches, and TLS. Horizontal scaling is sticky-session over the session ID. None of this is novel HTTP engineering; the point is that MCP fits inside it without inventing a new layer.

The choice is a security decision as much as a deployment one. stdio servers tend toward over-privilege — they inherit the user's filesystem and network rights, with no authentication because there is no remote caller. Network servers face the open internet and must authenticate every request, with OAuth 2.1 now the converging standard. Picking the wrong transport for your security posture forces you to bolt on what should have been built in.

5.2 Server discovery: `.well-known/mcp.json` and Server Cards

A protocol that requires every host to be hand-configured with every server's address does not become an ecosystem. It becomes a configuration nightmare. MCP's discovery layer borrows the well-known URI pattern from RFC 8615 — the same mechanism used by robots.txt and .well-known/openid-configuration. A server publishes .well-known/mcp.json at its base URL. A host fetches it, parses it, and learns what the server claims about itself: the MCP endpoint, the protocol version, the auth scheme, identity metadata, and a pointer to the Server Card.

The Server Card is MCP's openapi.json: a self-describing artifact that machines and humans can both consume. A host can show the card to the user during the connect flow — "this server says it can access your Linear workspace, will read tickets but not delete them, is run by Linear themselves, uses OAuth" — and the user can decide whether the claim is acceptable. The cards in their unsigned form are claims, not proof. The mitigation is signed attestation, with a signing model that has converged on sigstore-style OIDC: a card signed via GitHub's OIDC integration under github.com/some-org/mcp-server can be verified by any host that trusts GitHub's identity claims. Hosts can layer policy on top — trust only signed cards, only signed cards from a specific provider, only those not on a revocation list.

The flow looks like adding a Wi-Fi network. Once. The host stores the configuration; subsequent sessions skip discovery. What "something changed" means in practice — version bumps, scope expansions, capability shifts — should trigger renewed consent, the way a TLS certificate change does. Hosts that skip this step let servers silently expand their reach over time, which is exactly the slow capability creep that compromises trust in any long-lived integration.

5.3 The boring parts that decide whether you ship

Three operational concerns deserve their own treatment because each is a place where careless implementation produces either broken integrations or security holes.

CORS matters for any MCP server reachable from a browser-based host. The temptation is to set Access-Control-Allow-Origin: * and move on. This is wrong. Combined with cookie auth it is a CSRF disaster; combined with bearer tokens in localStorage it is a token-exfiltration risk. Use a per-origin allowlist, and prefer Authorization headers over cookies so same-origin is not your only line of defense.

Origin validation is CORS applied server-side, because non-browser clients enforce nothing. The classic failure: a developer runs an MCP server on localhost:8000, visits a webpage that knows the convention, the page fires a request, and the server — with no origin check — does whatever the page asked. Validate Origin on every Streamable HTTP server.

Caching headers are the third. Static documentation can be cached aggressively; a database row only as long as subscriptions are wired in; a document being actively edited not at all. Return the right Cache-Control and ETag per resource. The interaction between caching and subscriptions is where engineers get bitten — an intermediary that serves a stale copy means the host's resources/updated notification never fires, because the subscription mechanism never reaches the live server.

DNS rebinding rounds out the neighborhood and disproportionately affects local servers. Bind to 127.0.0.1 not 0.0.0.0, validate the Host header against an allowlist, require an auth token even for localhost. A local MCP server without these is one demo away from being a security advisory.

Worth holding onto: transport and discovery sound like protocol trivia, and they are where most MCP deployments quietly fail. A stdio server that should have been Streamable HTTP cannot be shared; a network server with a wildcard CORS policy can be borrowed by any tab the user opens; a server without a Server Card is invisible to every host except the one its author personally configured. Pick the transport for the trust domain. Ship the discovery metadata. Get the headers right.

What Chapter 5 sets up

This closes Part II of the book. We have the protocol in hand: server primitives in Chapter 3, client primitives in Chapter 4, the wire and discovery layer here. We know what messages travel, in what direction, over what transport, between what trust domains, after what handshake. Part III turns from primitives to patterns. A single server connected to a single host is useful; the architectural payoff of MCP is composition — multiple servers, multiple agents, multiple models, cooperating toward larger goals.

Next — Chapter 6: Fundamental Orchestration Strategies. When a single well-tooled agent beats a multi-agent design, and the two foundational shapes — sequential pipelines and scatter-gather concurrency — that most production deployments rely on.

Want the full picture? The book walks each transport's latency profile, failure-isolation behavior, and OAuth 2.1 integration, treats the Server Card signing model in depth, and includes the operational hygiene checklist for any production Streamable HTTP server. View LLM Primer IV on Amazon →