28 April 2026

MCP in Production — What Nobody Tells You

Everyone's building MCP servers. Almost nobody's shipping them to production. Here's what the gap looks like from the inside.

The production gap

MCP has crossed the adoption threshold. The protocol is in the Linux Foundation, backed by every major AI vendor, and most engineering teams we talk to have either built an MCP server or are planning to. The protocol itself is solid — clean transport model, well-designed tool abstraction, good ergonomics for agent consumption.

The problem isn't adoption. The problem is that most MCP servers running today wouldn't survive a compliance audit, a traffic spike, or a security review. The distance between "protocol-compliant" and "production-grade" is where most of the real engineering lives — and it's larger than most teams expect when they start.

The gaps nobody demos

There are a set of problems that every team building MCP for production hits. They're not exotic edge cases. They're table stakes that the specification intentionally leaves to implementors.

Identity is the first wall. In a demo, the client is you. In production, the client is an agent acting on behalf of a user, within an organization, under specific permissions. "Who is making this request?" becomes a layered question that the transport layer alone doesn't answer. The patterns that work here aren't novel — they're borrowed from mature API gateway design — but applying them correctly to the MCP lifecycle requires care.

Authorization gets strange fast. An agent might have access to a tool but not to the data the tool returns in a specific context. Tool-level permissions are necessary but insufficient. The interesting design decisions happen at the intersection of tool access, data scope, and session context. Most implementations we've seen punt on this entirely.

Observability is harder than logging. Structured logs are a start, not a finish. When an agent chains three tool calls across two MCP servers to answer one question, understanding what happened — and whether it should have happened — requires tracing that crosses protocol boundaries. Standard APM tools weren't designed for this interaction model. You either build the instrumentation layer yourself or accept opacity.

Caching is deceptively complex. Weather data can be cached for ten minutes. A user's account balance cannot. An MCP server that treats all tool responses the same way will either serve stale data or burn through rate limits. The caching strategy needs to be tool-aware, which means the server needs metadata about its own tools that goes beyond what the specification requires.

Failure modes multiply. An MCP server that wraps a single API has predictable failure modes. A server that orchestrates across multiple upstream services has combinatorial ones. Partial failures — where tool A succeeds but tool B doesn't — need graceful handling that preserves the agent's ability to reason about what happened.

None of these problems are unsolvable — but solving them well requires having hit them before. That's what we do.

The compliance dimension

For teams in regulated industries — healthcare, financial services, legal — the standard set of production concerns gets an additional layer. Audit trails aren't optional. Data residency matters. The ability to reconstruct exactly what an agent did, with what data, at what time, under whose authority, is a hard requirement.

This isn't a criticism of MCP. The protocol is deliberately minimal. It's an observation about the distance between the protocol and what production compliance actually demands. That distance is the engineering work — and it's the kind of work where experience compounds. We've built compliance-aware MCP systems for exactly these constraints.

Edge changes the calculus

Deploying MCP servers at the edge — Cloudflare Workers, Deno Deploy, similar runtimes — introduces a different set of trade-offs. Cold start times matter when an agent expects sub-second tool responses. Stateless execution means session state has to live somewhere else. Memory limits constrain how much context the server can hold.

But the advantages are real: global distribution, automatic scaling, reduced infrastructure management. For stateless, cacheable tool calls — which many practical MCP use cases are — edge deployment is a natural fit. The trick is knowing which tools belong at the edge and which don't, and designing the server topology accordingly.

Where the maturity will come from

The MCP ecosystem will mature the same way every protocol ecosystem does: through production scars. Teams that deploy, break things, and fix them will develop operational patterns that eventually become shared knowledge.

We're at the early end of that process. The organizations building MCP servers today are generating the lessons that the ecosystem needs. Some of those lessons will become tooling. Some will become best practices. Some will stay hard-won institutional knowledge.

The teams that invest in production-grade implementation now — not just protocol compliance, but operational maturity — will have a meaningful head start. Not because the problems are unsolvable, but because solving them takes time, iteration, and exposure to real workloads.

What this means

The protocol isn't the risk. The implementation layer is. The distance from "spec-compliant" to "production system I'd bet the company on" is real, and it's made of engineering decisions that don't show up in the specification.

That's not a reason to wait. It's a reason to work with people who've already navigated it.

We build production-grade MCP infrastructure — edge-deployed, compliance-aware, designed for teams that can't afford to get it wrong. If you're past the demo phase and need to ship something real, we should talk.