The Capability Bus

01 · The Mis-Sizing

A skill teaches behavior. MCP grants reach.

A skill is a small thing. It is a behavioral package that sits next to the harness and tells an agent how to do something: how to format a spreadsheet, how to run a deploy, the convention for this one repo. It changes what the agent does on a turn. Useful, bounded, local. You can hold a skill in your head.

MCP is not that, and the industry keeps sizing it as if it were. The reflex is to file it next to plugins and skills, one more way to hand the model a capability. That reflex is the mistake, and it is everywhere, because the first time anyone meets MCP it looks small: a local server that lets the model read a file or call an API. The hello-world is a filesystem tool. So people conclude MCP is a tool-plugin mechanism and move on.

The size asymmetry is the tell. A skill affects agent behavior. MCP affects tool access, identity, credential scope, sandboxing, routing, governance, observability, versioning, and the deployment surface of every integration the agent touches. One of those is a recipe. The other is the road network.

There is a second tell, and it is structural. A plugin is a passive guest: the host calls it, it answers, control runs one way. MCP servers are not passive. Through the sampling primitive, a server can turn around mid-execution and ask the host model to do reasoning for it, then use the answer to finish its own work. The guest commissions the host. That one capability breaks the plugin model outright, because a thing that can make the host think on its behalf was never a plugin. It is a peer.

A skill tells the agent how to drive. MCP is the road that decides where it can go. Filing them in the same drawer is the error.

02 · One Name, Many Things

MCP is underestimated because it answers to one name while being many.

The reason MCP gets mis-sized is not that people are careless. It is that MCP is genuinely hard to point at, because it is not one kind of thing. Most components in a stack are exactly one kind of thing: a queue is a process, a schema is a contract, a socket is a boundary. MCP is all of those registers at once.

It is a protocol: JSON-RPC, lifecycle, capability negotiation. It is a server shape: the thing you build and run. It is a discovery mechanism: how an agent learns what exists without being told. It is a schema system: typed inputs and typed outputs. It is a resource interface and a prompt interface. It is a policy object the gateway governs. It is a registry object you version and deprecate. And the moment you put it in a sandbox, it is an execution unit with an identity and a blast radius.

Count them or do not; the number is not the point. The point is that no single layer of the stack can hold MCP, because MCP shows up at every layer wearing a different face. When you say the word, you might mean the wire format, or the process, or the contract, or the governed capability, and you would be right every time. That polymorphism is exactly why it is substrate and not a feature. A feature lives in one place. Substrate is the thing every layer is made of.

People underestimate MCP because they catch one facet and name the whole. It is a protocol and a process and a contract and a boundary, and it answers to one word for all four.

03 · The Membrane

The layers are real, and they leak. Both are true.

Draw the stack and it looks like clean plates of glass: model, harness, MCP, gateway, sandbox, policy, adapter, endpoint. The diagram is useful and the diagram is a lie, and the discipline is holding both at once.

The layers are real. They name genuinely different jobs, and collapsing them loses the seams you need to debug and govern. But they are not glass. They are membranes. MCP bleeds into the harness, because the harness is what decides which servers exist at all. The sandbox bleeds into MCP, because the sandbox is the boundary the server runs inside. Policy bleeds into all of it, because governance is not a layer; it is a property that pierces every layer.

This is why "what goes where" feels impossible the moment you try to be precise. The question assumes a single axis, a ladder with one rung per component. There is no such ladder, because the components are not the same kind of thing. Some are interfaces. Some are processes. Some are boundaries. Some are control planes. You cannot linearize four orthogonal axes into one stack and expect the ordering to hold.

So there are two valid views, and they disagree on purpose. The conceptual layer view orders by altitude of concern: model, harness, capability, governance, execution. The runtime containment view orders by what physically wraps what: the harness calls the gateway, the gateway routes to a sandboxed server, the server executes against the world. In the first view the sandbox sits low. In the second view the sandbox wraps the MCP server directly. Both are correct. They answer different questions, and the confusion only starts when you force them into one picture and ask which rung is right.

The stack is not plates of glass. It is membranes. The layers are real and they leak, and a diagram that hides the leak is lying to look clean.

04 · The Wall

The whole control plane is one boundary around the probabilistic core.

The last section said every layer is a membrane, and every membrane leaks. This boundary is a membrane too, with one difference: it is engineered. A cell membrane is not a hole and not a brick. It is selectively permeable, built to pass certain things and stop others. That is what this boundary is. It passes capability and stops the credential, every time, on purpose. The membranes in the last section leak because nobody designed them. This is the membrane you design so that it cannot.

Here is what the gateway, the sandbox, and the credential broker have in common, the thing that explains why the architecture has the shape it has. They are not three features. They are one wall, drawn once, around a single dangerous fact.

The model is probabilistic. It decides contextually, and it is fallible by construction. The credential is deterministic and absolute: a token does exactly what its scope allows, every time, no judgment involved. You do not want the probabilistic actor holding the deterministic key. The entire job of the control plane is to keep those two on opposite sides of a boundary.

So the credential lives on the deterministic side, inside a compiled process the model never enters. The model lives on the probabilistic side, deciding which tool to call. MCP is the interface between them, and it is the only thing that crosses. The agent gets a capability, never the credential. It can ask the broker to act; it cannot read what the broker holds. Every piece of it, the sandbox boundary, the scoped short-lived token, the egress allowlist, the approval gate, is a brick in that one wall.

This is the same split named elsewhere as compiled versus interpretive doctrine, pushed all the way down to the credential. The compiled side enforces; the model cannot violate it because the path to violate it does not exist in its surface. The probabilistic side decides, and is allowed to be wrong, because nothing it can reach is irreversible without a gate. This is the half the membrane cannot do alone. Blocking the credential is the cheap win; a capability the membrane passes on purpose can still be aimed at something irreversible, and stopping that is the gate's job, not the membrane's. Confidentiality is the easy side of the boundary; controlling the action is the hard one. MCP is where the two meet, which is exactly why it is load-bearing and exactly why mis-sizing it is dangerous. It is not a plugin. It is the membrane between the part of the system that can be wrong and the part that must not be.

The credential is deterministic. The model is probabilistic. MCP is the interface between them, and the entire control plane is one wall keeping them apart.

05 · The Factory

If you can mass-produce it, it is substrate. The proof is the factory.

A feature is something you build once. Substrate you manufacture. The test of whether MCP is substrate is whether you can build a factory that stamps out governed capabilities on demand, and you can, which settles the question.

The shape is a reusable core plus a thin endpoint adapter. The core is everything that does not change between integrations: the protocol handler, the tool and resource registries, schema validation, the credential-broker client, rate limits, timeouts and cancellation, audit logging, output redaction, error normalization, dry-run, idempotency, approval hooks. That is the bulk of an enterprise-grade server, and it is identical whether you are talking to a payments processor, a DNS provider, or an issue tracker.

What changes per endpoint is small and declarative: the auth method, the client library, the tool names and their schemas, the permission scopes, the rate limits, the risk class of each action, the business rules. Stripe, Cloudflare, GitHub: different tool packs over the same core. You do not write twenty servers. You write one core and twenty manifests.

A per-endpoint manifest, not configuration

The risk column is the law.

dns_record_list → risk: read · approval: never. dns_record_update → risk: write · approval: always. cache_purge → risk: write · approval: sometimes. The core reads the manifest and the boundary holds by construction. The agent cannot route around it, because the gate is compiled into the path, not written in a sentence the model is asked to remember.

That manifest is not configuration. It is compiled doctrine. When it declares dns_record_update as write-and-approval-always while dns_record_list is read-and-approval-never, it is authoring an enforced boundary, not a suggestion. The risk field is cages and constitutions expressed as a typed column: the small set of truths that must stay mechanical no matter how capable the agent becomes, sitting right next to the tool they govern.

A final thought, before the section closes its case. The manifest declares the risk of each tool, but a declaration is not a verification. The core can enforce that pet_cat is approval-always; it cannot check that pet_cat does what its name promises. The gate holds by construction. The truthfulness of the label does not, and that is a separate problem, and a harder one. The boundary this section builds is real and it is the part you can compile. What binds the name a model reads to the code a server runs is the part you cannot, and it is its own essay.

You do not build twenty MCP servers. You build one core and twenty manifests. The manifest is not config. It is compiled doctrine with a risk column.

A skill is a recipe. MCP is the superbus highway. People are sleeping on it.

The mistake is small and the cost is large. You meet MCP as a filesystem tool, you file it next to plugins, and you miss that the same word names the protocol, the process, the contract, the boundary, and the unit you sandbox and ship. Skills change what an agent does on a turn. MCP determines what the agent can reach, under which identity, with which credential, against which targets, through which gates, with what written to the audit log. One of those is behavior. The other is the typed capability fabric between agents and the world, the road network every other layer drives on.

In a factory of factories, that fabric is not something you consume. It is something you manufacture, govern, route, sandbox, version, and evolve. The credential stays on the deterministic side. The model stays on the probabilistic side. MCP is the bus between them, and the bus is the substrate. Build it as a plugin and the credential ends up on the wrong side of the wall.