MCP and Kubernetes Operators Aren't Competing. They're Layers.
Should you use a Kubernetes operator or MCP servers for FDE infrastructure? It's the wrong question. They operate at different layers of the stack and you'll eventually want both.
MCP and Kubernetes Operators Aren't Competing. They're Layers.
A question that keeps surfacing in FDE infrastructure discussions: should we use a Kubernetes operator or Model Context Protocol servers?
It's the wrong question. Not because the answer is obvious — it isn't — but because the framing assumes the two tools are alternatives competing for the same slot in your architecture. They're not. They operate at different layers of the stack, solve different problems, and for most cloud-native FDE organizations with AI in their workflow, you'll eventually want both.
This post breaks down what each tool actually does, where they genuinely differ, and when you'd choose one without the other.
What a Kubernetes Operator Actually Does
A Kubernetes operator manages the lifecycle of custom resources in a cluster. It watches for changes to Custom Resource Definitions — resources you define, like FDEDeployment or DeploymentPrimitive — and runs a reconciliation loop that drives actual state toward desired state.
What this means concretely: when an FDE creates an FDEDeployment resource pointing to a specific primitive version, the operator sees that resource, resolves the primitive reference, generates the concrete Kubernetes manifests for the customer namespace, applies them, monitors deployment health, handles rollbacks if something goes wrong, and updates the resource's status conditions. All automatically. All without an AI model in the loop.
The operator is fundamentally about infrastructure lifecycle management. It automates the operational workflows that are too complex, too stateful, and too compliance-sensitive to be managed with ad-hoc kubectl commands or CI/CD scripts. Its primary users are platform engineers and FDEs who are managing the operational reality of running integrations in customer environments.
The operator works the same way regardless of whether any AI is involved in your engineering workflow. It doesn't know or care whether the FDE who created the FDEDeployment resource was using an AI coding assistant or writing YAML by hand. It just reconciles.
What MCP Actually Does
The Model Context Protocol is an open standard introduced by Anthropic in November 2024 that defines how AI models can call tools and access context from external systems. The specification and reference implementations are open source, and the protocol has since been adopted across major AI tooling providers.
The protocol is about how an AI reasons about and acts on a system, not about how that system is deployed or managed. An MCP server exposes a set of tools — functions with typed parameters and descriptions — that an AI assistant can discover and invoke. The AI model reads the tool descriptions, decides which tools are relevant to the user's task, calls them, and incorporates the results into its reasoning.
In an FDE context, an MCP server backed by your pattern library might expose tools like:
- search_patterns(domain: string, tags: string[]) — find relevant primitives for an integration type
- get_pattern_details(name: string, version: string) — retrieve full parameter schema and documentation
- check_deployment_status(customer: string, deployment: string) — surface telemetry for a running deployment
When an FDE describes their integration challenge in plain language to Claude or another AI assistant, the assistant uses these tools to retrieve relevant patterns, explain the parameters, and help draft the FDEDeployment spec. The AI is navigating the pattern library on the FDE's behalf.
MCP requires an LLM in the loop. The tools it exposes aren't called by your operators or your CI/CD system — they're called by AI models reasoning about user requests in real time. The primary users are developers using AI coding assistants for FDE work.
Where They Overlap (and Why That Creates Confusion)
The confusion arises because both tools can theoretically be used to connect an AI agent to enterprise integrations. That's where the overlap ends, and the mechanisms underneath are completely different.
A Kubernetes operator could manage MCP server deployments as Kubernetes resources — treating MCP server instances as things to be lifecycle-managed, scaled, updated, and monitored. This is a legitimate use of the operator pattern: model your MCP servers as custom resources, and let the operator handle their operational lifecycle.
An MCP server could surface the pattern library to FDEs using AI assistants — making the knowledge captured by the operator-managed system accessible to AI reasoning in real time. This is a legitimate use of MCP: expose structured tools that let an AI navigate something your operator manages.
Neither of these use cases means the tools are competing. They're composable.
Where They Are Genuinely Different
The genuine differences are in layer, dependency, and target user.
Layer: The operator manages what gets deployed and maintains its lifecycle. MCP manages how an AI agent discovers and interacts with what's been deployed. Operator = deployment layer. MCP = AI reasoning layer. These are not the same layer.
Dependency on AI: The operator has none. It runs a reconciliation loop driven by Kubernetes API events. It works identically whether your team uses AI coding assistants or not. MCP is entirely dependent on an LLM being in the loop — without an AI model calling its tools, an MCP server does nothing.
Target user: The operator serves platform engineers and FDEs managing the operational lifecycle of customer deployments. MCP serves developers using AI assistants to navigate that operational reality. An FDE debugging a production integration failure primarily reaches for operator status conditions and telemetry. An FDE writing a new integration for a customer they've never worked with before reaches for an AI assistant that can search the pattern library intelligently.
Failure modes: An operator failure means customer deployments aren't being managed — rollbacks don't happen, health checks go unresponded, resource quotas aren't enforced. This is a production incident. An MCP server failure means AI assistants can't surface pattern recommendations — FDEs fall back to searching the pattern library manually. This is an inconvenience.
The risk profile difference alone tells you something important about architectural sequencing.
The Layered Answer
If you accept that these tools operate at different layers, the question shifts from "which one?" to "which layer do I need right now, and what order do I build in?"
The answer is: build the operator layer first, because it works without AI. The pattern library, the lifecycle management, the Full Loop mechanics, the security model — none of that requires AI to function. It requires Kubernetes and careful engineering.
Add the MCP layer when you want AI agents to navigate what you've built. At that point, the pattern library is already populated with validated primitives, the deployment data is already being captured, and the telemetry already flows. The MCP server has something real to expose. The AI assistant has something genuine to reason about.
Build the MCP layer first, and you have AI agents navigating an empty library or surfacing unvalidated patterns without the governance model that makes them trustworthy. The reasoning interface works; the knowledge it reasons about doesn't yet.
This is the sequential dependency: the operator layer creates the thing; the MCP layer gives AI the interface to find, understand, and interact with the thing. Neither is optional if you want both capabilities. But they have an order.
When You'd Choose One Without the Other
The layered answer doesn't mean you always need both. There are three clear scenarios where you'd choose just one:
If you have no AI in your FDE workflow: Use the operator, skip MCP. The pattern library, lifecycle management, telemetry capture, and Full Loop mechanics all work without AI reasoning. Many high-quality FDE organizations don't use AI coding assistants in their core workflow, either by preference or by compliance constraint (classified environments, certain regulated industries). The operator delivers its full value independently. MCP adds nothing without an LLM to call it.
If you're pure SaaS, no Kubernetes, but use AI assistants for FDE work: Use MCP against a REST API, skip the operator. If your deployment environment is purely SaaS — the pattern library lives in a database, deployments are REST API calls, there's no Kubernetes cluster to run an operator — you can expose that system to AI assistants via MCP without ever touching Kubernetes. The operator pattern assumes CRDs and Kubernetes. If your runtime is different, the operator doesn't fit; MCP still does.
If you have both Kubernetes and AI agents in your workflow: Use both layers, in order. Operate the pattern library and lifecycle management with the operator. Expose discovery and interaction capabilities to your AI assistants via MCP. The stack is: Kubernetes operator managing what exists and its lifecycle → MCP server giving AI agents structured access to what the operator manages → AI assistant reasoning about pattern selection and deployment configuration on behalf of the FDE.
The Enterprise Reality Check
There's a version of this conversation that the previous post in this series has already made necessary to name: for many enterprise FDE environments, neither tool is deployable today.
The IBM engineer working in a Stuttgart manufacturing plant with SSH access and a two-week change management queue cannot deploy a Kubernetes operator in the customer environment. They also aren't going to route AI assistant tool calls through an MCP server that needs to reach a control plane behind a VPN that drops every four hours. The enterprise FDE environments described in Post 4 of this series are too constrained for either of these tools to function as described.
What remains portable — what survives the stripping away of both Kubernetes and the MCP protocol — is the pattern registry concept: the idea that integration knowledge should be versioned, parameterized, annotated with provenance and compliance context, discoverable, and reusable. That concept can be implemented as Kubernetes CRDs, as a Python library, as a REST API, as a document database with a search interface, or as a structured wiki with enough discipline to be useful.
The Kubernetes operator and the MCP server are both implementations of the same underlying concept, optimized for environments where their dependencies are available. The pattern registry is the concept underneath both of them. It's the portable thing.
For enterprise FDE environments that can't yet deploy either tool, building the pattern registry as a simpler artifact — a library, a structured API, an indexed knowledge base — is the work that makes the more sophisticated implementations possible later. When those environments eventually normalize Kubernetes (and many are trending that way), the operator layer can be added. When they normalize AI assistants in engineering workflows (also trending), the MCP layer follows.
Neither tool is a shortcut past the harder organizational work of actually capturing and curating FDE knowledge. That work has to happen regardless of which runtime carries it.
What This Means for Your Architecture Decision
If you're a platform engineer or FDE lead deciding what to build:
Start by asking whether you have Kubernetes in the environments where you manage FDE deployments. If yes, the operator pattern is the right foundation for lifecycle management and the Full Loop. If no, a platform-agnostic pattern registry is the right starting point.
Then ask whether AI assistants are part of your FDE workflow today, or on the near-term roadmap. If yes, designing your pattern library to be MCP-accessible from the start costs almost nothing extra and pays off when the AI tooling matures. If no, build the library as if you might add MCP later — clean schemas, typed parameters, structured metadata — but don't build the MCP server until there's an AI system to call it.
The wrong move is to build the MCP layer first because MCP is exciting right now and the operator seems like older thinking. Kubernetes operators are not legacy architecture. They're steady-state infrastructure that works without drama. MCP is a real protocol solving a real problem — but it reasons about things that have to already exist and be well-structured before the reasoning is useful.
Build the foundation. Then give AI agents a way to navigate it.
The question isn't MCP or Kubernetes operator. It's MCP after Kubernetes operator. Or, in environments where Kubernetes isn't available: platform-agnostic pattern registry first, MCP interface second, operator lifecycle management when the runtime catches up.
Two tools. Two layers. One question worth asking in the right order.
This post is part of a series on [Forward Deployed Engineering](https://newsletter.pragmaticengineer.com/p/forward-deployed-engineers) infrastructure. [Post 3](/open-source-trends/fde-infrastructure/03-fde-model-economics) covers the business economics that make FDE infrastructure necessary. [Post 4](/open-source-trends/fde-infrastructure/04-fde-enterprise-constraints) covers why enterprise environments can't simply adopt the AI lab FDE model. [Post 5](/open-source-trends/fde-infrastructure/05-fde-operator-beachhead) describes the [Kubernetes operator](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) architecture in detail, including the four CRDs and the Full Loop implementation.