DevOps Killed the Dev/Ops Wall. SRE Killed the Scale Wall. FDE Just Hit a Wall Nobody Has Named Yet.

DevOps removed the dev/ops wall. SRE removed the scale wall. FDE is hitting the enterprise deployment wall — and the field has not yet named what it is building.

DevOps Killed the Dev/Ops Wall. SRE Killed the Scale Wall. FDE Just Hit a Wall Nobody Has Named Yet.

This piece stands alone — you don't need any background to follow it. It's also the closing post in a 7-part series on Forward Deployed Engineering infrastructure covering the business case, enterprise vs. AI lab environments, a Kubernetes operator architecture, and how MCP and operators fit together as layers. Links to the full series at the bottom.


There's a pattern hiding in the history of software engineering disciplines. Every major one emerged the same way: software hit a boundary, found manual humans on the other side of it, and then systematically replaced the manual work with engineering discipline.

This isn't a retrospective observation. It's a live process. And it's happening again right now with Forward Deployed Engineering — except the field hasn't yet named what it's building.

The Three Walls

Let's trace the pattern.

Wall One: The Dev/Ops Wall.

Before DevOps, writing software and running software were two separate tribes. Developers threw code over the wall to operations. Operations caught it, deployed it however they caught it, and then managed the fallout. It was slow, it was fragile, and every failure triggered a blame negotiation rather than a fix.

The insight that became DevOps wasn't technical. It was organizational: these two tribes were solving the same problem from opposite ends of the same system. The discipline that emerged — everything as code, shared ownership, the CALMS framework — turned a human coordination problem into an engineering problem. The wall came down.

Wall Two: The Scale Wall.

DevOps worked. Which meant companies started deploying more software, faster, at greater scale. And then operations hit a different wall: you simply cannot run a hyperscale system with manual incident response. Google had tens of thousands of services. No human process could have kept them reliable.

Site Reliability Engineering was Google's answer. The core insight was precise: reliability is a software engineering problem, not an operations problem. Error budgets, SLOs, toil elimination — these weren't management frameworks, they were engineering ones. SRE didn't make operations more disciplined. It made operations a subfield of software engineering.

Wall Three: The Enterprise Deployment Wall.

Fast forward to the era of complex enterprise AI and data products. Product-led growth became the dominant go-to-market strategy — ship self-serve, let customers adopt, optimize the funnel. It worked for Slack and Figma. It doesn't work when your product requires integrating with Gary's COBOL job.

Gary, for those who haven't met him: he's 63, he's the only person who understands a critical batch job at your enterprise customer, and he's retiring in November. His COBOL job runs every weeknight at 11pm, it touches three systems nobody has documentation for, and before your product can go to production, someone needs to sit with Gary for two weeks and understand every decision he made in 1987.

Product-led growth assumed customers could operationalize complex software themselves. Gary's COBOL job is why they can't. Forward Deployed Engineering emerged as the answer: embed expert engineers directly in customer environments, solve the operationalization problem from the inside, and ship outcomes rather than software. Job postings for the role grew 1,165% in 2025, with demand spikes across OpenAI, Anthropic, Databricks, and Anduril.


Here's what makes the pattern visible:

DevOpsSREFDE
Boundary removedDev/Ops wallScale wallEnterprise deployment wall
The manual work beforeManual deployments, siloed opsManual incident response at scaleManual bespoke enterprise integrations
The discipline's core moveEverything as code, shared ownershipTreat reliability as software engineeringEmbed engineers in customer environments
Primary artifactPipelineError budget / SLODeploymentPrimitive
Knowledge captureRunbookPlaybook / postmortemPatternLibrary
Feedback loopCI/CD metricsSLO dashboardsFull Loop
Org change requiredDev and Ops in same teamSREs embedded in product teamsFDEs with knowledge capture infrastructure
Systematizer role it createdPlatform EngineerChaos Engineer / Reliability Engineer???

Notice the last row.

Every discipline creates two jobs: the practitioner, and the person who systematizes the practitioner's knowledge. DevOps created Pipeline Engineers — and then Platform Engineers who built the internal developer platforms that made pipelines reusable and governable. SRE created Reliability Engineers — and then Chaos Engineers who stress-tested the assumptions those reliability engineers were making.

FDE creates Forward Deployed Engineers. But the second job — the systematizer — doesn't have a name yet. And that's the tell. It means the discipline is still in its first generation.

The Recursive Move

There's something deeper in this pattern worth naming.

Each discipline applies the same trick: take something humans were doing manually and ad-hoc, and bring software engineering discipline to it. The move is always the same. The boundary is always somewhere new.

What the FDE operator — a Kubernetes-based infrastructure pattern for codifying deployment knowledge — does is make that recursive move one more time. It applies software engineering discipline to the FDE work itself.

Field intelligence flows in from actual customer deployments. That intelligence gets parameterized into DeploymentPrimitives. Those primitives accumulate into a PatternLibrary. The Full Loop closes when a new customer deployment pulls from that library rather than being built from scratch. And then the new deployment generates its own field intelligence, which improves the library, which makes the next deployment cheaper.

This is SRE's error budget applied to deployment knowledge instead of reliability. The insight is structural, not cosmetic.

Three Practices That Don't Have Names Yet

If the systematizer role doesn't have a name, neither do the practices it will own. Here are three of them.

Pattern Engineering

When DevOps gave us "infrastructure as code," it created an entire sub-discipline around writing that code well — modularity, versioning, idempotency, testing. The tooling followed: Terraform, Pulumi, Ansible. Careers followed. Senior engineers who specialized in infrastructure abstraction became a real and valued job.

Pattern Engineering is the equivalent for deployment knowledge. The discipline of extracting, parameterizing, versioning, and governing deployment primitives from field work. It sounds deceptively mundane. It isn't.

The core tension in Pattern Engineering is the abstraction level problem: a primitive too specific to one customer has low reuse; a primitive too generic loses the operational judgment that made it useful. This is exactly the problem library authors face when designing APIs — you're choosing between expressiveness and safety, flexibility and learnability. But the literature on API design is deep and established. The literature on deployment primitive design is essentially nonexistent.

Pattern Engineering will also need to own deprecation governance (when does an old primitive stop being supported?), quality gates (what does a primitive need to demonstrate before it enters the library?), and version management across customer environments that may be running different primitive versions simultaneously.

No current discipline owns this. DevOps engineers think about infrastructure abstractions, not deployment knowledge abstractions. SREs think about reliability characteristics, not operationalization patterns. Pattern Engineering is its own craft.

Engagement Archaeology

Back to Gary.

When an FDE finishes a Gary's COBOL job engagement, they leave with hard-won knowledge: why that authentication approach and not the simpler one, why the data quality check order matters, what three approaches failed before the one that worked. They also leave with a working deployment.

But they leave. And most of that knowledge goes with them.

The next FDE who hits a similar COBOL integration at a different customer starts from zero. Or from a conversation in Slack where someone vaguely remembers: "I think Jamie did something like this at [company] — you could ask her."

Engagement Archaeology is the systematic mining of completed engagements for reusable intelligence. It treats past engagements as a corpus to be analyzed, not a history to be forgotten. The artifact it produces is a DeploymentPrimitive. The practice it most resembles is SRE's post-incident review — except the trigger is deployment completion rather than service degradation, and the output populates a PatternLibrary rather than an incident report.

The interesting question for Engagement Archaeology is what structured information to extract. It's not just "what worked." It's:

  • What was the decision space? What options were considered?
  • What constraints made the general solution fail and the specific one work?
  • What signals would indicate this pattern applies to a new customer versus a superficially similar one that requires something different?

This is harder than extracting technical steps. It's extracting judgment. And judgment is the thing that currently evaporates when the FDE moves on.

Field Loop Engineering

SRE measures reliability. DevOps measures deployment frequency. Neither discipline owns a different but equally important measurement: how fast does field intelligence become platform capability?

That feedback loop — from deployment observation to platform primitive to library to next deployment — is the core value mechanism of FDE infrastructure. When it works well, each customer engagement makes the next one cheaper. When it doesn't, each engagement is equally expensive and the model never scales.

Field Loop Engineering is the discipline of managing that feedback loop as a first-class operational concern, with its own measurement framework:

  • Full Loop latency < 2 weeks — the SLO. From field observation to primitive in library, the cycle shouldn't take longer than two weeks. If it does, the knowledge is stale before it's reusable.
  • Pattern reuse rate > 60% — the SLI. If more than 40% of deployment work at a new customer site is being done from scratch rather than pulled from the library, the library isn't working.
  • Pattern contribution rate per FDE per quarter — the leading indicator. If FDEs aren't contributing primitives, the loop is broken upstream.

None of these metrics exist anywhere in standard engineering dashboards. SRE dashboards show p99 latency and error rates. DevOps dashboards show deployment frequency and lead time. No current discipline has even named the feedback mechanism these metrics would measure.

What This Means for Org Design

DevOps didn't just change how code got deployed. It changed the org chart: no more separate dev and ops tribes. Engineers who resisted working across that boundary became organizational liabilities.

SRE changed reporting structures: SREs embedded in product teams, with reliability budgets that created explicit negotiation between product velocity and system stability.

FDE infrastructure will require a similar structural change. Specifically, it requires creating a role that currently doesn't exist: the person who owns the PatternLibrary, governs the DeploymentPrimitive quality gate, and reports Full Loop latency as a KPI to leadership.

This isn't overhead. It's the load-bearing piece. Without it, the FDE model grows headcount linearly with customer count. With it, each new FDE becomes more effective because of everyone who came before them. The difference between those two trajectories, compounded over three years, is enormous.

The first company to formalize this role — call it Pattern Engineer, Field Platform Engineer, Deployment Intelligence Engineer, whatever — gains a compounding advantage that competitors can't easily replicate. Because the library that role builds isn't just a technical asset. It's institutional memory with an API.

The Open Question

DevOps got its watershed moment when the State of DevOps Report started producing data. High-performing engineering organizations deployed more frequently, had shorter lead times, recovered from incidents faster, and had lower change failure rates. The data was clean. The four DORA metrics gave every team a scorecard and gave the discipline a legitimate claim to business impact.

SRE got its watershed moment when Google published the SRE book. The error budget framework became public, became teachable, became hirable. "Do you have SLOs?" became a due-diligence question in acquisition conversations.

FDE infrastructure needs both of these: a measurement framework that makes compounding versus evaporating legible, and public documentation of what "high-performing" actually looks like.

Nobody has published that framework yet. The DORA metrics took years of research across hundreds of organizations to establish. The FDE equivalent will require the same rigor — real data from real FDE organizations comparing the ones where knowledge compounds against the ones where it evaporates, and reverse-engineering what distinguishes them.

That research doesn't exist yet. It's a genuinely open question.


If you had to name the four DORA-equivalent metrics for FDE organizations — the ones that separate compounding knowledge from evaporating knowledge — what would they be?


The FDE Infrastructure Series

  1. FDE and the Cloud-Native Operator Model
  2. The Enterprise Thought Piece: Why FDE Exists
  3. FDE Model Economics
  4. Enterprise Constraints and AI Lab Environments
  5. The Operator Beachhead Strategy
  6. MCP and Operator Layers
  7. Three Walls ← you are here

Built by [Decision Crafters](https://github.com/decision-crafters) — engineering and coordination infrastructure for complex systems.

Read more