How similar should the dev environment and production environment be?


You'll often hear reliability folks say something like --

The development environment should be as similar
to the production environment as possible.

That can't be correct though. Have you ever seen somebody deploy an identical production environment with mirrored traffic for each developer on the team? Would this be a pleasant development environment? Is that ‘not possible?' It seems far-fetched to me.

I'll suggest the contrary, that no, the aspirational development environment does not resemble the production environment.

Let's step back first. What is the goal of a software deliverable? Well, the production version of your code aspirationally satisfies at least these requirements:

  • Correctness - System delivers results that are computationally coherent to expectations.
  • Latency - System delivers results within expected delay of triggering.
  • Availability - System is responsive to triggers at all times it is expected to.
  • Throughput - System can satisfy other requirements while processing the expected amount of simultaneous requests.

In shipping a software deliverable, you build, test, deploy, monitor, and automate systems in order to converge upon expectations for each of these requirements. The environment necessary depends on the requirement in question.

The development environment's aspiration is to provide the most efficient suite of tools for bringing a system to provide full correctness. And you stop there. Because being good at correctness is hard enough already. If a one-size-fits-all system were optimal here, then doing that would be the advice.

Want a way to validate availability for a distributed system? Deploy some servers into a staging environment, and initiate scenarios that resemble potential disaster incidents (server failure, hardware failure, etc). Not the dev environment.

Want a way to validate latencies for a distributed system? Deploy enough servers with a network that resembles production and send a simulated workload through it. Not the dev environment.

No one pre-production environment satisfies all of these. The aggregate of your pre-production environments should add together to satisfy all necessary requirements. Each pre-production environment should simulate the production environment in ways that are coherent with the requirement they are targeting.

That's all for this essay. If you have a question or an interesting thought to bounce around, email me back at david@davidmah.com. I'd enjoy the chat.

This essay was published 2020-05-03.