I am currently contracting for a large insurance company in the Nordics, helping them with architecture and K8s setup. When I first came into the project, they had environments with an unorthodox arrangement. The environments were allocated over only 2 AKS clusters in 2 different subscriptions, using namespaces for environments. Examples of namespaces were dev, test, acceptance, staging, etc.
I am not passing judgment, as there are probably good reasons why it turned out that way. However, we wanted to simplify it and reduce the number of environments overall. We wanted the standard setup of dev, test, and production AKS clusters, thus removing namespaces as environments. Each team gets a namespace in each AKS cluster. Ideally, we would have had the dev K8s in a separate Azure subscription.
But then the questions started coming about how we do it in situations where a team would have an external dependency, e.g., on-prem data warehouse, with more environments than our new setup. In addition, testers were expecting stable environments like acceptance and staging.
I began to think about a concept that I first encountered studying discrete mathematics at Uni. The first time I read it, I had to go over it a couple of times. It seemed so self-explanatory, and I was worried that I misunderstood it. The concept was the pigeon hole principle. The short version is that if you have x number of pigeon holes and x + y number of pigeons where y is a whole number larger than 0, then at least two pigeons need to share a pigeon hole. Yeah, you read that right.
Bringing this principle to our new setup, we see that an environment like test will need to hold several different versions for teams expecting more than one testing environment. But is that a problem? We landed on no. It is totally up to the teams on how they want to handle the environments for their external dependencies. If they have acceptance and staging versions in their testing environment, we are totally fine with that. The team can deploy acceptance and staging K8s services and deployments in the same namespace and split traffic with different ingresses. Easy.
The other way around, where we will have to cater to all external environments, is not a long-term solution. Sometimes, you need to see if a solution is good by extrapolating it forward. How will the solution look if teams need ten environments? How about 20?
Another point is that it may be that the question is wrong. Instead of the teams asking if they can have more environments, they should ask how to reduce them. Often, teams need more environments due to the need for stable ones to verify a particular version by the testers. With trunk-based development, the teams turn this setup upside down. Let us constantly push small changes into production. If there are problems or bugs, you have a much smaller set of commits to help you identify the error. The teams can use feature flags instead of long-lived "stabilization" branches to stabilize environments for testers.
Sometimes it is the question that is wrong, and it is ok to push back on the requirements. Trying to reduce the number of environments and instead work with fewer of them and rather have a much faster flow of features through the environments can be good. Work closely with the testers and familiarize them with feature flags and the new approach.
Do not use namespaces for environments like dev, test, and production.
It is OK for pigeons to share a home.