My father started his career working in one of Norway's most incredible software companies: Norsk Data AS. Here he used one line editors, staple cards, and created compilers. Norsk Data was developing, producing, and selling "mini-computers". Several hundred thousand dollars for each machine. Leading up until now he has seen those computers shrink to desktops, large phones, small smartphones, part of the cloud, and virtualized using VM Ware.
I was sharing how I am working with Kubernetes and containers when he asked: "Can you explain in a few sentences what it is and why I should care?". So I thought I could test my own knowledge and write a blog post about it, and extend the answer to be relevant to managers with the same question. Managers that want to know why they should set aside money to train up the developers and port applications to Kubernetes.
Containers is a technology that enables the developer to create a standardized unit of software. This unit consists of the source code packaged together with setup instructions on the environment where the code will be running. A container runtime takes that package and creates an isolated area by using virtualization at the Operating System level. The isolated area gets, amongst others, its own process three, CPU/memory slice, and file system. So from the application's point of view, it looks like it has the machine all alone. The software doing the virtualization is called a container runtime, and the most popular is called Docker runtime. The standardized unit of software is called an image, and the instances that the container runtime is creating using OS virtualization is called containers.
If you are coming from the VM world you can imagine creating a new image for each version of your software.
Creating these images is easier than creating VMs. In the case of Docker, you create a special file called Dockerfile that describes how to create an image. This file is checked in alongside your source code and describes the steps to set up the environment that the source code will be running in. Let us look at an example:
Not going into detail here but the developers here are saying exactly how the environment that builds the software should be as well as the environment running the code. The running application "thinks" it is running alone on a machine. Notice that all dependencies to the runtime are reduced to Docker runtime, i.e. if you have Docker installed you can run the application. No longer hunting down missing OS features on an individual machine.
Ok, that is all well and good but why should you as a manager care?
Let´´´s say that you had a great idea for a start-up in the '90s. You had to buy servers, spend money on people that could host it, and constantly monitor machines for disk crashes for example. And since it is a start-up the traction on the site would most likely be very small in the beginning, i.e., the machine will spend most of its CPU cycles doing nothing. You probably estimated a wave of users in the first weeks and bought in big machines to handle the traction. In other words: a lot of wasted CPU cycles and available memory.
VM´s partially solved this by "splitting" the hardware into several virtual machines. That way we reduced the cost spent on hardware dramatically. However, with VM´s you still pay for the operating system license for each of the VM's and you still need to make guesses on how large the VM's should be. The VM´s are also spent most of their time waisting CPU cycles on idle. The operating systems also need to be patched adding to the operating costs.
Containers take the step of virtualization one step further by basically saying: instead of slicing the hardware into several virtual machines, let the operating system do it. Each of these slices is called a container. There are a host of benefits to doing it this way. We pay for the OS license once. The containers can be created and destroyed in seconds in software instead of waiting 30min for an OS install in the VM case. The utilization of the hardware is a number of magnitudes greater.
In one of my projects we where able to reduce hosting cost by over 90%. They where already hosting the applications using a cloud provider but by utilizing containers and the container orchestrator Kubernetes, the need for CPU's and hosting plans where reduced drastically.
This argument also holds if you are renting your servers through a cloud provider directly or through a SAAS or PAAS. There are however cloud services where you only pay for the exact usage, i.e, small startup traction = low cost. But there more reasons why you should care about containers that a SAAS or PAAS do not deliver on. One of them is immutability.
"But it worked yesterday! And we did not deploy any new code...
A very good design decision that was made in the container community was that the parts that make up an image (the specification that containers are created from) is immutable. You can not change a container once it has been created. Need to do a change? Create a new one. This eradicates those problems where someone just needed to do an insignificant change in production that turned out to be not so insignificant.
"Works on my machine" to "Works wherever we have Docker" aka fewer bugs aka happier customers
Developers need to have a local copy of the software they are creating on their machines. This is where they are doing development. Depending on how complicated the environment is this could deviate a great deal from the environment where the software is running in production. This deviation leads to a number of problems.
One problem is when the team has tested the functionality in all of the environments besides production. The code is shipped only to expose that the setup is slightly different, resulting in a bug. Why not have an exact replica in production you might ask. Well often this is not feasible. At least before containers came along. Production is set to update at another time from the test environment, or production need to have more machines to load balance the traffic. It could be several (good) reasons for having a slightly different environment in production.
The probability of exposing a bug to the end-user due to differences in environments is reduced dramatically with containers. Another argument is the time spent for the developers setting up the development environment, as well as doing development. Previously I saw examples of page after page on how to set up your environment. And if a developer was working on several projects with the same machine with contradictory setup she needed to use virtual machines. Or at least something to keep the environments from interfering.
Just a final example to bring it all home on how easy it is to set up an environment. Wondering how long it will take to setup up a professional build server that lets the developers build better software? This one-liner:
Dev to DevOps
There are arguments to be made that organizations that embrace DevOps are more likely to succeed due to the increased speed of delivering valuable working features to customers. We saw that in the creation of the images, which are the blueprints for the container creation, we needed to bring in some of the operations part of software development. One example is the decision on what operating system the application should be running on. It turns out that moving to containers is a good way to "push" the organization, and the developers, in the direction of DevOps practices.
There are many definitions of what DevOps is. In this blog post, we will basically define it as a set of practices that extends the area of concern for developers to not stop after the code is tested and delivered to deployment. Extend it to monitor the code, make sure it is running effectively in production, and that users are actually using the new features.
With the code for environment setup bundled with the code like we saw above we are good on our way. However, a system or SOA service is so much more than a single application. It is often numerous applications working together to form some functionality. The modern requirements for scaling, deploying, uptime, zero downtime deployments, etc. are pretty high.
We need to orchestrate these applications and keep them running. Enter the world of container orchestrators.
There are numerous container orchestrators: Mesos, Docker Swarm, and Kubernetes (K8s). Kubernetes is forming as the de facto standard, so we will concentrate on that one here. However, the arguments hold true for the other ones as well.
We are now ready for a definition straight from the Kubernetes website:
Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications.
A good definition, but needs some elaboration for managers. You have packaged our applications into containers and are ready to run them in production. One way of doing that is to run the container directly on a VM with docker runtime. However what happens if that VM is having issues? Or that our service is quickly becoming so popular that just one instance of our container is not enough? Also as you move to create larger systems with multiple different applications (in containers) needing to collaborate, your deployment and hosting becomes more and more difficult.
Kubernetes changes this by actually being a platform for running, scaling, and managing a set of containers. Now you have a standard set of K8s APIs that you can interact with to deploy our set of containers declaratively. You create the K8s objects declaratively with YAML files and use a CLI to send it to the K8s APIs. Let's look at an example.
This is an example of a file in the yaml format that describes a K8s deployment. Notice that the image is defined on line 19. By default, K8s will be pulling the images from a registry called Docker Hub. You can of course push your images to that registry or create your own. Then you use a Command Line Interface (CLI) called Kubectl to help you "push" the file to your K8s API:
> Kubectl apply -f deployment.yaml
That is it. K8s will now make sure that it is always running 3 instances of the application.
We are of course only scratching the surface of what possibilities K8s is giving us. You have networking policies so you can declarative describe what kind of traffic between applications are legal, mutating and validating webhooks so you can mutate or validate all of the yaml sent to the API and make sure your company policies are followed, custom scaling rules so you can create an elastic platform that scales perfectly according to load, zero downtime and automatically rolled back deployments and the list goes on. A common denominator is that all of these features can be described declaratively in code, bringing in all the benefits compared to ad-hoc wiki descriptions.
Attracting good developers
As a manager, you want to have the best developers working in your team. And at the moment there are arguments to be made that the best developers want to work with containers and K8s, simply because it enables them to create better systems.
I think it is pretty safe to say that for the time being that containers and K8s will be a key component in meeting the ever-increasing demands from customers. The ecosystem is thriving and the technology is now at a stage that a lot of big companies are running it in production. You as a manager can be part of a push to reduce cost, bring in highly motivated developers, and create platforms to give you an edge in competition vise.