I have been working on moving a 12 year old Rails application from virtual machines to Docker for a year now (we’re a small team and still need to produce features and fix bugs) and thought I would share some mistakes I made so that others may experience less pain.
As I see it, there are two rather large problems with understanding Docker:
- Assimilation vs accommodation. Docker (and other containers) seem to be a lot like virtual machines but very much are not. Unfortunately brains love to shove information into already established areas and resist accommodating new ideas. Especially if those new ideas seem like old ideas. This is called assimilation and it isn’t trivial but it’s a heck of a lot easier than accommodation. Accommodation involves a new conceptual framework. This is when ideas are decidedly not like other things you are familiar with. Docker seems like virtual machines but it has major differences that require accommodation.
- Documentation on Docker and it’s orchestration (Kubernetes, the Hashi stack, etc.) has ballooned to the point that reading all of it is incredibly frustrating as each part tends to refer to many other concepts that the reader doesn’t understand yet. The word salad of confusing terms and new ideas obfuscates many important core ideas.
Given that preamble, here is the first concept I misunderstood as it flew by my eyes somewhere in the doc: Leave restarting processes to the framework, it is not your responsibility to make it happen, only to configure how that happens. The following is an example of how we learned this the hard way.
One of the services that support our main application is written in Clojure. On a vm or “bare metal” it’s important to use something to restart the application if it crashes. Since Clojure is a JVM language we were able to use the tried and tested “daemonization” concept. We decided to use JSVC to do the actual monitoring/restarting. In practice this means that JSVC starts up 2 processes: one running as root to monitor/restart the application and the other that runs as a lower privilege user and does the actual work of an application. We spent an embarrassing amount of time trying to get this to work in Docker.
When using JSVC in Docker, starting up the container would result in an immediate crash. Where were the logs? Gone. Docker, when started by Nomad (or using
docker run --rm which most tutorials recommend) is ephemeral. The instant a container stops, all information not in the image is gone. Therefore we re-invented a technique I’m sure many developers have before: Make the start up script (the “entrypoint” or “cmd”) an infinite loop with a sleep inside. Then use
docker exec -it <container> /bin/bash to “sorta like ssh but not really” into the “machine,” run the process manually, and see the errors/logs it produces. Yes, taking out the entrypoint (or cmd), rebuilding the image, and using
docker run -it <image> /bin/bash accomplishes the same thing but we didn’t know that at the time.
Unfortunately, using JSVC to start up a JVM app leads to the obscure error “set_caps(CAPS) failed for user …” Doing some research about the error leads to some suggestions of using “privileged mode.” However, everyone warns about using privileged mode. At the time, no amount of searching would reveal the root cause. The short answer was that nobody does daemonization in Docker because it’s forbidden (for good reason) and also not necessary.
Why is it forbidden? JSVC and daemonization are trying to give a user permission to monitor/kill processes owned by other users. This is fine in a virtual machine because there is insulation between the vm processes and the host machine. Having processes kill or manage other processes is fine because they are all in the vm and can’t hurt the host. A container may be thought of as a full other machine but it is not. In many ways it is just another process running on the host with a large dose of Linux awesome to make it run like it’s in another operating system. Having one process be able to kill or manipulate another process (that could be owned by another company when, say, Amazon is hosting a container) is a huge security hole. Therefore Docker correctly shuts down any attempt to do so. Btw, “privileged mode” turns off this protection.
Why is daemonization not needed? The job of monitoring/restarting/managing is handled by K8s or the Hashi stack or whatever is used to look after the containers. The main (or only) process of a container essentially IS the container. When it stops, the whole thing is restarted per configuration of the container wrangler.
At core here is a Docker philosophy that gets lost amongst all the other documentation: Applications that used to use an entire vm or host are now “just” another process on that host. This allows all the container apps to use resources as needed instead of the wasteful and inflexible assignment of memory and cpu cycles to a vm that doesn’t need them 90% of the time. If a container isn’t using some memory, that memory is completely free to be used by other processes. This is one of the reasons containers are way more efficient than virtual machines or dedicated hardware.