I’ve been working on the Docker conversion of a 10 year old Rails app that is heavily integrated with several other services for awhile now and I’d like to point out that we adopted some advanced Docker/container orchestration ideas before we should have. In retrospect, it seems obvious that one should get everything working together before adding in, say, random ports but we did. Why?
Two things led to my team of one other developer and I to pick up more than was wise at the time. The first is hype. When you look at K8s or the HashiCorp stack, there are some very cool things the framework can do that are recent(ish) additions so its proponents will emphasize them because that’s where they are at in their container journey. A read only container is pretty darn cool from both a security and reliability point of view. Of course, when you’re debugging a half-converted application, it is not helpful at all.
The second reason we moved forward with an advanced technique before getting it all working together boils down to scheduling. A team of two that supports an entire product (one of many products at our company) is going to have periods where they must abandon a large project to work on a feature, fix a bug, etc. Sometimes we would leave our migration to Docker/HashiCorp project alone for weeks at a time. Additionally, we had scheduling conflicts with other teams building out the underlying infrastructure for our project. Things we needed to get an entire ecosystem going in containers were unavailable until our very busy IT could work it in. Unfortunately, both these scheduling problems lead to a scramble to find something to do. Ideally, once it was all working on a developer machine we should have made sure it worked in our prod-like staging areas. With that option unavailable, it would have been more efficient to work on something else rather than moving the Linux OS to Alpine. Sure Alpine is great for keeping image size down, but debugging on an Alpine Linux instance can be a nightmare.
Keeping the Linux fat and full of debugging tools would have been more productive than having to install all the tools on Alpine. We didn’t actually end up making our containers read only but had there been more delays I can’t guarantee that we wouldn’t have. The pressure of a deadline when all sensible paths are blocked will make the mind fuzzy. No one is immune.
FYI: I am not throwing our IT team under the bus. A fair amount of our servers suffered a catastrophic nature based death (with no outage to our customers, amazingly) during the time we were bugging them about building the shiny new future so we completely understood why schedules needed to be adjusted.