Docker, you lied to me

I know, I know. I’m dramatic. The title is basically clickbait, but the subject is just as true as the title. You should read this article because it will save your life! Okay, it might not save your life, but most likely it will solidify some very important information about Docker containers, which will most likely save of some drama down the line.

The mindset of Docker

You know, and I know, is that Docker is hailed as the next big thing in software - containerization. And comparisons between virtual machines and chroot environments and image sizes and many other benefits and functions dominate the whole ecosystem.

They are completely right. Docker is the next big thing, or perhaps just a big thing - it’s here now.

The mindset of Docker however comes with some caveats. The idea is to “just run your software, anywhere”, but the whole ecosystem is made up of people who write software, and people who choose to set up and use the software in different ways. Many ways to skin a cat. And therein lies the problem.

Docker is relatively new technology and it can sometimes come and bite you in the behind. That doesn’t mean it’s useless: absolutely no! It’s very much like fire in the sense if you’re a caveman you can be absolutely boggled by it, before you move on and realize - grilled steak is tasty.

I’ve been using docker in production for a while now. And there are lessons I learned the hard way, which make me feel like a poor caveman who just got burned with fire.

The story of a database

About a year ago, one of the first automated containers which I started using was the percona fork of MySQL. As things go, it was pretty easy, the standard docker run scenario applied, the latest image was pulled and voila - you have a running database.

Of course, the database which I was running in 2015 is not exactly the same database which I am running in 2016. There are a few things which I haven’t thought about at that time - which made it a slight inconvenience today. I wanted to use xtrabackup to make a snapshot of the database instead of using mysqldump directly - due to an issue which I faced on a highly loaded database.

Upgrading the underlying data

Well, as you might now, database schema changes occur. Additional capabilities are added to the database which mean you need to upgrade it on occasion, to keep all the nice commands working. Commands like show status or show variables.

Yes, I was surprised that such basic commands might stop working at some point, but then again - it’s obvious that something as complex as MySQL will add in complexity with time. Not to mention that the percona fork traditionally has extensions as to what MySQL already provides.

MySQL, for this purpose, provides a script called mysql_upgrade, and with the docker image - this script never gets run. And things die on upgrades if you didn’t think of running the script by hand. You live, you learn.

Using xtrabackup

Installing percona-xtrabackup package left me thinking that MySQL 5.7 is completely unsupported. Only with a short discussion on Twitter did I find out that it’s supported in xtrabackup 2.4. But I thought I’m installing the latest version?

Much like percona:latest currently points to MySQL 5.7, it’s in the same way the debian package for percona-xtrabackup points to one that only supports MySQL 5.6. Percona provide percona-xtrabackup-24 because they don’t want to force upgrade, but it seems a bit silly to me.

The happy end to this discussion is that I don’t need to export my database as SQL files and import it into a 5.6 instance. That would be dreadful.

The main “problem” with scenarios like this one is that it’s very difficult to be sure what you’re installing. If your package has multiple installation candidates, apt-get will just fetch the newer version by default. There are some constraints you can make when using apt-get (minimum version for example), but when you think that docker images take several minutes to build, the iteration loop is just so slow.

Official image?

You know that percona image I’m using? It’s official in the sense that it’s not even officially built by Percona.

You know, sometimes the concept of “trust” just has to come from an authority in the sense that Docker should be this authority, and Percona should be the trusted party. What is the point of official images, if you’re giving blanket trust to the person who builds them, and not the provider which creates the software?

I’m sure there’s a better way.

The story of a web server

I come from a PHP background. I’m very comfortable with nginx, and use various modules which come with it. Modules like mod_lua to provide some extended logic for caching and authentication, and more simple ones like mod_realip that provide the client IP to the servers behind a known reverse proxy.

I’ve been using richarvey/nginx-php-fpm for a significant number of these backend services. At some point around february, the author decided to switch the base image from a debian:jessie (afaik) to an alpine. The reasoning behind that should be obvious - alpine weighs in at about 5MB, while a debian image weighs in at over 100MB.

Remember the point I made before? “Just run your software, anywhere.”.

I feel violated because obviously, my software is not my software, it’s some other guys software. Some guy named Richard, who I’m sure is very smart and uses his software and iterates the package when he needs new stuff thrown in. Unfortunately for me - the alpine build doesn’t have mod_realip, so my configuration at some point didn’t validate anymore. Only due to providence on my part, I managed to scrape by this issue without a serious outage.

What providence you ask? Having a development environment where things inevitably broke and I had time to fix them.

Fork or roll your own image

You should create your own Dockerfiles and build your own images when needed. This should go without saying - you don’t really have control over what Richard Harvey might add in his nginx-php-fpm image at some later point in time, if you decide to use it. At some point - the image was exactly what I needed. Today, it added functionality, changed the underlying software so it doesn’t support all of the features it supported at one time and as a cherry on top: it added support for letsencrypt.

But I use a stand-alone container for letsencrypt. I didn’t want it to be coupled with the web server in the way it is now. Those of you who are already familiar with letsencrypt, the minimum it requires to function is a /.well-known folder which should be accessible on the domains for which you want to generate your SSL certificates. To support this it only takes a few lines in a config file.

And if every image came bundled with Letsencrypt, what a world would that be?

Conclusion

Docker is a very powerful tool, but you have to plan for some things which you might not have planned for when you come with a background of installing virtual machines, or dealing with a full linux distribution. I am literally using at least 3 distributions that I know of - ubuntu, debian and alpine - and I’m pretty sure that the base image for some of the software I use might be even outside of those three.

When you don’t have control over what you use, you might touch on some of the same problems down the line.

Today I learned something that could have been avoided from the start. But, to be honest, most engineering problems are the same. Create, test, use, break, fix, repeat.

While I have you here...

It would be great if you buy one of my books:

I promise you'll learn a lot more if you buy one. Buying a copy supports me writing more about similar topics. Say thank you and buy my books.

Feel free to send me an email if you want to book my time for consultancy/freelance services. I'm great at APIs, Go, Docker, VueJS and scaling services, among many other things.