Docker, you lied to me
I know, I know. I’m dramatic. The title is basically clickbait, but the subject is just as true as the title. You should read this article because it will save your life! Okay, it might not save your life, but most likely it will solidify some very important information about Docker containers, which will most likely save of some drama down the line.
The mindset of Docker
You know, and I know, is that Docker is hailed as the next big thing in software - containerization. And comparisons between virtual machines and chroot environments and image sizes and many other benefits and functions dominate the whole ecosystem.
They are completely right. Docker is the next big thing, or perhaps just a big thing - it’s here now.
The mindset of Docker however comes with some caveats. The idea is to “just run your software, anywhere”, but the whole ecosystem is made up of people who write software, and people who choose to set up and use the software in different ways. Many ways to skin a cat. And therein lies the problem.
Docker is relatively new technology and it can sometimes come and bite you in the behind. That doesn’t mean it’s useless: absolutely no! It’s very much like fire in the sense if you’re a caveman you can be absolutely boggled by it, before you move on and realize - grilled steak is tasty.
I’ve been using docker in production for a while now. And there are lessons I learned the hard way, which make me feel like a poor caveman who just got burned with fire.
The story of a database
About a year ago, one of the first automated containers which I started
using was the percona fork of MySQL.
As things go, it was pretty easy, the standard
docker run scenario
applied, the latest image was pulled and voila - you have a running database.
Of course, the database which I was running in 2015 is not exactly the
same database which I am running in 2016. There are a few things which
I haven’t thought about at that time - which made it a slight inconvenience
today. I wanted to use
xtrabackup to make a snapshot of the database
instead of using
mysqldump directly - due to an issue which I faced on
a highly loaded database.
Upgrading the underlying data
Well, as you might now, database schema changes occur. Additional capabilities
are added to the database which mean you need to upgrade it on occasion, to
keep all the nice commands working. Commands like
show status or
Yes, I was surprised that such basic commands might stop working at some point, but then again - it’s obvious that something as complex as MySQL will add in complexity with time. Not to mention that the percona fork traditionally has extensions as to what MySQL already provides.
MySQL, for this purpose, provides a script called
mysql_upgrade, and with
the docker image - this script never gets run. And things die on upgrades if
you didn’t think of running the script by hand. You live, you learn.
percona-xtrabackup package left me thinking that MySQL 5.7 is
completely unsupported. Only with a short discussion on Twitter did I find
out that it’s supported in xtrabackup 2.4. But I thought I’m installing the
percona:latest currently points to MySQL 5.7, it’s in the same
way the debian package for
percona-xtrabackup points to one that only
supports MySQL 5.6. Percona provide
percona-xtrabackup-24 because they
don’t want to force upgrade, but it seems a bit silly to me.
The happy end to this discussion is that I don’t need to export my database as SQL files and import it into a 5.6 instance. That would be dreadful.
The main “problem” with scenarios like this one is that it’s very difficult to be sure what you’re installing. If your package has multiple installation candidates, apt-get will just fetch the newer version by default. There are some constraints you can make when using apt-get (minimum version for example), but when you think that docker images take several minutes to build, the iteration loop is just so slow.
You know that
percona image I’m using? It’s official in the sense that it’s
not even officially built by Percona.
You know, sometimes the concept of “trust” just has to come from an authority in the sense that Docker should be this authority, and Percona should be the trusted party. What is the point of official images, if you’re giving blanket trust to the person who builds them, and not the provider which creates the software?
I’m sure there’s a better way.
The story of a web server
I come from a PHP background. I’m very comfortable with nginx, and use various
modules which come with it. Modules like
mod_lua to provide some extended
logic for caching and authentication, and more simple ones like
that provide the client IP to the servers behind a known reverse proxy.
I’ve been using
richarvey/nginx-php-fpm for a significant number of these
backend services. At some point around february, the author decided to switch
the base image from a
debian:jessie (afaik) to an
alpine. The reasoning
behind that should be obvious - alpine weighs in at about 5MB, while a debian
image weighs in at over 100MB.
Remember the point I made before? “Just run your software, anywhere.”.
I feel violated because obviously, my software is not my software, it’s some
other guys software. Some guy named Richard, who I’m sure is very smart and
uses his software and iterates the package when he needs new stuff thrown in.
Unfortunately for me - the alpine build doesn’t have
mod_realip, so my
configuration at some point didn’t validate anymore. Only due to providence
on my part, I managed to scrape by this issue without a serious outage.
What providence you ask? Having a development environment where things inevitably broke and I had time to fix them.
Fork or roll your own image
You should create your own Dockerfiles and build your own images when needed.
This should go without saying - you don’t really have control over what Richard
Harvey might add in his
nginx-php-fpm image at some later point in time, if
you decide to use it. At some point - the image was exactly what I needed. Today,
it added functionality, changed the underlying software so it doesn’t support
all of the features it supported at one time and as a cherry on top: it added
support for letsencrypt.
But I use a stand-alone container for letsencrypt. I didn’t want it to be
coupled with the web server in the way it is now. Those of you who are already
familiar with letsencrypt, the minimum it requires to function is a
folder which should be accessible on the domains for which you want to generate
your SSL certificates. To support this it only takes a few lines in a config file.
And if every image came bundled with Letsencrypt, what a world would that be?
Docker is a very powerful tool, but you have to plan for some things which you might not have planned for when you come with a background of installing virtual machines, or dealing with a full linux distribution. I am literally using at least 3 distributions that I know of - ubuntu, debian and alpine - and I’m pretty sure that the base image for some of the software I use might be even outside of those three.
When you don’t have control over what you use, you might touch on some of the same problems down the line.
Today I learned something that could have been avoided from the start. But, to be honest, most engineering problems are the same. Create, test, use, break, fix, repeat.
While I have you here...
It would be great if you buy one of my books:
Want to stay up to date with new posts?