Things I learned about DevOps in Q1

Being a software engineer, programmer, developer - is not a destination. The good ones have a natural curiosity, where the main measure of an activity is what you learned while doing it. There’s a number of things that people learn - some which can eventually be explained away by documentation. What it logical isn’t always how things are.

GitHub releases

I use Codeship and I love them. But it feels I’m fighting them a bit. And it’s honestly not their fault. They are a cloud-based CI platform which offers things like Parallel Builds and very awesome Docker integration. I just keep picking scenarios which are edge cases for them (or where my logic isn’t exactly aligned with some less obvious technical choices).

The way Codeship works - you connect a GitHub repository to their service, create a few configuration files and when you trigger a push to GitHub, it will run your build steps. One of those build steps might be adding a release to github using something like c4milo/github-release.

Only adding a release to GitHub creates a new tag. A new tag triggers a new build, a new build triggers a new release, a new release creates a new tag. It’s a good thing I noticed quickly enough to press cancel after some 20 jobs have run.

Codeship builds every commit that is pushed – independently of the branch it appears on. To skip builds, you can add –skip-ci or [skip ci] to the commit message of the last commit before it is pushed and that push will be ignored.

Well, it turns out that this notification of new tags doesn’t come with a commit message which Codeship could inspect and ignore (yes, I tried). By itself Codeship doesn’t support skipping builds on the type of events (much of which is configurable with a classic Webhook integration on GitHub).

But they do allow filtering build steps based on the tag/branch which is documented here. They do figure out however that no steps can be executed and skip building the service, but I suspect only after they spin up a temporary VM instance (it takes 15 seconds to pass such a build).

For people keeping score, Me vs. Documentation: 0 - 1.

Docker releases

People have set ideas how they build software. I have a set idea of a “DIY” build procedures, which unfortunately (usually) doesn’t work out like that in real life.

For example - building Docker containers for Go applications is a two step process in my book. In the first step, ideally I’ll build a container with which to build the Go app, and in the second step I’m going to build a container with just this app (without any development toolchains). So, one container is the complete development environment, and one is the production release which has just the app. This is called the “builder pattern”.

I mean, this actually is going so far that even Docker added support for multi-stage builds which should land in a future stable release, slated for 17.05.0 (May?). If you want to play around with it, Docker Capitain Alex Ellis wrote a nice article about multi-stage builds.

Codeship provides support to push docker images to remote registries, like Docker Hub. There’s a tutorial on how to set up pushing to a remote registry which involves providing secure credentials for docker to enable push. There’s also some documentation on push steps about it.

So, using the builder pattern and Codeships Docker-in-Docker support, you can build your image, for example titpetric/sonyflake, and push it by adding a step like this.

- type: serial
  tag: master
  service: sonyflake
  steps:
    - name: build
      command: make
    - name: push
      type: push
      image_name: titpetric/sonyflake

It works just as you think it would, the image which you create in the “build” step will be pushed from the push step. Codeship also supports image_tag, so you can tag your images with a CommitID for example.

    - name: push
      type: push
      image_name: titpetric/sonyflake
      image_tag: '{{ .CommitID }}'

Well, this doesn’t work how logic dictates. Somehow the image from the sonyflake service was pushed. After a long exchange of emails with Codeship support, I again came to realize that Documentation beats logic.

On Push documentation: image_name - the name of the generated image as it exists in your service.yml/json file, in the format [registry/][owner/][name].

On Push tutorial: Your service image_name can differ from the repository defined in your steps file. Your image will be tagged and pushed based on the push step.

This is just a mess. I can build any image I want in a build step, I can push it, but only until I explicitly define which image_tag to push. If I push without tags, there’s a bug where the image is pushed as it is named (and logically that is fine - for me), and if I want to push a tagged image, they tag the image based on the sonyflake service - as it’s written in the docs. Part bug, part “I have different ideas”. But, paper beats rock.

For people keeping score, Me vs. Documentation: 0 - 2.

Docker networking

If you want to start more than 1023 docker containers, you will have to resort to using a custom networking driver. People suggest Open vSwitch which from what I could read supports up to 65535 of them (or CIDR /16 if you prefer).

Of course, I had to try it to find out. Me vs. Documentation: 0 - 3

You could also recompile the Linux kernel to allow this. I’m not a fan of that approach because it’s not really a configurable option and you have to edit code which wasn’t intended to be configurable? There’s some more information about these limits on this IBM Bluemix blog. The guys from IBM ran 10k containers on a single (very fat) host. Also: 10k containers is not such an insane scale TBH.

Docker credential helpers

Docker provides credential helpers which you can connect to your keychain, or even your infrastrucure (Amazon ECR, Google GCR). Basically it’s a neat trick to run docker-credential-[provider] which returns a JSON blob with the response from your credential storage. The registry to which the authentication is requested is sent over stdin to this program, and it should return a JSON response to stdout:

{ "Username": "titpetric", "Secret": "WillNotTellYou" }

Of course, I wanted to write a simple bash script that would print out two environment variables and I could use that instead of issuing docker login -u [user] -p [pass]. Since there’s no debugging information available from docker at this point, I was just stuck even if I echo’d a valid JSON response.

And then I found the shell-test for docker-credential in the official docker repo.

I’m going to count this as a point for Documentation: Me vs. Documentation: 0 - 4.

Docker swarm and sysctl

The worst post-fact discovery was that Docker Swarm mode doesn’t support sysctl at this time. There are a number applications which suggest tuning (Redis, Elasticsearch, etc.) and especially when in high traffic scenarios (Load balancing with NGINX, haproxy, etc.) there are a number of sysctl tunables required to ensure stability of the service.

The lack of this feature and others are tracked in this epic issue, or specifically this one for lack of sysctl.

It’s unfortunate figuring out that there are such limitations after you already installed a few hosts. Running a few high traffic edge points pretty much locks us into docker run. It’s not bad, but I can’t say that I wouldn’t want to distribute the load over a Docker swarm service.

Me vs. Documentation 0 - 5.

To be honest, I’d still try out swarm if I knew about this beforehand. However, I’m still confused as to why sysctl is broken and needs to be fixed, and why people think that. From what I could deduce, the biggest issue is figuring out which sysctls are namespaced, but I suspect that’s a moving target anyway as kernel development goes forward. Why it’s considered broken, I have no idea, even after reading the linked issues above.

Total score

Software development is one of the few industries where reinventing the wheel happens on a daily basis. Wisdom comes from failure. Knowledge comes from documentation. If we did the same shit as mechanics, doctors or aerospace engineers, our streets would be full of exploding cars, dead people and rockets that mostly wouldn’t fly.

You can love the art of engineering as much as I do, but at some point you have to be honest as well - a lot of the things can be learned from documentation, many things can be solved with logic, but failure is the only thing that will give you wisdom. Knowledge is only as good as your ability to apply it to real world problems. That ability doesn’t live in a book.

You read documentation when you’re stuck. You search on google and browse stackoverflow when you just forgot that syntax for some obscure awk expression because it’s going to be shorter than the awk man page. You draw a map of where you want to be, and you read documentation to find a way how to get there. It’s mostly boring.

Much like you don’t become Elon Musk by reading his biography, just like that you don’t become an expert at using docker, without constantly learning about new features, reading documentation and solving issues. If you think about it, Docker is barely 4 years old - even with bunches of experience and wisdom and documentation, there are still going to be those ‘a-ha’ moments.

I can’t even tell you how many times I’ve been really grateful to people for linking me to a documentation chapter with a footnote which I didn’t notice. “Here buddy, you’re misunderstanding this whole thing.“ Even a RTFM sometimes is a blessing. The worst part is when you’re stuck, and you have nowhere to look and nobody to ask. Not everything can be solved with logic, and not everything explained by documentation.

And on that sombre note, I’m off to break more things in Q2.

While I have you here...

It would be great if you buy one of my books:

I promise you'll learn a lot more if you buy one. Buying a copy supports me writing more about similar topics. Say thank you and buy my books.

Feel free to send me an email if you want to book my time for consultancy/freelance services. I'm great at APIs, Go, Docker, VueJS and scaling services, among many other things.