Docker done right enforces immutable infrastructure. This requires lots of work on your build system. Work you should be doing anyway. VMs aren’t slow, and if they are, you won’t notice it for 90% of workloads. Docker is actively slower than immutable build cloud infrastructure on a public cloud.
on hosted hardware, no hot-migration is a deal breaker. On the public cloud you are grouping together apps into single points of failure.
In-container monitoring that executes scripts will cause problems.
Throw away your dev machines hourly.
This is based on an extended post based on an email discussion at $WORK. Previously I’ve been a sysadmin which means I built the infrastructure according to the evidence that I had procured.
I’ve moved on from VFX where we genuinely dealt with scale into the world of online news, the yardstick is different. In VFX you have to provide enough CPU and IO otherwise the world ends, in news a CDN does pretty much all the heavy lifting, there is a different understanding of scale. Most problems are about HA, which is mistakenly thought of as “scale”
What is Docker?
At $WORK there is a strong push towards docker. Firstly what is docker? Docker is basically a wrapper around cgroups, to provide containers. Great, but what does that mean? a container is a is a bunch of system libraries, application glue and config in a self contained tarball. The only thing needed to run a docker container is a compatible kernel(sortof).
You can think of it as a very lightweight VM. However that is part of the problem. Docker is not a VM. docker is glorified chroot with a cgroup wrapper, and its not and original idea. Rescue CDs basically allow you to do the same thing, but without the cgroups bit (so no isolation). This means you can boot a CD with a stable RHEL kernel and system libs, mount the local disk and run the installed FSCK, even though the installed distro could be ubuntu. (bad example, but within reason you can run any programme on the local disk)
Its also not new. At $WORK we are migrating away from SunOS Zones.
At a previous work we had a similar setup to this, for various testing and building purposes. However that was born out of necessity because we didn’t have enough cash for a real VM cluster at the time.
Where does docker shine?
Docker is brilliant as a VM replacement for vagrant. There is a minimal spinup penalty, and a vast library of prebuilt images.
It also allows people to provide a known shippable app that is more or less self contained. As time goes on this will become more important.
Fail early, fail fast
Docker forces you to resolve your dependencies at build time. This is possibly the single biggest plus. Because you have to build a container from scratch, you must start from scratch and run all the various builds stages from nothing.
This eliminates the classic “it builds on dev because someone manually installed x by mistake” Lord knows this is important. If you take away anything from this rant, it should be this: always build from scratch. Throw away your dev machines hourly.
Uses of Docker at scale
The majority of docker use cases involve CI. Its brilliantly suited because it allows a dev to spin up an isolated environment to install dependencies without affecting the build host. This is invaluable, as it allows more that one team to share a build host.
How not to use docker:
The following is an extract from an email conversation about a link to spotify’s lightning talk:
This doesn’t instil much confidence, that in 2013 they didn’t think to use virtualisation. The presenter countered with: “we wanted to give some room to grow” that’s great capacity planning guys. Just capex all the things. (He goes on to say that they didn’t want the speed issues of virtualisation, which shows that they didn’t really do any research. )
He goes on to say “we use puppet to manage 5000+ servers, and its really hard” & “we package things into debian packages, and upgrading requires developers ssh’ing on to the boxes” & “as you can imagine it fails often”
The mind boggles. first, the whole point of config management is that you can control packages, if you’re sshing in to upgrade, you’re doing it wrong. Second why is it hard to figure out why the package upgrade failed? I mean isn’t that the whole point of package management, to provide dependencies and bundle upgrade scripts/paths?
“We can’t know for sure that all out machines are in a consistent state” Isn’t that the point of puppet/ansible/salt/etc?, its state enforcement after all. If its not in your config management, its not repeatable.
Sounds like spotify needed to invest in better jenkins deploy scripts. That and learn how to build VM clusters. What’s even less clear is why Devs were even having to think about packaging, apt-get’ting and orchestration.
Surely all we should aim to be in a place where all we need to do is: git push test feature/blah, do some testing, then git push prod master to get it to live. All the rest of the details should be handled by the build system? (maybe with an orchestration file to do app pinning and order of starting)
The main thing that annoyed me was that despite not appearing to have any capacity management at all, they dismissed Virtualisation because of “speed issues” if you have a 10% utilisation of you infrastructure then you’ll not see any speed difference. Unless you’re doing it wrong and not providing enough RAM/IO/network. Besides the “VM penalty” will apply equally to Docker. You’re still cramming n processes into one machine.
Zombies and you
Docker best practice encourages you to run a single process in each container. This means that if your process forks at any point, the children will never get reaped. In effect this is a memory leak. You end up with zombies.
Let me give an example, say you are running an app that does the occasional exec, or you have baked in some monitoring like nagios or sensu. Each time something is exec’d even though the child has quit, the resources won’t be reclaimed. That is the job of init.d or systemd. Effectively you’ll have a memory leak. the container will fill up and you’ll get OOM killed.
The networking is shit.
Docker doesn’t yet have an easy way to assign IPs to containers. The best it does it round robin between two containers operating on the same port. With work, you can hack about and assign virtual IPs, but then that’s work. (if you’d bought vmware, you’d have been done by now…)
Your build system is now critical
If you have or practise immutable infrastructure, then this is largely irrelevant. However in a years time, when you need to patch a critical systems bug, will your build scripts work? If you kill and rebuild your containers, do you have adequate monitoring to ensure that only good containers are promoted to production.
Modern VMs arn’t slow
The cargo cult goes: docker over comes the VM penalty. What I assume people mean is the AWS penalty, because a properly provisioned system is unnoticeably slower. Yes AWS is dogshit slow. EBS is terribly limited, as you are paying for scale not speed. Unless you pay much monies, your CPU is throttled to buggery.
What a VM system does give you, is migration. That means that when you have hardware issues, or you need to upgrade hardware, you can move VMs about with no downtime. (you don’t really see this in AWS. its something that happens behind the scenes, if at all)
if a host becomes overloaded, VMs can be automatically moved to somewhere less congested. Docker? yeah good luck. You could of course run docker inside a VM but then you get the “penalty” right?
Docker done right at scale
https://speakerdeck.com/teddziuba/docker-at-ebay this is a presentation by Ted Dziuba. He is captain pragmatic. Whilst I wouldn’t have done it in the same way, what he shares is a solid use of docker. What’s not shared is the provisioning of the physical machines, the clustering of the database, the build system, the monitoring and orchestration. Ironically they are possibly more key to how docker works well, than docker it’s self.
Docker is a tiny component in a glue of badly integrated systems. Without a decent linux build, network, extensive build systems and monitoring and alerting, you’ll still end up with a shit system that’s hard to maintain. Its just you’ll have containers.
Choose what’s right.
As with all things, you need to choose the right tool for the job. Docker will work, however not without significant work on your build system. Before considering anything, you need to understand what’s currently deficient with your present infrastructure.
Even if you chose not to use docker, you should really consider resolving all your dependencies at build time, not deploy. If you use the cloud, then you really have no excuses for not resolving dependencies at build time. Before running down the docker road you need to asses immutable infrastructure first. Why? because to the dev its pretty much the same thing, just with easier debug and monitoring.