Americas

  • United States

Asia

matt_weinberger
Reporter

Contain yourself: The layman’s guide to Docker

news analysis
Nov 21, 20145 mins
LinuxVirtualization

Don't worry, this article is self-contained.

docker
Credit: Shutterstock

Welcome to the age of containerization, where an ecosystem led by startup Docker is leading IT organizations to ineffable peaks of efficiency, helping them scale their workloads ever-higher, and probably baking them a nice cake to boot (it’s my birthday, I have cake on the brain, sue me). Microsoft, Google and Amazon Web Services are all tripping over themselves to make sure prospective customers know that their clouds are the place to be if you want to get the most from Docker. 

That’s great and all, but what really is Docker, and why are containers suddenly such a hot topic? Without getting lost in the weeds, and without breaking out the diagrams, let’s take a look.

The big idea behind Docker, which arrived at a stable version 1.0 over the summer, is nothing new. Docker is essentially a wildly popular open source implementation of lightweight Linux containers, putting some secret sauce on top (and standardizing them in the process). The company sells services on top of these containers, in kind of a Red Hat model. 

Linux containers have been part of the kernel since 2008, and enable — I’m quoting from the Wikipedia entry here — “operating system-level virtualization through a virtual environment that has its own process and network space, instead of creating a full-fledged virtual machine.” 

If you’re a big public cloud customer, you probably already see why this is a big deal. Virtual machines are the cornerstone of cloud computing, where we abstract away the hardware and focus on running as many guest operating systems as the infrastructure will support. That’s a fancy (and oversimplified) way of saying that you trick one server into thinking that it’s five or 10 or 100 virtual machines, all running Ubuntu or Windows Server or whatever, and repeat across your entire server farm. 

At the scale of your average data center, the gains in speed, resilience and resource utilization are significant. At the scale of a Google or an Amazon Web Services, it’s what makes modern web services possible. 

But there’s an issue here: Virtual machines are tricky beasts. Because they rely on the operating system layer to run applications, tiny differences between environments can add up to a lot of complexity when trying to move workloads between your server rack and the cloud. Or the cloud to your server rack. Or from one cloud to another. Configuration management can be a nightmare. And that’s besides the fact that just running the operating system can eat a lot of CPU cycles.

So let’s go back to Docker, which famously got its name from the shipping containers you can see on any major cargo vessel in any seaport. Shipping internationally across the water was an untenable proposition at large scales for a long time, goes Docker CEO Solomon Hykes’ standard pitch, until the invention of the stock-standard shipping container. Before containers, a ship would have to account for hundreds or thousands of different goods on board, all of different shapes and sizes; with containers, transportation companies can stack as many of the colorful boxes atop  each other without the slightest concern for what’s actually in them. 

So extend that metaphor to the cloud. By packaging an application and all its dependencies in such a way that it doesn’t require a full-fledged virtual machine to run, you can shove as many as you want onto a single host Linux operating system. So if you’re running Ubuntu on your home PC, and you package up your applications (web servers and databases are popular candidates) into a Docker container, they’ll run just as well on any other modern Linux distro no matter if it’s running on a private cloud, a standard server or Amazon Web Services.

The advantage, besides the portability, is that you no longer need a virtual machine spun up for each and every app. That means more processing power is freed up for more Docker containers or any other application you want to run. To wit: You can do a lot more with a lot less.

Understand, though, that containers are pretty dumb on their own, lacking functions for replication and scheduling ora robust way to handle packages and install services.

This is where that burgeoning ecosystem comes in. Startups like CoreOS provide an extremely stripped down version of the Unix kernel to package up with those apps, providing higher-level services without hitting performance. Meanwhile, Google — a long time consumer of Linux containers itself — released Kubernetes, a Docker container management and orchestration tool, to the open-source community. 

This really only touches on some of what’s happening in the market, but it has a lot of people very excited. That’s why the market is reacting so quickly and so many cloud providers are tripping over themselves to express support. 

But Docker isn’t for every application — some have too many dependencies or too many complexities to be neatly packaged up — especially legacy apps. Newer apps, designed in the first place to be run at web scale, tend to fare better: WordPress, MySQL, Redis, and Nginx are among the most popular images at the central repository hub

That’s why companies like VMware and Microsoft choose to work with Docker rather than crush it, despite the seeming threat it poses to their lucrative virtual machine hypervisor businesses: Many IT shops are finding success in trimming down their footprint with Docker, but still rely on VMs for many applications. 

Since Linux containers have been around for a while, Docker isn’t the only game in town, it’s just the most popular. Joyent and Canonical both recently open sourced their own spins on the concept of containerization, putting some competitive pressure on the market.  

So yes, containers are here to stay.