When the Docker revolution started, one argument among many that was put forward to use containers instead of virtual machines was their size. Container images were supposed to be small.
However, several anti-patterns emerged quickly in the Docker early days. First, most people wanted to treat containers just like VMs, hence they wanted a SSH server in them, they wanted to run multiple processes in them and they wanted their regular linux distributions.
This quickly grew the size of Docker images that could be pulled from the Docker hub. Official Ubuntu and CentOS images used to be above 600 MB. Once dependencies and application code got added, it was not rare to see several GB Docker images around.
Folks reacted rather quickly and several things happened:
- The official regular distro images got smaller
FROM scratchbecame very popular
- Alpine took off
Alpine linux indeed allows you to create very small Docker images. It is based on busybox and muslc libc. It is rooted in embedded linux. While very useful for testing and development, I believe that Alpine is challenging in an enterprise setting used to CentOS and Debian where packages provenance and patching is critical and where code may break unexpectedly with muslc.
But if we leave those concerns aside, is it really that small ?
Comparing Official Images
If you pull the latest official images of well known distribution you get the following sizes:
centos:7= 191 MB
debian:jessie= 194 MB
debian:jessie-slim= 80 MB
ubuntu:16.04= 129 MB
alpine:3.4= 4.8 MB
First reaction is that Alpine is super small. Second reaction is that the CentOS and Debian community have reacted very well from the early days and shrank their images. Third reaction is: What is actually in those images ?
If we just compare image size aren't we comparing apples and oranges ?
I will leave chasing the source of the Dockerfile(s) that makes those base image to another post. This is another rabbit hole, that really scared me. But I am a Python guy, so let's see how to run Python with those images.
The first "surprise" is that Python is already in CentOS by default:
$ docker run --rm -it centos:7 python Python 2.7.5 (default, Nov 6 2016, 00:28:07) [GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>>
But it is not in the Debian or Ubuntu or Alpine images. If we look at the official Python images from the Docker Hub (which is a recommended best practice), we see that they are based on Debian (via
FROM buildpack-deps). Let's pull a few and check their sizes:
python:2.7= 676 MB
python:2.7-slim= 181 MB
python:2.7-alpine= 72 MB
And there, big surprise, the official image is huge, the Alpine jumps to 72 MB compared to 4.5 MB when empty. And the
slim image is at 181 MB really close to the 191 MB of straight-up
Caveat: I did look a bit deeper into the
buildpack images. It is interesting as the
python:2.7-slim is actually not based on the official
debian:jessie-slim. If you were to use that official debian image, you would get a 131 MB Python image, only 2x from Alpine.
Now for the last step of this post.
Let's say that you are not happy with a 191 MB official CentOS image or a 131 MB Debian-base image, solely because they are 2x or 3x bigger than the Alpine based Python image (muslc and apk packages aside). Could we shrink a Python image ?
It sits at 50 MB, gives you standard glibc and access to standard debian packages. It also has a convenience script to install packages and clean the cache, remove man pages etc, in order to keep images small.
Let's do a Python image with
minideb. A simple Dockerfile will do:
FROM bitnami/minideb:jessie RUN install_packages python
Build and run:
$ docker build -t minideb-python . $ docker run --rm -it minideb-python python Python 2.7.9 (default, Jun 29 2016, 13:08:31) [GCC 4.9.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>>
And you get it for 79 MB compared to the 72 MB of the official Python Image based on Alpine.
And I am sure we could do a
minicentos:7 version of it which would shrink the official CentOS image even furter.
This post is already too long, so let's keep it short:
- Yes Alpine based images are very small, but not as small as we think (at least not with Python in them).
- Pervasive distros like CentOS and Debian already have very small official Docker images.
- You can't compare images based solely on size, you need to check how they are actually made (base Dockerfiles, packages installed/removed).
bitnami/minideb:jessieis a great minimalist debian-based Docker image which can compare in size with an Alpine image.