sebgoa sebgoa on docker, python, google, go, binary

Static Binaries from Python Scripts with Grumpy

Today, Google open-sourced Grumpy.

"Grumpy is an experimental Python runtime for Go. It translates Python code into Go programs, and those transpiled programs run seamlessly within the Go runtime."

While this will surely raise eye brows, Google justifies the effort by needing to run concurrent Python workloads more efficiently. Youtube it seems, is based on Python. Interestingly it seems that Grumpy has no GIL, a well-known limitation of Python threads.

What quickly sparked my interest was the ability to generate a static binary from a Python script, and use that binary in a container. Making the container image extremely small.

There are solutions to build static binaries of Python scripts (e.g Pyinstaller) but I have never been very lucky with them. With Grumpy it really took me 10 minutes, so let me share:

Install Grumpy

Remember that this is experimental , nobody is saying that you should use this in production today.

Assuming you have a working Go environment, clone Grumpy, make it and set a few temporary environment variables.

git clone https://github.com/google/grumpy.git  
cd grumpy  
make  
export GOPATH=$PWD/build  
export PYTHONPATH=$PWD/build/lib/python2.7/site-packages  

Translating Python to Go

You are now ready to translate your Python script to Go.

Let's take a toy print statement:

cat hello.py  
print "hello world"  

Translate it with tools/grumpc, build it with Go and execute it:

$tools/grumpc hello.py > hello.go
$ go build -o hello hello.go
$ ./hello
hello, world  

If you are not on Linux, you can use Go to build for another platform:

GOOS=linux GOARCH=amd64 go build -o hello-linux-amd64 hello.go  

You now have a static binary for linux 64-bit. And that means...drum rolling...you can stick it in a container easily

Putting it in a Container

Since you built a static binary, this means that you do not need a Docker base image with the Python runtime. You can just use SCRATCH. That is the real kicker in my view:

Here is the basic Dockerfile:

FROM scratch  
ADD hello-linux-amd64 /  
CMD ["/hello-linux-amd64"]  

Build the image and run the container:

$ docker build -t hello .
$ docker run hello
hello, world  

The size of that image is now 4.8 MB, but compared with the 72 MB of the official Python Alpine image, this is a huge reduction.

Caveats

Not suprising, the 4.8 MB for a single print statement is a big bump from the 21 Bytes of the actual Python source. But it is executable as is without any dependencies or separate runtime. It will be interesting to see how this increases when we start using real Python script with imported module.

The big caveat as Grumpy just got released is how to create a toolchain so that exciting Python module will be usable. One will need to get the source instead of installing them via PyPi, and have Grumpy translate them as well.

It could also mark a new evolution, where Python developers start using Go modules directly in their scripts.

This is still very early for Grumpy, but without looking at the big implication of having no GIL, the possibility of creating a static binary that I can use with a FROM scratch image is terrific. Well done !