How-to: run Microsoft Cognitive Toolkit based application inside a Docker image

Published 12/7/2017 5:40:39 AM
Filed under Machine Learning

As part of a Community Hack at Microsoft Denmark people can hack together basic AI models with CNTK. Two fellow MVPs asked me if they could use Docker to work on CNTK applications.

Sure you can work with a Docker image on CNTK stuff. Microsoft has a Docker image for CNTK. I've created an image on top that includes some additional configuration to make things even easier. You can download it right here: ai-community-hack image

How does it work

Microsoft has kindly created a Docker image that includes the Cognitive Toolkit bits preinstalled. They provide both a CPU version and GPU version of it. I have used the CPU version, but you could easily transform that image into a GPU version if you wanted.

To use the microsoft/cntk you need to base your own image on top of it. Basically like this:

FROM microsoft/cntk:2.3-cpu-python3.5

....

You then add a CMD line to your own Dockerfile and execute your Python code with that. The format for the CMD line however is somewhat different than normal.

You have to first activate the CNTK environment in the docker image and then run your python command like so:

CMD ["/bin/bash","-c","source /cntk/activate-cntk && python <your-script-path>"]

Instead of executing just a Python program I opted to run the python notebook server. This enables you to edit python code from the browser.

Check out the contents of the Dockerfile to get a sense of what that looks like.

How can you use this for production

In any normal production scenario you'd execute python code that starts a service or trains a model and stores that model somewhere on disk or on a server.

Python notebooks aren't very useful in this situation. I usually follow this pattern to build a model that I want to use on production later:

  1. Build, train and validate the experiment in a Python notebook
  2. Setup a proper python project with a _main_.py
  3. Copy-paste the code blocks from my experiment to python files
  4. Write a bunch of unit-tests and refactor the code to properly structure it
  5. Create a new dockerfile that includes my Python program
  6. Build and publish my Docker image to a server

Thoughts and/or ideas?

Hope you find the docker image useful. If you have any thoughts or ideas, let me know in the comments below!