It's time to get a faster, cleaner anaconda

Published 6/6/2019 7:53:51 AM
Filed under Machine Learning

Why I downloaded massive anaconda binaries in the past is beyond me to be honest. I like to use virtual environments, so almost none of the stuff in the base anaconda installation is ever used. Time to change things around and get a leaner, faster anaconda installation.

In this post, I'll show you how you can switch your computer to using miniconda and how I use virtual environments in my projects to keep a clean workspace.

Getting miniconda

Let's start by getting miniconda installed. Miniconda is a Anaconda variant that contains a minimal set of tools to get started with Python on Anaconda. Unlike the full blown Anaconda distribution it doesn't contain packages like scipy, scikit-learn and pandas. You have to install those manually.

The benefit of Miniconda is that it is only 58MB on Windows. That's a lot less then the 500MB installer that you have to download for Anaconda.

You can download the miniconda installer from the anaconda website. Once downloaded, follow the instructions to install it on your machine.

Make sure to include the anaconda tools in your PATH variable. The installer will provide the options for this during installation.

Now that you have miniconda installed, let's take a look at how you can use it from Powershell and Visual Studio Code on Windows.

Setting up powershell for python development

My personal favorite terminal on Windows is Powershell. It has a pretty decent scripting engine and many open source extensions exist for it.

Setting up useful PowerShell modules

For example, I like to have the following extensions installed on my machine:

The posh-git extension shows the git status for the current directory. With this you can quickly see if you have any pending changes or need to push commits to the server.

The second extension, oh-my-posh contains a customizable prompt with several themes. It's not strictly necessary, but it's a nice addition and it makes my command prompt less boring.

To install these extensions, you first need to set up the NuGet module feed for PowerShell:

Install-PackageProvider NuGet -MinimumVersion '2.8.5.201' -Force
Set-PSRepository -Name PSGallery -InstallationPolicy Trusted

First, we add the NuGet package provider to PowerShell. Next, we configure the policy for the PSGallery repository so it is trusted. Note, you can always disable this afterwards if you want.

After setting up the feed, we can install the modules using the following commands:

Install-Module -Name 'posh-git'
Install-Module -Name 'oh-my-posh'

Once installed, edit your powershell profile using the following command:

notepad $profile

Include the following lines in the file:

Import-Module 'posh-git'
Import-Module 'oh-my-posh'
Set-Theme agnoster

The first line imports the git module for the status information. The second line imports the oh-my-posh module. Finally, the third line sets the theme for the oh-my-posh module to agnoster.

Let's modify PowerShell some more with the addition of the anaconda tools.

Adding anaconda commands to PowerShell

Anaconda has a built-in virtual environment manager that allows you to run different Python versions side-by-side on your computer. This is an important tool for Python developers, because you will want different versions of packages depending on the project you're working on. Virtual environments support this.

To use the virtual environment tooling of Anaconda in PowerShell, you need to execute the following command:

conda init powershell

This will install a custom module in PowerShell that allows you to use virtual environments.

With the PowerShell environment set up, let's take a quick look at how to use virtual environments.

Working with miniconda inside PowerShell

In the previous sections, we've talked about installing and configuring miniconda. Because miniconda doesn't contain any packages, you'll need to use virtual environments. This might seem like a lot more work, but I assure you it will save you on headache medication.

To work on a new project, you'll need to create a virtual environment. You can create a new virtual environment with the following command:

conda create -n myenv python=3.7 pip

Running this command creates a new environment named myenv with python 3.7 and pip.

After you've created the environment, you can activate it using the following command:

conda activate myenv

Once, activated you can start installing packages using conda or pip"

conda install scipy
pip install tensorflow

Installing packages locally is really easy and also a little dangerous. Unless you tell your coworkers about the packages, they will probably run into errors telling them they are missing packages.

To fix this problem, you can export your environment to a descriptor file. Run the following command to export the environment:

conda env export > environment.yml

This will create an anaconda environment file that contains all the settings you used to configure your virtual environment. Your coworkers can now get the same environment by using the following command:

conda env update -f=environment.yml

This will update the environment with the settings from the environment file. Similarly, you can create a new environment based on the file using the following command:

conda env create -f=environment.yml -n myenv

Done working on your project? You can deactivate the environment using the command:

conda deactivate

Here's to your new clean Python workspace

And there it is, a cleaner, and faster way to work with Python inside PowerShell. Enjoy!