ML.NET Tutorial: Getting started with ML.NET

Published 8/2/2019 7:25:48 AM
Filed under Machine Learning

Microsoft built a new machine learning library a while ago. It allows C# engineers that want to take their machine learning to a higher level to build custom models rather than using the standard cognitive services. In this post I'll show you how to get started.

Since some people don't like to read, I've also created a short video demonstrating how to get started with ML.NET.

In this post we'll cover the following topics:

  • What is ML.NET?
  • Installing the ML.NET commandline utilities
  • Building your first machine learning model

What is ML.NET?

ML.NET is a library and a set of tools that allow .NET developers to build and integrate machine learning into their .NET application.

With ML.NET, you can create regression, classification, recommendation, and forecasting models in C#.

You can build machine learning models in ML.NET using code, or you can automatically generate them using the command line tools. In this post we're focusing in on the automated machine learning tools, I will get back to the code in a future post.

Installing the ML.NET commandline utilities

To get started with automated machine learning in ML.NET, you'll need the command line utility or Visual Studio. The command-line tool can be installed using the following command:

dotnet tool install -g mlnet

Make sure you have .NET core 2.2 or later installed on your machine before running this command. If you haven't got it yet, you can get it from the website: https://dot.net/

Tip: If you've tried ML.NET before, you may run into an error telling you that the tool is already installed. In this case, run dotnet tool uninstall mlnet to install the old version. Then, run the original install command to get the updated version of the tool.

For mac users: I ran into a problem where the tool did install, but couldn't be found in the $PATH variable of my system. I extended my `$PATH` to include the .NET tools like so:

export PATH=$PATH:~/.dotnet/tools/

After this, the tool worked like a charm!

Let's move on to building our first machine learning model.

Building your first machine learning model

With the tools installed, let's jump into creating a machine learning model. We're going to use a toy dataset for this from the UCI machine learning repository. After we've downloaded the data and transformed it to CSV, we're going to use the ML.NET command-line tool to train a model.

Getting sample data from the UCI machine learning repository

First, we need to download a dataset. You can get some pretty cool datasets from the UCI machine learning repository: https://archive.ics.uci.edu/ml/datasets.php

Among the samples is a dataset that can be used to train a regression model to predict the milage of a car based on its properties. You can get it here: https://archive.ics.uci.edu/ml/datasets/Auto+MPG

You'll find that the dataset is stored in a weird format. I've gone ahead and converted it to CSV using Python. You can get the CSV file from my github account: https://gist.github.com/wmeints/80c1ba22ceeb7a29a0e5e979f0b0afba

Using the ML.NET commandline utility

With the dataset we can train a ML.NET machine learning model using the following command:

mlnet auto-train --task regression --dataset auto-mpg.csv --label-column-index 0 -x 30

This command takes the following parameters:

  • --task: The task (regression, classification, recommendation, forecasting)
  • --dataset: The path to the dataset
  • --label-column-index: The index of the column that we want to predict
  • -x: The amount of time spend exploring different models

I've set the time to explore really low (30 seconds) for demonstration purposes. Depending on your dataset you will need more time to get a good model.

Exploring the generated output

Once the command line tool is done, it will generate a sample project and a model project for you. The image below shows what the project looks like in Visual Studio Code.

The generated projects in Visual Studio Code

There are two projects in the solution:

  1. The model project, which contains the trained model.
  2. A sample console application, demonstrating the use of the model project.

The easiest way to get the model into your project is to copy the sources of the model project to your own application and use the console application as a template to write the necessary glue code.

Summary

In this post you saw how to get started with ML.NET and where to get free sample datasets to start learning. In the next post we'll explore the pipeline further and learn how to build a machine learning model without automated machine learning tools.

Thanks for reading!