ML.NET – Introduction to Machine Learning With C#

Posted by MrD Brains | Updated Date: May 10, 2024

Ready to take your skills to the next level? Jump into our high-impact courses in web development and software architecture, all with a focus on mastering the .NET/C# framework. Whether you're building sleek web applications or designing scalable software solutions, our expert-led training will give you the tools to succeed. Visit our COURSES page now and kickstart your journey!

To download the source code for this article, you can visit our GitHub repository.


Introduction

Machine learning is one of the most exciting and rapidly evolving fields in computer science. Lately, we have seen the emergence of advanced AI tools such as ChatGPT. The basis for these tools is machine learning, and its growing prevalence should make developers sit up and take notice.

What Is Machine Learning and How Does ML.NET Enable It?

Machine Learning, or ML for short, is a field of computer science that involves training algorithms to recognize patterns in data. Predictions or decisions are made based on these patterns. The goal of the machine learning model is to predict a new system state based on previous states.

From a clear C# developer’s point of view, Machine Learning can be challenging because building and training models require a lot of specialized knowledge and resources. That is where ML.NET comes into play.

ML.NET is an open-source machine learning framework that makes it simpler for C# developers to build and deploy machine learning models. It provides a range of algorithms for supervised and unsupervised learning, as well as tools for data preparation, training, evaluation, and deployment.

Setting Up the Development Environment

The first step is to install the required ML.NET packages, using the command line:

PM> Install-Package Microsoft.ML

Or via NuGet Package Manager. Multiple ML.NET packages are available; for most projects, install only the Microsoft.ML package.

Understanding Supervised Learning

3 main categories of machine learning models:

  1. Supervised learning
  2. Unsupervised learning
  3. Semi-supervised learning

Unsupervised learning: Train a model on unlabeled data to find patterns.
Semi-supervised learning: Combine labeled and unlabeled data.
Supervised learning: (focus of this article) Train a model on labeled data (each data point known outcome).

Common supervised learning algorithms: linear regression, logistic regression, decision trees, etc. Each algorithm fits different problems and data types.

Building and Training a Simple Model With ML.NET

We'll use the Credit Risk Customers dataset (21 columns) and focus on credit_amount, duration, age, and class.
The class column is the one we want to predict.

Defining ModelInput and ModelOutput Classes

public class ModelInput
{
    [ColumnName("duration"), LoadColumn(1)]
    public float Duration { get; set; }
    [ColumnName("credit_amount"), LoadColumn(4)]
    public float CreditAmount { get; set; }
    [ColumnName("age"), LoadColumn(12)]
    public float Age { get; set; }
    [ColumnName("class"), LoadColumn(20)]
    public string Class { get; set; }
}

// Prediction result model
public class ModelOutput
{
    [ColumnName("PredictedLabel")]
    public string Prediction { get; set; }
}

Defining the ModelBuilder Class

public class ModelBuilder
{
    private MLContext _mlContext = new MLContext(seed: 0);
    private PredictionEngine<ModelInput, ModelOutput> _predictionEngine;
    private IDataView _trainingDataView;
    private IDataView _testDataView;
    private ITransformer _mlModel;
    ...
}

Model Creation Method

public void CreateModel(string dataFilePath, string savingPath)
{
    LoadAndSplitData(dataFilePath);
    var pipeline = PreProcessData();

    BuildAndTrainModel(_trainingDataView, pipeline);

    EvaluateModel();
    SaveModel(savingPath);
    _predictionEngine = _mlContext.Model.CreatePredictionEngine<ModelInput, ModelOutput>(_mlModel);
}

Step-by-Step ML.NET Model Development

Collecting Data

private void LoadAndSplitData(string dataFilePath)
{
    var allDataView = _mlContext.Data.LoadFromTextFile<ModelInput>(
                              path: dataFilePath,
                              hasHeader: true,
                              separatorChar: ',');
    var split = _mlContext.Data.TrainTestSplit(allDataView, testFraction: 0.1);
    _trainingDataView = split.TrainSet;
    _testDataView = split.TestSet;
}

Preparing The Data

public IEstimator PreProcessData()
{
    var pipeline = _mlContext.Transforms.Conversion
        .MapValueToKey(inputColumnName: "class", outputColumnName: "Label");
    pipeline.Append(_mlContext.Transforms.Concatenate("Features", "duration", "credit_amount", "age"));
    return pipeline;
}

Training the Model

public IEstimator<ITransformer> BuildAndTrainModel(IDataView trainingDataView, IEstimator<ITransformer> pipeline)
{
    var trainingPipeline = pipeline
            .Append(_mlContext.MulticlassClassification.Trainers.SdcaMaximumEntropy("Label", "Features"))
            .Append(_mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabel"));
    _mlModel = trainingPipeline.Fit(trainingDataView);
    return trainingPipeline;
}

Evaluating the Model

public void EvaluateModel()
{
    var testMetrics = _mlContext.MulticlassClassification.Evaluate(_mlModel.Transform(_testDataView));
    Console.WriteLine($"- MicroAccuracy:\t{testMetrics.MicroAccuracy:0.###}");
    Console.WriteLine($"- MacroAccuracy:\t{testMetrics.MacroAccuracy:0.###}");
    Console.WriteLine($"- LogLoss:\t\t{testMetrics.LogLoss:#.###}");
    Console.WriteLine($"- LogLossReduction:\t{testMetrics.LogLossReduction:#.###}");
}

Sample Evaluation Metrics:
MicroAccuracy: 0.796
MacroAccuracy: 0.514
LogLoss: 34.539
LogLossReduction: -69.23

Decent accuracy, but room for improvement in macro accuracy and log loss.

Saving and Loading the Model

private void SaveModel(string saveModelPath)
{
    _mlContext.Model.Save(_mlModel, _trainingDataView.Schema, 
        Path.Combine(Environment.CurrentDirectory, saveModelPath));
}

public void LoadModel(string path)
{
    _mlModel = _mlContext.Model.Load(path, out _);
    _predictionEngine = _mlContext.Model.CreatePredictionEngine<ModelInput, ModelOutput>(_mlModel);
}

Prediction Usage Example

public ModelOutput Predict(ModelInput input)
{
    return _predictionEngine.Predict(input);
}

// Usage:
private static string savedModelFilename = "trainedModel.zip";
var modelBuilder = new ModelBuilder();
modelBuilder.LoadModel(savedModelFilename);
var modelInput = new ModelInput()
{
    Age = 300,
    CreditAmount = 100000,
    Duration = 120
};
var prediction = modelBuilder?.Predict(modelInput);
Console.WriteLine($"\nExample input class is {prediction?.Prediction.ToUpper()}!");

Usage is straightforward once your model is trained and saved.


Improving Model Performance

What Can We Do With ML.NET?

Conclusion

In this article, we’ve covered the basics of machine learning and explored how to create a simple ML model in C# using ML.NET. We also learned about techniques to improve model performance and several interesting ML.NET scenarios.

ML.NET is a great addition to the Microsoft stack, enabling C# developers to keep pace with this exciting, rapidly evolving technology.