Optical Character Recognition with F# and ML.NET

Mark Farragher
7 min readJul 23, 2020

In this article, I’m going to build an app that recognizes handwritten digits from the famous MNIST machine learning dataset:

The MNIST challenge requires machine learning models to read images of handwritten digits and correctly predict which digit is visible in each image.

This may seem like an easy challenge, but look at this:

These are actual digits from the dataset. Are you able to identify each one? It probably won’t surprise you to hear that the human error rate on this exercise is about 2.5%.

There are 70,000 images of 28x28 pixels in the dataset. I’m going to use a truncated set of 5,000 images to speed up the training.

And I’ll build my app in F# with ML.NET and NET Core.

ML.NET is Microsoft’s new machine learning library. It can run linear regression, logistic classification, clustering, deep learning, and many other machine learning algorithms.

And F# is a perfect language for machine learning. It’s a 100% pure functional programming language based on OCaml and inspired by Python, Haskell, Scala, and Erlang. It has a powerful…

--

--