Machine learning systems are complicated. And sometimes, it’s not the fault of the engineers who build them. It’s the nature of machine learning systems.

Here is what I mean…

Let’s say that you did a great job at finding good data, you prepared it reasonably well, and your model made great predictions. Everything is pretty cool at the moment!

But there are times you won’t be able to prevent the worse to happen. A model that is used to make good predictions can start to make misleading predictions. What can go wrong?

There are two reasons why that can happen…


Recently, I had the fortune to host a world-class Machine Learning practitioner, who has not only built a wide range of Machine Learning systems but also helped many people to make a career in Machine Learning.

He is Santiago Valdarrama, someone you might know if you are on the Twitter ML community. Being a Director of Computer Vision at Levatas, he leads a team of software developers and machine learning engineers in the development of Levatas’ flagship product. …


Exploratory Data Analysis or what many people call EDA is a critical step in a machine learning or data science project. This step is more about learning the data and when done properly, you can find some interesting insights that can help you in understanding why certain predictions were made.

In this post, I want to share how I approach the exploratory analysis. I usually start with simple things that you may already know, especially if you have done any end to end project.

Taking a Quick Look at the Dataset

This is the initial step in the process of exploratory analysis. It is here that you…


Addressing the common hidden mispractices that can hurt the results of the machine learning systems when everything else was done well!

Image by author

In almost any step involved in building a machine learning project, there is a chance that something can be done incorrectly. There is going to be a small mispractice that is hard to notice but can completely ruin everything.

Here is what I want to mean…

The failure of a machine learning project can be caused by many factors but the two common pitfalls are data leakage and inconsistent data preprocessing functions. …


Every real-world dataset comes with its unique blends. Sometimes working with real-world data, you will have to deal with categorical data. Categorical data are those types of data whose features’ values contain a limited number of categories. Take an example of feature gender that can have two categories: male and female.

Why do we have to handle categorical features?

The reason is that most Machine Learning algorithms accept numerical values at the input. So, we have to manipulate these types of categories to be in the proper format accepted by these learning algorithms.

In this article, I want to talk…


Image by author

In this simple article, I reflect back on my early days of high school and how machine learning could have solved some hard problems back then!

I remember back in the early days of high school, we used to have maths problems where we could be given a table of two variables, X and y, and the question was to find the linear relationship (equation) between X and y.

Here is how the table used to look like:


Let’s say that you wanted to build a real-world image classifier, but you found out that you can only get 20 images. You thought you can collect more images and get 100 in total. You know that if you spend more time you can get 150, 200, 250 but it’s slow and daunting…What can you do to get enough training images for your classifier?

As ML succeeds in solving complex problems, there is a need for enough data and it is rare to get it in the first place. The good news is that with the advent of ML techniques…


Building machine learning models is one thing, and making the most of machine learning models is another thing. There are a lot of iterative works involved to build an effective ML application that can ultimately solve the business problem. The gap between the problem and the desired goal is often caused by a lack of proper framework in approaching machine learning systems.

In this article, I want to talk about the 7 key points that can potentially help in the path of building effective ML systems. But I would also like to mention that I am also learning this art…


A new way of diagnosing machine learning systems!

The standard way of doing Machine Learning has been focusing on choosing the best learning algorithm and tweaking hyperparameters of such particular learning algorithm to get good accuracy. It is inarguable that the goal of such an approach yields some increments in the accuracy or other desired metrics. However, a recent trend in the ML community suggests different things — that is understanding that the model is a small fraction of what to be done to build an effective and working machine learning system and instead spend time improving the data. In other words, the data-centric approach.

In this article…


Original Image from Canva

A helpful note on choosing the right learning algorithm!

There are many types of machine learning models, from linear models, tree-based, ensembles, to neural networks. Knowing which model to pick up while approaching a given problem can be a battle.

In this write-up, I want to share some takes that can hopefully reduce your modeling curve. While I will list different factors to consider, a model selection is a no-free lunch scenario — There is no model that is guaranteed to solve a problem before you try and evaluate different models.

The first thing to consider is the scope of the project. …

Jean de Dieu Nyandwi

Writing about Machine Learning!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store