Identifying and tackling the challenges of Machine Learning

One of our experts explains you how to identify and tackle common challenges within Machine Learning
Sun 20 Jan 2019

Unlike classical technology projects, machine learning projects come with certain unique challenges. As companies discover clear use cases for applied machine learning in their organization, they may benefit from knowing which are these challenges, and how to deal with them before the start of the project. In this article, discover our experience on how to ensure that these projects are deployed successfully, and in a way that they guarantee a high ROI (Return On Investment).

Challenge 1: Creating a common understanding of your goal

Most businesses come with their own vocabulary to describe their business environment, challenges, products, and much more. The world of machine learning also comes with its own words: supervised, unsupervised and reinforcement learning, type 1 and type 2 errors, etc. The terms used can cause a lot of confusion and misunderstandings. What may seem perfectly clear in one stakeholder's mind may mean something completely different for someone else. By their very nature, machine learning projects tend to be highly technical, and risk create more misunderstandings than most regular projects.

By their very nature, machine learning projects tend to be highly technical, and risk create more misunderstandings than most regular projects.

These sometimes subtle misunderstandings may cause problems the longer they remain undetected. In one of our projects, a new member of our clients' team had a different understanding of what an algorithm this client was testing was actually predicting. This member, who was tracking the accuracy of our algorithm, was (much to our confusion) reporting different accuracies than what we had found internally for several weeks. When we eventually found out about the misunderstanding, we had lost a couple of weeks of time, caused a lot of confusion, and required additional meetings to rectify something that we could have prevented from the very beginning by ensuring this new member had access to historical knowledge of the project and decisions that were taken along the way.

Since then, Kantify has taken a number of actions to avoid these misunderstandings. One of them is developing a clear methodology to gain a deep understanding of the goals and the challenges of our clients, together with investing tools that help our clients understand the workings and limitations of machine learning algorithms, go a long way in preventing these kinds of problems. One of the simple but effective tools we have developed internally is our Data Science Project Canvas. On a single page, we describe the main elements of the project.

Because we limit this document to a single page, we prevent creating an overly detailed scoping document that might limit the effectiveness of the project rather than enhance it. We attempt to create a consensus on "why" we are doing a project, and at least a tentative approach on "what" and "how" we are going to implement it. This high-level approach leaves us the flexibility to approach the project in an agile way. While we believe procedures won't replace good communication, and that it's not always possible to prevent all misunderstandings, we have seen these tools and techniques significantly decrease the number of misunderstandings in our projects.

Challenge 2: Cutting up the problem in pieces

When clients initially approach us with ideas for machine learning projects, most projects tend to be in the ideation phase - meaning that the scope is still flexible, and the exact outcome our client is looking for is still ill-defined.

In order to deal with this complexity, we often see the reflex by our clients to attempt to create a very detailed project scope to create clarity in the project. We have learned that in almost all cases, this exercise is extremely difficult if not impossible, and doesn't help projects deliver value quickly.

Rather than attempting to describe a full project, we recommend splitting the project into three steps, and creating a "stop-or-go" moment between each step:

  1. The Roadmap: Create a common understanding within different stakeholders of the goals (often resulting, amongst other documents, in the file we described in our first challenge), possibilities, and limitations of machine learning. This roadmap often includes looking at both successful and failed attempts at implementing similar projects, and the lessons to draw from these projects.
  2. The Proof of Concept: Before we actually implement a project, we usually recommend working on a minimal proof of concept. The idea here is to create a minimally-viable product - a product that only includes the absolute minimum to be tested, but that does not contain all the nice-to-haves that you would really want in a production environment. The Proof of Concept is meant to validate that the goals we are aiming to meet, and the KPIs that we set can actually be met.
  3. The Implementation: If the proof of concept is validated, we recommend to implement projects in a step-wise fashion, to enable creating trust in the machine-learning algorithm, adapt the client's business processes, and ensure smooth uptake within the company.

Perhaps counter-intuitively, this stepped approach has increased the number of applications we have put into production, by allowing our clients to gain trust and validate projects along the way.

Are you about to start a data science or machine learning project and would you like to use the Kantify Data Science Project Canvas? If so, send us a message at hello@kantify.com, we will send you the Canvas (in case you wonder, this is free).