Feeling a bit out of your depth with AI product management?

Done't sweat it... join us on September 29, for the AI for PMs Summit! 🤖

Whether you still don't know your ML from your NLP, or you're a seasoned AI PM, we've got all the content you need in one jam-packed day.

This is part three of the series - if you haven’t yet, check out part one and part two.

For the final installment of this series, we discuss monitoring, and how product managers can add value to machine learning projects.

Monitor for errors, technical glitches, and model accuracy

If you’ve ever built a successful and sustainable product, you’ve taken monitoring seriously. A quick run through why monitoring is important, especially in the context of ML systems:

Why do you need to monitor?

  1. You’ve built a complex system with multiple moving parts:
    Machine learning products are complex and evolving. They rely on UX and user journey maps. One product manager does not necessarily control all parts of this experience. Imagine for example another team removing your ‘thumbs up’ feedback loop to optimize for conversion. This can quickly snowball into bad retraining data for your model, and in some cases, render the model useless.
  2. Radical changes to customer behavior need to be captured:
    Customer preferences, wants, and tastes change. Business cycles change. This means that the model is under continuous stress to learn and adapt. New features may need to be added, old features may need to be removed, and changes to customer behavior need to be captured - all to ensure that model performance does not degrade over time.
  3. Models need to be retrained:
    Even without any external influence, models lose their calibration over time and the performance can continuously degrade. Having good monitoring ensures retraining with new constraints well within time.

The metrics for monitoring performance are usually radically different from business metrics. Monitoring for something link conversion, and setting an alert every time conversion tanks simply does not cut it. Here are the reasons why:

  1. Business metrics lag:
    When monitoring from business metrics, what you see is an effect and not a cause — conversion monitoring, for example, indicates only that customers are not converting. This could be due to multiple factors within the user journey, seasonality, business cycles, new launches, just about anything. Business metrics are always a measure of reaction from customers. They say nothing about system health, and usually need time to mature. A glitch in the e-commerce delivery system, for example, will only reflect in the product closer to the customer’s delivery time. This could mean days of lag before you even discover that there is a glitch, all the while losing precious customers. You want a monitor that can detect potential system issues in time and mitigate them before problems creep up.
  2. You (almost) never have a real time view of business metrics:
    Even if you choose to monitor a specific business goal — clicks on the ‘buy’ button for example — business metrics will still lag. This is mainly because of calculation overheads. Data processors responsible for business KPIs perform extensive calculations and process large amounts of data. The consequence is that often, business dashboards lag by a few hours if not more (If you can see your business metrics realtime, I am deeply and frighteningly jealous).
  3. It’s a nightmare to debug business metrics:
    So your conversion tanked. What now? Let’s go thought 6,000 lines of code, and look at all the wrong predictions that your model spilled out — not very scalable. System and model specific monitors are great at pointing to the exact location of the bug that can lead to potential catastrophes. Debugging is easier with good alerting and affords precious time with rollbacks.

What should you monitor?

For the purpose of this discussion, I will not discuss monitoring for the non-ML bits. However, I am taking a giant leap of faith and assuming that bugs in code are already being monitored - never, ever rule out that part.

The machine learning model itself should be monitored for precision and recall. Set up an acceptable threshold, and set an alert every time the model’s performance breaches that threshold.

I usually set aside x% of my traffic for perpetual monitoring. For example, if you’re building a recommender system, it’s always a good idea to have a sample set that never interacts with the recommender system and monitor their KPIs. If you’re automating human processes, always send a sample of the model’s predictions to a human agent to validate.

A good example is Spam detection - if 100% of your traffic goes through an ML spam detector, you never know if the model is right - you work under the presumption that everything marked by the model is legitimately spam. Having a human agent validating this assumption and monitoring the precision gives an added confidence that all is good with the ML system.

All machine learning models are in essence precise probability calculators. Monitoring ensures that the system is healthy and reliable at all times.

Get Creative!

Or why product managers are essential to ML success

As can be seen throughout the framework, product managers play a role in the entire ML System journey right from conceptualization to production and beyond. PMs are deeply aware of the customer journey, the underlying business logic, the happy paths and the disasters.

As someone who sees the full picture, PMs can not only influence the ML Product, but also become the difference between success and failure.

For some final words, here is a quick summary of the framework, and how you as a PM can add value.

1 . Identify the problem
Product managers are a strong voice of the user. Identifying the right problem, with great business and customer impact, is key to a delightful ML Product.

2 . Get the right data set
Machine learning needs data. Think about creative ways to generate reliable data sets — ask your users, interpret their preferences from their actions, ask humans to generate the data set, or use a combination of all three! Facebook has at least one Product guy who is super happy about his data collection approach:

Facebook reactions are awesome — they leverage peer to peer engagement to determine the popularity and sentiment of a post, and can be used to curate the user’s feed.

3 . Fake it first
This one simply cannot work without a little Product magic. Think out of the box to conjure an experience that mimics the one you want to build. Here’s an extreme example:

Ford dressed up one of their employees as a car seat to simulate self driving cars and study the reactions of everyone on the road. Here’s the full article.

4 . Weigh the cost of getting it wrong
It’s your job to protect your users and the business. Ensure awareness of all the pitfalls of machine learning. Remember, you are what stands between your users and this:

5 . Build a safety net
You know the disaster scenarios, and you know how to get your users out of them. Ensure that users are not stuck in loops or spiraling down disasters waiting to happen. Never, ever do what this guy did:

6 . Build a feedback loop — This helps to gauge customer satisfaction and generates data for improving your machine learning model. We’ve covered this extensively in part two.

7 . Monitor — For errors, technical glitches and prediction accuracy.

And for some parting words - data scientists can build amazing models, but product managers translate them into delightful and usable products. Don’t let the jargon intimidate you - ML models are, simply put, precise probability calculators.