Machine Learning Monitoring is a very important aspect of the whole MLOps workflow which tells us about model availability and performance.
Why Machine Learning Monitoring is important?
In traditional software, we plan every single step and operation in order to reduce as much as possible the risk of error in the production environment. We can do it using automated tests to ensure that our applications or system perform as expected.
When it comes to Machine Learning the situation is significantly different. A trained model provided by a Data Scientists is nothing else than just a representation of historical data or events. Such a model deployed in a production environment is constantly bombarded with a new type of information which sometimes causes undesirable behavior.
Since we can’t explicitly test for all of the possible machine learning prediction cases, we need to continuously monitor that system to ensure it’s operating effectively. Moreover, such monitoring needs to be performed in a specific way and based on special requirements.
What is Machine Learning Monitoring?
It is important to understand what is happening with models in the production environment. Monitoring is a very important part of the whole Machine Learning Operations workflow. However, is not an already defined term. Some people consider monitoring as a model availability, while others look at this from a model performance perspective. Both of them are very important, especially when we talk about automation.
Is the model available?
When we talk about the availability we mainly mean resource utilization. We are checking whether the model has enough resources? How it scales during high frequency of incoming requests? What is the consumption of CPU, GPU and memory?
How is model performing?
In terms of model performance, our goal is to find information about traffic patterns, latency, or errors. We mainly ask what is the accuracy, what is the drift? When is it time to retrain?
What are Machine Learning Monitoring metrics?
When it comes to metrics there are many ways how you can do it. However, the most important question to rise is: What you want to measure? What is the most important factor for you?
In terms of Model performance the most popular metrics are:
- Type 1 Error
- Type 2 Error
- F1 score
- Mean Absolute Error (MAE)
- Mean Squared Error (MSE)