How to start with MLOps, what it is exactly, and how to take it into account when creating an AI strategy with your AI team.
According to the latest Tech Trends report introduced by Deloitte, companies will need to rethink AI adoption in 2021 taking MLOps into consideration.
Why is that?
More and more companies try to apply AI in their products, processes, or services. However, many of them face problems with AI model deployment. As a result, they need to finish the project with just a proof of concept without going into production.
According to the IDC report, about 28% of machine learning projects do not succeed. The problem is a lack of necessary expertise, production-ready data, and integrated development environments. While roughly 47% fail to even make it out of the experimental phase and into production.
The solution to that problem is applying MLOps that aims to help with AI delivery and management in production environment.
Therefore, in the following article, we will discuss how to start with MLOps. What it is exactly, and how to take it into account when creating an AI strategy.
Machine Learning Operations (MLOps) is a set of best practices that aim to seamlessly deploy, integrate, and run machine learning models in a production environment. Unlike, DevOps it is not only based on continuous integrations and continuous deployment but also continuous training.
Deloitte : : Tech Trends 2021"Through continuous development, testing, deployment, monitoring, and retraining, MLOps can improve collaboration among teams and shorten development life cycles, thereby enabling faster, more reliable, and more efficient model deployment, operations, and maintenance as well."
When applying MLOps you own the control over scalability, monitoring, maintenance, and governance of the whole AI model pipeline. Therefore, the entire model delivery process is fast and is not at risk of failure.
In the following paragraphs, we will try to investigate how to apply MLOps in your organization.
In the first step, we need to establish an MLOps process maturity of the organization. According to Google, there are three levels of ML process maturity in the company. Each of them depends on the total percentage of automated tasks in the entire ML workflow.
At the first level, the whole process of AI model building and deploying is entirely manual. It is characterized by a large gap between the Data Science team and the operational team. There are no CI / CD pipes and no performance monitoring.
Level 0 is sufficient only when models are rarely deployed to production.
The next level consists of at least one automated pipeline that allows Data Scientists to speed up repeatable tasks. That might be for instance Continuous training of the model in production. The model is automatically trained based on new incoming data initialized by some trigger.
This level occurs very often in small AI teams, that need to handle and adapt to a fast-changing ML environment.
The last level is characterized by a fully automated CI/CD pipeline. Data Scientists are able to rapidly explore new hypotheses and deploy them into the production environment to validate assumptions.
Such a level of maturity consists of data, model, and code versioning, monitoring. In addition, it is highly secure and easily manageable and consists of Continuous integration, deployment, and training pipelines.
Summing up there are three levels of MLOps maturity, and each of them is characterized by different needs and challenges.
Level 0 is the most common, especially among companies that start to build in-house AI teams. Therefore, in the next paragraphs, we will focus on the path from level 0 to level 1.
However, there are three things that significantly speed up work and helps companies go from level zero to one. Let's have a look at them in details.
When talking about AI workflow (MLOps workflow) it is important to mention that there are two main phases. The first one, the experimental, when we explore data, build features, and train models. The second one, operational, when we actually deploy model to production and deal with all operational things.
The most important step towards Level 1 is to take care of that first phase. Tracking all experiments will allow Data Scientists to reproduce obtained in the past results very easily. It is important to emphasize that in order to get a good reproducibility you need to take care of not only model versioning but also code and data versioning.
Keeping track of all your work and being able to recreate that work when needed is a crucial thing in the Data Science world.
Second very important improvement that is worth to take into consideration when going from level zero to one is Continuous Deployment.
Not long ago I spoke with a couple of SME's (Small-Medium size Enterprises) and it occurs that there is no problem with the Data Science workflow. However, there is practically no operational and management layer when it comes to production models.
Why is that the problem? ...because models that have 99% accuracy, but are not deployed does not give any business value.
The solution is to build some abstraction operational layer for AI model deployments. It would allow Data Scientists to easily deliver models in the form of microservices effortlessly.
If we managed to deploy models to production it is important to know whether they are alive. Therefore, adding some monitoring system that give us basic information about resource consumption is the way to go.
We know "where" we are and "what" we want to achieve, so the question that arises currently on the horizon is "How?".
There are two solutions. You can either build your own (in-house) system that will help you speed up your work, or you can buy one. There are pros and cons when it comes to both approaches.
I love the info-graphics introduced by one of the best companies in space of AI that exactly hit the spot when it comes to buy vs build perplexity.
However, there are some use-cases that need a special approach when it comes to AI delivery, and the answer is not so obvious. For instance, Energy supply sector deals with specific requirements, and buying an industry agnostic platform very often is not a solution at all.
It is best for you to estimate the costs, benefits of its option and make the decision yourself...
...and chose Syndicai! :D
The main purpose of this article was to build awareness that proper collection and cleaning of data and thorough building and training of the model will not ensure success in your ML project. You need to remember about basics of MLOps, thinking about AI delivery when models trained.