Machine Learning Operations (MLOps) — Augmenting Machine Learning Activities for better results

4 min readJun 11, 2022

What is MLOps?

MLOps is a set of practices that aims to deploy and maintain ML models in production reliably and efficiently. Machine learning models are tested and developed in isolated experimental systems. When an algorithm is ready to be launched, MLOps is practised between Data Scientists, DevOps, and Machine Learning engineers to transition the algorithm to production systems. MLOps seeks to increase automation and improve the quality of production models, while also focusing on business and regulatory requirements. MLOps applies to the entire lifecycle — from model generation (software development lifecycle, continuous integration/continuous delivery), orchestration, and deployment, to health, diagnostics, governance, and business metrics.

MLOps includes capabilities that data science, product and IT need to deploy, operate, govern, and secure models in production. It includes model training, development, infrastructure, model version control, monitoring, security, governance and compliance which together enable an AutoML pipeline that maximises your machine learning performance and ROI.

The goal of MLOps is to develop, train and deploy machine learning models with automated procedures that integrate Data, developers, security, and infrastructure teams.

Why MLOps?

Machine learning only provides value once models reach production. However, organizations often underestimate the complexity and challenges of moving models to production but instead the resources are more focused on model development, while treating machine learning just like standard software.

Different teams are involved in the MLOps lifecycle, as mentioned below:

Business/Product
Data Engineer
Data Scientist
DevOps

Business/Product:

Business Analysts play an important part in MLOps. They are the bridge between the stakeholders and the team and break down the business problem into actionable use cases. Also, they help stakeholders with the timeline, progress of the projects etc.

Business Analysts are excellent domain experts and have good storytelling skills. They spend most of the time preparing helpful documentations and presentations. They are good at planning and management.

Data Engineer:

Data engineers are the platform enablers. They mostly spend their time building data pipelines. Data pipelines ensure uninterrupted data flow and make some primary transformation to the data.

They ensure the relevant data meets the required quality standards and make data available for the project. They orchestrate various individual tasks together and run them in schedules.

Data Scientist:

Data Scientists are important for the machine learning project. Their main responsibilities are data preparation, finding the correct/best model that solves the business problem and tuning them for optimal hyperparameters, testing and evaluation.

DevOps:

DevOps engineers work on the automation of the model deployment to all the environments. The levels of automation are Dev, QA and Production. It can be different from organization to organization.

They take the models from Data Scientists and integrate them with the products that use the models. Data scientists often use python to build, test and validate their models. DevOps engineers expect the ML models to be accessible through API’s.

MLOps provides the connection between the code and all the other components that includes:

Data collection
Feature selection
Auto EDA
Training data
Model training
Model validation
Inference data
Deployment
Model pipelining
Version control
Monitoring
Security
Governance
Explainability and interpretability

Challenges MLOps solve

Managing these tasks at scale is not an easy task and there are so many gaps that need to be addressed. List of top major challenges that MLOps solves are,

Removing shortage of data scientists/developer to develop and deploy scalable tools
Bridging communication gaps between technical, data, and business teams due to a known workflow (Data Scientists develop models/algorithms and hand them over to operations to deploy into production. Lack of coordination and improper handoff between the two parties lead to delays and errors)
Facilitate risk management in the implementation of ML Models — Everyone knows ML models are like black box. Models tend to drift away from what they were initially intended to do. Assessing the risk of such failures is a very important step.

Benefits of MLOps

There are so many benefits that MLOps practice brings to the project. Benefits such as:

Risk reduction during model validation — Models with errors or incorrect assumptions can suggest misleading results from the perspective of the decision-maker. Model risk arises from errors within models, or the incorrect use of models, and if undetected, can have significant impacts on businesses and organizations whose decisions and business processes depend on the model outputs. Thus, there is a need for model validation to reduce model risk. Validation is a process of comparing the correspondence between model outputs and system behaviour.
Actionable insights — Actionable insights are conclusions drawn from data that can be turned directly into an action or a response. The data can be structured or unstructured, quantitative or qualitative. While actionable insights provide the action needs to be taken or processes are needed to execute.
CI/CD processes, faster and fewer errors using AutoML — CI/CD is a method to frequently deliver apps/solutions to customers by introducing automation into the stages of app/solution development. The main concepts attributed to CI/CD are continuous integration, continuous delivery, and continuous deployment.
Simplify the implementation of more complex problems — With CI/CD in place moving the code to the production environment and automation becomes easy to implement. It helps in actively monitoring and updating the models to evolve with data, improving the efficiency of AI systems.

Machine Learning Operations (MLOps) — Augmenting Machine Learning Activities for better results

Written by Cetas AI