This article was published as part of the Data Science Blogathon
Learning workflows. Machine learning is a big buzzword these days. Everybody seems to want to jump into the ML these days. Many startups and startups implement machine learning projects in many ways. But only a part of these products can work and be maintained in the long term.
What is DevOps?
The software must be built and run correctly. Then, How is the complete life cycle of a software product?
Good, DevOps makes the entire software operation process seamless. It consists of a set of practices and processes that work together to automate and integrate the entire software development process.
DevOps unifies the software development process (Dev) and software operation (Ops). DevOps makes automation easy, the monitoring and smooth running of the creation, the development, the integration, tests, the implementation, the launch, maintenance, scaling, the maintenance and management of the software infrastructure.
Some key DevOps goals are short development cycles, fluid and fast, fast implementation, stable releases and follow business goals correctly. DevOps advocates and professionals use the art of the infinite loop (up) to represent the DevOps cycle and show how each process relates to the other. This also shows how software development and operations are an ongoing role.. You need constant collaboration and improvement throughout the cycle.
Proper DevOps implementation enables organizations to do more. Continuous integration (THERE) and continuous delivery (CD) help organizations see more efficient development and higher frequency of deployment. Repair times are reduced, cybersecurity flaws can be easily detected and facilitate experimentation and testing.
Therefore, we can say that DevOps makes the overall software engineering process easier to implement and maintain.
MLOps: DevOps for machine learning
MLOps borrows many of the principles of DevOps in an effort to make the entire ML cycle more efficient and easier to operate.. The main goal of MLOps is to create automated ML pipelines and track times and other metrics. Pipelines will be used for multiple iterations throughout the AA project lifecycle.
MLOps consists of:
- Design
- Development model
- Operations
The design phase consists of requirements engineering, ML use case prioritization, data availability check and other steps.
Model development includes machine learning engineering, modeling, data engineering and model testing and validation.
Operations deals with the implementation of models, CI pipelines / CD and model monitoring and maintenance.
Let's now see what the general MLOps scenario for Azure MLOps looks like.
Microsoft Azure MLOps
Azure MLOps follows the goals of:
- Faster model development and experimentation
- Faster deployment of models in production
- Quality control and lineage tracking from start to finish
Las herramientas MLOps ayudan a realizar un seguimiento de los cambios en la Data SourceA "Data Source" refers to any place or medium where information can be obtained. These sources can be both primary and, such as surveys and experiments, as secondary, as databases, academic articles or statistical reports. The right choice of a data source is crucial to ensure the validity and reliability of information in research and analysis.... o las canalizaciones de datos, the code, SDK models, etc. The life cycle is made easier and more efficient with automation, repeatable workflows and assets that can be reused over and over again.
Azure Machine Learning services allow us to create reproducible Machine Learning pipelines. Software environments for model training and deployment are also reusable. These pipelines allow us to update models, test new models and continuously implement new AA models.
The typical AA process
The typical Machine Learning process consists of:
1. Data collection
2. Training model
3. Pack the model
4. Validate the model
5. Deploy model
6. Monitor model
7. Retrain the model
But in many cases, the process needs more refinement. New data is available and the code is changed. Multiple processes are running, each of which involves a large amount of data.
As usual, much of the data is disorganized. Raw data must also be conditioned.
How can MLOps help?
MLOps consists of several elements. Some notable elements are the automation of the deployment, monitoring, frame testing, data version control, etc. MLOps tries to take DevOps concepts and improve the whole machine learning process.
All Azure Machine Learning architecture looks like this:
MLOps is supposed to do a coordinated process which can ease the flow of the whole process and also support CI environments / Large-scale CD.
MLOps can help with automation, facilitating code implementations. Validation is also facilitated with SWE's best quality control practices. En el caso del trainingTraining is a systematic process designed to improve skills, physical knowledge or abilities. It is applied in various areas, like sport, Education and professional development. An effective training program includes goal planning, regular practice and evaluation of progress. Adaptation to individual needs and motivation are key factors in achieving successful and sustainable results in any discipline.... de modelos, if we think our model needs retraining, we can always go back and do a retraining of the model.
Azure Machine Learning has many domain-specific pre-trained models, how to:
View, speaks, language, search, etc.
If we talk about, what exactly is Azure ML Services, is a suite of Azure Cloud Services along with a Python software development kit / R, that allows us to prepare data, build models, manage models, train models, track experiments and implement models.
There are many elements involved, let's see some of them.
Data sets: record the data
Experiments: Training races
Pipelines: Training workflow
Models: Registered Models
End points: deployed models and training workflow endpoints
Calculate: Managed Computing
Environments: Defined environments of formation and inference.
Azure services are built in such a way that they are DevOps compliant and make everything work properly.
Managing the machine learning lifecycle with MLOps is a breeze. Models must be monitored and retrained. MLOps processes facilitate real business results and, Thus, enable faster time to market and implementation for ML-based solutions. They also increase collaboration and alignment between teams.
Azure MLOps
Azure Machine Learning has the following MLOps features.
Create playable ML pipelines
We can define reusable and repeatable pipes. All steps that comprise data collection and model evaluation can be reused.
Reusable software environments
The software dependencies of our projects can be tracked and reproduced as needed.
Register, pack and deploy models from anywhere
Model registration makes the above steps very easy to implement. Models have name and version.
Capture governance data for an end-to-end machine learning lifecycle
Azure ML can integrate with Git to track our code. We can check from which branch / repository comes from our code. We can also track, profile and version our data.
Alerts and notifications for events in the AA life cycle
Azure ML lists milestones in Azure EventGrid, that can be used for various purposes.
Monitor operational problems
We can understand the data that is sent to our model and the predictions that are returned. So that we can understand how the model is performing.
Automate the AA life cycle
Azure Pipelines and Github can be used to create a continuous, autonomous integration that trains a model. Most AA lifecycle steps can be automated.
Then, as a general thought, we can understand that MLOps is a concept where we use various resources to make the whole ML journey smooth and efficient.
Then, thinking about a hypothetical MLOps workflow, first, the data science team and the application development team would collaborate to build the application and also train the model. Later, the application will be tested. parallel, the model will be validated. Later, the model will be implemented and the application will be launched.
The application and the model will be monitored. From the general process of managing models and applications, we can perform performance analysis. Depending on the needs, we can also retrain the model.
Veamos un proyecto de ML de muestra en Azure DevOpsAzure DevOps is an end-to-end platform that provides tools for planning, Software Development and Delivery. Includes services such as Azure Boards, for project management; Azure Repos, for version control; and Azure Pipelines, that automates integration and continuous delivery. Its collaborative approach allows development teams to optimize their processes and improve software quality, facilitating greater agility in the cycle of....
The various elements that we had discussed are visible here.
When entering Repos, we can check the code and other files.
Regarding the pipes, let's see a pipeline executed successfully.
We can see that all the jobs were executed correctly.
For the AA project of particular sample, we can see the requirements.
We can implement the requirements.
pip install azure-cli==2.0.69 pip install --upgrade azureml-sdk[cli] pip install -r requirements.txt
Code to train the model:
Start by importing the necessary things.
import pickle from azureml.core import Workspace from azureml.core.run import Run import os from sklearn.datasets import load_diabetes from sklearn.linear_model import Ridge from sklearn.metrics import mean_squared_error from sklearn.model_selection import train_test_split from sklearn.externals import joblib import numpy as np import json import subprocess from typing import Tuple, List # run_history_name="devops-ai" # the.makedirs('./outputs', exist_ok=True) # #ws.get_details() # Start recording results to AML # run = Run.start_logging(workspace = ws, history_name = run_history_name) run = Run.get_submitted_run()
Obtaining the data.
X, y = load_diabetes(return_X_y=True) columns = ["age", "gender", "bmi", "bp", "s1", "s2", "s3", "s4", "s5", "s6"] X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2, random_state=0) data = {"train": {"X": X_train, "Y": y_train}, "test": {"X": X_test, "Y": y_test}}
Then, training.
# Randomly pic alpha alphas = np.arange(0.0, 1.0, 0.05) alpha = alphas[np.random.choice(alphas.shape[0], 1, replace=False)][0] print(alpha) run.log("alpha", alpha) reg = Ridge(alpha = alpha) reg.fit(data["train"]["X"], data["train"]["Y"]) preds = reg.predict(data["test"]["X"]) run.log("mse", mean_squared_error(preds, data["test"]["Y"])) # Save model as part of the run history model_name = "sklearn_regression_model.pkl" # model_name = "." with open(model_name, "wb") as file: joblib.dump(value=reg, filename=model_name) # upload the model file explicitly into artifacts run.upload_file(name="./outputs/" + model_name, path_or_stream=model_name) print("Uploaded the model {} to experiment {}".format(model_name, run.experiment.name)) dirpath = os.getcwd() print(dirpath) # register the model # run.log_model(file_name = model_name) # print('Registered the model {} to run history {}'.format(model_name, run.history.name)) print("Following files are uploaded ") print(run.get_file_names()) run.complete()
There are many significant differences in the way we operate here.
Me, being a student, I didn't exactly have a business case or the monetary resources to work on a full ML project in Azure DevOps. The above work is done taking as reference.
Visit this Repo for reference: DevOpsForAI
Conclution
MLOps can solve many problems related to the machine learning life cycle. Therefore, many of the general challenges are also solved.
MLOps is a very compelling proposition for companies to run their ML models. MLOps makes working between teams super easy and straightforward.
These are some of the references I used to gather knowledge for this article. Have a read.
References:
2. https://docs.microsoft.com/en-us/learn/modules/start-ml-lifecycle-mlops/1-introduction
3. https://docs.microsoft.com/en-us/azure/architecture/example-scenario/mlops/mlops-technical-paper
4. https://docs.microsoft.com/en-us/azure/architecture/reference-architectures/ai/mlops-python
About me:
Prateek Majumder
Data science and analytics | Digital Marketing Specialist | SEO | Content creation
Connect with me on Linkedin.
My other articles on DataPeaker: Link.
Thanks.
The media shown in this article is not the property of DataPeaker and is used at the author's discretion.