MLOps is to data science as DevOps is to software engineering. Both methodologies aim to operationalize the rapid deployment of software or machine learning (ML) initiatives with subtle differences. MLOps is growing in popularity as organizations realize that efficiently deploying ML at scale is the key to unlocking greater business value. But in this newly developing space, few organizations fully understand how MLOps works, or where to start. Using WWT’s MLOps Maturity Model, this article will help you evaluate whether your organization is ready for MLOps and how to get started.
Taking ML to the next level
Artificial intelligence (AI) and machine learning (ML) are in CIOs' list of priority initiatives today. AI is about machines’ ability to gain intelligence to learn and act for themselves. ML is the primary force behind AI advancement in recent years. ML is about training an algorithm to recognize patterns, both subtle and obvious, in big data sets using statistics.
Many data and analytics technology decision-makers have implemented, or are implementing, some form of AI. Basic adoption of ML techniques has been quite prevalent in enterprises of all sizes. Data scientists are hired, technology choices are made (usually with the help of an external consultant), use cases are identified, and the first proof of concept (POC) is launched. Initial steps happen fairly quickly but then many organizations’ AI projects stall. According to a Gartner report, after initial data science kicks off, “only about 1 in 5 CIOs who thought they would employ AI within the next 12 months actually achieved that, especially in the 2019-2021 timeframe.” Why is that?
ML is more than just code
Scaling ML is not easy since ML is more than just code. It is a careful orchestration of many processes and tools in an entire ecosystem. It is believed that most AI initiatives stall after a few POCs due to several reasons:
- Developing ML systems is relatively fast and cheap, but deploying and maintaining them at scale is difficult due to the effort associated with data collection, data verification, testing and debugging, model analysis, monitoring and system configuration.
- Teams are overwhelmed by the complexity of tools and technologies and may not have the comprehensive skill set to make the right decisions.
- Lack of immediate results (ROI) from POCs, mostly due to poor choice of a problem domain, poor quality data, siloed data aggregation, curation, governance, storage or retrieval.
This is why many are now turning to MLOps. MLOps is a framework of processes, tools and people that can help rapidly scale ML programs.
MLOps is meant to streamline the highly iterative ML experiment process, as well as to relieve data scientists from worrying about software compatibility, infrastructure provisioning, version control, data refresh, etc. The end goal of MLOps is to enable organizations to deploy ML programs rapidly and at scale.
MLOps is similar to DevOps in many ways. Both DevOps and MLOps aim to unify development (Dev) with operations (Ops) and advocate for automation and monitoring at all steps of the process. However, in addition to continuous integration and continuous delivery (CI/CD), which is necessary for DevOps, MLOps takes the process a step further to address machine learning’s unique challenges by including continuous monitoring and continuous training (CM/CT).
Finding the starting line
At this point, you may be wondering where you can start on MLOps. You may have questions like:
- Should I plan to move all my ML/AI within MLOps, or will I still reap the benefits of using it in just a couple of projects?
- Which MLOps tools should my team use? Why?
- If I am already on Azure/Google Cloud/AWS, how might I start MLOps?
- Does MLOps include data architecture and warehousing?
- How much investment will I need in total? In servers? Tools? Training? Consultant services?
- Will my team be able to do MLOps with their current skills? Will they need training? Should I hire new talent?
To help answer all these questions and quickly assess your organization’s MLOps capability growth, WWT has identified key MLOps development steps that account for strategy, people, process and technology perspectives and developed them into the MLOps Maturity Model.
The first step in the MLOps adoption process is to assess your organization’s MLOps maturity using the maturity framework depicted below. It is assumed that your organization is able to carry out a few data projects, even if more sophisticated data management skills are not in place yet.
Here are the critical ML capabilities required at each maturity level:
- Level 1: Modeling processes are heavily manual, and models are rarely changed or retrained. At this level, model production is time-consuming, error-prone and not scalable.
- Level 2: Initial pipeline is set-up with automation and proper data and model validation, CT and CD. At this level, the organization starts to benefit from the rapid iteration of experiments, and it can harness the ability to auto-retrain models on new data quickly. Also, at this level, data science and operations teams begin to act as a unified team.
- Level 3: At this level, the CI/CD and ML pipeline automation is set-up and managed by a fully integrated DS/Ops team. Few organizations are at this level, those who are exhibit characteristics such as:
- Ability to deploy new implementations of the entire ML pipeline frequently (e.g., several times a day).
- Ability to deliver new predictions based on new information assimilated into the ML pipeline with agility.
- Increased workplace innovation and efficiency.
- Ability to apply model versioning and traceability with auditing.
Building your MLOps strategy
MLOps is about the entire ML ecosystem and not just ML code. Building a blueprint for success requires attention to four key pillars: strategy, people, process and technology. From our experience, clients who have embarked on a journey to accelerate their ML capabilities through MLOps choose to focus on all four dimensions simultaneously and with equal attention.
WWT is well-positioned to help your organization build a strategy that encompasses these four dimensions and addresses the challenges stated earlier. We have internal and external client experience evolving MLOps maturity by identifying the ideal use cases for an MLOps workflow and pipeline. Beyond that, WWT’s AI R&D team has leveraged MLOps to reuse an existing ML pipeline and design an automated process to generate a model as a microservice. The MLOps knowledge and hands-on experience WWT has gained through our own AI R&D program, and client collaborations position us well to help your team design and execute a strategy to generate impactful business value from your AI initiatives.
Building ML rapidly at scale is not easy. It is about more than just code and requires careful orchestration of many processes and tools across teams. But by adopting MLOps, you unleash your organization's ML potential and realize the benefits of continuous monitoring and continuous training. WWT is on the leading-edge of this ever-evolving space, and we have the grit and experience to serve as your guide through this developing landscape. To learn more, please request an MLOps Platforming Workshop.
- D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, and Dan Dennison. 2015. Hidden technical debt in Machine learning systems. Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2 (NIPS'15). MIT Press, Cambridge, MA, USA, 2503–2511.
- MLOps: Continuous delivery and automation pipelines in machine learning. (2020, November 16). https://cloud.google.com/solutions/machine-learning/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning?hl=de.
- MLOps aims to unify ML system development | Google Cloud Blog. (n.d.). Retrieved January 26, 2021, from https://cloud.google.com/blog/products/ai-machine-learning/key-requirements-for-an-mlops-foundation
- Gartner_Inc. (2019, January 3). 2019 CIO Survey: CIOs Have Awoken to the Importance of AI. https://www.gartner.com/en/documents/3897266/2019-cio-survey-cios-have-awoken-to-the-importance-of-ai.
- Hao, K. (2020, April 02). What is machine learning? Retrieved January 19, 2021, from https://www.technologyreview.com/2018/11/17/103781/what-is-machine-learning-we-drew-you-another-flowchart/
- Sridharan, S., Leganza, G., & Vale, J. (2020, May 20). Research Overview: Artificial Intelligence A Guide To Navigating Our AI Technology Research Portfolio. https://www.forrester.com/report/Research+Overview+Artificial+Intelligence/-/E-RES160761.
- Worldwide MLOps Web Search. (2021, January 3). https://trends.google.com/trends/.
- Lyft. (2020). (rep.). Lyft Annual Report 2019. Retrieved from https://investor.lyft.com/index.php/static-files/9da68816-849a-4720-bdc8-140de503ef95
- Netflix. (2020). (rep.). Netflix Annual Report 2019. Retrieved from https://s22.q4cdn.com/959853165/files/doc_financials/2019/ar/2019-10-K.pdf
- GoogleApps. (2019, April 10). ML Ops Best Practices on Google Cloud (Cloud Next' 19). https://www.youtube.com/watch?v=20h_RTHEtZI.