MLOps Tools: An Analysis of the Third-Party Landscape
In this article
Machine learning operations, also known as MLOps, can transform business outcomes by unlocking the potential of machine learning. However, as a relatively new discipline, many organizations lack the expertise and wonder about the best way to get started and fully utilize MLOps. This series aims to bridge that gap and provide resources to begin your organization's MLOps journey.
The first article, Top Considerations for Implementing MLOps, focused on the holistic approach of connecting people, processes, and technology. Then, Ins and Outs of Choosing a Cloud Provider broke down MLOps offerings from GCP, AWS, and Azure. However, the emergence of MLOps has brought about a high amount of innovation in the third-party tool space. Understanding that organizations can face decision fatigue with the seemingly unlimited number of options, What to Know Before Selecting a Third-Party Tool broke down the approach for assessing if a third-party tool is right for you.
As the series finale, this article will build on the considerations in the last article and provide a point-in-time snapshot of specific third-party tools that your organization can implement.
Recall the various components that make up an MLOps system:
- Data management
- Model versioning and storage
- Model training and deployment
- Model validation
- Continuous integration & continuous delivery (CI/CD)
Some tools target one or more of these components, while others cover entire MLOps systems. Understanding your organizational needs can help identify the types of tools that are best for you. However, given how many tools there are, it may not be practical to use this framework to compare every tool on the market. For example, in the process of looking for a data management tool, there are an overwhelming number of options on the market that make the consensus on an organization's choice of tool fairly difficult.
The considerations outlined in our last article provide a framework for assessing individual tools and platforms:
- Support & ease of use
- Flexibility & customizability
These factors can provide additional insights into your considerations and help narrow the scope of tools. Assume, again, that you're looking for a data management tool. You can apply one or more of these considerations to condense the list further. The list you make would have data management tools compatible with your existing environments, have extensive supporting documentation, and that fit within your budget.
To make your search for third-party tools easier, this article highlights a number of tools that include a high level of support - more specifically, tools that have a high ease of use. We've chosen a set of tools organized along a no-code to code-intensive spectrum. No-code platforms are accessible to those with only conceptual machine learning understanding, whereas the most code-intensive ones require extensive coding in addition to conceptual knowledge. As a caveat, we have found that simplicity is often a tradeoff with flexibility. In other words - the more coding involved, the more flexible the solution. Flexibility for MLOps Platforms often means additional customizability of data inputs, model capabilities, and compute power, coupled with greater ability to work with and alongside additional platforms and systems.
Our data scientists chose seven tools from their industry knowledge and direct customer experience that span the spectrum of no-code to code-intensive.
Starting at the left, DataRobot is the most no-code-friendly platform of the platforms listed, whereas Domino Data Lab is the most code-intensive. It's important to know that this spectrum isn't a ranking—we can't say which tools will be better than another for your organization. The goal of applying these considerations is to identify what you're optimizing for and then choose the right tool that addresses your needs. Often, we are asked if one technology is better than the other, but the answer always depends on your individual needs. Below, we'll discuss each of these seven tools in more detail in their respective order.
Ease of use
DataRobot provides MLOps and AIOps application-building capabilities from within a no-code interface.
DataRobot offers end-to-end management with proprietary model health monitoring to ensure robust solutions and performance. Models can be deployed from anywhere, and their services support cloud, on-premise, and hybrid environments. All models in production are centrally monitored—allowing you to build in your preferred infrastructure while benefiting from a single pane of glass monitoring.
Their platform has extensive documentation on their offerings, and even an MLOps 101 playbook to guide beginners. Their documentation combined with the no-code interface has allowed DataRobot to position itself as an ideal fit for customers beginning their MLOps journey.
Ease of use
MLReef provides a fully no-code development environment alongside a publishing process that converts code to modules usable inside the MLReef interface.
MLReef is an open-source MLOps platform that focuses on cross-user and cross-project collaboration. MLReef increases the efficiency of ML projects by structuring the life cycle into modules that are atomic Git repositories for models, data operations, and data visualizations. This allows for easy scaling of ML activities across teams and, with instant sharing and fine-grained permission management, increases knowledge accessibility for both technical and non-technical users. MLReef supports cloud and on-premise repositories, infrastructure, and deployment management.
Ease of use
Dataiku caters to all skill levels by providing options for no-code workspaces and Python capabilities for building CI/CD MLOps pipelines.
Dataiku offers a broad-based solution for MLOps pipeline development. Dataiku specializes in increasing production speed through visualized and automated data preparation, and easy management using project bundling from test to production environments. The environment supports edge deployment using ONNX Runtime for machine learning projects using devices. This enables monitoring, data updates, and model retraining at a larger and more rapid scale. Additionally, Dataiku offers drift detection for training and test data to enhance model reliability during automated model retraining. Dataiku has full Git integration for version control and provides an API that integrates with Jenkins, GitLabCI, Travis CI, or Azure Pipelines for CI/CD within your MLOps pipeline.
Ease of use
Datadog recently unveiled an end-to-end testing system that allows for no-code test creations. Team members can take advantage of this feature and create end-to-end tests without code.
Datadog provides an end-to-end monitoring solution for data pipelines, focusing on automatic anomaly detection, correlation, forecasting, and dashboard visualization to ease the burden on machine learning engineers during troubleshooting and maintenance. Datadog is compatible with over 500 integrations, including AWS, Azure, and AKS.
There are several ways Datadog enables model monitoring. Datadog automatically collects log data from an organization's services, applications, and platforms, ingesting them into a data lake for easy filtering, live tailing, and metric generation. Datadog also includes a web recorder and code-free test simulations of a user journey through applications to proactively detect performance issues for CI/CD. Additionally, Datadog offers real-time interactive dashboard visualizations with slice-and-dice filters and performance alerts for reporting key metrics or issues, allowing users to easily demonstrate production data impact to other stakeholders and respond to dynamic situations quickly.
Ease of use
Cloud data platform Snowflake recently released Snowpark, allowing an easy and quick connection to Snowflake databases for MLOps pipeline development. UDFs can be designed in several programming languages, including Python, but coding is required. There is no no-code or low-code interface.
Ease of use
Databricks acquired no-code platform 8080 Labs in 2021 and has now integrated some no-code features into its platform, but what makes Databricks truly shine is its high flexibility in functionality. To get the most out of Databricks' features, you will want to take advantage of its programming and Python capabilities.
Databricks has products with a comprehensive offering, the highlights of which are model versioning and registry. Notably, Databricks is the developer of MLflow, an open-source platform for developing ML projects. Built on an open Data Lakehouse architecture, Delta Lake, Databricks ML enables the standardization of pipelines without relying on the support of a data engineering team. The Databricks ML package includes pre-configured clusters for PyTorch, Tensorflow, and sci-kit-learn, simplified model deployment with a data-lineage-based feature search and integration with governance workflows, and model management using Managed MLflow, which allows automatic cluster scheduling for production models. Deployment is available on Apache Spark or REST APIs using built-in integration with Docker containers, Azure ML or Amazon SageMaker.
Ease of use
Domino Data Lab creates flexible and easily maintainable tools for data scientists currently only accessible via coding.
Domino Data Lab's primary offering is its Enterprise MLOps platform, which services over 20% of Fortune 100 companies. The Domino Enterprise MLOps platform includes a System of Record for model versioning, an Integrated Model Factory for model training, deployment, and integrated model monitoring, including automated drift detection, and a Self-Service Infrastructure Portal with a unified data access library for data management. Domino Data Lab features partnerships with multiple data science tool providers such as Snowflake, SAS, RStudio, Amazon Sagemaker, and Stata and infrastructure platforms such as AWS, Azure, GCP, NVIDIA, VMWare, and Red Hat. Apart from integrating solutions on one platform, Domino Data Lab also offers a multi-cloud MLOps solution currently only available through private preview - the Domino Nexus. The Domino Nexus is a single pane of glass to view data science and ML workloads across compute clusters on multi-cloud infrastructure.
Across the MLOps pipeline, multiple open-source platforms and tools are available to use in developing, deploying, and monitoring machine learning solutions. For organizations for which scalability is a high concern, the low latency offering of Databricks ML may keep those organizations' MLOps pipelines ahead of the curve. Alternatively, for an organization concerned with accuracy and reliability, Dataiku's automated model retraining with drift detection can be an essential feature. Organizations must know the critical issues rooted in their business needs and build their MLOps environment around them.
This article, along with the whole series, is aimed to help you begin the MLOps journey. We've provided examples and spent time focusing on big-picture ideas to help you get started. Now, it's time to evaluate and develop strategies catered to your individual needs.
Disclaimer: This article only provides a point-in-time snapshot of the offerings from DataRobot, MLReef, Dataiku, Datadog, Snowpark, Databricks and Domino Data Lab. We anticipate the tools and services in the MLOps space will continually evolve, given the rapid pace of MLOps and technology development today.