In this article

This article was written by Sathish Thyagarajan at NetApp.

What's the difference between an alpaca and a llama? They have a very similar appearance. But apparently, alpacas are very gentle and shy, and llamas are very confident and brave animals. I love all animals, but I am not here to speak about these wonderful creatures. I am here to write about a different Llama and other large language models (LLMs) that are driving adoption of generative AI (GenAI) and are central to our current AI revolution.

Reflecting a macrotrend across industries with the rise of generative AI underpinned by LLMs, it's becoming easier than ever to build intelligent chatbots that can go through stacks of documents, code, or images to respond to user prompts. And the use cases vary greatly. For example, biopharmaceutical and life sciences companies have initiated human trials for AI-powered drug discovery (AIDD) by combining Gene Ontology (GO) with pretrained models like ProteinBERT to rapidly design protein structures. On the other end of the spectrum are digital artists in visual media companies and 3D computer-aided design (CAD) engineers in additive manufacturing companies. These industries have begun to apply a combination of rendering technologies like neural radiance field (NeRF) with generative AI to convert static 2D images into immersive 3D scenes.

Following the AI boom, since late 2022, LLMs like Llama, GPT, and BLOOM have become widely available through cloud services. These AI solutions require industrial-scale compute resources and an intelligent data infrastructure that spans multiclouds and private data centers and that preserves data privacy, governance, and security. When a powerful technology like generative AI emerges, leaders must assess the many possible ways that people might interact with it. They must take every step possible to help ensure that AI systems are designed in a responsible, explainable, and ethical manner.

Strategic Questions

Should we train, fine-tune, or use RAG?

Across industries, most businesses face the question of whether to train an LLM from scratch (a costly approach), to fine-tune a pretrained LLM (feasible yet complex), or to use Retrieval Augmented Generation (RAG). And the answer is a strategic decision.

Should we use open-source or proprietary models?

Furthermore, the question of whether to use open-source LLMs (like Llama 2 and CM3Leon) or proprietary LLMs (like GPT-4 and DALL·E) adds another layer of complexity in AI decisions.

Each question includes a multitude of factors like cost, scalability, data bias, hallucinations, value alignment, and vendor dependency, which play a crucial part in decision-making, depending on an organization's priorities and business objectives.

So, what's in it for you to use NetApp in your AI environment?

NetApp, a leader in delivering intelligent data infrastructure, serves a broad range of enterprise customers in healthcare, manufacturing, financial services, and many other industries. LLM solutions rely largely on transformer-based deep learning algorithms and GPU compute. However, when the GPU buffer fills up, the data must be written quickly to storage.

Some AI models are small enough to execute in memory. But for fast access to large datasets, LLMs require high IOPS and high throughput storage, thus making an intelligent data infrastructure a cornerstone for generative AI. The NetApp® intelligent data infrastructure helps you get your data ready for at-scale AI solutions. As NetApp CEO George Kurian said recently, "… data is the unique intellectual property and asset that an organization has, and it's particularly valuable in the age of AI."

Furthermore, as stated in Harvard Business Review, to scale generative AI, companies should enable their explorers, build platforms, and prioritize impact. From that standpoint, NetApp has been committed to the use of AI internally in refining and designing its core products, such as NetApp ONTAP® data management software. We also think deeply about the overall relationship between AI, scale, and value creation for customers like you. NetApp Autonomous Ransomware Protection, developed by using AI and advanced machine learning, illustrates NetApp's capability to build AI-integrated systems that identify threats before they affect your business operations.

In addition to AI capabilities that are infused in our core product engineering and in the organization's culture, NetApp also collaborates closely with a robust network of AI partners such as NVIDIA and Domino Data Lab. These collaborations bring value to your organization across the various stages of an LLM lifecycle, such as data preprocessing, AI model pretraining, fine-tuning, inferencing, and prompt engineering helping you meet the data management demands of generative AI.

 

Generative AI: NetApp value proposition
  • NetApp AI POD (formerly ONTAP AI) offers validated architectures that support two prominent NVIDIA GPUs, the A100 and the H100, which come with NetApp cloud-connected all-flash storage. NetApp AI POD, based on NVIDIA DGX BasePOD, is a scalable architecture for deep learning and AI workloads, helping you pretrain or fine-tune LLMs.
  • Cloud platforms. You get a fully managed cloud storage offering that's available natively on Microsoft Azure as Azure NetApp Files, on AWS as Amazon FSx for NetApp ONTAP, and on Google Cloud as Google Cloud NetApp Volumes. It's easy to fine-tune LLMs and to run highly available generative AI workloads with cloud-native machine learning platforms such as Amazon SageMaker, Azure OpenAI Service, and Google Cloud Vertex AI.

But wait, there's more to it. NetApp FlexCache technology brings real-time rendering capabilities and data mobility for real-time generative AI models and prompt execution. NetApp SnapLock compliance software, available with ONTAP based storage systems, brings immutable disk capability for dataset versioning. Your data is protected during denial-of-service (DoS) attacks when a malicious attacker interacts with an LLM in a resource-consuming way.

NetApp BlueXP helps you identify, map, and classify personal information; meet privacy requirements on premises or in the cloud; and improve your security posture. It offers these features as well as data protection when you're deploying LLMs. NetApp Spot helps in scaling your GPU-enabled instances and reducing your GPU rental costs on cloud. You can use NetApp technology to also deploy private RAG and to maintain data security and governance. To learn more about NetApp's product offerings and capabilities for generative AI, read my paper Generative AI and NetApp Value.

Generative AI can produce effective results only when the model is trained on reams of high-quality data. And that data can reside anywhere, from your manufacturing floors, to your data centers, to your public cloud and multicloud environments. NetApp delivers true unified data storage and intelligent data infrastructure that combine any data, structured or unstructured, in a secure and governed manner.

NetApp has helped customers across industries with their AI initiatives. For example, we have helped medical organizations build predictive AI solutions on healthcare data (such as surgical robot videos, brain images, and genome sequencing). NetApp solutions have enabled medical practitioners to spend more time on patient care and less time on examining images. For anti-money laundering and fraud detection in the banking and financial services industry, NetApp intelligent data infrastructure helps in designing generative AI solutions that involve generative adversarial network (GAN) frameworks. And those examples are just a few among many.

Learn more about generative AI and NetApp Connect with a WWT expert

Technologies