A look at open source AI models
by Antone Gonsalves, Tech Target
An alternative to the massive generative AI models developed by major cloud providers AWS, Google and Microsoft are do-it-yourself options that enterprises can tailor to their needs.
Startups such as Cerebras Systems, Databricks and MosaicML hope to convince enterprises that they can get more bang for their buck with open source AI models that enterprises control and train with their own data to provide information to customers or help employees with specific tasks.
"If you believe you can provide an advantage with your data, you want to use an open source model and train it with your specific data," said Andrew Feldman, CEO of Cerebras, which builds computer systems and models for AI applications.
Drugmakers, financial services, academic researchers and government agencies have used AI for years. Generative AI changed the game by enabling text-based, humanlike responses to natural language queries.
The advancement promises to make AI accessible to everyone. Software developers can use it to generate code, salespeople can fine-tune email pitches, and marketing teams can craft better product descriptions. Also, employees can get answers to questions and summaries of documents and meetings.
Generative AI's potential for driving efficiency in business processes is why enterprises are intensely interested in the early-stage technology.
Cost of generative AI models
The cost of an in-house model will depend on the number of parameters used to train it and how quickly the system responds based on the number of users simultaneously submitting queries.
On-premises models at the low end will cost a few $100,000, so it's often cheaper for organizations that use them only occasionally to run them in the cloud, said Dylan Patel, an analyst at SemiAnalysis. However, companies that use models regularly could cut costs with on-premises systems while gaining more flexibility in deployment and customization.
WWT trained the generative AI models on 100 million to 3 billion parameters, which the company found could fit into the memory of typical on-premises hardware, excluding the RAM used by the GPU. By comparison, OpenAI used 175 billion parameters to train GPT-3.
WWT concluded that enterprises training models with 100 million to 6 billion parameters would also have to fine-tune them for specific tasks, such as text summarization, question answering and text classification, to get accurate responses and meet user expectations.
"The experiments and the results provide a rough guide of the kind of capabilities one can expect from open source models of small sizes," said Aditya Prabhakaron, WWT's data science lead.