In this blog

A Google conference about AI

Each year, Google gathers customers and partners for its annual Google Cloud Next conference where the tech company announces its latest offerings and gives out its Partner of the Year awards (World Wide Technology was an award winner!)

Like nearly every tech conference going on right now, Google Cloud Next 2024 was all about Generative AI (GenAI). While the hype around this topic is large, it's also driving significant spending patterns as companies prepare to deploy their own AI solutions. Companies are searching for AI platforms and collecting talent that can build something on top of them.  Last year in my recap of Google's conference, I talked about how GenAI was different and represented a significant evolution of AI. So it comes as no surprise this is still very much the innovation focus for Google.

It was only last December that Google released Gemini, a new large language model (LLM) with a name and brand meant to unify the various Generative AI offerings across Google's ecosystem of products. In a broad fulfillment of that promise, Gemini was everywhere at Google Cloud Next. If last year was about announcing this new model, this year was about showcasing and expanding it.

Gemini

Gemini 1.5 Pro

The Gemini LLM itself has been upgraded to version 1.5, with the primary new feature being a 1 million token context window. Think of tokens like small bits of information that LLMs process. 1 million tokens corresponds roughly to 700,000 words. The previous Gemini version was limited to 20,000 tokens, so this is a significant upgrade.

Why is the size of the context window important? LLMs are trained on a vast dataset of information but often lack specific details concerning more specialized questions you might ask. For example, in order to ask Gemini questions about a technical research project your company completed, you probably would need to feed the model documents related to the project. In previous model iterations, the relatively small context window meant that to feed the model these documents, you would likely need to build a solution, such as a retrieval augmented generation (RAG) enhancement, for the model. This created extra complexity and, in some cases, may not have been worth the added effort.

With a larger context window, you can simply feed the documents to the model as part of your prompt. There's no RAG or model fine-tuning required to get context-specific answers from the LLM. This is an extremely useful feature when engineering resources may be limited or when the use case simply doesn't require anything additional.

Gemini Code Assist

Gemini Code Assist (formerly Duet AI) is a big upgrade from its predecessor. It can be integrated directly with popular developer environments such as VS Code and JetBrains products and provides code suggestions, auto-completion, and many of the features you would expect from an AI code assistant. I've been using it with my Python projects and it's already making a noticeable impact. It's especially useful for generating code that resembles another part of your project, as it's able to read code across the project and make recommendations based on this context.

Gemini Cloud Assist

Another impressive use case for GenAI came in the form of Gemini Cloud Assist. This product is targeted at developers and engineers who build and maintain solutions in the cloud. In the Developer Keynote demo, Google showed off a potential scenario where a firewall issue had caused a product website outage. The engineer chatted with a chatbot during each step of the troubleshooting process to greatly increase the speed at which he could retrieve the information needed to find the root cause. He asked for recent alerts, got a link to the log, and asked Gemini to summarize the log and even show what recent firewall changes had been made. It was an extremely impressive demonstration of how LLMs can sift through vast amounts of information and provide actionable intelligence for issue-resolution use cases.

Gemini in BigQuery

Google also showed us Gemini in BigQuery, which allowed for intelligent query completion along with the generation of queries entirely from natural language. But where the functionality gets really interesting is the ability to run the LLM across a large batch of data and generate results. For example, let's say you have a large dataset of customer survey text and want to understand if the sentiment of this text is positive or negative. You could instruct Gemini to query the data in this way and generate a new BigQuery column with the results. It's all done without leaving the BigQuery product, which makes it a really convenient and powerful method for augmenting datasets with additional insights.

Gemini in Workspace

Gemini was also everywhere in Google Workspace. Similar to how Microsoft has integrated Copilot into nearly every aspect of its collaboration tools, Google has done the same with Gemini. Expect additional offerings from Google that heavily integrate LLM technology into Gmail, Calendar, Drive and their other collaboration products.

Vertex AI

For those not familiar, while Gemini represents Google's LLM model, Vertex AI is Google's online platform that allows you to develop, train and deploy AI solutions. You can choose to use Gemini as your model or you can choose from a wide variety of other models as well, such as Claude 3 from Anthropic or Llama 2 from Meta. In addition to new model choices, several key enhancements to Vertex AI were also announced at Google Cloud Next.

Vertex AI Agent Builder

This was maybe my favorite announcement at Next: the Vertex AI Agent Builder. This is actually an upgrade and rebrand of an existing product that was called Vertex AI Search and Conversation. The Agent Builder is Google's no-code online tool for creating and deploying AI "agents." An agent is simply an AI solution designed to assist with a specific task, such as a chatbot or an intelligent search. 

What's uniquely powerful about Vertex AI Agent Builder is it cuts out many of the arduous steps typically required to set up well-functioning models. A RAG search application that is grounded with data from a specific website can be constructed and deployed in just a few minutes without the need to create your own API integrations or a vector database. It's all built-in and does a lot of the hard work for you. While this isn't going to work for every enterprise solution, there are definitely use cases where it will be applicable and will save businesses a ton of time getting an AI solution in place.

Vertex AI Prompt Management

New enhancements to Vertex AI prompting allow you to save prompts and their results and even feed the results back into the model, asking for suggestions on how to improve the original prompt. It seems that even when it comes to what we should be asking LLMs… we can ask LLMs!

Other announcements

In all, there were 218 things announced at Google Cloud Next!

Rather than go over each one, I'll highlight a few I found particularly interesting:

  • Gemini can use its 1M token context window to process audio
  • Imagen, Google's prompt-to-image tool, can now create short video clips
  • Users can ground their searches with Google Search or BigQuery
  • Gemma, Google's open-source LLM, is available across Vertex AI
  • Cloud TPU v5p is now generally available and supported in Google Kubernetes Engine (GKE)
  • A3 Mega VMs (powered by NVIDIA H100) will be generally available in May 2024
  • NVIDIA Blackwell GPUs are coming to GCP in early 2025
  • Google Distributed Cloud will have GenAI search via Gemma, available Q2 2024

Key takeaways

At Google Cloud Next, Google continued its intense focus on expanding its portfolio of AI products. Whether it was AI platform tools, collaboration applications, or high-performance hardware, nearly everything linked back to AI. The pace of AI product delivery and enhancement from Google and the industry at large is remarkable. It's clear there will be no shortage of products to try and companies will continue building solutions on these platforms. Generative AI has supercharged the race to innovate and it's incredibly exciting to be a part of it.

Technologies