AI is growing. As large-scale language models (LLMs) become some of the critical components used in artificial intelligence (AI)-centered enterprise software applications, their use is being developed, refined, segmented, and interconnected. It's not distant news.
Google DeepMind CEO Demis Hassabis announced the next version of Google's Gemini LLM. It is currently version 1.5 and available in a new format. LLM, formerly known as Bard, is said to be a “step change” in the advancement of technology, and the release of its Pro version is available as a developer preview.
Long context understanding
Hassabis praised Gemini 1.5's ability to deliver “long-term contextual understanding.” This is a term used to describe the ability of AI models to track vector relationships across long texts, and as we move towards ingesting images (videos, sound files, etc.) with multimodal AI. The same goes for data sources.
Long-term context understanding technologies are designed to recognize the need to create relationships between more distant data points, as increasing the amount of data in the information naturally reduces performance. Also, importantly, you don't necessarily have to create the relationship first or first. End of information.
First introduced last December, this version of the Gemini series is positioned as a “research release” exclusively for software application developers and Google Cloud customers. In some ways it's reminiscent of the developer preview system that Redmond used on his Microsoft Developer Network (MSDN), where the company works more closely with (or at least thinks first of all) the programming community than other open source approaches. ) It seems that they are working together. It's debatable whether Google is trying to win over people's hearts, be more strict about AI safety, or just have more overall control and steering.
mix of experts
Hassabis also explained how Gemini 1.5 offers a so-called new Mix of Expertise (MoE) architecture. An approach designed to break down the logic of a neural network architecture into smaller, incremental “expert” networks, MoE develops AI into a more specialized component model structure that is connected to one (or at least one) other. It symbolizes how we are currently considering being good at things (other than). their larger counterparts.
A given “training corpus” (a body of knowledge or work) contains a huge amount of information, so allowing an AI model to focus in one direction or another can help it understand what's going on around it. You will be able to better understand what is going on. It's like he's sitting in one room with a mixed group of experts, some who understand food science and gastronomy, some who have an innate understanding of rocket science. Think about it. When a video showing how to make the perfect omelet plays, the rocket scientist mostly switches off or starts thinking about lunch, while the food expert's eyes light up, absorbing and taking in the information. . The MoE model is built to selectively enable only when it matters and initiate the use of relevant expert pathways within the neural network architecture.
“Depending on the type of input given, the MoE model learns to selectively activate only the most relevant expert paths within the neural network. This specialization greatly increases the efficiency of the model. “Google has been an early adopter and pioneer of MoE technology for deep learning through research,” Hassabis explained in a Google AI blog post. “Our latest innovations in model architecture enable Gemini 1.5 to learn complex tasks faster and maintain quality while improving training and service efficiency. Our team is now able to iterate, train, and deliver more advanced versions of Gemini faster than ever before, and we are working on further optimizations.”
Providing tokens
With this next-generation version of Gemini, Google has increased the amount of information that its AI models can process, allowing them to run up to 1 million tokens consistently. As explained here earlier, tokens are a core AI technique used to segment, define, and classify words, parts of words (even letters), or word components, and allow AI models to You will be able to learn to place relationships and values in pieces of information. Further here we can say that the token can be an image, video, audio, or code. The more tokens an AI model can process, the more knowledge it can potentially gain.
Google engineer Chaim Gartenberg said: “The entire 1 million token context window is computationally intensive and requires further optimization to improve latency, but we are actively working on it as we scale out. ” states.
Is that safe? According to Hassabis, this latest technology update is built according to his AI principles, which are unique to Google, and robust safety policies.
“We ensure that our models undergo extensive ethics and safety testing. We then integrate these research findings into our governance processes and model development and evaluation to continually improve our AI systems. I will,” he said. “Since its introduction, [Gemini] Since 1.0 Ultra in December, our team has continued to refine the model and make it more secure for broader release. We also conducted new research on safety risks and developed red teaming techniques to test for a variety of potential hazards. ”
gemini family
There's a whole “family” of Google Gemini options, from the company's Nano products designed for model phones, to Gemini Pro for developers, and the premium Gemini Ultra version. It's unclear whether the price and service feature differentiation that Google currently offers will continue to be part of how it channels its products in the future, but as AI itself diversifies, as demonstrated here, Given the widespread trend towards specialization, it makes sense to diversify its offering.
The need to include small language models (SLMs), also known as “private AI”, and various other names in the LLM universe has become recognized, combining the use of AI with fluid tokenization control and mixing; There is a growing need to expand and extend. -of-Experts (MoE) architecture is of the moment.
Did Google name it Gemini because it wanted us to think of its AI as the “twin” of our human existence? Unfortunately, that’s not the case. Most sources believe this is the coming together of his two AI teams within Google (Google Brain and Google DeepMind), but an astrological connection doesn't hurt either.
follow me twitter Or LinkedIn.