With the launch of the new Llama 3 model, Meta is preparing to bring artificial intelligence to WhatsApp, Instagram, and Facebook. This is a move that could ultimately put the model in the hands of more than 3 billion users every day. This unprecedented flow of human-AI interaction data could help Meta advance in his AI race.
For the past year, OpenAI's GPT-4 has been the dominant large-scale language model. Although LLM's popularity dwarfs its competitors from Google, Meta, and Mistral, the difference in model quality is becoming smaller and smaller. After Cohere's Command R+, the Llama 3 is his second model to be released this month, surpassing the original his GPT-4 in terms of popularity. LMSYS Chatbot Arena. In the past few days, Microsoft announced its Phi-3 model and Snowflake also joined the competition with its own LLM. The language model space is becoming increasingly crowded, and its players are having a hard time differentiating themselves.
Intense competition creates a fast-moving space. Technological advances that are incredible one day can become high stakes months later, as competitors quickly adapt their approaches. Architectural and algorithmic innovations have not resulted in defensible competitive advantages. Despite billions of dollars poured into it by big tech companies and venture capital, neither has the budget or scale. One way he finds lasting benefits is by leveraging more and better data.
Positive feedback loop to AI
Research accompanying recent LLM releases shows that even models trained on billions of words of training text remain undertrained. Increasing the amount and quality of data is a proven way to improve performance. The key player is already using the best public training data, so the next thing to look at is his use of LLM itself. The data we get from how we query and guide our language models through conversations goes beyond just volume, helping us enhance the models that are already working well and target and improve the areas of difficulty. Both OpenAI and Google offer models directly to end users through chat interfaces, and have already started building their datasets.
When usage data drives quality, and quality drives usage, a positive feedback loop can turn the space into a near monopoly. I've seen this story before on a search engine. Google initially gained the top spot thanks to an innovative ranking algorithm using public data, but it has maintained and grown that lead by learning from vast amounts of usage data from its users. Succeeded. Because of this, other players struggle to capture a small portion of the market. Even Microsoft CEO Satya Nadella testified last summer that Bing is inferior to Google and that Bing is all about data.
Companies need to effectively leverage the feedback loop of AI usage. His GPT-4 improvements since its initial release can be attributed in part to usage data, but the changes feel more iterative than transformational. But OpenAI, which has more than 100 million monthly active users, has built an unprecedented conversation dataset. With his long-awaited GPT-5, rumored to be released this summer, we may finally see that data put into action. From there, we could see OpenAI continue to widen the gap and assert its dominance.
Even if the quality gains from usage data are modest, past interactions allow the model to customize its methods better than anyone else. According to Sam AltmanThat's where the real long-term differentiation of AI models lies. That means they know you best.
Mainstream AI will be delivered through messaging apps and social networks
Data feedback loops and upcoming releases of OpenAI make Meta's timing especially interesting. Meta not only captures the time and attention of billions of people, but also owns the default place where most people go to chat. Deploying Llama-based AI capabilities directly into their inboxes could scale adoption faster than ChatGPT's incredible speed.
The rest of this year will see AI providers emerge ahead of others who can leverage user data. Players who own the virtual space where people interact with their models are far better off than players who only provide models but cannot collect data. This is good news for OpenAI, Microsoft, Google, and Meta, but non-Big Tech competitors like Mistral and Cohere may be left out.
Next year's AI space is likely to have very different competitive dynamics than what we see today. Currently, overwhelming funding and a competitive open source model have made the most powerful AI capabilities incredibly cheap, but as only the largest companies can offer the most powerful AI experiences, price points will continue to decline. is less likely to be maintained. Meta's current habit of open sourcing its models makes this the friendliest option at the moment, but it may reconsider if other open source players fall behind. Today's top-notch AI is perhaps the most open and accessible we've ever seen. We are wise to enjoy it while it lasts.
Jeroen Van Hout Co-founder and CPO/CTO of tech wolfis a VC-backed AI/HR technology startup based in Ghent, Belgium.he has wrote several research papers Focuses on applying artificial intelligence to human resources data.