last weekend, X.ai releases Grok-1, the world's largest “open source” large-scale language model (LLM). At 314 billion parameters, it is the largest open source model to date and far exceeds previous models. Falcon 180B 180 billion.
While that's impressive, the timing of X.ai's release comes just weeks after X.ai's founder, Elon Musk. Filed a lawsuit against OpenAI The alleged lack of openness raised eyebrows. Is Grok-1 a useful effort to advance open source AI, or is it a ploy for Musk's company to prove it's more open than its rivals?
“One of the strategies is true open source. And the second strategy is more of an 'open way' strategy. What X.ai did here is gone with the open-way strategy. ” —Mahyal Ali, Smoldin
“There are two open source strategies that large AI companies typically employ. One strategy is true open source, and the second strategy is more of an 'open way' strategy.” said. Mahyal Ali, product lead at Smodin, an AI company based in Casper, Wyoming. “What X.ai did here is gone with the open-way strategy.”
Grok hasn't settled the AI open source debate
The definition of “open source” is a controversial topic in the AI community. Meta Llama 2 Proven. Although Meta provides model weights, evaluation code, and documentation, the company does not reveal the model's training data or associated code. There are also commercial restrictions. If you want to use it for a product or service with more than 700 million monthly active users, you must request a license from Meta.
Grok-1 won the license. It is released under Apache 2.0, a popular open source license introduced over 20 years ago. “The Apache 2.0 License is important because the definition of open source varies widely. The gold standard is the Apache 2.0 License.” Cameron R. Wolf, Director of AI at Minneapolis-based e-commerce platform Rebuy Engine. This license allows use by organizations of any size and has no restrictions on commercial use.
“The difference between Grok-1 and anything that's been released so far from an open source perspective is that Grok is so much bigger. This is great because it's probably closer to what OpenAI is doing.” —Cameron R. Wolfe, Rebuy Engine
However, although the license was open, Grok's release did not come with the documentation and benchmarks that Meta released with Llama and Llama 2. And like Meta, X.ai does not publish information about how its models are trained, nor has the company released the code used to do so. Wolfe contrasted Grok's release with OLMo, an open-source LLM that includes not only model weights and documentation, but also training code, logs, and metrics. “It's open source, but [X.ai is] I’m holding things back,” Wolf said.
For Ali, the lack of data in Grok's release is a problem because it hinders Grok's usefulness in AI research. Anyone can download the model weights and deploy the model, but they can prove difficult to analyze and understand. “If a company releases a model, can you replicate that model? That's what open source is,” Ali said.
Will the number of problems increase as the number of parameters increases?
The most notable thing about the Grok-1 release is its massive size. The number of parameters is 314 billion, about 4.5 times that of Meta's largest Llama 2 model, Llama-2-70b.
“The difference between Grok-1 and anything that has been released so far from an open source perspective is that Grok is so much bigger, which is great because it's probably closer to what OpenAI is doing,” Wolfe said. said. Grok's scale could help open source catch up to higher-performing closed models like his OpenAI's GPT-4 and Anthropic's Claude-3 Opus. (In both cases, the number of parameters in the model remains undisclosed, but common estimates put the number of parameters in each well above 1 trillion).
Grok-1 could serve as a warning for future open source models. Please pursue the size at your own risk.
But size can be both a blessing and a curse. Adding parameters improves the quality of the model, but it also makes the model more difficult for developers to deploy.
“It's relatively easy for small businesses and the open source community to tweak small models,” Ali says. “But with a model this big, it's hard to even load it. [it]You'll need a GPU that costs about $15-20 per hour. [to rent], just to run this model. And you'll need 20-30 to fine-tune it. ”
This is an issue that is particularly relevant to the Grok-1, as it has not been fine-tuned unlike most models released to the public. This means it is not adapted for specific uses (like chat) and lacks the built-in reliability and safety measures that fine-tuned models often include.
Wolfe said this is not an unusual move, as OLMo also released a basic model first, then a tweaked model. on the other hand, Musk's social media posts opposed what he called “woke” AI He suggests that the lack of safety measures may be viewed as a feature rather than a bug. X.ai has not said when or if it will release the tweaked model.
Previous large open source LLMs (such as the Falcon 180B whose logo is pictured here) were not popular in the open source community.Institute of Technology Innovation
The history of other similar models suggests that fine-tuning and deploying Grok-1 will be difficult.Including some previous models Meta Galactica 120B and TII Falcon 180B, These are clearly designed to bring the benefits of large-scale models to the broader AI development community. However, these models were not popular. Hug Face OpenLLM Leaderboard Models with between 7 billion and 72 billion parameters still dominate. If Grok-1 fails to buck that trend, it will serve as a warning for future open source models. Please pursue the size at your own risk.
“Grok attracted a lot of headlines. … But in the long run, I think it will be forgotten like other big models,” Ali said. “What works are small-scale models that individual researchers can use and scale. I think this is hype and it's going to go away quickly.”
From an article on your site
Related articles on the web