AI hardware startup Cerebras Systems has announced a new third-generation AI processor that it claims is the world's fastest. The company announced today (March 13) that its WSE-3 chip has twice the performance of its previous generation chip, which was the previous record holder.
said Andy Hock, vice president of product management at Cerebras. “Once again, we have delivered the largest and fastest AI chip on the planet in the same dinner plate-sized form factor.”
The Sunnyvale, California-based startup entered the hardware market in 2019 with the Wafer Scale Engine (WSE), an ultra-large 8-inch by 8-inch AI chip. It was 56 times larger than the largest GPU, with 1.2 trillion transistors and 400,000 computing cores, making it the fastest and largest AI chip available at the time.
Then, in 2021, Cerebras launched WSE-2, a 7-nanometer chip with 2.6 trillion transistors and 850,000 cores, doubling the performance of the original.
900,000 cores
The company today nearly doubled its performance again with its WSE-3 chip, which has 4 million transistors and 900,000 cores and delivers 125 petaflops of performance. The new 5-nanometer processor powers Cerebras' new CS-3 AI server, designed to train the largest AI models.
“CS-3 is a huge step forward for us,” Hock said. Data center knowledge. “It's twice the performance of CS-2.” [server]. This means training large AI models will be twice as fast with the same power consumption and available at the same price. [as the CS-2] To customers. ”
Since its launch, Cerebras has positioned itself as an alternative to Nvidia GPU-powered AI systems. The startup's pitch is that it can run its AI training on its Cerebras hardware using significantly fewer chips instead of using thousands of GPUs.
“One [Cerebras] A server can do the same work as 10 racks of GPUs,” said Karl Freund, founder and principal analyst at Cambrian AI Research.
Cerebras enters the AI market
Nvidia dominates the AI market, capturing about 85% of the AI chip market with its GPUs, while the remaining players such as AMD, Intel, Google, AWS, Microsoft, and Cerebras capture about 15%. Yes, the analyst said.
While it has yet to be proven that competitors can seize a significant portion of market share from Nvidia, Cerebras has been successful since launching its first product five years ago, Freund said. We call it the most successful AI startup.
“Cerebras took a completely different approach from the beginning,” he says. “Everyone else is trying to surpass his Nvidia, but it's really difficult.'' Cerebras said he will do something that no one has done before: “Building an entire wafer-scale AI engine.'' . Its advantage is incredibly high performance. ”
cloud access
Cerebras does not make a profit by selling processors. A company spokesperson said the company makes money by selling servers powered by these chips, which cost millions of dollars each. Cerebras offers its own CS-3 system to customers via the cloud, but also sells to large enterprises, government agencies, and international cloud providers.
For example, Cerebras recently added medical provider Mayo Clinic to its growing roster of customers, which also includes Argonne National Laboratory and pharmaceutical giant GlaxoSmithKline.
In July 2023, Cerebras will invest 100 million yen to build the first of nine interconnected cloud-based AI supercomputers for G42, a technology ownership group based in the United Arab Emirates. He also announced that he had signed a dollar contract.
Since then, the companies have built two supercomputers with a total of 8 exaflops of AI computing. The supercomputer, accessible via the cloud, is optimized for training large-scale language models and generative AI models, and is used by organizations across a variety of industries for climate, health, energy research and other projects. It has been.
Cerebras and G42 are currently building their third supercomputer, Condor Galaxy 3, in Dallas. This supercomputer is powered by 64 of his CS-3 systems and delivers 8 exaflops of AI computing. The two companies plan to complete nine supercomputers with a total computing power of 55.6 exaflops by the end of 2024.
“The fact that Cerebras produced a third generation wafer scale engine is a testament to the company's customer traction. They generated the revenue necessary to pay for all the engineering,” Freund said. .
By the numbers: WSE-3 chip and CS-3 AI system
Cerebras' WSE-3 has 52 times more cores than Nvidia's H100 Tensor cores. Compared to the Nvidia DGX H100 system, the Cerebras CS-3 system with the WSE-3 chip runs 8x faster to train, has 1,900x more memory, and can train AI models up to 600x more parameters, up to 24 trillion parameters. can be trained. Cerebras executives say it's larger than the DGX H100's capabilities.
The Llama 70 billion parameter model, which takes 30 days to train on a GPU, can be trained in one day using a CS-3 cluster, Hoch said.
Cerebras partners with Qualcomm on AI inference
Cerebras' hardware is focused on AI training, so until now we didn't have an answer for our customers' AI inference needs. Now, that's possible thanks to a new partnership with Qualcomm.
The companies announced today that they have collaborated to ensure that models trained on Cerebras hardware are optimized to run inference on Qualcomm's Cloud A100 Ultra accelerator.
“They have optimized the output of the large CS-3 machine to work very well with these very low-cost, low-power Qualcomm AI inference engines,” Freund said.