Amazon
,
alphabet
,
meta
,
and Microsoft are the four largest companies in AI cloud computing. They are also Nvidia's biggest buyers.
's
Graphics processing units (GPUs) have become essential for training AI software and running inference applications from AI models.
Another thing the cloud giants have in common is that they build their own chips to supplement Nvidia's supply.
While none of the Cloud Four are likely to sell chips to companies like Dell, Super Micro, or other hardware providers, their chips will still take a hit on Nvidia's 90%+ share of the AI GPU market. may give.
Here's why: All four cloud providers want and need to manage rapidly increasing capital expenditure budgets. Meta, Microsoft, Amazon, and Google parent Alphabet are investing billions in AI infrastructure. Combined capital spending by the companies is expected to reach $178 billion in fiscal year 2024, up more than 26% from a year ago, according to FactSet estimates.
Advertisement – SCROLL TO CONTINUE
Much of the cost control is achieved by designing a proprietary “stack” that includes software, hardware, and chips, similar to Apple.
's
approach.
Amazon entered the chip business in 2015 with the acquisition of Israeli chip design company Annapurna Labs. This transaction resulted in three chips. One is a CPU called Graviton, which Amazon says offers 40% better price performance than comparable x86 chips. Trainium, for training large models. Inferentia for AI inference workloads.
Gadi Hutt, director of business development at Annapurna, said Amazon started looking at the AI chip market in 2016. “There was no Gen AI or LLM.” [or large language models]But machine learning was growing as a use case for Amazon and our customers,” he said. “They all came to us with basically the same product description: We want to do more, but it's too expensive.” Solving this problem is what Amazon's AI This is the basis of our approach.
Advertisement – SCROLL TO CONTINUE
“Compared to other solutions available in the cloud (mainly Nvidia), we get cost savings and high performance, sometimes even higher performance,” Hutt says.
Amazon has hired some of the companies most invested in AI, including OpenAI rival Anthropic, which uses Trainium to train its AI models.airbnb
,
ByteDance, Snap, and Deutsche Telekom are Inferentia customers. Amazon uses its own chips to run Alexa, Amazon Ads, and the new Rufus shopping bot.
Like Amazon, Alphabet's Google has been working on AI chips for nearly a decade, and like Amazon, it has the ability to build complete systems that include not just chips but also associated software, networking infrastructure, and storage. I think there are advantages to this. Google calls its chips “tensor processing units” (TPUs). (“Tensor” refers to matrix multiplication, which is the math behind AI software.) Google has 5th generation TPUs, which come in two flavors: one focused on performance; (to create the model at maximum speed), and one tuned for optimal speed. efficiency.
Advertisement – SCROLL TO CONTINUE
Google uses TPUs to build large-scale language models, including the widely deployed Gemini software. He also uses TPUs to power Google Mail and other services. Among the many startups relying on Google TPUs are Character AI, which creates personalized chatbots, and Midjourney, which provides text-to-image software.
Mark Lohmeyer, Google Cloud's vice president of compute and machine learning infrastructure, said the company started a thought experiment more than a decade ago that showed how much computing power a Google user would generate if they interacted with them for a few minutes a day. Do you want to use voice-prompted search? He says he started the project when he thought about how much computing power he would need. It was said that this would spur the development of processors, AI, and machine learning.
Earlier this month, Meta announced its second generation AI chip called MTIA (Meta Training and Inference Accelerator). The company, the parent company of Facebook and Instagram, uses his MTIA chips, also made by TSMC, to power its social media rankings and advertising models.
Advertisement – SCROLL TO CONTINUE
“MTIA is a long-term venture that provides the most efficient architecture for Meta's unique workloads,” the company said in a blog post announcing the new chip. “As AI workloads become increasingly important to our products and services, these efficiencies will improve our ability to deliver the best experience to our users around the world.”
Microsoft is the furthest behind the cloud players when it comes to chips, but it has entered the race. Last year, the company announced the Azure Maia 100 AI Accelerator, a chip for model training and inference. So far, Microsoft has limited its use to internal workloads such as his Github Copilot, Bing, and OpenAI's GPT 3.5, but customers will also have access.
“We are optimizing and integrating every layer of the stack,” said Rani Borkar, corporate vice president, Microsoft Azure Hardware Systems and Infrastructure. “It's not just about silicon. We're rethinking every layer from silicon to servers to systems to data center infrastructure to maximize performance and efficiency.”
Borkar added that Microsoft wants to offer “options” to its cloud customers. Mr. Volker says: or That's it, this is it and that. …We need diverse suppliers. ”
Email Eric J. Savitz at eric.savitz@barrons.com.