The Battle of the AI Chips Among Apple, Microsoft, NVIDIA, and Google

The Battle of the AI Chips Among Apple, Microsoft, NVIDIA, and Google

Updated: May 09 2024 20:52


As AI continues to revolutionize the tech industry, major players like Apple, Microsoft, NVIDIA, and Google are investing heavily in custom silicon to power their AI-focused data centers. With the growing demand for efficient, scalable, and sustainable compute power, these companies are reimagining their infrastructure to meet the needs of their customers and stay ahead in the AI race.

According to recent reports from Bloomberg and the Wall Street Journal, Apple is developing its own chips, similar to those designed for the Mac, to process the most advanced AI tasks coming to Apple devices. Apple is leveraging its powerful Apple silicon chips to power AI-focused data centers.

Apple New AI Chip for Data Centers

The first AI-focused chip in Apple's data centers will be the M2 Ultra, which currently powers the Mac Pro and Mac Studio. It’s likely that a future M4 Ultra will soon follow. This move highlights Apple's strategy to unify its product lineup with Apple silicon, a strength the company has been leveraging since the introduction of the M1 chip in 2020.


Apple's decision to leverage its strengths in chip design for AI-focused data centers highlights the company's commitment to staying competitive in the AI landscape. By unifying its product lineup with Apple silicon, the company has already seen success with the Mac's M1 chip. Harnessing the same powerful chip team to create AI-focused data centers could prove to be a long-term strength that Apple's competitors may struggle to match.

Microsoft’s Maia and Cobalt AI Chips


Microsoft has also unveiled two custom-designed chips and integrated systems: the Microsoft Azure Maia AI Accelerator and the Microsoft Azure Cobalt CPU. The Azure Maia AI Accelerator is optimized for AI tasks and generative AI, while the Azure Cobalt CPU is an Arm-based processor tailored to run general-purpose compute workloads on the Microsoft Cloud.


These chips represent Microsoft's efforts to deliver infrastructure systems that have been designed from top to bottom and can be optimized with internal and customer workloads in mind. The chips will start rolling out to Microsoft's data centers early next year, initially powering the company's services such as Microsoft Copilot and Azure OpenAI Service.


Microsoft's approach to building its own custom silicon allows the company to target specific qualities and ensure that the chips perform optimally on its most important workloads. The company's testing process includes determining how every single chip will perform under different frequency, temperature, and power conditions for peak performance, and testing each chip in the same conditions and configurations that it would experience in a real-world Microsoft data center.

NVIDIA's B200 Blackwell AI Chip

At GTC 2024, NVIDIA has introduced its latest flagship chip, the B200 based on the Blackwell architecture. This chip is up to 30 times faster than its predecessor H100 at certain tasks, such as serving up answers from chatbots. The B200 combines two squares of silicon the size of the company's previous offering into a single component, significantly boosting performance.


This Blackwell chip will cost between $30,000 and $40,000 per unit, according to CEO Jensen Huang.

We had to invent some new technology to make it possible, spending about $10 billion in research and development costs



NVIDIA's chips are used by major customers, including Amazon, Google, Microsoft, OpenAI, and Oracle, for their cloud-computing services and AI offerings. The company has also introduced new software tools, called microservices, to help developers more easily sell AI models to companies using NVIDIA technology.

More NVIDIA Blackwell details can be found here: NVIDIA Blackwell Architecture Technical Brief

NVIDIA Blackwell-Powered DGX SuperPOD

The NVIDIA DGX SuperPOD™ powered by NVIDIA GB200 Grace Blackwell Superchips — for processing trillion-parameter models with constant uptime for superscale generative AI training and inference workloads.

Featuring a highly efficient, liquid-cooled rack-scale architecture, the new DGX SuperPOD is built with NVIDIA DGX™ GB200 systems and provides 11.5 exaflops of AI supercomputing at FP4 precision and 240 terabytes of fast memory — scaling to more with additional racks.


Each DGX GB200 system features 36 NVIDIA GB200 Superchips — which include 36 NVIDIA Grace CPUs and 72 NVIDIA Blackwell GPUs — connected as one supercomputer via fifth-generation NVIDIA NVLink®. GB200 Superchips deliver up to a 30x performance increase compared to the NVIDIA H100 Tensor Core GPU for large language model inference workloads.

Google's Cloud TPU v5p

Google has recently announced the launch of its new Cloud TPU v5p, an updated version of its Cloud TPU v5e. The v5p pod consists of 8,960 chips and features Google's fastest interconnect yet, with up to 4,800 Gbps per chip.


Google claims that the v5p offers significant improvements in performance and efficiency compared to its predecessor, the TPU v4. The v5p is designed to train large language models like GPT3-175B 2.8 times faster than the TPU v4 while being more cost-effective. Google's DeepMind and Research teams have observed 2x speedups for LLM training workloads using TPU v5p chips compared to the TPU v4 generation.


TPUs have long been the basis for training and serving Google AI-powered products like YouTube, Gmail, Google Maps, Google Play, and Android. Gemini, Google’s most capable and general AI model, was trained on, and is served, using TPUs.

Google's Cloud AI Hypercomputer


In addition, Google also announced AI Hypercomputer from Google Cloud, a supercomputer architecture that employs an integrated system of performance-optimized hardware, open software, leading ML frameworks, and flexible consumption models. Traditional methods often tackle demanding AI workloads through piecemeal, component-level enhancements, which can lead to inefficiencies and bottlenecks. In contrast, AI Hypercomputer employs systems-level codesign to boost efficiency and productivity across AI training, tuning, and serving. Google Cloud customers will have access to VMs powered by both the NVIDIA HGX B200 and GB200 NVL72 GPUs.

AI Chips Comparison and Analysis

While Apple, Microsoft, NVIDIA, and Google are all investing heavily in AI-focused chips and infrastructure, their approaches differ in some key aspects.

Apple's strategy revolves around leveraging its expertise in chip design, which has already proven successful with its M-series chips for Macs, to create powerful AI-focused chips for its data centers. This approach allows Apple to maintain tight control over its hardware and software integration, potentially giving the company an edge in terms of performance and efficiency.

Microsoft, on the other hand, is taking a more holistic approach by designing custom chips, servers, and racks that are tailored specifically for its cloud and AI workloads. By having visibility into the entire stack, Microsoft aims to optimize every layer of its infrastructure to maximize performance, sustainability, and cost-effectiveness. The company's partnerships with NVIDIA and AMD also provide customers with a range of options for AI acceleration.

NVIDIA, as a dedicated GPU manufacturer, continues to push the boundaries of AI acceleration with its powerful GPU offerings. Its GPUs have become the de facto standard for AI training and inferencing in the cloud. OpenAI and Microsoft's adoption of NVIDIA's latest AI chipset technology highlights the importance of NVIDIA's role in the AI ecosystem.

Google's Cloud TPU v5p demonstrates the company's commitment to pushing the boundaries of AI hardware performance and efficiency. Having said that, NVIDIA is still very important to Google. In the same blog post announcing its latest AI chip, Google mentioned NVIDIA 20 times. The company said it’s updating its A3 supercomputer, which runs on Nvidia GPUs. And Google reminded customers that it’s using Nvidia’s latest chip, the Blackwell, in its AI Hypercomputer.

NVIDIA H100 Shipments Chart

Interesting data from Omdia Research on NVIDIA H100 shipments by customer in Q3 2023. Roughly 28% of Q3 revenue and 27% of the first 9 months revenue in 2023 is attributable to two major customers - Meta and Microsoft, according to Omdia Research.


The Potential Impact of Apple's AI Chips

Apple's move to develop its own AI chips for data centers could have significant implications for the future of AI on Apple devices. By processing the most advanced AI tasks in the cloud, Apple can offload the heavy lifting from individual devices, allowing for more sophisticated AI features and better performance.


This approach also allows Apple to maintain tight control over the entire AI pipeline, from hardware to software, ensuring a consistent and secure experience for users. As AI becomes increasingly important in our daily lives, Apple's investment in AI-focused data centers could give the company a significant competitive advantage.

As Apple continues to develop its AI capabilities, we can expect to see more advanced AI features in future versions of iOS, iPadOS, and macOS. The company's unified approach to chip design and AI processing could lead to breakthroughs in areas such as natural language processing, computer vision, and machine learning.


Apple's expertise in chip design, Microsoft's holistic approach to infrastructure optimization, and NVIDIA's cutting-edge GPU technology all contribute to the rapidly evolving AI landscape. As these companies continue to innovate and compete, customers stand to benefit from more powerful, efficient, and cost-effective AI solutions in the cloud. Ultimately, the winner in this race will be the company that can deliver the most compelling combination of performance, efficiency, and value to its customers. These companies all bringing their unique strengths to the table, the future of AI in the cloud looks brighter than ever.


Check out my recent posts