Google Ironwood TPU: 10x Faster AI Chip!

In an era shaped by artificial intelligence, Google has unveiled Ironwood, its seventh-generation Tensor Processing Unit (TPU). This advanced AI chip is more than just an upgrade—it is a bold leap in how technology supports cutting-edge applications, especially in inference, the vital phase where AI models interpret and respond to real-world queries.

Performance and Power

Ironwood is built to be transformative. Compared to its predecessor, Trillium (TPU v6e), Ironwood delivers four times the performance per chip, whether it’s used for training new models or running them in production. When compared to TPU v5p, the leap is even more striking, offering a tenfold increase in peak performance.

At the heart of each Ironwood chip is a powerhouse capable of achieving 4,614 teraflops of computational strength. When many such chips are linked together into pods, their combined might reaches an astonishing 42.5 exaflops. To appreciate this scale: it is over 24 times more computing power than the world’s largest supercomputer, El Capitan, which offers just 1.7 exaflops. This capacity makes Ironwood a foundation for running and improving the largest, most advanced language models.

A Leap in Memory and Connectivity

Ironwood’s design addresses a central challenge in AI inference: the ability to move and process massive amounts of data swiftly. Each chip possesses 192 gigabytes of state-of-the-art HBM3e memory—six times more than its predecessor. This allows each chip to achieve 7.37 terabytes per second of memory bandwidth, eliminating bottlenecks that once delayed AI computations.

Communication between chips is equally advanced. The bidirectional inter-chip interconnect now operates at 1.2 terabytes per second, enabling chips to work together in large networks without slowing down. At the pod scale, 1.77 petabytes of high-speed memory is shared across thousands of chips, ensuring vast datasets and model weights are always at hand for instant use.

Efficiency and Engineering Discipline

Ironwood’s remarkable power is matched by its efficiency. With twice the performance per watt of previous TPUs, it manages not only to increase speed but also to reduce the energy required to support high-volume AI workloads. Operating a pod with thousands of Ironwood chips does require significant electricity—about 10 megawatts—yet the gains in efficiency set a new industry standard.

Every aspect of Ironwood’s architecture strives to minimize unnecessary data movement and reduce wait times for large-scale calculations. This commitment is crucial for delivering AI results with low latency, keeping responses fast and reliable for users and applications around the world.

Reliability and Security

Trust and reliability are foundational to Ironwood’s design. Each chip comes equipped with an integrated root of trust, providing hardware-level security. It also features self-test routines and advanced monitoring to detect rare but critical errors, such as silent data corruption. Built-in arithmetic checks ensure that every computation is sound, securing both data integrity and user confidence.

Shaping the Future of AI Deployment

Ironwood’s debut has already begun to reshape the competitive landscape of AI. Anthropic, creator of Claude, is committed to using up to one million Ironwood chips, signaling industry endorsement on a massive scale and foreshadowing a shift away from traditional GPU-based infrastructures.

While Ironwood underwent limited testing in the first half of 2025, it is now being rolled out for broader use, with general availability arriving imminently.

The Dawn of the “Age of Inference”

Ironwood stands as a pivotal advancement, ushering in what Google calls the “age of inference.” For years, AI innovation centered on the training of ever-larger models. Now, the spotlight shifts to applying these models at scale, powering applications that touch lives daily.

Through its balance of raw computing power, memory bandwidth, energy efficiency, and security, Ironwood embodies a new philosophy: one devoted not only to making greater AI possible, but also to making it sustainable, accessible, and dependable for the world.

Killed by Robots