Arm Newsroom Blog

Is Computing Facing An Energy Crisis?

Arm Fellow & Director of Technology Rob Aitken explores how much-needed efficiency gains can be had by combining CPUs, NPUs, GPUs and networking processors in novel ways.
By Rob Aitken, Fellow & Director of Technology, Arm
Performance per Watt

Is the end near? If the topic is energy efficiency gains in computing, the answer depends on whom you ask.

The steady increase in performance per watt over the decades has been one of the most important drivers in our industry. Last year I was thumbing through a neighbor’s 1967 Motorola IC catalog that featured such space-age wonders as a small control chip of the sort that went into the Apollo moon mission. While cutting edge then, if you tried to build a smartphone with it today, the phone would consume about 16MW of power and take up 12 football fields. You’d think twice before signing up for a cell plan.

Skeptics believe we are headed for choppier waters. Moore’s Law is delivering diminishing returns. Meanwhile, techniques that have kept data center power consumption flat for the past 15 years —virtualization, ambient cooling, workload consolidation, unplugging “zombie” servers—have already been exploited fairly extensively. Many cutting edge data centers already tout Price Use Effectiveness (PUE) ratings of close to 1, meaning that almost all of the energy goes to running IT equipment. Further improvements will require innovation of core computing architecture.

AI turns up the heat

Worse, AI will turn up the heat. We’re graduating from basic AI problems (finding cat videos!) to more energy-intensive tasks like autonomous driving or medical diagnostics. Applied Materials warns that without advances in materials, chip designs and algorithms, data center power could rise from 2 percent of worldwide electricity consumption to 10 or even 15 percent.

On the other hand, the optimists have a compelling argument: we’ve heard it before. In 1999 some predicted the Internet might consume half of the grid in ten years. That scary future was avoided through leapfrog innovations like FinFETs, but also through steady improvements in overall system design and mapping algorithms to hardware. Good engineering, they argue, still has quite a bit of headroom.

Plus, you need to look at the big picture. Worldwide emissions dropped by 2.4 billion tons, or 7 percent, in 2020 as videoconferencing replaced commuting and business trips. While travel will likely rebound, a good portion of meetings will stay on Zoom. Similarly, smart devices and artificial intelligence (AI) are being deployed to help curb the estimated 30 percent of power that gets wasted in buildings. Electronics, one can argue, can deliver a net benefit to the environment.

Nonetheless, many optimists are also reluctant to look beyond a 2 to 3 year horizon. So who’s right? Both sides bring up very good points and the debate has certainly added a jolt to conference panels. But personally, I’m a cautious optimist. While Moore’s Law may be past its prime, the semiconductor industry has already launched into a design-centric era where gains will be mainly realized through innovations in SoC and core architectures instead of process shrinks. Large integrated caches and graphics processors (GPUs) accelerators were arguably the first step in this era. 3D NAND was another major milestone: transistor stacking changed the design and economic equations for flash memory companies.

Focus on combining CPUs, GPUs and NPUs

Arm has been paying particular focus on exploring the synergies that can be achieved by combining processors (CPUs), neural processing units (NPUs), GPUs and networking processors or DPUs in novel ways. Combining CPUs and NPUs, for instance, have been shown to be capable of boosting efficiency gains by 25x while increasing performance on tasks like interference by 50x over CPU-only solutions. For IoT devices, that means an ability to produce more precise, more interesting insights on a fixed energy budget that won’t tax batteries. You’ll see a similar philosophy with the Total Compute strategy coming to handhelds.

In data centers, AWS says its single-threaded, Arm-based 64-core Graviton2 processor provides more than 3X the performance per watt over more traditional multithreaded processors with fewer cores. Similarly, AWS says that over 70 percent of the instances available on EC2 take advantage of its Nitro system for offloading tasks like virtualization, security and networking to dedicated hardware and Arm-based silicon.

One of the next big milestones for us all will be the commercialization of chiplets. Chiplet designs allow companies to maximize yields and mix process manufacturing nodes for optimal effect. Chiplet designs, however, will also have a positive impact on the power-performance equation. Imagine a 4 x 4 array of chiplets each with 640 CPUs, 640 NPUs, and gigabytes of SLC all linked by a high-speed interconnect. Such a system could deliver petaflops of performance on around 1.4kW of power.

And what do we do when we tap out the gains there? Dig deeper with chip-level technologies like in-memory computing and computational storage: Over 60 percent of total system energy gets spent moving data between main memory and compute by some estimates. We’ve only scratched the surface of what is possible at the device and circuit level.

Granted, these advances will take some very hard work, but I’m confident they can occur before we hit a power wall.

This blog originally appeared in Semiconductor Engineering

Discover Low Power Compute with Arm

Arm’s power-efficient technologies sit at the heart of billions of devices worldwide. Discover more about Arm products.

Article Text
Copy Text

Any re-use permitted for informational and non-commercial or personal use only.

Editorial Contact

Brian Fuller and Jack Melling
Subscribe to Blogs and Podcasts
Get the latest blogs & podcasts direct from Arm

Latest on Twitter