Arm Newsroom Blog
Blog

AI data center CPU demand: Why agentic AI is scaling the role of CPUs

With always-on, agent-based systems, hyperscalers are scaling CPUs to maximize performance per watt, rack efficiency and return on capital.
By Arm Editorial Team
chip- and city-like AI image

For much of the last decade, the data center conversation has revolved around accelerators. GPUs, TPUs and the like have dominated headlines, investor decks and infrastructure roadmaps as AI training workloads exploded in scale. But as AI moves from model experimentation into scaled-up products, user-facing applications – and increasingly into always-on, agent-based inference – a more profound shift is underway inside hyperscale data centers.

And amid this shift, the CPU’s role is becoming more crucial than ever, not as a legacy holdover, but as the orchestration and data-processing engine that makes modern AI systems viable at scale.

This shift helps explain a striking point from Arm’s recent quarterly earnings: Arm’s data center business is expected to match or surpass its smartphone business within the next few years. For investors, that statement signals more than a growth. It reflects a structural change in how hyperscalers design, deploy and monetize AI infrastructure, and that’s why CPU scalability, efficiency and ease of system integration matter more than ever.

Why does AI growth increase CPU demand?

AI growth increases CPU demand because modern AI systems are becoming continuous, agent-based workloads that require scheduling, coordination, memory access, data retrieval, pre- and post-processing, security, and low-latency control across heterogeneous infrastructure. Accelerators handle model computation, but CPUs orchestrate the system around them. As AI moves from episodic training to always-on inference, hyperscalers need more high-core-count, power-efficient CPUs to improve accelerator utilization, rack efficiency, and return on capital.

How agentic AI changes data center CPU demand

Early AI infrastructure was built around sustained, high-intensity workloads: Large-scale  model training and high-throughput inference . In those environments, accelerators understandably took center stage.

That model no longer reflects reality.

As modern AI applications  expand across enterprise platforms and user-facing products, they are increasingly agent-based. These are persistent systems that plan, reason, retrieve information, coordinate actions, and interact continuously with users and services, all while learning through these interactions. 

Agentic AI systems don’t just run models; they orchestrate workflows and process data in real time across databases, web services and application layers. Agents don’t sleep. They schedule, retrieve context, manage memory and coordinate actions continuously.

Practically speaking, this means:

  • Continuous scheduling and coordination
  • Persistent memory access (KV cache, vector databases, context retrieval)
  • Pre- and post-processing around every model invocation
  • Secure, low-latency control paths between heterogeneous components.

Those responsibilities fall squarely on the CPU.

Why always-on AI workloads need high-core-count CPUs

Agentic AI doesn’t just increase CPU importance; it changes CPU demand characteristics.

Instead of brief orchestration bursts around accelerator-heavy workloads, AI systems now spend a greater share of time in CPU-bound activities. These workloads require large numbers of power-efficient cores operating continuously, often within fixed power and cost envelopes.

This is not theoretical. Hyperscalers are scaling CPUs aggressively:

These are structural increases in CPU density — not incremental bumps. They reflect recognition that CPU-led orchestration and data processing are now critical limiting factors in AI data center scalability.

As AI workloads become continuous rather than episodic, core count and efficiency become defining metrics.

How CPU scaling improves AI data center economics 

For investors, the implications are fundamentally economic, not technical. Accelerator availability and model scope (e.g., larger, more capable foundation models, increasing parameter counts, multimodality, etc.) are no longer the only limiting factors in AI data centers. Power, cooling and capital efficiency have joined the list as hyperscalers are now operating within fixed energy envelopes and physical rack space constraints, and returns depend on how efficiently infrastructure is utilized. In this environment, maximizing output per rack – not peak performance in isolation – has become the defining metric for sustainable AI growth.

And accelerators alone don’t solve for these constraints. In fact, without sufficient CPU capacity to orchestrate workloads efficiently, expensive AI accelerators can sit idle or underutilized.

Scalable Arm-based CPUs address this problem by enabling hyperscalers to deliver:

  • Always-on inference within fixed power budgets
  • Better accelerator utilization
  • Higher AI output per rack
  • System-level integration rather than bolt-on architectures

That is why CPU scaling and AI economics are now directly linked.

What does the CPU do in an AI data center?

In AI infrastructure, the CPU coordinates the work around accelerators. It handles scheduling, request routing, pre-processing, post-processing, memory access, context retrieval, data movement, security, networking, and low-latency control paths between CPUs, GPUs, accelerators, storage, databases, and application services.

Why AI infrastructure demand for CPUs is structural, not cyclical

Independent analysis reinforces that this shift is not a short-term correction but a multi-year architectural realignment. As research from Futurum Group notes, the future of AI infrastructure is moving away from “how much raw compute can we deploy” toward “how intelligently can we orchestrate compute across diverse requirements.”

This evolution favors scalable, power-efficient CPU architectures that can serve as the control layer across heterogeneous systems.

For Arm, this aligns directly with long-standing strengths: scalable architecture, power efficiency and an ecosystem that enables hyperscalers to build custom silicon without fragmenting software.

Arm does not monetize individual AI models or specific accelerator wins; it monetizes the expansion of compute itself, across every new core deployed to support AI workloads.

That distinction matters in a world where core counts are rising structurally.

Why Arm is positioned for AI data center growth 

AI infrastructure is no longer constrained only by accelerators. As AI moves toward agentic, always-on inference, hyperscalers need CPUs that can orchestrate workloads continuously, keep accelerators utilized, improve output per rack, and operate efficiently within power and cooling limits. That makes high-core-count, power-efficient CPUs central to AI data center economics — and strengthens Arm’s role as AI infrastructure scales.

FAQs

Why does AI need CPUs if GPUs do the model computation?

GPUs and accelerators perform much of the model computation, but CPUs coordinate the system around them. CPUs handle scheduling, routing, memory access, context retrieval, data movement, pre- and post-processing, security, and control paths across heterogeneous infrastructure.

What is CPU orchestration in AI infrastructure?

CPU orchestration is the coordination layer that keeps AI workloads moving across accelerators, memory, networking, storage, databases, and application services.

How does agentic AI increase CPU demand?

Agentic AI systems operate continuously. They plan, reason, retrieve context, manage memory, coordinate actions, call tools, and interact with users and services, creating more CPU-bound work around each model invocation.

Why is performance per watt important in AI data centers?

AI data centers operate within fixed power, cooling, and rack-space limits. Improving performance per watt helps hyperscalers increase useful AI output without simply adding more power-hungry infrastructure.

How do CPUs improve accelerator utilization?

CPUs help keep accelerators fed with data, requests, context, and orchestration. Without enough CPU capacity, expensive AI accelerators can sit idle or underutilized.

Why is Arm relevant to AI data center growth?

Arm’s data center relevance comes from scalable, power-efficient CPU architectures, Arm Neoverse adoption, hyperscaler custom silicon momentum, and rising demand for high-core-count CPUs in AI infrastructure.

Forward-looking statements

This news blog contains forward-looking statements within the meaning of Section 27A of the Securities Act of 1933, as amended, and Section 21E of the Securities Exchange Act of 1934, as amended, and as defined in the Private Securities Litigation Reform Act of 1995. All statements other than statements of historical fact could be deemed forward-looking statements, including without limitation, statements relating to the anticipated growth of Arm’s data center business, Arm’s expected share among top hyperscalers, and expectations with respect to CPU importance and demand that are based on Arm’s current expectations, estimates, assumptions and projections. In some cases, you can identify forward-looking statements because they contain words such as “may,” “might,” “will,” “could,” “would,” “should,” “expect,” “is/are likely to,” “intend,” “plan,” “objective,” “anticipate,” “believe,” “estimate,” “predict,” “potential,” “target,” “continue,” “ongoing” or similar words or phrases, or the negative of these words or phrases. These statements involve known and unknown risks, uncertainties and other important factors that may cause Arm’s actual results, levels of activity, performance or achievements to be materially different from the information expressed or implied by these forward-looking statements. There are many factors that could cause or contribute to such differences, including, but not limited to, those discussed in Arm’s Annual Report on Form 20-F for the fiscal year ended March 31, 2025, filed with the Securities and Exchange Commission on May 28, 2025. Any forward-looking statement in this news blog speaks only as of the date hereof, and Arm does not undertake any obligation to update any forward-looking statement to reflect events or circumstances after the date of this news blog except as required by applicable law. Arm cautions that you should not place undue reliance on any of Arm’s forward-looking statements.

Article Text
Copy Text

Any re-use permitted for informational and non-commercial or personal use only.

Editorial Contact

Arm Editorial Team
Stay informed with Arm's top stories, insights, and conversations.

Latest on X

promopromopromopromopromopromopromopromo