Blog

January 26, 2026

Why cloud developers are moving to Arm: Building the AI-ready infrastructure of the future

Cloud developers are rapidly adopting Arm-based platforms for their unmatched performance-per-watt, lower costs, and faster path to scalable, production-ready AI workloads.

By Arm Editorial Team

As AI reshapes the digital landscape, developers face new pressures to build infrastructure that is not just powerful, but scalable, cost-efficient, and energy aware. Leading hyperscalers and AI leaders – Amazon, Google, Microsoft, Oracle Cloud Infrastructure, and NVIDIA – are deploying AI data centers using purpose-built compute based on Arm architectures.

This trend is already well underway. Close to 50% of new server deployments at top hyperscalers are set to be Arm-based. The Arm Neoverse platform is powering production-scale AI pipelines, vector search engines, real-time machine learning (ML) platforms, and cloud-native microservices, while delivering measurable improvements in cost-efficiency, throughput, and energy savings. As AI workloads expand, infrastructure choices matter more than ever.

If you’re a developer, this is the time to explore the tools available for adopting Arm-based infrastructure, many of which are self-service and designed to accelerate the transition. If you’re an enterprise CIO, read on to see which companies have already made the switch to Arm and the performance and cost advantages they’re seeing in production.

Here’s what’s changing, why it matters now, and how developers, many of whom are using self-service tools, are making the leap to Arm faster than you think.

AI workloads are end-to-end and Arm enables full-pipeline optimization

AI is no longer just about inference or model training. From data pre-processing and model orchestration to real-time serving and memory management, today’s AI stacks stretch across the entire compute pipeline. That introduces new system-level challenges in latency, cost, power, and scaling, challenges that general-purpose CPUs were never designed for.

Arm plays a central role in enabling this transformation; not just at the CPU level, but across the entire AI system architecture.

At AWS, Arm Neoverse cores power Graviton for general-purpose compute, Nitro for data processing units (DPUs), and serve as the head node for AI accelerators, enabling tightly integrated, energy-efficient infrastructure for AI pipelines.

Similarly, at NVIDIA, Arm is the foundation of the Grace CPU and Vera host that are both used as AI head nodes, and also powers the BlueField DPU for data movement and offload, creating a unified platform approach to AI data center design.

With high performance-per-watt, strong memory bandwidth, and growing deployment across platforms like AWS Graviton, Google Cloud Axion, Microsoft Azure Cobalt, and NVIDIA Grace, Arm-based infrastructure is increasingly chosen for scalable, cost-effective AI workloads.

Why Arm is becoming standard for cloud compute

We’re seeing a directional change across the cloud: world-leading cloud providers are investing in Arm-based infrastructure as their default path for scaling converged AI data centers. It’s not a trial phase; it reflects a long-term architectural strategy.

Software companies such as Atlassian, Spotify and Uber have started migrating critical workloads to Arm-based cloud infrastructure using public tools and community documentation, without requiring deep platform rewrites. Atlassian reported reduced compute costs and improved CI/CD pipeline speeds after moving critical services to Arm instances. Spotify saw meaningful power savings and infrastructure efficiency gains when trialing Graviton for backend workloads. Uber has leveraged Arm-based infrastructure to optimize microservice performance while lowering per-instance operating costs.

Tools such as the Arm MCP Server and the Arm Cloud Migration Agent in GitHub Copilot are helping developers assess compatibility, accelerate cloud transitions, and scale reliably. Now released to all software developers, the Arm MCP Server – which is purpose built for any Arm cloud platform – brings cloud migration tools and expertise directly into your favorite AI Assistant, enabling agentic workflows.

This simplifies the migration path by automating best practices, accelerating the developer process, and providing real-time guidance; making it even easier for teams to unlock cost, energy, and performance benefits from day one. Feedback from early adopters confirm its strong utility in real-world migration scenarios.

Five examples of developers moving to Arm

Beyond these global software companies, other technology companies are experiencing similar benefits when adopting Arm-based cloud infrastructure for their day-to-day operations.

1. LLM inference costs reduced by 35% with Graviton3

Vociply AI, an AI startup deploying large language models (LLMs) at scale, reduced monthly infrastructure costs from $2,000 to $1,300 after switching to AWS Graviton3. Performance and efficiency improved too, including:

40% better price-performance

15.6% higher token throughput

23% lower power draw

Results were driven by Arm Neoverse cores, NEON optimizations, and quantized inference engines like llama.cpp.

2. Faster generative AI pipelines with 40% lower infrastructure cost

Esankethik, a generative IT and AI solutions platform, migrated its full stack – pre-processing, training, inference – to Arm-based Graviton instances. Results included:

25% faster inference latency

40% lower Lambda costs per million requests

15% better memory efficiency

Running preprocessing, training, and inference on Arm reduced bottlenecks and improved scalability.

3. Real-time ML scalability SiteMana

SiteMana, a lead generation technology company, moved real-time ML inference and data ingestion to Graviton3. Benefits:

~25% lower monthly costs

~15% faster p95 latency

2.5× higher network bandwidth

The migration addressed CPU throttling and stabilized performance under peak loads.

4. Developer pipeline efficiency at AuthZed

AuthZed, which provides a specialized platform for authorization infrastructure, standardized all its operations on Arm, from dev laptops to cloud. This led to:

40% faster local builds

20–25% more efficient CPU usage in prod

~20% compute cost reduction

This approach streamlined workflows without requiring changes to developer habits.

5. Higher throughput for AI Search at Zilliz Cloud

Zilliz Cloud, a fully managed vector database engineered for AI applications in production, migrated its vector search engine to Graviton3. This resulted in:

50% faster index building performance

20% faster vector search on billion-scale queries

Lower cost per query, higher throughput

These results apply to semantic search, retrieval-augmented generation (RAG), and multimodal AI tasks.

Built for the AI cloud era

Arm Neoverse is architected for modern workloads – LLMs, vector search, real-time ML, analytics, and high-density microservices. Compared to x86, Arm-based instances offer:

Greater price performance benefit

Higher performance for AI and Cloud Native workloads

A mature software ecosystem and robust developer tooling
Optimized support for AI frameworks through Arm Kleidi, enabling seamless performance tuning and integration

Arm provides a suite for developer resources, performance tuning guides, and cloud migration checklists, to simplify the migration process for AI and cloud workloads. These reduce friction and support performance tuning without full platform rewrites.

Developers can explore the Arm Cloud Migration Program for migration resources, technical guides, and expert advice.

In addition, the Arm MCP Server is now available to all developers, helping them to identify and execute x86-to-Arm transitions. Developers can access the Arm MCP Server here.

The infrastructure platform for AI

Arm-based cloud infrastructure is emerging as a central pillar of AI compute strategy.

As workloads scale and energy efficiency becomes critical, infrastructure needs to do more with less. Arm offers a practical path forward for developers building the next generation of AI systems.

By Arm Editorial Team

Article Text

Copy Text

Any re-use permitted for informational and non-commercial or personal use only.

Editorial Contact

Arm Editorial Team

editorial@arm.com

Stay informed with Arm's top stories, insights, and conversations.

Blog

Jan 06, 2026

Arm in the agentic era: Scaling the converged AI data center

Arm Editorial Team

Blog

Dec 04, 2025

How Arm is redefining compute through the converged AI data center

Dermot O'Driscoll, Vice President of Product Solutions, Cloud AI Business Unit, Arm

Blog

Nov 18, 2025

What performance and efficiency does Microsoft Azure Cobalt 100 VMs deliver using Arm Neoverse performance?

Sameer Nori, Senior Manager, Software Ecosystem, Cloud AI Business Unit, Arm

Blog

Nov 06, 2025

Arm propels cloud-to-car development in new Google Axion-based instances

Bhumik Patel, Director, Server Ecosystem Development, Arm

Blog

Nov 10, 2025

Inside KubeCon 2025: How Arm and its CNCF partners are transforming the way the world builds cloud-native systems

Arm Editorial Team

Blog

Nov 18, 2025

Powering Microsoft’s Azure Cobalt 200 with Arm Neoverse CSS V3: The next generation of Arm-based compute for the AI era

Arm Editorial Team

Media Information

Latest on X

; Arm @Arm ·

13 Mar 2032590808931561698

Partnership is driving Texas’ leadership in AI and semiconductor innovation. 🤝

Arm is proud to support the state's growing innovation ecosystem - bringing together research, infrastructure, and workforce talent to power the future of AI and compute. https://okt.to/2xFUBC

Reply on Twitter 2032590808931561698 Retweet on Twitter 2032590808931561698 1 Like on Twitter 2032590808931561698 19 Twitter 2032590808931561698

; Arm @Arm ·

13 Mar 2032540993321328997

Space exploration is no longer confined to labs and launchpads. 🚀

Students are designing, coding, and experimenting with technologies inspired by real missions. We're proud to power the platforms that are helping connect classrooms, maker spaces, and libraries to outer space

Reply on Twitter 2032540993321328997 Retweet on Twitter 2032540993321328997 6 Like on Twitter 2032540993321328997 21 Twitter 2032540993321328997

; Arm @Arm ·

12 Mar 2032181622494105735

AI is exposing the limits of legacy data center designs.

In response to the shift to purpose-built rack-level systems and continuous inferencing through agentic AI demands, Arm Neoverse is increasingly the choice CPU foundation for system architects, delivering stability,

Reply on Twitter 2032181622494105735 Retweet on Twitter 2032181622494105735 7 Like on Twitter 2032181622494105735 40 Twitter 2032181622494105735

; Arm @Arm ·

11 Mar 2031735686819188956

🚨 Just Announced: Arm CEO Rene Haas will deliver the keynote at #ArmEverywhere.

Join us live on March 24 for a defining moment in AI compute, as Rene shares more on the evolution of intelligence and the ecosystem powering innovation at scale.

Watch the livestream:

Reply on Twitter 2031735686819188956 Retweet on Twitter 2031735686819188956 11 Like on Twitter 2031735686819188956 36 Twitter 2031735686819188956

; Arm @Arm ·

11 Mar 2031690647103537409

As AI transforms embedded products, teams are under more pressure than ever to deliver, faster and with fewer resources.

At #EmbeddedWorld, we shared how Arm’s tools and software, including Keil MDK6, help simplify AI adoption at the edge.

Thanks to everyone who joined the

Reply on Twitter 2031690647103537409 Retweet on Twitter 2031690647103537409 7 Like on Twitter 2031690647103537409 30 Twitter 2031690647103537409

; Arm @Arm ·

10 Mar 2031361980146045329

Hello from Nuremberg! 👋

We’re excited to be on the show floor at #EmbeddedWorld, showcasing how Arm is powering intelligent systems everywhere.

Find us at Booth 4-504 to explore our demos in action, join our speaking session, and chat with our team.

If you’re at EW, come say

Reply on Twitter 2031361980146045329 Retweet on Twitter 2031361980146045329 3 Like on Twitter 2031361980146045329 30 Twitter 2031361980146045329

; Arm @Arm ·

9 Mar 2031148893669404813

Can your smartphone be your yoga instructor? 🧘

Our Smart Yoga Tutor combines BlazePose, an on-device LLM and TTS to deliver real-time, personalized feedback, accelerated by Arm Scalable Matrix Extension 2 (SME2).

✅ Up to 2.5x faster full AI pipeline
✅ Up to 4.7x faster LLM

Reply on Twitter 2031148893669404813 Retweet on Twitter 2031148893669404813 4 Like on Twitter 2031148893669404813 21 Twitter 2031148893669404813

Why cloud developers are moving to Arm: Building the AI-ready infrastructure of the future

AI workloads are end-to-end and Arm enables full-pipeline optimization

Why Arm is becoming standard for cloud compute

Five examples of developers moving to Arm

1. LLM inference costs reduced by 35% with Graviton3

2. Faster generative AI pipelines with 40% lower infrastructure cost

3. Real-time ML scalability SiteMana

4. Developer pipeline efficiency at AuthZed

5. Higher throughput for AI Search at Zilliz Cloud

Built for the AI cloud era

The infrastructure platform for AI

Editorial Contact

Related

Arm in the agentic era: Scaling the converged AI data center

How Arm is redefining compute through the converged AI data center

What performance and efficiency does Microsoft Azure Cobalt 100 VMs deliver using Arm Neoverse performance?

Arm propels cloud-to-car development in new Google Axion-based instances

Inside KubeCon 2025: How Arm and its CNCF partners are transforming the way the world builds cloud-native systems

Powering Microsoft’s Azure Cobalt 200 with Arm Neoverse CSS V3: The next generation of Arm-based compute for the AI era

Media Information

Company Overview & History

Arm Corporate Guidelines

Media Contacts

Latest on X