Arm Newsroom Blog
Blog

Why cloud developers are moving to Arm: Building the AI-ready infrastructure of the future 

Cloud developers are rapidly adopting Arm-based platforms for their unmatched performance-per-watt, lower costs, and faster path to scalable, production-ready AI workloads.
By Arm Editorial Team

As AI reshapes the digital landscape, developers face new pressures to build infrastructure that is not just powerful, but scalable, cost-efficient, and energy aware. Leading hyperscalers and AI leaders – Amazon, Google, Microsoft, Oracle Cloud Infrastructure, and NVIDIA – are deploying AI data centers using purpose-built compute based on Arm architectures. 

This trend is already well underway. Close to 50% of new server deployments at top hyperscalers are set to be Arm-based. The Arm Neoverse platform is powering production-scale AI pipelines, vector search engines, real-time machine learning (ML) platforms, and cloud-native microservices, while delivering measurable improvements in cost-efficiency, throughput, and energy savings. As AI workloads expand, infrastructure choices matter more than ever. 

If you’re a developer, this is the time to explore the tools available for adopting Arm-based infrastructure, many of which are self-service and designed to accelerate the transition. If you’re an enterprise CIO, read on to see which companies have already made the switch to Arm and the performance and cost advantages they’re seeing in production. 

Here’s what’s changing, why it matters now, and how developers, many of whom are using self-service tools, are making the leap to Arm faster than you think. 

AI workloads are end-to-end and Arm enables full-pipeline optimization

AI is no longer just about inference or model training. From data pre-processing and model orchestration to real-time serving and memory management, today’s AI stacks stretch across the entire compute pipeline. That introduces new system-level challenges in latency, cost, power, and scaling, challenges that general-purpose CPUs were never designed for. 

Arm plays a central role in enabling this transformation; not just at the CPU level, but across the entire AI system architecture.   

At AWS, Arm Neoverse cores power Graviton for general-purpose compute, Nitro for data processing units (DPUs), and serve as the head node for AI accelerators, enabling tightly integrated, energy-efficient infrastructure for AI pipelines. 

Similarly, at NVIDIA, Arm is the foundation of the Grace CPU and Vera host that are both used as AI head nodes, and also powers the BlueField DPU for data movement and offload, creating a unified platform approach to AI data center design. 

With high performance-per-watt, strong memory bandwidth, and growing deployment across platforms like AWS Graviton, Google Cloud Axion, Microsoft Azure Cobalt, and NVIDIA Grace, Arm-based infrastructure is increasingly chosen for scalable, cost-effective AI workloads. 

Why Arm is becoming standard for cloud compute

We’re seeing a directional change across the cloud: world-leading cloud providers are investing in Arm-based infrastructure as their default path for scaling converged AI data centers. It’s not a trial phase; it reflects a long-term architectural strategy. 

Software companies such as Atlassian, Spotify and Uber have started migrating critical workloads to Arm-based cloud infrastructure using public tools and community documentation, without requiring deep platform rewrites. Atlassian reported reduced compute costs and improved CI/CD pipeline speeds after moving critical services to Arm instances. Spotify saw meaningful power savings and infrastructure efficiency gains when trialing Graviton for backend workloads. Uber has leveraged Arm-based infrastructure to optimize microservice performance while lowering per-instance operating costs.  

Tools such as the Arm MCP Server and the Arm Cloud Migration Agent in GitHub Copilot are helping developers assess compatibility, accelerate cloud transitions, and scale reliably. Now released to all software developers, the Arm MCP Server – which is purpose built for any Arm cloud platform – brings cloud migration tools and expertise directly into your favorite AI Assistant, enabling agentic workflows.

This simplifies the migration path by automating best practices, accelerating the developer process, and providing real-time guidance; making it even easier for teams to unlock cost, energy, and performance benefits from day one. Feedback from early adopters confirm its strong utility in real-world migration scenarios. 

Five examples of developers moving to Arm 

Beyond these global software companies, other technology companies are experiencing similar benefits when adopting Arm-based cloud infrastructure for their day-to-day operations. 

1. LLM inference costs reduced by 35% with Graviton3 

Vociply AI, an AI startup deploying large language models (LLMs) at scale, reduced monthly infrastructure costs from $2,000 to $1,300 after switching to AWS Graviton3. Performance and efficiency improved too, including: 

  • 40% better price-performance 
  • 15.6% higher token throughput 
  • 23% lower power draw 

Results were driven by Arm Neoverse cores, NEON optimizations, and quantized inference engines like llama.cpp. 

2. Faster generative AI pipelines with 40% lower infrastructure cost 

Esankethik, a generative IT and AI solutions platform, migrated its full stack – pre-processing, training, inference – to Arm-based Graviton instances. Results included: 

  • 25% faster inference latency 
  • 40% lower Lambda costs per million requests 
  • 15% better memory efficiency 

Running preprocessing, training, and inference on Arm reduced bottlenecks and improved scalability. 

3. Real-time ML scalability SiteMana 

SiteMana, a lead generation technology company, moved real-time ML inference and data ingestion to Graviton3. Benefits: 

  • ~25% lower monthly costs 
  • ~15% faster p95 latency 
  • 2.5× higher network bandwidth 

The migration addressed CPU throttling and stabilized performance under peak loads. 

4. Developer pipeline efficiency at AuthZed 

AuthZed, which provides a specialized platform for authorization infrastructure, standardized all its operations on Arm, from dev laptops to cloud. This led to: 

  • 40% faster local builds 
  • 20–25% more efficient CPU usage in prod 
  • ~20% compute cost reduction 

This approach streamlined workflows without requiring changes to developer habits. 

5. Higher throughput for AI Search at Zilliz Cloud 

Zilliz Cloud, a fully managed vector database engineered for AI applications in production, migrated its vector search engine to Graviton3. This resulted in: 

  • 50% faster index building performance 
  • 20% faster vector search on billion-scale queries 
  • Lower cost per query, higher throughput 

These results apply to semantic search, retrieval-augmented  generation (RAG), and multimodal AI tasks.

Built for the AI cloud era 

Arm Neoverse is architected for modern workloads – LLMs, vector search, real-time ML, analytics, and high-density microservices. Compared to x86, Arm-based instances offer: 

  • Greater price performance benefit  
  • Higher performance for AI and Cloud Native workloads 
  • A mature software ecosystem and robust developer tooling 
  • Optimized support for AI frameworks through Arm Kleidi, enabling seamless performance tuning and integration

Arm provides a suite for developer resources, performance tuning guides, and cloud migration checklists, to simplify the migration process for AI and cloud workloads. These reduce friction and support performance tuning without full platform rewrites.  

Developers can explore the Arm Cloud Migration Program for migration resources, technical guides, and expert advice.  

In addition, the Arm MCP Server is now available to all developers, helping them to identify and execute x86-to-Arm transitions. Developers can access the Arm MCP Server here

The infrastructure platform for AI

Arm-based cloud infrastructure is emerging as a central pillar of AI compute strategy.  

As workloads scale and energy efficiency becomes critical, infrastructure needs to do more with less. Arm offers a practical path forward for developers building the next generation of AI systems.

Article Text
Copy Text

Any re-use permitted for informational and non-commercial or personal use only.

Editorial Contact

Arm Editorial Team
Subscribe to Blogs and Podcasts
Get the latest blogs & podcasts direct from Arm

Latest on X

promopromopromopromopromopromopromopromo