Blog

April 2, 2026

Gemma 4 on Arm: Accessible, immediate, optimized on-device AI to accelerate the mobile app experience

Gemma 4 on Arm brings fast, privacy-preserving, power-efficient AI directly onto Android devices, helping developers deliver richer real-time app experiences to billions of users without relying on the cloud.

By Alex Spinelli, SVP, AI and Developer Platforms, Arm

Real-time assistance, seamless communication, and greater personalization are now baseline expectations for billions of smartphone users worldwide. Highly capable on-device AI that operates in the power envelope of modern smartphones is essential to delivering instant, intelligent experiences at scale, while unlocking AI’s future potential. 

Google’s launch of Gemma 4 accelerates the ongoing shift to on-device AI, enabling developers to seamlessly access optimized performance and bring increasingly capable AI experiences directly into the apps people use every day. Unlocking these benefits at a global smartphone scale depends on the underlying compute foundation, with one constant that is ubiquitous across the entire Android ecosystem: Arm.

What’s new for Gemma 4 

Gemma 4 further advances on-device AI by delivering improved performance and efficiency, while expanding support for the kinds of multimodal experiences that matter most on Arm-based devices, including reasoning, agentic workflows, and vision-and-audio enabled use cases. With enhanced capabilities across text, audio*, and image, broader language support, and a foundation for real-time assistive experiences, it enables more responsive, context-aware interactions directly on-device without increasing memory footprint.

Exploring Gemma 4 performance on Arm CPUs

In early Arm engineering tests, SME2 shows promising performance gains for running Gemma 4 workloads. Initial tests on the Gemma 4 E2B (Effective 2 Billion) model demonstrate an average of 5.5x speedup in prefill (processing user input) and up to 1.6x faster decode (generating responses), highlighting the potential of Armv9 CPU innovations for on-device AI workloads. These engineering tests include upcoming patches to Google XNNPACK and Arm KleidiAI.

As an early example of what is possible with these improvements, Envision, an accessibility-focused app for blind and low-vision users, evaluated an on-device approach for delivering more of its experience locally. Historically, Envision’s scene interpretation relied on cloud connectivity. In this prototype, Gemma 4 was evaluated running locally on Arm CPUs with SME2 capabilities, enabling users to capture a photo and receive a detailed scene description directly on-device without requiring a network connection or sending sensitive data off-device.

These explorations on Arm CPUs highlight the broader flexibility of the Arm compute platform and the potential for continued innovation across CPU and heterogeneous compute pathways.

The result is lower latency, stronger privacy, and more consistent user experiences regardless of connectivity conditions. This shift from cloud dependency to local inference is critical for mobile applications. It has the potential to reduce infrastructure costs for developers, improve reliability for users, and unlock new categories of real-time applications. 

“Envision is excited to work with Arm and Google to bring powerful accessibility experiences directly onto smartphones. Running visual understanding models like Gemma 4 on-device on SME2-enabled Arm CPUs opens the door to reliable, low-latency scene description and visual Q&A for blind and low-vision users. For our community, the ability to access these capabilities offline is incredibly meaningful because it ensures the technology works wherever they are, while also improving privacy by keeping more processing on the device itself.”  – Karthik Mahadevan, CEO, Envision 

Envision is an early example of what’s possible when Gemma 4 meets the Arm compute platform at mobile scale. As more developers integrate Gemma 4, on-device AI will increasingly become the default architecture rather than the exception. 

Why Arm matters for on-device AI at Android scale

The Armv9 architecture is the most secure, pervasive and advanced ISA ever. Arm Scalable Matrix Extension 2 (SME2) – a set of advanced CPU instructions in the Armv9 architecture – is a key technology, as it accelerates matrix-heavy AI workloads within the power envelope of smartphones. Already built into Arm C1 CPUs  that are integrated into the latest Android smartphone devices, SME2 unlocks higher sustained performance and improved efficiency.  

Through Arm KleidiAI –  Arm’s software acceleration layer integrated into leading runtime libraries, like Google’s XNNPACK, and frameworks, like Google LiteRT and MediaPipe  – the benefits of SME2 are readily accessible to mobile developers with no changes required to existing code, models or deployment pipelines. As a result, developers automatically access out-the-box performance optimizations simply by targeting Arm-based Android devices built on SME2. 

In practice, these software-level gains translate directly into better on-device experiences. Users benefit from faster responses, smoother sustained interactions, and more reliable on-device AI, all while maintaining battery life and thermal stability, even as models grow more capable. 

“Delivering Gemma 4 efficiently across the Android ecosystem requires deep collaboration across hardware and software. Our work with Arm reflects a shared commitment to advancing on-device AI, combining the benefits of the Armv9 architecture and built-in acceleration technologies, like SME2, with the Android operating system to unlock greater performance and efficiency at scale. Together, we’re making it easier for developers to bring fast, responsive, and privacy-preserving AI experiences to our users, without needing to modify their existing applications.”  – Sandeep Patil, Engineering Director, Android

Arm and Google: Building the future of on-device AI together 

As more applications move AI on-device, Arm and Google are committed to supporting developers with accessible performance optimizations and clear guidance that help Gemma 4 accelerate application experiences across all Arm-based mobile devices. 

The future of mobile AI will not be defined solely by larger models, but by how efficiently, securely, and pervasively they run at scale across the Android ecosystem. Through this collaboration, the benefits of on-device AI will be felt by billions of Android smartphone users worldwide.  

_{*only for E2B (Effective 2 Billion) and E4B (Effective 4 Billion)}

By Alex Spinelli, SVP, AI and Developer Platforms, Arm

Article Text

Copy Text

Any re-use permitted for informational and non-commercial or personal use only.

Editorial Contact

Arm Editorial Team

editorial@arm.com

Stay informed with Arm's top stories, insights, and conversations.

News

Oct 22, 2025

Redefining the Edge AI Developer Experience on Arm with New ExecuTorch 1.0 GA Release

Sharbani Roy, VP, AI & Developer Platforms, Arm

Blog

Jan 05, 2026

How on-device AI acceleration effortlessly improves everyday apps

Arm Editorial Team

Blog

Feb 11, 2026

Arm at GDC Festival of Gaming 2026: What mobile game developers need to know

Peter Hodges, Director, Developer Ecosystem Strategy, Arm

Blog

Mar 23, 2026

The Dawn of New Desktop Devices: Arm-Powered NVIDIA DGX Spark Workstations to Redefine AI Computing

Parag Beeraka, Senior Director, Consumer Computing, Edge AI Business Unit, Arm

Media Information

Latest on X

; Arm @Arm ·

1h 2059344098041929817

A landmark weekend for our partners, the @AstonMartinF1 Team. 🏁

🏆 @1JessicaHawkins claimed GT4 Silver victory at Oulton Park
🏆 Mathilda Paatz delivered the team's first-ever F1 ACADEMY™ win

At Arm, we’re proud to partner with the Aston Martin Aramco Formula One™ Team to

Reply on Twitter 2059344098041929817 Retweet on Twitter 2059344098041929817 0 Like on Twitter 2059344098041929817 0 Twitter 2059344098041929817

; Arm @Arm ·

7h 2059258313947086871

As AI moves into broader deployment, infrastructure decisions matter more than ever.

The latest @MoorInsStrat report highlights the growing importance of efficiency, orchestration and distributed compute, from devices to cloud.

The combination of technology innovation and

Reply on Twitter 2059258313947086871 Retweet on Twitter 2059258313947086871 2 Like on Twitter 2059258313947086871 23 Twitter 2059258313947086871

; Arm @Arm ·

24 May 2058660108561772730

🔔 Don’t miss Arm CEO, Rene Haas, live at #COMPUTEX2026.

As agentic AI reshapes the compute landscape, Rene will share why Arm is at the center of this shift.

Livestreamed here: https://okt.to/1WrSvc

Reply on Twitter 2058660108561772730 Retweet on Twitter 2058660108561772730 5 Like on Twitter 2058660108561772730 34 Twitter 2058660108561772730

; Arm @Arm ·

22 May 2057708452412076246

We've been recognized by @TheTimes Best Places to Work as the Technology Industry Winner for very large companies — and named in the UK Top 10 Best Very Big Places to Work. 🎉

This achievement belongs to the people behind Arm: teams pushing boundaries, collaborating openly, and

Reply on Twitter 2057708452412076246 Retweet on Twitter 2057708452412076246 5 Like on Twitter 2057708452412076246 36 Twitter 2057708452412076246

; Arm @Arm ·

21 May 2057447701961199988

From personal roots to global impact. 🌍

For Will Abbey, EVP and Chief Commercial Officer at Arm, our work with @Simprints & @gavi in Ghana shows how safe & secure AI-powered biometric ID can give children a digital identity and strengthen vaccine delivery from day one.

🎥

Reply on Twitter 2057447701961199988 Retweet on Twitter 2057447701961199988 1 Like on Twitter 2057447701961199988 10 Twitter 2057447701961199988

; Arm @Arm ·

20 May 2057160951917326402

From porting foundational infrastructure software to Arm with just 5 engineers in 3 months, to launching the Arm AGI CPU together earlier this year.

At #OCPBarcelona26, Meta shared why efficient, scalable Arm compute is becoming foundational to the next phase of AI

Reply on Twitter 2057160951917326402 Retweet on Twitter 2057160951917326402 2 Like on Twitter 2057160951917326402 39 Twitter 2057160951917326402

; Arm @Arm ·

19 May 2056803666074505240

Counting down to #COMPUTEX2026. ⏳

Join us live from Taipei on June 2 as Rene Haas shares how Arm is powering agentic AI — from cloud to edge.

Join us and discover why Arm's the compute platform is enabling AI at every scale: https://okt.to/Bk2m9g

Reply on Twitter 2056803666074505240 Retweet on Twitter 2056803666074505240 7 Like on Twitter 2056803666074505240 39 Twitter 2056803666074505240

Gemma 4 on Arm: Accessible, immediate, optimized on-device AI to accelerate the mobile app experience

What’s new for Gemma 4

Exploring Gemma 4 performance on Arm CPUs

Why Arm matters for on-device AI at Android scale

Arm and Google: Building the future of on-device AI together

Editorial Contact

Related

Redefining the Edge AI Developer Experience on Arm with New ExecuTorch 1.0 GA Release

How on-device AI acceleration effortlessly improves everyday apps

Arm at GDC Festival of Gaming 2026: What mobile game developers need to know

The Dawn of New Desktop Devices: Arm-Powered NVIDIA DGX Spark Workstations to Redefine AI Computing

Media Information

Company Overview & History

Arm Corporate Guidelines

Media Contacts

Latest on X

What’s new for Gemma 4 

Arm and Google: Building the future of on-device AI together