Blog

May 15, 2024

Generative AI is on Mobile and it’s Powered by Arm

Exciting new developments that demonstrate the advanced AI capabilities of the Arm CPU.

By James McNiven, Vice President of Product Management, Client Line of Business, Arm

Generative AI, which includes today’s well-known, highly publicized large language models (LLMs), has arrived at the edge on mobile. This means that AI generative inferences, from generating images and videos to understanding words in context, are starting to be processed entirely on the mobile device, rather than being sent to the Cloud and back.

Arm is the foundational technology to enable AI to run everywhere and when it comes to generative AI on mobile, there are some exciting, new developments that demonstrate this in action, from the latest AI-enabled flagship smartphones to LLMs being directly processed on the Arm CPU.

New AI-powered smartphones

High performance AI-enabled smartphones are now on the market, which are built on Arm’s v9 CPU and GPU technologies. These include the new MediaTek Dimensity 9300-powered vivo X100 and X100 Pro smartphones, Samsung Galaxy S24, and the Google Pixel 8.

The combination of performance and efficiency provided by these flagship mobile devices are delivering unprecedented opportunities for AI innovation. In fact, Arm’s own CPU and GPU performance improvements have doubled AI processing capabilities every two years during the past decade.

This trend will only advance in the future with more AI performance, technologies, and features on our robust consumer technology roadmap. This will be supported by the rise of AI inference at the edge, the process of using a trained model like LLMs to power AI-based applications, with CPUs being best placed to serve this need as more AI support and specialized instructions continue to be added.

It all starts on the CPU….

In most cases, the use of AI on our favorite mobile devices starts on the CPU, with some good examples being face, hand and body tracking, advanced camera effects and filters, and segmentation across the many social applications. The CPU will handle such AI workloads in their entirety or be supported by accelerators, including GPUs or NPUs. Arm technology is crucial to enabling these AI workloads, as our CPU designs are pervasive across the SoCs in today’s smartphones used by billions of people worldwide.

This has led to 70 percent of AI in today’s third-party applications running on Arm CPUs, including the latest social, health and camera-based applications and many more. Alongside the pervasiveness of the designs, the flexibility and AI capabilities of the Arm CPU makes it the best technology for mobile developers to target for their applications’ AI workloads.

In terms of flexibility, Arm CPUs can run a wide variety of neural networks in many different data formats. Looking ahead, future Arm CPUs will include more AI capabilities in the instruction set for the benefit of Arm’s industry-leading ecosystem, like the Scalable Matrix Extension (SME) for the Armv9-A architecture. These help the world’s developers deliver improved performance, innovative features and scalability for their AI-based applications.

The new Arm Kleidi Libraries, which will be embedded directly into AI frameworks, enable developers to transparently access the outstanding AI capabilities of the Arm CPU, so they can build their applications quickly at the highest possible performance.

The combination of leading hardware and software ecosystem support means Arm has a performant compute platform that is enabling the rise of generative AI at the edge, which could include gaming advancements, image enhancements, language translation, text generation and virtual assistants.

LLM on mobile on the Arm compute platform

At Mobile World Congress (MWC) 2024, we produced a virtual assistant demo that utilized Meta’s Llama2-7B LLM on mobile via a chat-based application. However, new models continue to emerge and we are committed to improving the LLM experience on Arm.

When the latest Llama3 model from Meta and Phi-3 3.8B model from Microsoft came out, we worked quickly to run them on Arm CPUs on mobile. These new AI models are far more capable and can respond to a wider range of questions. Our latest demo utilizes Microsoft’s Phi-3 3.8B model on mobile through ‘Ada’, a chatbot specifically trained to be a virtual teaching assistant for science and coding.

The generative AI workloads take place entirely at the edge on the mobile device on the Arm CPUs, with no involvement from accelerators. The impressive performance is enabled through a combination of existing CPU instructions for AI, alongside dedicated software optimizations for LLMs through the ubiquitous Arm compute platform that includes the Arm AI software libraries.

As you can see from the video above, there is a very impressive time-to-first token response performance and a text generation rate of just under 15 tokens per second that is faster than the average human reading speed. This is made possible by highly optimized CPU routines in the software library developed by the Arm engineering team that improves time-to-first token and text generation significantly compared to native implementation of the models

The Arm CPU provides the AI developer community with opportunities to experiment with their own techniques to provide further software optimizations that make LLMs smaller, more efficient and faster.

Enabling more efficient, smaller LLMs means more AI processing can take place at the edge. The user benefits from quicker, more responsive AI-based experiences, as well as greater privacy through user data being processed locally on the mobile device. Meanwhile, for the mobile ecosystem, there are lower costs and greater scalability options to enable AI deployment across billions of mobile devices.

We are also excited to see the developer open-source community engaged in working with models on Arm. This was demonstrated by the fact that developers in the open-source community managed to have the new Llama3 and Phi-3 3.8B models up and running on Arm in around 48 hours. We look forward to seeing more open-source engagement with generative AI on Arm.

Find out more information about the previous Llama2-7B demo and current Phi-3 3.8B demo from the Arm engineers that developed them in this technical blog.

Driving generative AI on mobile

As the most ubiquitous mobile compute platform and leader in efficient compute, Arm has a responsibility to enable the most efficient and highest-performing generative AI at the edge. We are already demonstrating the impressive performance of LLMs that are running entirely on our leading CPU technologies. However, this is just the start.

Through a combination of smaller, more efficient LLMs, improved performance on mobile devices built on Arm CPUs and innovative software optimizations from our industry-leading ecosystem, generative AI on mobile will continue to proliferate.

Arm is foundational to AI and we will enable AI everywhere, for every developer, with the Arm CPU at the heart of future generative AI innovation on mobile.

By James McNiven, Vice President of Product Management, Client Line of Business, Arm

Article Text

Copy Text

Any re-use permitted for informational and non-commercial or personal use only.

Editorial Contact

Brian Fuller & Jack Melling

editorial@arm.com

Subscribe to Blogs and Podcasts

Get the latest blogs & podcasts direct from Arm

Blog

Jan 08, 2024

Arm: The Technology Foundation for AI Everywhere

Arm Editorial Team

Blog

Dec 20, 2023

How Can Strategic AI Chip Development Accelerate the Future of Technology?

Kevork Kechichian, EVP, Solutions Engineering, Arm

Blog

Jul 12, 2023

Unleashing the Power of Edge AI: A Comprehensive Guide for Companies in the Age of Innovation

Arm Editorial Team

Podcast

Dec 08, 2023

Pushing AI to the Edge: A Conversation

Media Information

Latest on X

; Arm @Arm ·

25 Oct 1982084746835378550

🚗 AI inside the car is redefining vehicle design, enhancing safety, personalization, and performance in ways drivers barely notice.

In this #AIToyToTools podcast series, we explore how AI is powering the shift from on-device intelligence to cloud-to-car integration.…

Reply on Twitter 1982084746835378550 Retweet on Twitter 1982084746835378550 6 Like on Twitter 1982084746835378550 29 Twitter 1982084746835378550

; Arm @Arm ·

24 Oct 1981797862242394130

We’re heading to #GitHubUniverse!

Catch us in the Festival Pavilion to explore demos, connect with experts, and discover more on GitHub-native development across cloud, PC and embedded. 💪

+ Wrap up day 1️⃣ with the team at our onsite Happy Hour: https://okt.to/657ei8

Reply on Twitter 1981797862242394130 Retweet on Twitter 1981797862242394130 1 Like on Twitter 1981797862242394130 13 Twitter 1981797862242394130

; Arm @Arm ·

24 Oct 1981750164902588512

Moments before the Geely EX5 UK launch, Dipti Vachani talked about what’s under the hood - Arm Automotive Enhanced technology powering real-time safety, performance and intelligence. #GeelyAutoUK #GeelyEX5 @geelyautouk

Reply on Twitter 1981750164902588512 Retweet on Twitter 1981750164902588512 4 Like on Twitter 1981750164902588512 22 Twitter 1981750164902588512

; Arm @Arm ·

23 Oct 1981391805686911206

The @GeelyAutoUK EX5, built on Arm, brings software-driven intelligence and real-time safety to modern driving.

Powered by Arm Automotive Enhanced technologies, it delivers seamless screens, intelligent park assist & natural voice interaction.

https://okt.to/Nnhtfj…

Reply on Twitter 1981391805686911206 Retweet on Twitter 1981391805686911206 6 Like on Twitter 1981391805686911206 24 Twitter 1981391805686911206

; Arm @Arm ·

23 Oct 1981316423814132113

Something exciting is coming. 👀

Join @GeelyAutoUK for the Geely EX5 launch event, live streaming today, Thursday, Oct 23 at 16:45 BST / 8:45 PT.

We can’t say much yet, but trust us, you’ll want to see this.

🎥 Watch live:

Geely EX5 UK Launch | INTELLIGENT EVERYDAY

Welcome to the Future of Driving!Join us for the exclusive UK launch of the Geely EX5, the next-generation electri...

okt.to

Reply on Twitter 1981316423814132113 Retweet on Twitter 1981316423814132113 6 Like on Twitter 1981316423814132113 11 Twitter 1981316423814132113

; Arm @Arm ·

22 Oct 1981027795431039252

ExecuTorch 1.0 GA is here, and it's redefining what’s possible for AI at the edge. 🙌

Built on PyTorch & optimized for the Arm compute platform, it enables faster, higher-performance AI across devices, bringing edge AI to life everywhere, for everyone: https://okt.to/f83kPo

Reply on Twitter 1981027795431039252 Retweet on Twitter 1981027795431039252 2 Like on Twitter 1981027795431039252 15 Twitter 1981027795431039252

; Arm @Arm ·

22 Oct 1980994392576885221

Our new Bengaluru office set the stage for Rene Haas and @AshwiniVaishnaw's conversation on Tech Unheard.

Fresh from the grand opening, they dive into India’s next chapter, exploring how innovation, talent, and resilience are driving progress in tech🎙️: https://okt.to/ZBcOE1

Reply on Twitter 1980994392576885221 Retweet on Twitter 1980994392576885221 4 Like on Twitter 1980994392576885221 15 Twitter 1980994392576885221

; Arm @Arm ·

21 Oct 1980751324116390081

📷 This time last week at #OCPSummit25!

We joined tech leaders rethinking how the world powers AI, sharing our vision for the open, converged AI data center and highlighting Arm’s FCSA contribution, Arm Total Design expansion, and spot on the OCP Board: https://newsroom.arm.com/blog/key-takeaways-from-ocp-global-summit-2025?utm_source=twitter&utm_medium=social-organic&utm_content=blog&utm_campaign=mk03_infrastructure_na

Reply on Twitter 1980751324116390081 Retweet on Twitter 1980751324116390081 1 Like on Twitter 1980751324116390081 6 Twitter 1980751324116390081

; Arm @Arm ·

21 Oct 1980701958555201880

As Arm's ecosystem continues to unlock better performance per watt at hyperscale - customer momentum on Microsoft’s Arm-based Cobalt 100 VMs for general-purpose and cloud-native workloads is accelerating. ⬇️

How Azure Cobalt 100 VMs are powering real-world solutions, delivering performance and efficiency...

Learn how you can accelerate product development, scale analytics platforms, or improve user experiences with Azure Cobalt 100 VMs.

azure.microsoft.com

Reply on Twitter 1980701958555201880 Retweet on Twitter 1980701958555201880 5 Like on Twitter 1980701958555201880 29 Twitter 1980701958555201880

; Arm @Arm ·

20 Oct 1980383796160532981

For Rene Haas, leadership means moving fast, embracing change, and knowing when to pivot.

Speaking with future leaders at @CarnegieMellon, he highlighted the importance of experimentation and learning from fast failure.

🎧 Hear more on Tech Unheard: https://okt.to/Jbgn2E

Reply on Twitter 1980383796160532981 Retweet on Twitter 1980383796160532981 1 Like on Twitter 1980383796160532981 13 Twitter 1980383796160532981

; Arm @Arm ·

20 Oct 1980258226344952005

🆕 We’re expanding Arm Flexible Access to include our first Armv9 edge AI platform, giving innovators low-cost access to the performance, efficiency, & security they need to bring intelligence to every edge device, and drive the next wave of AI innovation: https://okt.to/GyJDV2

Reply on Twitter 1980258226344952005 Retweet on Twitter 1980258226344952005 7 Like on Twitter 1980258226344952005 33 Twitter 1980258226344952005

; Arm @Arm ·

19 Oct 1979880253595037712

“Stay useful, stay flexible and hold the door open for the person behind you.” – Tamika Curry Smith, Arm

Ahead of the #USGP, Arm and the @AstonMartinF1 Team joined forces in Austin to inspire high school and university students through hands-on learning, mentoring, and…

Reply on Twitter 1979880253595037712 Retweet on Twitter 1979880253595037712 4 Like on Twitter 1979880253595037712 16 Twitter 1979880253595037712

; Arm @Arm ·

17 Oct 1979140387022270924

SME2’s integration into @OPPO’s AI framework is a huge step for on-device AI!

Built into the Arm Lumex compute platform, SME2 delivers faster, more efficient AI performance - with a 1.2x performance improvement and 63% reduction in quantization precision loss in OPPO AI’s…

Reply on Twitter 1979140387022270924 Retweet on Twitter 1979140387022270924 4 Like on Twitter 1979140387022270924 25 Twitter 1979140387022270924

; Arm @Arm ·

16 Oct 1978955390197916037

The @nvidia DGX Spark is now available, with leading OEMs launching new AI workstations.

Built on the Arm-based NVIDIA GB10 Grace Blackwell Superchip, these desktop computing systems deliver petaflop-scale AI performance and support for models up to 200B parameters, all…

Reply on Twitter 1978955390197916037 Retweet on Twitter 1978955390197916037 6 Like on Twitter 1978955390197916037 28 Twitter 1978955390197916037

; Arm @Arm ·

16 Oct 1978953133196767368

A calm moment before the doors opened at #OCPSummit25. Since then, the Arm booth has been buzzing with conversations, meetings, and insights on building the Converged AI Datacenter, where performance meets efficiency and collaboration drives innovation.

Reply on Twitter 1978953133196767368 Retweet on Twitter 1978953133196767368 2 Like on Twitter 1978953133196767368 9 Twitter 1978953133196767368

; Arm @Arm ·

16 Oct 1978896049767866684

Designed for AI, Apple's new M5 chip, built on the Arm architecture, delivers major gains in performance and efficiency, bringing next-generation AI experiences to the new Macbook Pro, iPad Pro and Apple Vision Pro - with Arm innovation at the foundation.

Apple unleashes M5, the next big leap in AI performance for Apple silicon

Apple today announced M5, delivering advances to every aspect of the chip and the next big leap in AI.

okt.to

Reply on Twitter 1978896049767866684 Retweet on Twitter 1978896049767866684 4 Like on Twitter 1978896049767866684 37 Twitter 1978896049767866684

; Arm @Arm ·

16 Oct 1978881404151767303

Building on Arm’s appointment to the @OpenComputePrj board, Tech Arena spoke with Eddie Ramirez on how the Foundation Chiplet System Architecture drives openness, interoperability and efficiency across AI infrastructure:

#OCPSummit25

Arm Joins OCP Board, Contributes Chiplet Architecture Spec

Appointment to Open Compute Project Foundation board of directors, contribution of Foundation Chiplet System Architecture ...

okt.to

Reply on Twitter 1978881404151767303 Retweet on Twitter 1978881404151767303 3 Like on Twitter 1978881404151767303 20 Twitter 1978881404151767303

; Arm @Arm ·

16 Oct 1978820536663728292

Edge AI is accelerating faster than ever. ⚡

In the latest Arm Viewpoints podcast, Arm’s SVP and GM of IoT, Paul Williamson and @VDC_Research’s Chris Rommel discuss what’s driving innovation at the edge — from software complexity to smarter system design.…

Reply on Twitter 1978820536663728292 Retweet on Twitter 1978820536663728292 4 Like on Twitter 1978820536663728292 10 Twitter 1978820536663728292

; Arm @Arm ·

16 Oct 1978758591898149058

As vehicles become AI-defined, compute must evolve. 🚘

This week, we contributed the Foundation Chiplet System Architecture (FCSA) to @OpenComputePrj. This lays the groundwork for open, interoperable chiplet design, helping the whole ecosystem move faster.…

Reply on Twitter 1978758591898149058 Retweet on Twitter 1978758591898149058 7 Like on Twitter 1978758591898149058 30 Twitter 1978758591898149058

; Arm @Arm ·

16 Oct 1978664718416879941

Great conversations at #OCPSummit25! 💡

Eddie Ramirez joined partners from Meta, Rebellions, and Novatek to explore how Arm-based chiplets and ecosystem collaboration are shaping a new generation of composable, AI-ready infrastructure.

Reply on Twitter 1978664718416879941 Retweet on Twitter 1978664718416879941 2 Like on Twitter 1978664718416879941 15 Twitter 1978664718416879941

; Arm @Arm ·

15 Oct 1978609650007203986

It’s not #OCPSummit25 without the iconic Arm-based hardware and DPU wall. 💪
We've got a seriously impressive line up, featuring:
🔹 @NVIDIA Grace Blackwell GB300
🔹 NeuReality NR1
🔹 Novatek’s Arm Neoverse CSS N2-based SoC
🔹 …and plenty more from Marvell, xSight, and others

Reply on Twitter 1978609650007203986 Retweet on Twitter 1978609650007203986 1 Like on Twitter 1978609650007203986 8 Twitter 1978609650007203986

; Arm @Arm ·

15 Oct 1978494349966025044

Announcing a deepened, strategic partnership with @Meta to drive the next era of AI.

From software to the data center, we’re accelerating our collaboration, combining our power-efficient leadership with Meta’s AI innovation to scale AI everywhere: https://okt.to/v6mhgw

Reply on Twitter 1978494349966025044 Retweet on Twitter 1978494349966025044 16 Like on Twitter 1978494349966025044 76 Twitter 1978494349966025044

; Arm @Arm ·

15 Oct 1978477155295395883

“AI is powering the shift from on-device intelligence to cloud-to-car integration.”

In this #AIToyToTools podcast series, Suraj Gajendra shares how end-to-end AI and standardization are defining the future of automotive innovation.

🎧 Listen now

Reply on Twitter 1978477155295395883 Retweet on Twitter 1978477155295395883 3 Like on Twitter 1978477155295395883 13 Twitter 1978477155295395883

; Arm @Arm ·

14 Oct 1978226762904408272

We’re catching you up on the latest from #OCPSummit25 👇

1️⃣ Mohamed Awad kicked things off this morning with a keynote on how the AI era is transforming the way we build and power the world’s infrastructure.

He highlighted how Arm Total Design, advances in chiplet innovation,…

Reply on Twitter 1978226762904408272 Retweet on Twitter 1978226762904408272 4 Like on Twitter 1978226762904408272 21 Twitter 1978226762904408272

Generative AI is on Mobile and it’s Powered by Arm

New AI-powered smartphones

It all starts on the CPU….

LLM on mobile on the Arm compute platform

Driving generative AI on mobile

Editorial Contact

Related

Arm: The Technology Foundation for AI Everywhere

How Can Strategic AI Chip Development Accelerate the Future of Technology?

Unleashing the Power of Edge AI: A Comprehensive Guide for Companies in the Age of Innovation

Pushing AI to the Edge: A Conversation

Media Information

Company Overview & History

Arm Corporate Guidelines

Media Contacts

Latest on X