Blog

May 15, 2024

Generative AI is on Mobile and it’s Powered by Arm

Exciting new developments that demonstrate the advanced AI capabilities of the Arm CPU.

By James McNiven, Vice President of Product Management, Client Line of Business, Arm

Generative AI, which includes today’s well-known, highly publicized large language models (LLMs), has arrived at the edge on mobile. This means that AI generative inferences, from generating images and videos to understanding words in context, are starting to be processed entirely on the mobile device, rather than being sent to the Cloud and back.

Arm is the foundational technology to enable AI to run everywhere and when it comes to generative AI on mobile, there are some exciting, new developments that demonstrate this in action, from the latest AI-enabled flagship smartphones to LLMs being directly processed on the Arm CPU.

New AI-powered smartphones

High performance AI-enabled smartphones are now on the market, which are built on Arm’s v9 CPU and GPU technologies. These include the new MediaTek Dimensity 9300-powered vivo X100 and X100 Pro smartphones, Samsung Galaxy S24, and the Google Pixel 8.

The combination of performance and efficiency provided by these flagship mobile devices are delivering unprecedented opportunities for AI innovation. In fact, Arm’s own CPU and GPU performance improvements have doubled AI processing capabilities every two years during the past decade.

This trend will only advance in the future with more AI performance, technologies, and features on our robust consumer technology roadmap. This will be supported by the rise of AI inference at the edge, the process of using a trained model like LLMs to power AI-based applications, with CPUs being best placed to serve this need as more AI support and specialized instructions continue to be added.

It all starts on the CPU….

In most cases, the use of AI on our favorite mobile devices starts on the CPU, with some good examples being face, hand and body tracking, advanced camera effects and filters, and segmentation across the many social applications. The CPU will handle such AI workloads in their entirety or be supported by accelerators, including GPUs or NPUs. Arm technology is crucial to enabling these AI workloads, as our CPU designs are pervasive across the SoCs in today’s smartphones used by billions of people worldwide.

This has led to 70 percent of AI in today’s third-party applications running on Arm CPUs, including the latest social, health and camera-based applications and many more. Alongside the pervasiveness of the designs, the flexibility and AI capabilities of the Arm CPU makes it the best technology for mobile developers to target for their applications’ AI workloads.

In terms of flexibility, Arm CPUs can run a wide variety of neural networks in many different data formats. Looking ahead, future Arm CPUs will include more AI capabilities in the instruction set for the benefit of Arm’s industry-leading ecosystem, like the Scalable Matrix Extension (SME) for the Armv9-A architecture. These help the world’s developers deliver improved performance, innovative features and scalability for their AI-based applications.

The new Arm Kleidi Libraries, which will be embedded directly into AI frameworks, enable developers to transparently access the outstanding AI capabilities of the Arm CPU, so they can build their applications quickly at the highest possible performance.

The combination of leading hardware and software ecosystem support means Arm has a performant compute platform that is enabling the rise of generative AI at the edge, which could include gaming advancements, image enhancements, language translation, text generation and virtual assistants.

LLM on mobile on the Arm compute platform

At Mobile World Congress (MWC) 2024, we produced a virtual assistant demo that utilized Meta’s Llama2-7B LLM on mobile via a chat-based application. However, new models continue to emerge and we are committed to improving the LLM experience on Arm.

When the latest Llama3 model from Meta and Phi-3 3.8B model from Microsoft came out, we worked quickly to run them on Arm CPUs on mobile. These new AI models are far more capable and can respond to a wider range of questions. Our latest demo utilizes Microsoft’s Phi-3 3.8B model on mobile through ‘Ada’, a chatbot specifically trained to be a virtual teaching assistant for science and coding.

The generative AI workloads take place entirely at the edge on the mobile device on the Arm CPUs, with no involvement from accelerators. The impressive performance is enabled through a combination of existing CPU instructions for AI, alongside dedicated software optimizations for LLMs through the ubiquitous Arm compute platform that includes the Arm AI software libraries.

As you can see from the video above, there is a very impressive time-to-first token response performance and a text generation rate of just under 15 tokens per second that is faster than the average human reading speed. This is made possible by highly optimized CPU routines in the software library developed by the Arm engineering team that improves time-to-first token and text generation significantly compared to native implementation of the models

The Arm CPU provides the AI developer community with opportunities to experiment with their own techniques to provide further software optimizations that make LLMs smaller, more efficient and faster.

Enabling more efficient, smaller LLMs means more AI processing can take place at the edge. The user benefits from quicker, more responsive AI-based experiences, as well as greater privacy through user data being processed locally on the mobile device. Meanwhile, for the mobile ecosystem, there are lower costs and greater scalability options to enable AI deployment across billions of mobile devices.

We are also excited to see the developer open-source community engaged in working with models on Arm. This was demonstrated by the fact that developers in the open-source community managed to have the new Llama3 and Phi-3 3.8B models up and running on Arm in around 48 hours. We look forward to seeing more open-source engagement with generative AI on Arm.

Find out more information about the previous Llama2-7B demo and current Phi-3 3.8B demo from the Arm engineers that developed them in this technical blog.

Driving generative AI on mobile

As the most ubiquitous mobile compute platform and leader in efficient compute, Arm has a responsibility to enable the most efficient and highest-performing generative AI at the edge. We are already demonstrating the impressive performance of LLMs that are running entirely on our leading CPU technologies. However, this is just the start.

Through a combination of smaller, more efficient LLMs, improved performance on mobile devices built on Arm CPUs and innovative software optimizations from our industry-leading ecosystem, generative AI on mobile will continue to proliferate.

Arm is foundational to AI and we will enable AI everywhere, for every developer, with the Arm CPU at the heart of future generative AI innovation on mobile.

By James McNiven, Vice President of Product Management, Client Line of Business, Arm

Article Text

Copy Text

Any re-use permitted for informational and non-commercial or personal use only.

Editorial Contact

Brian Fuller & Jack Melling

editorial@arm.com

Subscribe to Blogs and Podcasts

Get the latest blogs & podcasts direct from Arm

Blog

Jan 08, 2024

Arm: The Technology Foundation for AI Everywhere

Arm Editorial Team

Blog

Dec 20, 2023

How Can Strategic AI Chip Development Accelerate the Future of Technology?

Kevork Kechichian, EVP, Solutions Engineering, Arm

Blog

Jul 12, 2023

Unleashing the Power of Edge AI: A Comprehensive Guide for Companies in the Age of Innovation

Arm Editorial Team

Podcast

Dec 08, 2023

Pushing AI to the Edge: A Conversation

Media Information

Latest on X

; Arm @Arm ·

16h 1950006138499441009

Edge AI is rewriting the playbook for IoT and embedded development as it shifts towards collaborative ecosystems and heterogeneous compute.

@VDC_Research partnered with us to explore the next era of embedded technology - led by AI and built on Arm. ⚡⬇️

https://okt.to/nIkNe6

Reply on Twitter 1950006138499441009 Retweet on Twitter 1950006138499441009 2 Like on Twitter 1950006138499441009 18 Twitter 1950006138499441009

; Arm @Arm ·

22h 1949917544954892736

➡️50% faster vector indexing
➡️20% performance boost
➡️10% cost reduction

@zilliz_universe achieved all this and more by transitioning from x86 to Arm CPUs for compute intensive workloads, reducing operational costs and delivering scale across the organization:…

Reply on Twitter 1949917544954892736 Retweet on Twitter 1949917544954892736 4 Like on Twitter 1949917544954892736 14 Twitter 1949917544954892736

; Arm @Arm ·

28 Jul 1949868215485845764

Ready to push genAI performance to the next level?

Our new course gives you hands-on experience in optimizing AI models from cloud to edge using Arm-based platforms like SIMD (SVE, Neon), low-bit quantization, and the KleidiAI library.

Reply on Twitter 1949868215485845764 Retweet on Twitter 1949868215485845764 3 Like on Twitter 1949868215485845764 9 Twitter 1949868215485845764

; Arm @Arm ·

25 Jul 1948827821310161337

We're building a future for real people.

We caught up with @1JessicaHawkins from our partners over at @AstonMartinF1 during our latest brand film shoot where she gave us a look into her own career journey and the importance of empowerment, growth & pushing the limits.

The…

Reply on Twitter 1948827821310161337 Retweet on Twitter 1948827821310161337 1 Like on Twitter 1948827821310161337 10 Twitter 1948827821310161337

; Arm @Arm ·

25 Jul 1948785102562992381

http://x.com/i/article/1948015818245079041

Reply on Twitter 1948785102562992381 Retweet on Twitter 1948785102562992381 4 Like on Twitter 1948785102562992381 14 Twitter 1948785102562992381

; Arm @Arm ·

25 Jul 1948767747002814600

GenAI is reshaping compute and we’re seeing the shift firsthand.

Since 2021, we’ve seen a 14x increase in our data center customer base. With more AI startups than ever choosing Arm platforms for high-performance, power-efficient compute across workloads, it’s clear that the…

Reply on Twitter 1948767747002814600 Retweet on Twitter 1948767747002814600 4 Like on Twitter 1948767747002814600 11 Twitter 1948767747002814600

; Arm @Arm ·

25 Jul 1948737013575774707

🚗 How do you scale safe, efficient compute for increasingly intelligent vehicles?

Meet Arm Zena CSS , our scalable compute platform that will help OEMs accelerate deployment of L2+ to L4 automated driving, beating analyst predictions: https://okt.to/5Nn0rB

Reply on Twitter 1948737013575774707 Retweet on Twitter 1948737013575774707 7 Like on Twitter 1948737013575774707 25 Twitter 1948737013575774707

; Arm @Arm ·

24 Jul 1948407313770975710

🔋Power efficiency is key to scaling AI.

At #FortuneAISingapore, Will Abbey shared why the time to rethink how we build is now ⏭️and how the Arm compute platform is driving that shift.

FORTUNE @FortuneMagazine

“Power efficiency is going to be the key word that the whole industry needs to focus on.”

@Arm EVP and CCO Will Abbey told #FortuneAISingapore that the global supply chain for semiconductor chips needs to find effective solutions to keep up with demand. https://trib.al/DOI1tyi

Reply on Twitter 1948407313770975710 Retweet on Twitter 1948407313770975710 6 Like on Twitter 1948407313770975710 17 Twitter 1948407313770975710

; Arm @Arm ·

23 Jul 1948057694998302815

In this spotlight by @themoment_media, Ami Badani, Chief Marketing Officer shares how our AI tech is helping shape the next era of productivity, creativity, and purpose.

Big thanks to the team at ATM for featuring this moment.🙌

ATM - At The Moment Media @themoment_media

Everyone's thinking about AI - Ami Badani, CMO @Arm is thinking about AI for GOOD 🫶🏽

🌎 Making society more productive while empowering everyone to use technology for positive
change 🙌🏽

#ATM #advertising #technology #media #experiences #storytellers #influencers #stories…

Reply on Twitter 1948057694998302815 Retweet on Twitter 1948057694998302815 2 Like on Twitter 1948057694998302815 10 Twitter 1948057694998302815

; Arm @Arm ·

22 Jul 1947726864522293518

In an interview with @automotiveworld, Dipti Vachani shares how we're helping automakers move faster by making software development simpler, scalable, and AI-ready thanks to SOAFEE and Arm Zena CSS. 🚗

Download the full story: https://okt.to/2RE9nT

Reply on Twitter 1947726864522293518 Retweet on Twitter 1947726864522293518 2 Like on Twitter 1947726864522293518 9 Twitter 1947726864522293518

; Arm @Arm ·

22 Jul 1947704622866411702

Proud to collaborate with @unitygames on their new e-book: “The Ultimate Guide to Profiling Unity Games” 🎮

We helped integrate hardware tools like Arm Performance Studio and Streamline Performance Analyzer to help developers better understand runtime behavior on Arm-based…

Unity for Games @unitygames

🚀 New e-book alert!

“The ultimate guide to profiling Unity games (Unity 6 edition)” is ready to download. Learn how to Get almost 100 pages of tips on profiling, memory management, and power consumption optimization.

🕵️ Learn how to pinpoint performance issues with the Unity…

Reply on Twitter 1947704622866411702 Retweet on Twitter 1947704622866411702 4 Like on Twitter 1947704622866411702 16 Twitter 1947704622866411702

; Arm @Arm ·

21 Jul 1947431357376041069

Will Abbey joins Graphcore’s Nigel Toon at #FortuneAISingapore to unpack how chipmakers can scale AI sustainably in an era where global strategy meets silicon and intelligence.

https://okt.to/rcwGLb

📍Main Stage | 2:40 PM

Reply on Twitter 1947431357376041069 Retweet on Twitter 1947431357376041069 4 Like on Twitter 1947431357376041069 8 Twitter 1947431357376041069

; Arm @Arm ·

21 Jul 1947238915921920265

We joined @AstonMartinF1’s #MakeAMark initiative to help young people explore the future of tech.

From training AI with micro:bit devices to discussing the human side of innovation, it was all about real-world skills, hands-on learning, and big inspiration.

Reply on Twitter 1947238915921920265 Retweet on Twitter 1947238915921920265 4 Like on Twitter 1947238915921920265 20 Twitter 1947238915921920265

; Arm @Arm ·

17 Jul 1945924983990981072

Edge AI is triggering the Great Embedded Awakening 🌍

💡Modern workloads = modern tools
💡Rich operating systems are displacing RTOSs
💡Heterogeneous compute is becoming the norm

Our report with @VDC_Research explores how much the landscape is changing.
https://okt.to/9PxFJM

Reply on Twitter 1945924983990981072 Retweet on Twitter 1945924983990981072 3 Like on Twitter 1945924983990981072 18 Twitter 1945924983990981072

; Arm @Arm ·

17 Jul 1945920169433313612

Congrats, @nuro on the launch of its next-gen global robotaxi program! 🥳

The Nuro Driver, built on Arm, will soon enable safe, AI-first autonomy across Uber’s fleet.

We're proud to support our partners to the AI-driven future of mobility.

Nuro @nuro

.@LucidMotors’ premium EVs. @Nuro’s proven L4 autonomy. @Uber’s global ride-hailing network.

Together, we're launching a next-gen robotaxi fleet—20K+ vehicles, starting in 2026.

Details here: https://www.nuro.ai/nuro-lucid-uber-robotaxi-announcement

#autonomousvehicles #technology #innovation #partnership…

Reply on Twitter 1945920169433313612 Retweet on Twitter 1945920169433313612 7 Like on Twitter 1945920169433313612 24 Twitter 1945920169433313612

; Arm @Arm ·

17 Jul 1945802180981637406

🚗 Arm Zena CSS brings a world class ecosystem of software partners like @awscloud, DENSO, @Mapbox, @RedHat and more together to collaborate and drive the AI-defined future.

Together, we’re transforming vehicles into intelligent, safer, updatable platforms…

Reply on Twitter 1945802180981637406 Retweet on Twitter 1945802180981637406 4 Like on Twitter 1945802180981637406 22 Twitter 1945802180981637406

; Arm @Arm ·

16 Jul 1945553545870078258

“The winners in any technological race are defined by the partners they work with.”

Rene Haas joined the PA Energy & Innovation Summit, emphasizing the importance of collaboration in the AI era.

We’re proud to be part of the efforts shaping that future.

Reply on Twitter 1945553545870078258 Retweet on Twitter 1945553545870078258 3 Like on Twitter 1945553545870078258 14 Twitter 1945553545870078258

; Arm @Arm ·

14 Jul 1944835817605627950

Don't hit the brakes! 🏁

At #62DAC, Suraj Gajendra, joined AMD, Siemens & Collins Aerospace to explore how software-defined infrastructure & system-level modeling are cutting automotive development cycles and ushering in an era of AI-defined vehicles. ⚡…

Reply on Twitter 1944835817605627950 Retweet on Twitter 1944835817605627950 4 Like on Twitter 1944835817605627950 14 Twitter 1944835817605627950

; Arm @Arm ·

10 Jul 1943349846733119934

Congrats to the @SamsungMobile team on a fantastic #GalaxyUnpacked! 👏

The new Galaxy Z Flip7 and Watch8, built on Arm CPU, showcase what’s possible with leading performance and efficiency for smarter, AI-first experiences.

Samsung Galaxy Z Flip7: A Pocket-Sized AI Powerhouse With a New Edge-To-Edge FlexWindow

Compact in size, bold in capability — Galaxy Z Flip7 redefines the flip phone experience

okt.to

Reply on Twitter 1943349846733119934 Retweet on Twitter 1943349846733119934 1 Like on Twitter 1943349846733119934 9 Twitter 1943349846733119934

; Arm @Arm ·

10 Jul 1943295391228637685

SME2🤝KleidiAI= The perfect match for matrix-heavy AI workloads on mobile

With 6x faster AI responses on models like Google's Gemma 3 & real-time text summarization in under a second, SME2 is built to scale next-gen AI features across devices - starting with your apps from today

Arm Software Developers @ArmSoftwareDev

📢 Mobile devs, get ready for a performance boost on matrix-heavy AI workloads with SME2.

Built into @Google’s XNNPACK and AI frameworks via Arm KleidiAI, now’s the time to make sure your apps use a supported stack to benefit - no code changes required: https://newsroom.arm.com/blog/arm-sme2-android-mobile-apps?utm_source=twitter&utm_medium=social-organic&utm_content=newsroom&utm_campaign=mk24_developer_na

Reply on Twitter 1943295391228637685 Retweet on Twitter 1943295391228637685 4 Like on Twitter 1943295391228637685 14 Twitter 1943295391228637685

; Arm @Arm ·

8 Jul 1942588028058239400

“You can’t load up a car with huge servers to run the model.” – Suraj Gajendra, VP Products and Solutions, Automotive

In a recent Arm Viewpoints podcast episode, Suraj and @silviusrus, VP of Software at @Wayve_AI, explore what today tells us about the future of autonomous…

Reply on Twitter 1942588028058239400 Retweet on Twitter 1942588028058239400 2 Like on Twitter 1942588028058239400 15 Twitter 1942588028058239400

; Arm @Arm ·

7 Jul 1942327794261770408

Moore’s Law is slowing. AI demand isn’t.

Will Abbey joins @RAISESummit tomorrow to explore how the industry is meeting this compute collision, with smarter architectures, efficient design, and AI-ready infrastructure: https://okt.to/cLSZ86

Reply on Twitter 1942327794261770408 Retweet on Twitter 1942327794261770408 4 Like on Twitter 1942327794261770408 9 Twitter 1942327794261770408

; Arm @Arm ·

3 Jul 1940901701617131688

As AI models become more efficient, Rene Haas and @OpenAI’s @markchen90 reflect on what’s next in the evolution of intelligence.

🎧 They explore the promise of AGI and how it could empower a new wave of entrepreneurship by making creation more accessible: https://okt.to/2UcJYm

Reply on Twitter 1940901701617131688 Retweet on Twitter 1940901701617131688 6 Like on Twitter 1940901701617131688 22 Twitter 1940901701617131688

; Arm @Arm ·

1 Jul 1940141221319717058

Congrats to @RenesasGlobal on the RA8P1 MCU group, powered by Arm Cortex-M85, M33, and Ethos-U55.

Designed for on-device AI and ML, it brings advanced performance to next-gen voice and vision applications, alongside real-time analytics.👏

https://www.renesas.com/en/about/newsroom/renesas-sets-new-mcu-performance-bar-1-ghz-ra8p1-devices-ai-acceleration

Reply on Twitter 1940141221319717058 Retweet on Twitter 1940141221319717058 1 Like on Twitter 1940141221319717058 9 Twitter 1940141221319717058

Generative AI is on Mobile and it’s Powered by Arm

New AI-powered smartphones

It all starts on the CPU….

LLM on mobile on the Arm compute platform

Driving generative AI on mobile

Editorial Contact

Related

Arm: The Technology Foundation for AI Everywhere

How Can Strategic AI Chip Development Accelerate the Future of Technology?

Unleashing the Power of Edge AI: A Comprehensive Guide for Companies in the Age of Innovation

Pushing AI to the Edge: A Conversation

Media Information

Company Overview & History

Arm Corporate Guidelines

Media Contacts

Latest on X