News

October 24, 2024

Accelerating Generative AI at the Edge on Arm with ExecuTorch Beta Release

By Alex Spinelli, SVP, AI and Developer Platforms and Services, Arm

News highlights:

Combination of Arm Compute Platform with ExecuTorch framework is enabling smaller, optimized models for faster generative AI at the edge
New quantized Llama models ideal for on-device and edge AI applications on Arm, providing reduced memory footprint and improved accuracy, performance and portability
20 million Arm developers can create and deploy more intelligent AI-based applications quicker at scale across billions of edge devices

To realize the true potential of AI, we need to make it accessible to the broadest range of devices and developers. Through collaborating with the PyTorch team at Meta on the new ExecuTorch Beta release, we are fulfilling this mission, bringing AI and machine learning (ML) capabilities to billions of edge devices, as well as millions of developers worldwide.

The Arm Meta partnership

Generative AI improvements on the Arm Compute Platform with ExecuTorch and new quantized Llama 3.2 1B and Llama 3.2 3B models

The powerful combination of the ubiquitous Arm compute platform, which powers many of the world’s edge devices, with ExecuTorch, a PyTorch-native framework designed to deploy AI models on mobile and edge devices, is enabling smaller, optimized models, including the new quantized Llama 3.2 1B and 3B models. The new quantized models are ideal for generative AI use cases on smaller devices, such as virtual chatbots, text summarization and AI assistants, as they offer a reduced memory footprint, higher levels of accuracy, greater performance and greater portability.

Developers can seamlessly integrate the new quantized models into their applications with no additional modifications or optimizations, saving time and resources. This empowers them to quickly create and deploy more intelligent AI-based applications at scale across a broad range of Arm-powered devices.

As with the new Llama 3.2 large language model (LLM) releases, Arm is optimizing AI performance through the ExecuTorch framework, with this leading to real-world generative AI workloads running faster on edge devices that are built on the Arm Compute Platform. Developers are then able to access these enhancements from day one of the ExecuTorch Beta release.

Video demo highlighting the benefits of the Arm ExecuTorch collaboration

Accelerating generative AI on mobile with KleidiAI integration

In mobile, Arm’s work with ExecuTorch means virtual chatbots, text generation and summarization and real-time voice and virtual assistants are all running at improved performance entirely on the device on the Arm CPU. This was achieved through integrating KleidiAI, which now introduces micro-kernels optimized for 4-bit quantization, into ExecuTorch via XNNPACK to seamlessly speed-up the execution of AI workloads when running LLMs with 4-bit quantization on Arm Compute Platform. For example, the execution of the prefill stage of the quantized Llama 3.2 1B model will now run 20 percent faster with the KleidiAI integration, leading to speeds of over 400 tokens per second for text generation on some Arm-based mobile devices. This means the end-user will benefit from quicker, more responsive AI-based experiences on their mobile devices.

Learn more about Arm support for ExecuTorch in mobile markets in this blog.

Accelerating real-time processing for edge AI applications in IoT

Meanwhile, in IoT markets, the ExecuTorch work will improve real-time processing for edge AI applications across a broad range of IoT devices, from smart home appliances and wearables to autonomous systems used in retail in industrial IoT. Being able to process more real-time AI tasks on the edge means IoT devices and applications can respond to their environments in milliseconds, which is crucial for safety and functionality.

ExecuTorch can be leveraged across Arm’s Cortex-A CPUs and Ethos-U NPUs to accelerate the development and deployment of edge AI applications. In fact, through combining ExecuTorch with the Arm Corstone-320 reference platform (which is also available as an emulated Fixed Virtual Platform), the Arm Ethos-U85 NPU driver and compiler support into one package, developers can start creating their edge AI applications months before platforms arrive on the market.

Learn more about Arm support for ExecuTorch in IoT markets in this blog.

More accessible, faster edge AI experiences

We believe that ExecuTorch has the potential to be one of the world’s most popular frameworks for efficient AI and ML development. Combining ExecuTorch with the ubiquitous Arm compute platform, we are accelerating the democratization of AI through new quantized models that empower developers to deploy their applications quicker across more devices, and bring more generative AI experiences to the edge.

By Alex Spinelli, SVP, AI and Developer Platforms and Services, Arm

Article Text

Copy Text

Any re-use permitted for informational and non-commercial or personal use only.

Media Contacts

Melissa Woodbridge

Senior PR Manager

melissa.woodbridge@arm.com

+44 7469 851193

Media & Analyst News Alerts

Get the latest media & analyst news direct from Arm

Blog

Oct 09, 2024

Why Arm is the Compute Platform for All AI Workloads

Arm Editorial Team

News

Sep 25, 2024

Accelerating and Scaling AI Inference Everywhere with New Llama 3.2 LLMs on Arm

Ian Bratt, VP of ML Technology and Fellow, Arm

News

Sep 16, 2024

Arm Accelerates AI From Cloud to Edge With New PyTorch and ExecuTorch Integrations to Deliver Immediate Performance Improvements for Developers

Alex Spinelli, SVP, AI and Developer Platforms and Services, Arm

Blog

Sep 10, 2024

Unlocking New Real-world Generative AI Use Cases on the Mobile CPU

Ronan Naughton, Director, Product Management, Client Line of Business, Arm

Blog

Aug 15, 2024

Realizing the Full Potential of Edge AI with Connected Security

David Maidment, Senior Director, Market Strategy, Arm

Blog

Jul 18, 2024

KleidiAI Integration Brings AI Performance Uplifts to Google AI Edge’s MediaPipe

Ronan Naughton, Director, Product Management, Client Line of Business, Arm

Media Information

Latest on X

; Arm @Arm ·

4h 1985317893009985918

AI is reshaping the world ⏩ but can laws keep up?

In the latest episode of Arm Tech Unheard, Rene Haas and Minister @AshwiniVaishnaw unpack how innovation and policy must work together to govern AI responsibly.

Catch the full episode: https://okt.to/nXTvkb

Reply on Twitter 1985317893009985918 Retweet on Twitter 1985317893009985918 0 Like on Twitter 1985317893009985918 6 Twitter 1985317893009985918

; Arm @Arm ·

18h 1985095589432881499

AI innovation isn’t just about hardware, it’s the software that connects it all. Complexity in AI toolchains still blocks real-world deployment.

See how we're simplifying the AI stack to help developers build faster, deploy anywhere in this article by @VentureBeat.…

Reply on Twitter 1985095589432881499 Retweet on Twitter 1985095589432881499 3 Like on Twitter 1985095589432881499 20 Twitter 1985095589432881499

; Arm @Arm ·

31 Oct 1984374835686895914

We're powering a major shift in AI. 💪

With Arm-based cloud instances organizations can implement AI efficiently and at scale - gaining higher performance-per-watt, lower total cost, and the flexibility to move from pilot projects to full AI platforms.

From pilot to platform: How Arm is powering AI in the cloud

Why now is the right time to evaluate Arm-based cloud instances

okt.to

Reply on Twitter 1984374835686895914 Retweet on Twitter 1984374835686895914 9 Like on Twitter 1984374835686895914 26 Twitter 1984374835686895914

; Arm @Arm ·

30 Oct 1984026047466045622

By migrating to Arm-based AWS Graviton processors and GitHub’s native Arm64 runners, @ThePSF cut compute costs by 25%, reduced carbon emissions by 40%, and achieved zero downtime - keeping Python’s ecosystem running stronger. ⚡💪
https://okt.to/atludF

Reply on Twitter 1984026047466045622 Retweet on Twitter 1984026047466045622 9 Like on Twitter 1984026047466045622 27 Twitter 1984026047466045622

; Arm @Arm ·

30 Oct 1983995601793519960

Personalized AI is reshaping our daily lives and it all starts with power-efficient compute.

From your morning latte to life-changing medical care, Arm is powering the future of AI everywhere.

Learn more in this @nytimes feature.
https://okt.to/5AQvsK

Reply on Twitter 1983995601793519960 Retweet on Twitter 1983995601793519960 3 Like on Twitter 1983995601793519960 14 Twitter 1983995601793519960

; Arm @Arm ·

29 Oct 1983673134977802747

A new era for robotaxis and intelligent mobility is here. 🚗

Auto leaders like @LucidMotors, @MercedesBenz, and @Stellantis are driving innovation with the @NVIDIA DRIVE AV platform and DRIVE AGX Hyperion 10 architecture — powered by NVIDIA DRIVE Thor featuring Arm Neoverse…

Reply on Twitter 1983673134977802747 Retweet on Twitter 1983673134977802747 2 Like on Twitter 1983673134977802747 30 Twitter 1983673134977802747

; Arm @Arm ·

29 Oct 1983536074681995696

Say hello to the #OPPOFindX9Series, built on our latest Arm v9.3 C1 CPU cluster and G1-Ultra GPU, delivering up to 32% higher performance, 42% better power efficiency, plus new AI-powered features with ColorOS 16.

Congrats, @Oppo! We're excited to continue collaborating on the…

Reply on Twitter 1983536074681995696 Retweet on Twitter 1983536074681995696 5 Like on Twitter 1983536074681995696 25 Twitter 1983536074681995696

; Arm @Arm ·

29 Oct 1983485237414789559

The development of the world’s first blockchain-on-chip for drones is being made possible through Arm Flexible Access, as @Minima_Global and @unisouthampton use Arm compute platforms to explore new approaches to secure, autonomous system design 👏: https://okt.to/UL1Or0

Reply on Twitter 1983485237414789559 Retweet on Twitter 1983485237414789559 86 Like on Twitter 1983485237414789559 167 Twitter 1983485237414789559

; Arm @Arm ·

29 Oct 1983461333719724516

Migrating cloud workloads doesn’t need to be complex.

The new Arm Cloud Migration Assistant Custom Agent, integrated with @GitHub Copilot, accelerates deployment so you can analyze code for readiness and build optimized multi-arch containers faster: https://okt.to/Sou3gr

Reply on Twitter 1983461333719724516 Retweet on Twitter 1983461333719724516 7 Like on Twitter 1983461333719724516 20 Twitter 1983461333719724516

; Arm @Arm ·

28 Oct 1983284153815622070

Reply on Twitter 1983284153815622070 Retweet on Twitter 1983284153815622070 1 Like on Twitter 1983284153815622070 16 Twitter 1983284153815622070

; Arm @Arm ·

28 Oct 1983245931672707416

Last week, 300+ Arm graduates from 12 countries came together in London for the Global Graduate Conference.

Co-designed by grads, GGC helps our future innovators accelerate their impact and shape the future of AI. 💡

Reply on Twitter 1983245931672707416 Retweet on Twitter 1983245931672707416 5 Like on Twitter 1983245931672707416 22 Twitter 1983245931672707416

; Arm @Arm ·

27 Oct 1982897135327576542

We’re proud that the Arm x @Simprints partnership with @gaviwas a finalist for Partnership of the Year at the @Reuters Sustainability Awards!

A huge congratulations to our teams and partners for their work helping ensure everyone counts. 👏

Reply on Twitter 1982897135327576542 Retweet on Twitter 1982897135327576542 4 Like on Twitter 1982897135327576542 13 Twitter 1982897135327576542

; Arm @Arm ·

27 Oct 1982872522271141945

What gives you the edge in AI? The answer’s in the question!

As AI adoption accelerates, our report with @scsp_ai explores why success depends on rethinking infrastructure to embrace edge AI - backed by policies that prioritize power-efficient computing: https://www.arm.com/-/media/Files/pdf/policies/scsp-arm-position-paper?utm_source=twitter&utm_medium=social-organic&utm_content=report&utm_campaign=mk29_exec-comms_na

Reply on Twitter 1982872522271141945 Retweet on Twitter 1982872522271141945 2 Like on Twitter 1982872522271141945 12 Twitter 1982872522271141945

; Arm @Arm ·

25 Oct 1982084746835378550

🚗 AI inside the car is redefining vehicle design, enhancing safety, personalization, and performance in ways drivers barely notice.

In this #AIToyToTools podcast series, we explore how AI is powering the shift from on-device intelligence to cloud-to-car integration.…

Reply on Twitter 1982084746835378550 Retweet on Twitter 1982084746835378550 6 Like on Twitter 1982084746835378550 32 Twitter 1982084746835378550

; Arm @Arm ·

24 Oct 1981797862242394130

We’re heading to #GitHubUniverse!

Catch us in the Festival Pavilion to explore demos, connect with experts, and discover more on GitHub-native development across cloud, PC and embedded. 💪

+ Wrap up day 1️⃣ with the team at our onsite Happy Hour: https://okt.to/657ei8

Reply on Twitter 1981797862242394130 Retweet on Twitter 1981797862242394130 1 Like on Twitter 1981797862242394130 13 Twitter 1981797862242394130

; Arm @Arm ·

24 Oct 1981750164902588512

Moments before the Geely EX5 UK launch, Dipti Vachani talked about what’s under the hood - Arm Automotive Enhanced technology powering real-time safety, performance and intelligence. #GeelyAutoUK #GeelyEX5 @geelyautouk

Reply on Twitter 1981750164902588512 Retweet on Twitter 1981750164902588512 4 Like on Twitter 1981750164902588512 23 Twitter 1981750164902588512

; Arm @Arm ·

23 Oct 1981391805686911206

The @GeelyAutoUK EX5, built on Arm, brings software-driven intelligence and real-time safety to modern driving.

Powered by Arm Automotive Enhanced technologies, it delivers seamless screens, intelligent park assist & natural voice interaction.

https://okt.to/Nnhtfj…

Reply on Twitter 1981391805686911206 Retweet on Twitter 1981391805686911206 6 Like on Twitter 1981391805686911206 24 Twitter 1981391805686911206

; Arm @Arm ·

23 Oct 1981316423814132113

Something exciting is coming. 👀

Join @GeelyAutoUK for the Geely EX5 launch event, live streaming today, Thursday, Oct 23 at 16:45 BST / 8:45 PT.

We can’t say much yet, but trust us, you’ll want to see this.

🎥 Watch live:

Geely EX5 UK Launch | INTELLIGENT EVERYDAY

Welcome to the Future of Driving!Join us for the exclusive UK launch of the Geely EX5, the next-generation electri...

okt.to

Reply on Twitter 1981316423814132113 Retweet on Twitter 1981316423814132113 7 Like on Twitter 1981316423814132113 11 Twitter 1981316423814132113

; Arm @Arm ·

22 Oct 1981027795431039252

ExecuTorch 1.0 GA is here, and it's redefining what’s possible for AI at the edge. 🙌

Built on PyTorch & optimized for the Arm compute platform, it enables faster, higher-performance AI across devices, bringing edge AI to life everywhere, for everyone: https://okt.to/f83kPo

Reply on Twitter 1981027795431039252 Retweet on Twitter 1981027795431039252 2 Like on Twitter 1981027795431039252 15 Twitter 1981027795431039252

; Arm @Arm ·

22 Oct 1980994392576885221

Our new Bengaluru office set the stage for Rene Haas and @AshwiniVaishnaw's conversation on Tech Unheard.

Fresh from the grand opening, they dive into India’s next chapter, exploring how innovation, talent, and resilience are driving progress in tech🎙️: https://okt.to/ZBcOE1

Reply on Twitter 1980994392576885221 Retweet on Twitter 1980994392576885221 4 Like on Twitter 1980994392576885221 15 Twitter 1980994392576885221

; Arm @Arm ·

21 Oct 1980751324116390081

📷 This time last week at #OCPSummit25!

We joined tech leaders rethinking how the world powers AI, sharing our vision for the open, converged AI data center and highlighting Arm’s FCSA contribution, Arm Total Design expansion, and spot on the OCP Board: https://newsroom.arm.com/blog/key-takeaways-from-ocp-global-summit-2025?utm_source=twitter&utm_medium=social-organic&utm_content=blog&utm_campaign=mk03_infrastructure_na

Reply on Twitter 1980751324116390081 Retweet on Twitter 1980751324116390081 1 Like on Twitter 1980751324116390081 6 Twitter 1980751324116390081

; Arm @Arm ·

21 Oct 1980701958555201880

As Arm's ecosystem continues to unlock better performance per watt at hyperscale - customer momentum on Microsoft’s Arm-based Cobalt 100 VMs for general-purpose and cloud-native workloads is accelerating. ⬇️

How Azure Cobalt 100 VMs are powering real-world solutions, delivering performance and efficiency...

Learn how you can accelerate product development, scale analytics platforms, or improve user experiences with Azure Cobalt 100 VMs.

azure.microsoft.com

Reply on Twitter 1980701958555201880 Retweet on Twitter 1980701958555201880 5 Like on Twitter 1980701958555201880 29 Twitter 1980701958555201880

; Arm @Arm ·

20 Oct 1980383796160532981

For Rene Haas, leadership means moving fast, embracing change, and knowing when to pivot.

Speaking with future leaders at @CarnegieMellon, he highlighted the importance of experimentation and learning from fast failure.

🎧 Hear more on Tech Unheard: https://okt.to/Jbgn2E

Reply on Twitter 1980383796160532981 Retweet on Twitter 1980383796160532981 1 Like on Twitter 1980383796160532981 13 Twitter 1980383796160532981

; Arm @Arm ·

20 Oct 1980258226344952005

🆕 We’re expanding Arm Flexible Access to include our first Armv9 edge AI platform, giving innovators low-cost access to the performance, efficiency, & security they need to bring intelligence to every edge device, and drive the next wave of AI innovation: https://okt.to/GyJDV2

Reply on Twitter 1980258226344952005 Retweet on Twitter 1980258226344952005 7 Like on Twitter 1980258226344952005 32 Twitter 1980258226344952005

Accelerating Generative AI at the Edge on Arm with ExecuTorch Beta Release

News highlights:

Generative AI improvements on the Arm Compute Platform with ExecuTorch and new quantized Llama 3.2 1B and Llama 3.2 3B models

Accelerating generative AI on mobile with KleidiAI integration

Accelerating real-time processing for edge AI applications in IoT

More accessible, faster edge AI experiences

Media Contacts

Related

Why Arm is the Compute Platform for All AI Workloads

Accelerating and Scaling AI Inference Everywhere with New Llama 3.2 LLMs on Arm

Arm Accelerates AI From Cloud to Edge With New PyTorch and ExecuTorch Integrations to Deliver Immediate Performance Improvements for Developers

Unlocking New Real-world Generative AI Use Cases on the Mobile CPU

Realizing the Full Potential of Edge AI with Connected Security

KleidiAI Integration Brings AI Performance Uplifts to Google AI Edge’s MediaPipe

Media Information

Company Overview & History

Arm Corporate Guidelines

Media Contacts

Latest on X