Blog

March 3, 2025

On-device Audio Generation Accelerated by 30x with Arm Kleidi

Arm and Stability AI collaboration runs optimized text-to-audio generative AI model entirely on the smartphone on Arm CPUs

By Ronan Naughton, Director, Product Management, Client Line of Business, Arm

Imagine editing a video on your smartphone and needing the perfect sound effect or wanting to generate your own custom sound for a ringtone, alarm or social media post. Instead of searching online or purchasing audio clips, you type a description – “gentle ocean waves at sunset” – and within seconds, your device generates the perfect sound without even connecting to the internet. This seamless, on-the-spot audio generation entirely on the device has become a reality, thanks to a new collaboration between Arm and Stability AI.

Arm and Stability AI collaboration accelerates text-to-audio response times

To achieve this, Stability AI, which develops AI models for image, video, 3D, and audio, leveraged Arm KleidiAI, which offers optimized performance-critical routines – known as micro-kernels – tailored for Arm CPUs. Through the KleidiAI integrations into the XNNPack library and ExecuTorch framework and Stability AI’s own optimizations, the team unlocked significant AI performance improvements on Stability AI’s text-to-audio open model, “Stable Audio Open.”

The results are remarkable. Text-to-audio AI generation is significantly reduced from minutes to seconds, representing a 30x faster response time. This is all while running the Stable Audio Open model entirely on smartphone devices on Arm CPUs – a first for text-to-audio AI – with no internet connection required.

Video: The Arm Stability AI audio generation demo

Stability AI used the automatic KleidiAI accelerations to speed up model responses for improved on-device AI performance without compromising quality. These KleidiAI performance uplifts require no additional developer effort from users of the Stable Audio Open model, saving time and costs. Arm and Stability AI are continuing to work together to implement yet more performance improvements that will further enhance this outstanding AI user experience.

The dramatic improvements show how targeted hardware and software integration can make previously unattainable AI applications feasible on mobile, fueling future innovation opportunities. It also means that advanced AI audio capabilities are now accessible to billions of smartphone users worldwide, with 99 percent of the world’s smartphones built on Arm technology.

Video: The Arm and Stability AI partnership

Solving complex AI challenges together

Despite the efficiency of the Stable Audio Open model, running it directly on-device on smartphone CPUs presented significant challenges. Initial attempts resulted in generation times exceeding four minutes for a single audio sample, rendering the experience impractical for end users.

Working with Arm, Stability AI distilled the model down to the right number of trainable parameters for mobile. Stability AI then took this new distilled model and utilized the KleidiAI performance accelerations from the XNNPack and ExecuTorch integrations, so it could generate audio clips in seconds across Arm CPUs for mobile.

“As more and more professional creatives and businesses adopt generative AI to power their production pipeline, it’s important that our models and workflows are available everywhere for builders to build and creators to create. We are excited to partner with Arm for this exact reason. Arm’s prevalence across the ecosystem from the server to the smartphone and its work to accelerate AI models across all the popular frameworks by integrating Arm Kleidi into the software stack, made it a no brainer,” said Prem Akkaraju, CEO, Stability AI.

The rise of text-to-audio AI

Since 2022, Stability AI has been at the forefront of the generative AI evolution, initially making waves with Stable Diffusion, the industry-leading image model. Building upon this success, the company then introduced Stable Audio, one of the first fully licensed audio models designed to generate high-quality music and sound effects from textual prompts. These are some of the top-ranking AI models on leading platforms like Hugging Face, cultivating an active community of millions utilizing these tools.

Arm and Stability AI at MWC

At Mobile World Congress (MWC) 2025, Arm and Stability AI showcased the results of the KleidiAI accelerations on the Stable Audio Open model at the Arm booth in Hall 2 Stand I60. The demos are generated using Stability AI’s models and workflows, and all executed offline on Arm-based hardware, which includes the vivo X200 Series of flagship smartphones built on the MediaTek Dimensity 9400 featuring the latest Armv9 CPUs.

Advanced audio AI experiences accessible to all

This is just the start of the partnership between Arm and Stability AI, with yet more performance optimizations planned to further enhance the user experience. Working together, we are both setting the stage for on-device AI across audio, images, video, and 3D – reshaping how everyone creates content and interacts with digital media. By distilling advanced models and leveraging optimized software on ubiquitous hardware, we are paving the way for a future where sophisticated AI applications, models and experiences are accessible to all, directly from the devices in our pockets.

Arm and Stability AI

Learn more about the Arm and Stability AI partnership

By Ronan Naughton, Director, Product Management, Client Line of Business, Arm

Article Text

Copy Text

Any re-use permitted for informational and non-commercial or personal use only.

Editorial Contact

Arm Editorial Team

editorial@arm.com

Stay informed with Arm's top stories, insights, and conversations.

Blog

Feb 18, 2025

First Arm KleidiCV Integration Accelerates Computer Vision Workloads on Mobile by 4x with OpenCV 4.11

Arm Editorial Team

Blog

Jul 18, 2024

KleidiAI Integration Brings AI Performance Uplifts to Google AI Edge’s MediaPipe

Ronan Naughton, Director, Product Management, Client Line of Business, Arm

Blog

May 29, 2024

Accelerating AI Developer Innovation Everywhere with New Arm Kleidi

Geraint North, Fellow, AI and Developer Platforms, Arm

Blog

Mar 03, 2025

New Arm KleidiAI Integration Accelerates Multimodal AI Experiences at the Edge With Alibaba’s Qwen Model

Ronan Naughton, Director, Product Management, Client Line of Business, Arm

News

Oct 24, 2024

Accelerating Generative AI at the Edge on Arm with ExecuTorch Beta Release

Alex Spinelli, SVP, AI and Developer Platforms, Arm

Blog

Sep 10, 2024

Unlocking New Real-world Generative AI Use Cases on the Mobile CPU

Ronan Naughton, Director, Product Management, Client Line of Business, Arm

Media Information

Latest on X

; Arm @Arm ·

19h 2074991944573337617

AI didn't just get smarter. It got busier.

With agentic AI, systems are continuously reasoning, retrieving information, and coordinating actions. It's a change that shifts the focus from adding more compute to building infrastructure that can orchestrate it all efficiently.

Reply on Twitter 2074991944573337617 Retweet on Twitter 2074991944573337617 1 Like on Twitter 2074991944573337617 33 Twitter 2074991944573337617

; Arm @Arm ·

21h 2074963268540481591

Innovation doesn't follow a single path.

That's why Arm continues to expand its compute platform with IP, Arm Compute Subsystems (CSS), and production-ready silicon—giving partners more ways to build on Arm.

@Lenovo is one of many partners turning that expanded choice into

Reply on Twitter 2074963268540481591 Retweet on Twitter 2074963268540481591 3 Like on Twitter 2074963268540481591 58 Twitter 2074963268540481591

; Arm @Arm ·

7 Jul 2074607029960736850

AI isn't just transforming workloads. It's transforming graphics.

Arm Neural Frame Rate Upscaling (NFRU) is helping introduce a new era of gaming on mobile with Neural Dawn. 📱

To get a closer look, make sure to register for Arm Create Dev Day at SIGGRAPH 2026!

Reply on Twitter 2074607029960736850 Retweet on Twitter 2074607029960736850 4 Like on Twitter 2074607029960736850 28 Twitter 2074607029960736850

; Arm @Arm ·

7 Jul 2074572856370078144

Congratulations to the 2026 WILDLABS Award winners!

Alongside @WILDLABSNET, we're supporting conservation technology projects that use AI, edge computing, bioacoustics, and open tools to help protect biodiversity and deliver real-world impact.

https://newsroom.arm.com/blog/announcing-the-winners-of-the-wildlabs-awards-2026?utm_source=twitter&utm_medium=social-organic&utm_content=blog&utm_campaign=mk23_sustainability_na

Reply on Twitter 2074572856370078144 Retweet on Twitter 2074572856370078144 1 Like on Twitter 2074572856370078144 17 Twitter 2074572856370078144

; Arm @Arm ·

3 Jul 2073034550205108700

More AI agents = more demand for CPU cores.

In an interview with Key Context, Rene Haas speaks with @firstadopter about how agentic AI workflows are changing what data centers need from the CPU, with higher core counts becoming increasingly important as AI workloads scale.

Read

Reply on Twitter 2073034550205108700 Retweet on Twitter 2073034550205108700 8 Like on Twitter 2073034550205108700 63 Twitter 2073034550205108700

; Arm @Arm ·

3 Jul 2073001647127544231

🌍 Every child deserves to be counted.

With @Simprints, @gavi and Ghana Health Services, we're supporting safe & secure AI-powered biometric ID technology that helps health workers uniquely identify children and connect them to life-saving care.

Hear from Samuel Laate on why

Reply on Twitter 2073001647127544231 Retweet on Twitter 2073001647127544231 0 Like on Twitter 2073001647127544231 8 Twitter 2073001647127544231

; Arm @Arm ·

2 Jul 2072826924636754009

As questions emerge around the demand for AI compute, Arm CEO Rene Haas joined @LizClaman on The Claman Countdown to share why he believes demand will continue to grow and why the technologies behind AI matter now more than ever.

👀:

SoftBank is 'uniquely positioned' to win the game: CEO | Fox Business Video

Arm Holdings CEO Rene Haas discusses SoftBank Corp.'s developments as it looks to lead the A.I. data center race on 'The Claman Countdown.'

okt.to

Reply on Twitter 2072826924636754009 Retweet on Twitter 2072826924636754009 5 Like on Twitter 2072826924636754009 28 Twitter 2072826924636754009

On-device Audio Generation Accelerated by 30x with Arm Kleidi

Arm and Stability AI collaboration accelerates text-to-audio response times

Solving complex AI challenges together

The rise of text-to-audio AI

Arm and Stability AI at MWC

Advanced audio AI experiences accessible to all

Arm and Stability AI

Editorial Contact

Stay informed with Arm's top stories, insights, and conversations.

Related

First Arm KleidiCV Integration Accelerates Computer Vision Workloads on Mobile by 4x with OpenCV 4.11

KleidiAI Integration Brings AI Performance Uplifts to Google AI Edge’s MediaPipe

Accelerating AI Developer Innovation Everywhere with New Arm Kleidi

New Arm KleidiAI Integration Accelerates Multimodal AI Experiences at the Edge With Alibaba’s Qwen Model

Accelerating Generative AI at the Edge on Arm with ExecuTorch Beta Release

Unlocking New Real-world Generative AI Use Cases on the Mobile CPU

Media Information

Company Overview & History

Arm Corporate Guidelines

Media Contacts

Latest on X