Blog

April 9, 2024

Arm Ethos-U85: Addressing the High Performance Demands of IoT in the Age of AI

Arm launches the highest performing, most energy efficient Ethos NPU to date for IoT markets.

By Parag Beeraka, Senior Director, Consumer Computing, Edge AI Business Unit, Arm

As artificial intelligence (AI) continues to have more influence and impact in our day-to-day lives, the domain is migrating from cloud-based inferencing to edge and endpoint inferencing. Edge-based inferencing brings intelligence across a broad range of IoT devices, enabling data to be processed locally and decisions to be made in real time with increased data privacy and security.

How do Arm’s Ethos NPUs enhance AI performance at the edge and endpoints?

Arm has been developing edge AI accelerators to support the growing need of edge and endpoint inferencing workloads for several years. Through Arm’s Ethos-U55 and Ethos-U65 NPUs, we have two very successful products bringing high performance, energy efficient solutions for AI applications at the edge and endpoints.

Ethos-U55 is deployed in many Cortex-M based heterogenous systems. The Ethos-U65 extends the applicability of the Ethos-U family to Cortex-A-based systems, while delivering twice the on-device machine learning (ML) performance. Both products offer a unified toolchain for easy development and support for common ML network operations, including convolutional neural networks (CNNs) and recurrent neural networks (RNNs).

What is the impact of transformer architecture on AI development?

Introduced in 2017, transformer architecture has revolutionised generative AI and become the architecture of choice for many new neural networks. Transformer-based models can process sequential data using attention mechanisms and have achieved state-of-the-art results in many AI tasks, such as machine translation, natural language understanding, speech recognition, segmentation and image captioning.

These models can be adapted and compressed to run efficiently on edge devices without compromising much on accuracy and showcasing state-of-the-art advancements across many edge and endpoint use cases.

What are the key advantages of the Ethos-U85 NPU for edge and endpoint workloads?

Building on the success of our previous Ethos-U family of NPUs, we have a new product offering, Ethos-U85. This brings an accelerator with the same high-performance, energy-efficient philosophy of previous Ethos-U NPUs, while enabling current and upcoming workloads on the edge and endpoints using transformer-based networks.

Ethos-U85 is our third generation NPU from Arm’s Ethos-U product line and the highest performing, most energy efficient Ethos NPU to date. It delivers a 4x performance uplift and 20% higher power efficiency compared to its predecessor, with up to 85% utilization on popular networks. This addresses IoT applications where we are seeing even greater performance demands, such as factory automation and commercial or smart home cameras. It is also designed to run with Cortex-M as well as Cortex-A-based systems and tolerates high DRAM latencies.

Some of the key features for Ethos-U85 include:

Support for configurations from 128 to 2048 MACs/cycle – 256 GOPS/s to 4 TOP/s at 1GHz.
Support for int8 weights and int8 or int16 activations.
Support for transformer architecture networks, along with CNNs and RNNs.
Hardware native support for 2/4 sparsity for double the throughput.
Internal SRAM of 29 to 267 KB and up to six 128-bit AXI5 interfaces.
Support for weight compression, with both standard and fast weight decoder.
Support for extended compression.

In addition to the operators currently supported by the Ethos-U55 and U65, Ethos-U85 will include native hardware support for transformer networks and DeeplabV3 semantic segmentation network by supporting operations such as TRANSPOSE, GATHER, MATMUL, RESIZE BILINEAR, and ARGMAX.

Ethos-U85 also supports elementwise operator chaining. Chaining combines an elementwise operation with a previous operation to save the SRAM from having to write and then read the intermediate tensor. This can improve the efficiency of the NPU by reducing the amount of data that needs to be transferred between the NPU and memory. Chaining is one of several improved efficiency features in Ethos-U85 compared to Ethos-U65, along with fast weight decoder, improved power efficiency of the MAC array, and improved elementwise efficiency.

Ethos-U85 can be used in same flow of system configurations as it was for Ethos-U55 and Ethos-U65, and we are introducing the capability to directly drive Ethos-U85 from a Cortex-A-based system.

Ethos-U85 will also support the same software toolchain that is established with the previous Ethos-U line of products, which is using TFLmicro runtime. This will extend the value of investments that have already been made on systems using Cortex-A/Cortex-M with Ethos-U55/Ethos-U65, as Ethos-U85 builds on that and leverages that value to enable wider use cases based on transformer networks. In the future, we expect to enable support for ExecuTorch, which is a PyTorch runtime for edge devices.

Ethos-U85 supported operators will be accelerated on the NPU itself, while if there are any special operators that are not supported, then some of those can be accelerated on Cortex-M based systems using CMSIS-NN. For example, in the case of tinyLlama, the model was fully mapped on to Ethos-U85 with no fallback of operators to a CPU.

And finally, as part of Corstone-320, Ethos-U85 is built right at the heart of our latest IoT Reference Design Platform. This helps to accelerate the development and deployment of high performance systems-on-chip (SoCs) across a variety of AI-based IoT solutions.

Video: Revolutionizing edge AI with Arm Ethos-U85 NPU

Unleash every AI capability at the edge

Ethos-U85 will bring the compute power necessary to execute many state-of-the-art AI capabilities at the edge and on endpoint devices. As the world of AI develops, our partners will have reliable, efficient, and high performing Ethos-U based solutions. We expect to see Ethos-U85 deployed in emerging edge AI use cases, in smart home, retail or industrial settings, where there is demand for higher performance compute with support for the latest AI frameworks.

At Arm, we take pride in enabling our partners and ecosystem, with cutting-edge hardware and software solutions. With Ethos-U85, we are opening a world of possibilities of edge and endpoint-based AI inference use-cases that will transform the world. Arm is taking edge AI innovation to the next level as we continue to build the future of edge AI on Arm.

Learn more about the Arm Ethos-U85 here.

By Parag Beeraka, Senior Director, Consumer Computing, Edge AI Business Unit, Arm

Article Text

Copy Text

Any re-use permitted for informational and non-commercial or personal use only.

Editorial Contact

Brian Fuller & Jack Melling

editorial@arm.com

Subscribe to Blogs and Podcasts

Get the latest blogs & podcasts direct from Arm

Blog

Apr 09, 2024

Arm Corstone-320: Accelerating Voice, Audio and Vision IoT Systems

Diya Soubra, Director, IoT Solutions, Arm

Blog

Apr 05, 2024

From Possibility to Reality: Enabling AI and ML at the Edge with Arm

Paul Williamson, SVP and GM of the IoT Business, Arm

Blog

Mar 15, 2024

Enabling Next-Gen Edge AI Applications with Transformer Networks

Stephen Su, Senior Segment Marketing Manager, Arm IoT, Arm

Blog

Aug 11, 2023

Demystifying the Power of Machine Learning in IoT

Parag Beeraka, Senior Director, Consumer Computing, Edge AI Business Unit, Arm

Media Information

Latest on X

; Arm @Arm ·

15 Feb 2022869745561682129

This Valentine’s Day, let’s talk about a perfect match.💘

GPUs may get the headlines. But in the AI data center, they don’t scale alone.

As AI shifts from intense training to always-on, agent-based inference, something fundamental is changing. AI systems now run in tight

Reply on Twitter 2022869745561682129 Retweet on Twitter 2022869745561682129 4 Like on Twitter 2022869745561682129 54 Twitter 2022869745561682129

; Arm @Arm ·

13 Feb 2022389856618352674

AI data centers are shifting from training jobs to always-on, agent-based inference. That change dramatically increases the need for CPUs. 🚀

Agent-based AI involves many agents running continuously, coordinating tasks, managing context, accessing data and interacting with

Reply on Twitter 2022389856618352674 Retweet on Twitter 2022389856618352674 7 Like on Twitter 2022389856618352674 38 Twitter 2022389856618352674

; Arm @Arm ·

13 Feb 2022158315183120561

Arm is expanding in Austin with support from the Texas Semiconductor Innovation Fund grant!

This move will strengthen Texas’ role in global semiconductor innovation, creating 320+ jobs, new advanced lab capabilities, and building momentum for future compute and AI.👏

Reply on Twitter 2022158315183120561 Retweet on Twitter 2022158315183120561 7 Like on Twitter 2022158315183120561 22 Twitter 2022158315183120561

; Arm @Arm ·

12 Feb 2022058409843900480

Arm’s share among top hyperscalers is expected to reach nearly 50% ⚡

That shift reflects a bigger reality: AI isn’t just about accelerators — CPUs power the data pipeline and system orchestration behind AI, while maximizing performance per watt at scale.

@TheFuturumGroup's

Reply on Twitter 2022058409843900480 Retweet on Twitter 2022058409843900480 4 Like on Twitter 2022058409843900480 37 Twitter 2022058409843900480

; Arm @Arm ·

11 Feb 2021615315126211052

This National Apprenticeship Week, we’re proud to highlight Ayo Giwa, who completed a apprenticeship with us and has since moved into a graduate role, a brilliant example of where this pathway can lead. 👏 https://okt.to/p7em6M

Reply on Twitter 2021615315126211052 Retweet on Twitter 2021615315126211052 3 Like on Twitter 2021615315126211052 23 Twitter 2021615315126211052

; Arm @Arm ·

11 Feb 2021386234736624104

Advancing on-device AI starts with extending the CPU architecture to better support machine learning workloads.

With Exynos 2600, @SamsungDSGlobal adopts Arm Scalable Matrix Extension 2 (SME2), accelerating matrix operations on CPUs and expanding the role of CPU-based AI for

Reply on Twitter 2021386234736624104 Retweet on Twitter 2021386234736624104 25 Like on Twitter 2021386234736624104 187 Twitter 2021386234736624104

; Arm @Arm ·

10 Feb 2021325825786757570

Commodity infrastructure was built for a different era. 🦖

As AI training and inference scale, efficiency and system level optimization are becoming critical. Purpose-built platforms designed end-to-end are emerging as the new foundation for AI infrastructure.

Reply on Twitter 2021325825786757570 Retweet on Twitter 2021325825786757570 3 Like on Twitter 2021325825786757570 24 Twitter 2021325825786757570

Arm Ethos-U85: Addressing the High Performance Demands of IoT in the Age of AI

How do Arm’s Ethos NPUs enhance AI performance at the edge and endpoints?

What is the impact of transformer architecture on AI development?

What are the key advantages of the Ethos-U85 NPU for edge and endpoint workloads?

Unleash every AI capability at the edge

Editorial Contact

Related

Arm Corstone-320: Accelerating Voice, Audio and Vision IoT Systems

From Possibility to Reality: Enabling AI and ML at the Edge with Arm

Enabling Next-Gen Edge AI Applications with Transformer Networks

Demystifying the Power of Machine Learning in IoT

Media Information

Company Overview & History

Arm Corporate Guidelines

Media Contacts

Latest on X