From Possibility to Reality: Enabling AI and ML at the Edge with Arm
When you think of artificial intelligence (AI) and machine learning (ML), you always think of tractors, right? Of course not, but this comparison in the Economist can be helpful.
Here’s why. Despite tractors’ heralded launch in the early 20th century, farmers were slow to embrace the technology. Only 23% of U.S. farms used them by 1940. Why the slow uptake? Limited functionality, reliability issues, maintenance challenges, and prohibitive costs for the most part. But despite the challenges, most farmers could see the transformation these machines would bring once the bugs were ironed out and they became more economically attractive.
The pace of technological adoption today far outstrips that of the 20th-century agriculture sector, but the lessons learned from the evolution of the tractor are relevant to the early adoption of AI and ML at the edge. Put another way, the competition to invest in AI systems must move from amazing possibilities (1940s farmers gazing admiringly at tractors) to realistic implementation plans – for example, increased farming efficiency, diversification and intensification of agriculture, and development of specialized attachments and services for tractors.
To get to AI and ML at the edge at scale, however, several obstacles currently stand in the way of widespread adoption.
Navigating the diversity: How AI and ML are shaping the future of IoT at the edge?
One of the challenges of deploying AI and ML at the edge is the diversity of hardware available for different applications and use cases. Often, the variety of hardware options means that developers must tailor their models and code for the specific hardware they are targeting, which adds complexity and overhead to the development process.
In reality – just as in mobile and high performance IoT – the majority of ML models run on CPUs. The common denominator in IoT is the Arm architecture. In 2020, Arm launched Helium as a seamless extension to the Cortex-M instruction set, enabling ML acceleration on ultra-low-power devices. With Helium, developers can achieve up to 15x more performance and 5x more energy efficiency for ML applications compared to previous Cortex-M generations. More than 35 partners are already shipping devices with Helium technology, including NXP, Renesas, Ambiq, and Alif. Embedded World 2024 will see even more devices built on Helium, as we enter a decade of AI innovation in embedded systems.
The natural progression in this performance journey is the Arm family of Ethos NPUs, designed to deliver the highest performance and efficiency for ML workloads at the edge. Ethos NPUs are scalable and configurable, offering different levels of performance and power consumption for different applications, such as computer vision, natural language processing, speech recognition, and recommendation systems. Ethos NPUs can be integrated with any Arm-based system-on-chip (SoC), providing a seamless solution for ML acceleration on devices ranging from smart speakers to security cameras.
What does the AI model lifecycle look like at the edge?
Another challenge is the lifecycle of AI models, which includes training, tuning, and deployment. To deploy AI models at the edge, developers need to consider how to optimize the models for the specific hardware they are targeting. This involves choosing the right model architecture, data format, quantization scheme and inference engine that can run efficiently on the embedded device. Moreover, developers need to select an inference engine that can leverage the hardware features of the device, such as an Ethos NPU or Helium technology, to accelerate the execution of the model.
Arm makes it easy to use popular ML frameworks, such as PyTorch and ExecuTorch, on embedded devices. For example, Arm Keil MDK, the integrated development environment (IDE) that simplifies the development and debugging of embedded applications, supports CMSIS Packs, which provide a common abstraction layer for device capabilities and ML models. Simplified development flows are bringing AI within reach on a single toolchain and single proven architecture, with more than 100 billion Cortex-M devices shipped to date amid a global ecosystem of more than 100 ML partners.
By using Arm solutions, developers can reduce the time and cost of developing ML applications for embedded devices and achieve better performance and efficiency.
How do embedded devices overcome ML constraints for optimal performance?
One of the main challenges of embedded development is to optimize the performance and efficiency of ML applications on resource-constrained devices. Unlike cloud-based solutions, which can leverage the abundant computing power and memory of servers, embedded devices have to run ML models locally and often under strict power and latency constraints. To achieve desired ML performance developers often have to compromise on price or power consumption in the first iteration of the product.
Arm Virtual Hardware, which offers cloud-based simulations of Arm-based systems, is an innovative solution that allows developers to create and test ML applications without having to rely on physical hardware. It integrates seamlessly with MLOps solutions, such as AWS SageMaker and Google Cloud AI Platform, to streamline the deployment and management of ML models across devices. These platforms provide tools and services for automating the entire ML lifecycle, from data management and model training to deployment and monitoring. By combining Arm Virtual Hardware and MLOps solutions, developers can achieve faster time to market, lower costs and better scalability for their embedded ML applications.
How can we effectively deploy and secure intellectual property on edge devices?
Deploying and securing valuable intellectual property across millions of endpoints is a major challenge. This stems from the fact that ML models are essentially mathematical functions that can be extracted and replicated by anyone who has access to the device or the data stream. It exposes the devices and the data to potential tampering, manipulation, or malicious attacks that could compromise their functionality and reliability. Developers, therefore, need to ensure that their ML models are protected and cannot be easily reverse engineered.
One of the ways that Arm helps developers deploy and secure their ML models on edge devices is by working within the framework provided by PSA Certified. Based on the Platform Security Architecture (PSA) – best practices and specifications developed by Arm and its partners to help secure IoT devices – PSA Certified enables users to verify and trust the security of IoT products, and comply with regulations and standards.
What innovations are accelerating AI and ML at the edge?
The emergence of AI and ML is reshaping the landscape of embedded systems, and this will be on full display next week at Embedded World in Nuremberg, an event that’s quickly evolving into what you might call “Edge AI World.”
Last year, we and our partners talked about the myriad ways some familiar challenges of embedded development were being tackled – whether it was the rise of development solutions such as Arm Virtual Hardware, the emergence of new industry standards, or the adoption of the Arm architecture to enable flexibility, efficiency and minimize security risk.
At this year’s Embedded World, we confront the dizzying pace of innovation of AI and ML at the edge and the consequences for the Arm developer ecosystem. Consider that with the rise of interconnected devices at the IoT edge, there’s an exponential surge in data, providing ample opportunity for AI algorithms to process and derive real-time insights. And while the spotlight often shines on generative AI and large language models (LLMs), smaller models are making their mark by being deployed on edge IoT devices, such as Raspberry Pi. Transformer network models are also making waves at the edge, setting themselves apart from conventional convolutional neural networks (CNNs) by their inherent flexibility.
The accelerated pace of change is breathtaking. We at Arm are excited to play a vital role in enabling AI in high-performance IoT devices and systems. Our vision is to deliver intelligent and secure devices and systems that can empower innovation and transform lives. Arm remains committed to assisting developers in tackling challenges by offering:
- Optimized hardware and software for AI in high-performance IoT that carefully balance performance, power consumption, cost-effectiveness, security and scalability.
- Streamlined tools and platforms that democratize the development and deployment of AI in high-performance IoT, empowering developers and system builders from diverse backgrounds to create and tailor solutions according to their needs.
- Robust ecosystem support and strategic partnerships that drive the adoption and maximize the impact of AI in high-performance IoT, encouraging collaboration and co-creation across various stakeholders and industries.
These are the pillars of our vision for AI at the IoT edge, which we believe – in the same way the tractor revolutionized farming and the food chain – will transform the way we interact with the physical world and unlock new possibilities for human creativity and innovation.
Join us at Embedded World at the Nuremberg Messe, Hall 4, Stand 504.
Any re-use permitted for informational and non-commercial or personal use only.