Memory Safety: How Arm Memory Tagging Extension Addresses this Industry-wide Security Challenge
Learn all about Memory Tagging Extension, how it is being implemented across the Arm mobile ecosystem and why it’s a vital security feature to tackle the industry-wide challenge of memory safety bugs.
By Michael Lu, Director of Strategy (Security and Privacy), Client Line of Business, Arm
The future of computing will be driven by the increasing digitization of all aspects of our everyday lives, with this leading to increasing software and system complexity. The National Institute of Standards and Technology (NIST) reported that the number of vulnerabilities reported in 2022 was more than 23,000 (with more than 17,000 classified as critical), which set the record for the sixth straight year in a row.
Through Arm CPUs built on the latest v9 architecture, we are providing security features like Memory Tagging Extension (MTE) to reduce this complexity and provide profound security, safety, cost and time-to-market benefits for software developers, silicon vendors and device manufacturers. Armv9 security improvements remove up to 95 percent of certain classes of vulnerabilities, like memory safety violations that account for the majority of all serious security bugs.
Arm’s MTE was first introduced as part of the Armv8.5 instruction set in August 2019 before being built into the first Armv9 compliant CPUs that were announced in May 2021. Even before the introduction of the Armv9 architecture, Google announced that it was adopting Arm’s MTE in Android and committed to supporting MTE across the entire Android stack. Then, at the end of 2022, Honor announced at its developer conference that it will make MTE enabled MagicOS 6.x and MagicOS 7 devices available to developers through Honor SkyNet and in future DiagnosisKit tools. In the future, this could apply to Honor mobile devices that come to market for the end-user.
In this article, we answer some of the key questions about MTE, including what it is, how it solves security challenges like memory safety, the benefits it provides and what our partners are doing with the feature.
What does MTE do? And how does it lead to better software in the Arm ecosystem?
MTE allows developers to find memory-related bugs quickly, speeding up the application debugging and development process. Moreover, the ability of the feature to change configurations dynamically means that in the field accurate information about the location of an access failure can be relayed back to developers through bug reporting and telemetry systems.
It’s worth noting that the first encounter that many developers have with MTE may reveal far more vulnerabilities than they will be able to fix. However, developers can decide to fix the most serious vulnerabilities before a launch and then track down the less severe ones during updates. Also, as time goes on, developers’ code will become progressively cleaner, with subsequent full-body scans pulling up fewer flaws, making the process less time-consuming. The frequency of crashes, complaints, and fire drills will also decline.
MTE is beneficial for the wider mobile ecosystem because it allows developers to detect and avoid memory safety vulnerabilities before and after deployment. Locating and fixing vulnerabilities before deployment is important for security because it reduces the attack surface of deployed code. Detecting vulnerabilities after deployment supports the ability to reactively fix vulnerabilities before they are widely exploited, with MTE assisting with this detection. This provides robustness against attacks that are attempting to subvert security code.
Why is it so important to resolve memory safety violations?
Memory safety has been a major source of security vulnerabilities for decades. Operating system vendors (OSVs) report that vulnerabilities due to violations of memory safety account for most of the security issues in their products. Google’s Chromium Project team stated that 70 percent of all serious security bugs are memory safety issues.
The impact of memory safety violations on users can be substantial. Rogue applications can take advantage of unsafe memory to gain access to sensitive data, such as user credentials and passwords, which allows bad actors to gain access to confidential data. Alongside the security benefits, the disruption caused by unaddressed memory safety bugs reduces user satisfaction, increases the cost of software development and means more time spent addressing these issues at a later date.
The National Security Agency (NSA) recently published guidance to help software developers and operators prevent and mitigate software memory safety issues, with its Cybersecurity Technical Director, Neal Ziring, stating that “memory management issues have been exploited for decades and are still entirely too common today.” The agency’s “Software Memory Safety” Cybersecurity Information Sheet highlights how malicious cyber actors can exploit poor memory management issues to access sensitive information, promulgate unauthorized code execution, and cause other negative impacts.
What are memory safety violations?
There are two main types of memory safety violations: spatial and temporal. MTE provides the mechanism to detect both types in production code with no instrumentation.
Spatial safety is violated when an object is accessed outside of its true bounds. For example, when data is written beyond the buffer or other object. This may be exploited to alter the target address of a function pointer, saved register, or similar.
Temporal safety is violated when a reference to an object is used after it has expired, typically after the object’s memory has been freed – exploiting an existing “use after free” bug. Using knowledge of the allocator, an attacker can place a new and malicious object in place of the expected version.
How does MTE work?
Arm implements MTE as a two-phase system, known as the ‘lock’ and the ‘key’. If the key matches, then the lock memory access is permitted; otherwise access can be recorded or faulted. In this way, hard-to-catch memory safety errors can be detected more easily, which also aids general debugging.
Within the lock and key two-phase system, there are two types of tagging:
Address tagging, which acts as the key. This adds four bits to the top of every pointer in the process. Address tagging only works with 64-bit applications since it uses ‘top-byte-ignore’, which is an Arm 64-bit feature.
Memory tagging, which acts as the lock. Memory tags also consist of four bits, linked with every aligned 16-byte region in the application’s memory space. Arm refers to these 16-byte regions as tag granules. These four bits are not used for application data and are stored separately.
Why do mobile devices need MTE?
Mobile devices are coming to market with more advanced compute capabilities and, as a result, larger attack surfaces. At the same time, the quantity and value of personal content and data becoming available through these devices are increasing all the time. Therefore, it is vital to implement a security feature that delivers a secure ecosystem and safe digital experience for the end-user.
MTE is very flexible and can be deployed in different configurations at various stages of product development and deployment. For example, MTE can be configured, per-process, in asynchronous and synchronous modes. Asynchronous mode runs with very low overhead and can be used to identify areas of code with memory problems, while synchronous mode faults on the instruction causing a safety violation, producing rich debugging information as the bug is detected. This flexibility is especially useful for large scale deployments, with MTE highly scalable and capable of being run across a fleet of millions, or even billions of devices, providing robust error detection for system and application software.
Who is Arm working with on MTE?
In August 2019, Google announced that it was adopting Arm’s MTE in Android and committed to supporting MTE across the Android stack, stating that the technology makes “it very hard (if not impossible) to exploit memory bugs.”
The aim of the Arm/ Google MTE collaboration is to detect memory safety bugs in both existing codebases and in new code as it is written. Here’s what Google’s Kostya Serebryany and Sudhi Herle had to say:
We believe that memory tagging will detect the most common classes of memory safety bugs in the wild, helping vendors identify and fix them, discouraging malicious actors from exploiting them.
Android 12 has added an initial MTE implementation that detects use-after-free and buffer-overflow bugs, which are the most common source of memory safety bugs in the Goggle codebases. In Android 13, Google has added a developer mode boot switch to enable MTE on devices that have hardware support but do not have MTE permanently switched on. For future Android releases, Arm and Google are focusing on lowering the memory used by MTE.
What about silicon vendors and device manufacturers?
MTE is an intrinsic feature that is part of all of Arm’s v9 CPUs. Multiple Arm partners who are committed to tackling memory safety bugs in the software ecosystem have already built and enable this feature across their chipsets. One device manufacturer who is leading the charge for MTE adoption is Honor, with the company announcing that its MTE enabled MagicOS 6.x and Magic OS 7 devices will be available to developers through its Honor SkyNet and in future DiagnosisKit tools. This is a significant indication that MTE could be switched on across mobile devices built on Armv9 technology that are coming to the consumer market.
Already this work is having a positive impact. Kuaishou – a leading content community and social platform for video sharing and live streams with over 360 million daily average users and 626 million monthly average users, making it the second-largest short video platform in the world – is partnering with Honor SkyNet and using Arm MTE to enhance the efficiency of memory safety across the development cycles of large-scale software projects. This has led to 90 percent of memory bugs being detected before release, alongside the following additional improvements:
3x scan speed improvement when scanning for memory bugs before app deployment
Native RAM usage decreasing by 50 percent
Ten serious bugs being found, with this not being possible without MTE.
Here’s what Kuaishou – whose overseas products, Kwai and SnackVideo, cover more than 30 countries and 160 million users – had to say about MTE:
At Kuaishou, our mission is to be the most customer-obsessed company in the world. Therefore, we place a high emphasis on ensuring the ultimate user experience, with privacy and safety a significant part of this commitment. Kuaishou’s R&D team has committed serious efforts to ensuring a safe and secure user experience, with MTE playing a significant role in guaranteeing memory safety across the platform. Due to the high performance overhead of traditional memory sanitizer tools and the requirement of re-compiling all of the source code, it is almost impossible to use these tools in Kuaishou’s large C++ codebase daily practice on mobile platforms.”.
Why did Arm build MTE into its products and solutions?
MTE is part of a range of new and existing security features in the Armv9 architecture to improve security across all consumer market segments. This means our partners can achieve better value from their own software investment into security measures, leading to a more standardized and scalable solution that can address a diversity of security challenges.
What is Arm’s latest MTE development?
Through the second-generation Armv9 CPUs announced in June 2022, we introduced a brand-new Asymmetric MTE, which offers improved flexibility between the speed, precision, and targeting of these security vulnerabilities. This benefits software development with more stable applications, while also enabling a broader rollout of MTE across the ecosystem. Chipsets that adopt Arm Cortex-X3, Cortex-A715 and the newly updated version of Cortex-A510 (A510-r1) CPUs will have asymmetric MTE built in.
A safe and secure digital experience built on Arm
Arm is redefining the future of computing, with MTE being one key feature that is delivering secure mobile experiences. Arm is working with all partners across the mobile ecosystem – silicon vendors, device manufacturers, OSVs and developers – to encourage the implementation of the MTE feature to reduce time and costs and deliver safe and secure user experiences. Best of all for our ecosystem is that MTE can be deployed easily at scale. Through the Armv9 architecture – which is the computing foundation for billions of mobile devices worldwide – MTE is now widely available, with the world’s digital security being built on Arm.
Securing the future of computing with Arm’s v9 architecture
Arm’s v9 architecture is the underlying compute foundation for billions of mobile devices worldwide, providing in-built security features like MTE across all Armv9 CPUs.
Join us for our Tech Talk with @awscloud live from their Santa Clara Lab for a glimpse into the future of mobility! 🚀 We'll explore the auto sector's journey to Software-Defined Vehicles (SDV) and the tech accelerating development today.
Ami joins us as Chief Marketing Officer, drawing on years of experience growing the developer ecosystem & driving generative AI in the data center for NVIDIA. With her unique skills, we can continue to build the future #onArm: https://bit.ly/47x4Yua