Blog August 16, 2021
Arm Memory Tagging Extension: Securing Software Starts with Hardware
Our thinking behind the Arm Memory Tagging Extension (or MTE), a new architectural feature in Armv9
By Travis Walton, Distinguished Engineer and Director of Developer Ecosystem Strategy, Arm
Physicist Art Rosenfeld was working late at Lawrence Berkeley National Lab one night in 1973 when he noticed it. Despite an ongoing energy crisis, his colleagues routinely left their lights on after they left.
Waste was one of the largest consumers of power in the state, he soon discovered: pilot lights consumed 10 percent of gas in homes. Switching from physics to politics, he lead the charge to pass laws mandating efficiency standards for buildings and appliances.
Appliance makers “claimed it was the (expletive) end of civilization,” he said.
Fast forward nearly 50 years. California refrigerators consume around half as much power, hold 60% more food and cost roughly the same in real dollars. More importantly, per capita, electricity consumption has stayed roughly flat in California. In the rest of the U.S., it’s grown by nearly 50 percent. The Rosenfeld effect is now major policy consideration worldwide.
Software faces a similar dilemma with memory vulnerabilities. Both Microsoft and Google estimate that memory corruption is at the root of approximately 70 percent of the vulnerabilities identified in their products. Other developers likely experience the same thing. Many applications are written in C and C++, which predate the modern security era.
While many vulnerabilities are merely annoying, their impact is substantial. Customers get annoyed and trade messages on social media channels. Developers waste hours tracking down the root cause of the problem. Reputations and revenue falter. In the worst cases, the vulnerability becomes a gateway for data theft or denial of service attacks. Eliminating them would be a boon for everyone.
Yet, the problem persists. Developers complain they don’t have time to filter through libraries or conduct more exhaustive reviews of their own code before launch. Social media and streaming would like to see the problem solved, but don’t have the power or means to change it. And, while the ultimate costs might be substantial, but they are also down the road.
NIST examined the cost of bug fixes in a famous paper in 2002 and the lessons still hold true today – 75 percent of bugs can be detected in the integration phase or earlier and the cost of delay grows exponentially.
To break the cycle, we need to find a way to ferret out bugs at the earliest possible stage in the product development cycle and with the minimum amount of time, which means removing them at the silicon level. Like in California in the 70s, starting will be the hardest part. Over time, better code hygiene will become baked into the development progress. The upfront costs will decline and the benefits will increase.
Hardware or software?
So how we get clean? While software fixes have been developed, they probably aren’t suitable for mass adoption. Hardware-assisted addressed sanitizer (HWASAN), for example, detects 99 percent of bugs but it comes with a 50 percent code overhead penalty and can take up nearly 20 percent of RAM capacity when in use. GWAP-Asan, meanwhile, has a very small footprint but also has a low detection rate.
We’ve decided to include a hardware approach. Memory Tagging Extension (or MTE), a relatively new feature in the Arm architecture and one that will become more pervasive with Armv9, effectively cross-checks the validity of pointers to memory locations before using them. Think of it as the equivalent of an airline employee checking boarding passes right before boarding.
In a consumer device, it runs in asynchronous mode, highlighting areas of code with memory problems. This has a minimal performance cost. During development, it runs in synchronous mode, checking code precisely, instruction by instruction as it executes, but at a greater performance cost. MTE requires 4 bits for every 16 bytes of memory protected or roughly 3 percent of capacity. Tests and simulations show the performance cost is minimal. You can get a deeper technical dive here.
Google partnered with Arm on MTE and has committed to supporting it across the Android stack. You can anticipate programs for developers to integrate it into their workflow to increase.
“We believe that memory tagging will detect the most common classes of memory safety bugs in the wild, helping vendors identify and fix them, discouraging malicious actors from exploiting them,” write Kostya Serebryany and Sudhi Herle of Google.
The first encounter many developers have with MTE could be daunting. It will ferret out far more vulnerabilities than they will want to fix. Some may simply turn it off right there. But we believe most developers will take a triage approach, fixing the serious vulnerabilities before a launch and tracking down the less severe ones during updates. As time goes on, their code will become progressively cleaner. Subsequent full-body scans will pull up fewer flaws, making the process potentially less time-consuming. The frequency of crashes, complaints, and fire drills will likewise decline.
Potentially, an industry-driven response could even help stave off regulation down the road. HIPAA fines and other regulatory compliance measures were instrumental in building momentum for encryption. Even if memory vulnerabilities get to the regulatory phase, could better code cleaning lead to discounts on insurance premiums or other, similar benefits? Absolutely.
And if history is any guide, the fear of the cure often looms much larger than the cure itself. Seatbelts threatened to end the romance and passion of driving. Instead, they saved lives. Eliminating smoking sections threatened to be the death knell of restaurants and pubs and turn the wait staff into security enforcers. Now, any smoker lighting up would be hounded outside by the patrons. (And if you want to go way back, critics also worried about restrictions on child labor, leaded paint, and tainted meat.)
We’re at the same point with memory vulnerabilities. The problem won’t get better on its own. And, who knows, five years from now we might wonder why it took so long?
This article previously appeared on Semiconductor Engineering
Any re-use permitted for informational and non-commercial or personal use only.
Brian Fuller and Jack Melling