Maximizing performance and development efficiency with Armv8-M 

The embedded industry is rapidly evolving, and with it, the need for powerful, secure, and efficient microcontrollers. Enter Armv8-M architecture—the latest leap forward in the Cortex-M family. Over the course of this three-part blog series, we’ll explore the new features and improvements introduced in Cortex-M23 and Cortex-M33 microcontrollers, designed to address modern challenges in embedded systems.

Here’s what we’ll cover:

Part 1: The new C11 language support and significant MPU (Memory Protection Unit) updates. These enhancements are critical for ensuring data integrity and memory security, key components of any robust embedded system.

Part 2: Next, we’ll shift gears and focus on performance improvements in Cortex-M0/M0+, such as the addition of hardware division, long branch instructions, and more efficient MOV operations. These updates give developers a performance edge while maintaining low power consumption.

Part 3: Finally, we’ll wrap up by taking a closer look at security improvements. With embedded devices increasingly connected to networks, TrustZone technology and stack protection have become essential for safeguarding systems from threats. We’ll explore how these features enhance security.

Whether you’re designing for industrial automation, IoT, or safety-critical applications, this series will give you valuable insights into how Armv8-M is shaping the future of embedded systems. Stay tuned as we explore each of these topics in detail!

Blog 2: Maximizing performance and development efficiency with Armv8-M : Cortex-M0/M0+ performance improvements

Pushing performance further with Armv8-M

In this article, we’re exploring the performance improvements offered by the Armv8-M Baseline architecture, particularly the enhanced capabilities of Cortex-M0/M0+. These cores are widely used to replace 8/16-bit microcontrollers or in devices where low cost and low power consumption are key. Now, with the Cortex-M23 (based on Armv8-M), we see several exciting performance boosts that make it a more powerful alternative.

Key enhancements in Cortex-M23 for performance:
The Cortex-M23 has leveled up with some key instruction upgrades. Let’s break them down:

  1. Hardware division: No more relying on software libraries to handle division.
  2.  Compare and branch (CBZ, CBNZ) instructions: Simplified and faster decision-making in your code.
  3. Long Branch (B) instruction: Extended jump range to support larger software projects.
  4. MOV instruction with 16-bit immediate values: More efficient constant handling.

Now, let’s review each of these improvements in more detail.

1. Hardware division

In the earlier Cortex-M0/M0+, hardware division was missing, which meant developers had to rely on software libraries for division. While functional, this approach wasn’t the fastest. With Armv8-M, a hardware division instruction is now built in, meaning division operations are much faster and more efficient, reducing both execution time and code size. In fact, switching from Cortex-M0+ to Cortex-M23 saves 224 bytes in code size.

 

Fun fact: The hardware divider in Cortex-M23 completes operations in 17 cycles (fast divider) or   34 cycles (slower divider), while a software library could take up to 5 times longer!

2. Compare and Branch (CBZ, CBNZ) instructions

How often do you write code that checks if a variable is zero or not? Pretty often, right? In Cortex-M0+, this would typically take two instructions. But with Cortex-M23, the new CBNZ (Compare and Branch if Not Zero) instruction allows you to handle these cases in one go, speeding things up and making your code cleaner.

 

 

This feature is already used in the Cortex-M3/M4/M7 series, meaning you can reuse code easily across those cores.

3. Long Branch (B) instruction

As your software grows, so does the need to jump long distances in your code. The previous Cortex-M0+ limited you to jumping a maximum of -2048 to 2046 steps, but the Cortex-M23 extends this range significantly to -16,777,216 to 16,777,214. This means fewer instructions are needed to achieve those long jumps, saving time and simplifying the code.

4. MOV instruction with 16-bit immediate values

Previously, in Cortex-M0+, the MOV instruction could only handle immediate values up to 255. But what if you wanted to set a value like 301? You’d need multiple instructions to achieve that. Not anymore! The Cortex-M23 can handle 16-bit immediate values, so you can set larger constants in a single instruction, streamlining your code.

How does Cortex-M23 stack up?

Here’s a look at the CoreMark benchmark scores per MHz for Cortex-M0, Cortex-M0+, and Cortex-M23: As you can see, Cortex-M23 offers a performance boost, even if it’s a slight one, over its predecessors. But when you combine that with the reduced code size, increased efficiency, and hardware upgrades, the jump to Cortex-M23 is well worth it.

What’s next?

In our next blog, we’ll tackle security improvements in the Armv8-M architecture, exploring how TrustZone and other features are making embedded systems more secure.

As always, the IAR Embedded Workbench for Arm supports the entire Cortex-M family, including Armv8-M (Cortex-M23, Cortex-M33). With this unified development environment, you can keep pushing the limits of what’s possible with these incredible microcontrollers.

Download the evaluation version of IAR Embedded Workbench for Arm today and see firsthand how these performance enhancements can elevate your embedded projects.

 

Hiroki akaboshi
Hiroki Akaboshi
Field Application Engineer APAC

 

References:
[1] Armv8-M Memory Model and Memory Protection Version 1.1 User Guide