Cortex-M0 Plus Microcontrollers Instruction Manual

September 9, 2024
ST

Cortex-M0 Plus Microcontrollers

Cortex-M0-Plus-Microcontrollers-product

Hello, and welcome to this presentation of the ARM® Cortex®-M0+ core which is embedded in all products of the STM32U0 microcontroller family.

Cortex-M0+ processor overview

  • ARMv6-M architecture
  • Von Neuman architecture, 2-stage pipeline
  • Single-issue architecture
  • Multiply in 1-cycle
  • Memory Protection Unit (MPU)
  • Single-cycle I/O port

Ultra low power design Very compact code

Low power consumption and high energy efficiency| Except control instructions and branch and link, all instructions are 16 bits long

The Cortex®-M0+ core is part of the ARM Cortex-M group of 32-bit RISC cores. It implements the ARMv6-M architecture and features a 2-stage pipeline.
The Cortex®-M0+ has a unique AHB-Lite master port, but supports concurrent instruction fetch and data access when the data access targets the Fast I/O Port address range.

Cortex-M processors compatibility

Seamless architecture across all applications

Cortex-M0-Plus-Microcontrollers-2

STM32U0 microcontrollers integrate an ARM® Cortex®-M0+ core in order to benefit from the incomparable performance per milliwatt ratio.
All Cortex®-M CPUs have a 32-bit architecture.
The Cortex®-M3 was the first Cortex®-M CPU released by ARM.
Then ARM decided to distinguish two product lines: high performance and low power, while maintaining the compatibility between them.
The Cortex®-M0+ belongs to the low power product line. It is designed for battery-powered devices, very sensitive to power consumption.

Core architecture overview

Cortex-M0-Plus-Microcontrollers-3

The Cortex®-M0+ core delivers more performance than the Cortex®-M0 core thanks to the 2-stage instruction pipeline.
Let’s start our description of the CPU by the processor core in charge of fetching and executing instructions.

ARM Cortex-M0+ → 2-stage pipeline

Cortex-M0-Plus-Microcontrollers-4

Most V6-M instructions are 16 bits long. There are only six 32-bit instructions and most of them are control instructions, rarely used. However, the branch and link instruction, which is used to call a sub-program is also 32 bits long, in order to support a large offset between this instruction and the label pointing to the next instruction to be executed.
Ideally one 32-bit access loads two 16-bit instructions, which results in less fetches per instruction.
During clock number 2, no instruction fetch occurs. The AHB Lite port is available to execute a data access when instruction N is a load/store instruction.

Branch performance

Cortex®-M0+ core
• Maximum two 16-bit branch shadow instructions

Cortex-M0-Plus-Microcontrollers-5

On a given branch, fewer pre-fetched instructions are wasted (thanks to the 2-stage pipeline).
In clock number 1, the processor fetches Inst0 and an unconditional branch instruction.
In clock number 2, it executes Instr0.
In clock number 3, it executes the branch instruction while fetching the two next sequential instructions Inst1 and Inst2 called branch shadow instructions.

In clock number 4, the processor discards Inst1 and Inst2 and fetches InstrN and InstN+1.
Cortex-M0, M3 and M4 implement a 3-stage pipeline: Fetch, Decode and Execute. The number of branch shadow instructions is larger: up to four 16-bit instructions.

Core architecture overview

Cortex-M0-Plus-Microcontrollers-6

The Cortex®-M0+ has neither an embedded cache nor internal RAM. Consequently, any instruction fetch transaction is steered to the AHB-Lite interface and any data access is steered either to the AHB-Lite interface or the Single-cycle I/O port.
Note that the STM32U0 implements a SoC-level instruction cache, external to the CPU, located in the embedded flash controller.

The AHB-Lite master port is connected to a bus matrix, enabling the CPU to access memories and peripherals. Since transactions are pipelined on AHB-Lite, the best throughput is 32 bits of data or instructions per clock, with a minimum 2-clock latency.
The Cortex®-M0+ also features a Single-cycle I/O Port, enabling the CPU to access data with a 1-clock latency. An external decoding logic determines the address range in which data accesses are steered to this port.
In the STM32U0, the Single-cycle I/O Port is not used to access GPIO port registers. GPIO ports are mapped to AHB instead, allowing to be accessed by DMA.

Memory protection unit

  • MPU attribute settings define access permissions
  • 8 independent memory regions
    • Can execute code?
    • Can write data?
    • Unprivileged mode access?

The MPU in STM32U0 microcontroller offers support for eight independent memory regions, with independent configurable attributes for:

  • access permission: allowed or not read/write in privileged/unprivileged mode,
  • execution permission: executable region or region prohibited for instruction fetch.

References

For more details, please refer to these application notes and the Cortex®-M0+ programming manual available on www.st.com website.
Also visit the ARM website where you will find more information about the Cortex®-M0+ core.

Thank you
© STMicroelectronics – All rights reserved.
ST logo is a trademark or a registered trademark of STMicroelectronics International NV or its affiliates in the EU and/or other countries.
For additional information about ST trademarks, please refer to www.st.com/trademarks
All other product or service names are the property of their respective owners.

References

Read User Manual Online (PDF format)

Read User Manual Online (PDF format)  >>

Download This Manual (PDF format)

Download this manual  >>

Related Manuals