Is Your Thing in Danger?
by Ralph Moore
March 2024
Most IoT Things are embedded systems to which networking has recently been added. As such, hackers coming in via the Hacker’s Highway (aka the Internet) can overcome the weak defenses of such systems and gain access to critical information such as encryption keys. As a consequence, entire networks become compromised all the way into the Cloud.
The Cortex-M architecture accounts for a large proportion of microcontroller units (MCUs) in use today. Cortex-M has powerful processor security features, and most Cortex-M MCUs have Memory Protection Units (MPUs). Yet, these features are used only sparingly, if at all, in most embedded systems, despite the pressing need for better security. Why is this? It seems that the embedded system industry has made a collective judgement that the Cortex-M security features are either too hard to use or not effective and furthermore that they waste too much memory and processor time.
However, we have found that through careful, innovative design techniques, embedded system software can be divided into isolated partitions that provide strong security against hacker invasions. Furthermore, this can be done with only moderate memory and performance losses on the order of 10% — well worth the security gained. However, new tools and methodologies are necessary if reasonable development schedules are to be met, because there are many difficult obstacles to be overcome. It is for this reason that we have developed SecureSMX®, our next generation RTOS with security features built in.
Introduction
The following figure shows the security structure of typical microcontroller embedded software:
There is no security structure! A hacker who has gained access to this system, has access to anything he wants, including keys and other secrets. This undermines the security afforded by encryption, authentication, and other security methods employed in modern systems.
In contrast, the above figure shows the ideal partitioning of the same microcontroller embedded software. Each partition is isolated from all of the other partitions. If a hacker gains access to one partition, he cannot access the other partitions. What goes on in a partition stays in a partition. The pmode barrier separates unprivileged mode (umode) from privileged mode (pmode). This barrier is hardware-enforced by the processor. Note that the “Vault,” which contains keys and other valuables, is below the pmode barrier and that networking and other I/O partitions are above the pmode barrier. It is impossible for a hacker to break through the pmode barrier from one of these partitions to access the Vault.
It should be noted that the complete system partitioning shown above is not likely to be necessary for most legacy systems. Often, much less partitioning will achieve the security goals. For new systems, full partitioning may or may not be required, depending upon perceived threats. Ordinarily, partitions that interface to the outside world (shown in blue) are the most vulnerable to hacking. Application partitions (shown in green) are less vulnerable, but it is desirable to keep them safe since they do the important work. Below the pmode barrier are the most trusted partitions (shown in purple). The exception is ISRs, which are there because they must run in pmode. Most ISR processing can be deferred to LSRs, which are safer. Code minimization and ruggedized code can make ISRs resistant to hacking.
Steps to Achieve Fully Isolated Partitions
- Effective pmode/umode processor control.
- Efficient, flexible task-based Memory Protection Unit (MPU) control.
- Software Interrupt (SWI) API for system services (e.g. SemSignal()).
- Multi-heap support.
- Partition portals.
These are discussed below.
Processor Control
SecureSMX supports the Cortex-v7M and v8M architectures. These permit tasks to run in either pmode or umode. When a task is created, its mode is specified. When it is dispatched, the task scheduler switches the processor to umode for utasks and leaves the processor in pmode for ptasks. ptasks can directly access all system services, whereas utasks must use the software interrupt API, discussed below, to access system services. Hence, ptasks must be trusted. When a task is suspended or stops, control reverts to SecureSMX in pmode.
Normally, embedded applications run entirely in pmode, when they are first ported to SecureSMX, as this requires only minor modifications to be made to them. Then vulnerable and untrusted code is moved into isolated umode partitions above the pmode barrier. This greatly improves protection for the rest of the system from hacking and malware.
Memory Protection Unit
The above illustrates how a task accesses memory via slots in an MPU. Each slot permits access to a contiguous region of memory having attributes such as ReadWrite, ReadOnly, ExceuteNever, etc. If a task attempts an access outside of or not permitted by its regions, a Memory Manage Fault (MMF) occurs, the task is stopped, and pmode recovery software takes over. This severely restricts a hacker with regard to the common hacking tricks that he might employ – it is like tip-toeing in a mine field!
SecureSMX supports secure boot and system initialization in pmode, then switches to task mode and dispatches tasks running in pmode (ptasks) and tasks running in umode (utasks). Each pmode partition has one or more ptasks; each umode partition has one or more utasks. Every task has its own Memory Protection Array (MPA), which is loaded from an MPA template when the task is created. Typically tasks within a partition share the same template. When a task switch occurs, its MPA is loaded into the MPU. The task is then dispatched to run in umode or pmode.
Software Interrupt API
ptasks can directly call system services in pmode. However, utasks require an SWI API for system services such as waiting at or signaling a semaphore. The SWI API is implemented using the Cortex-M svc n instruction; it causes an SVC exception that results in switching to pmode and executing the desired system service. The parameter n selects the system service to perform. The svc instruction is the only way that a utask can penetrate the pmode barrier and then only to run a permitted system service. When the system service completes, the utask is resumed in umode with the return value and data, if any, from the service.
Not only system services but also the structures they use (e.g. task control blocks) reside in pmode and thus are not accessible to a hacker from umode. In addition, services that could cause system damage are not permitted from utasks. Attempted use of such a service results in the utask being stopped and pmode recovery software taking control, thus limiting the hacker even more.
umode offers the ultimate in isolation and protection. However, pmode offers some security and is a good place for proven, trusted, mission-critical and system software.
Multi-Heap Support
Multi-heap support is necessary because using a common heap between partitions destroys their isolation from each other. A hacker could wreck a common heap and bring all partitions using it down or cause them to do damage. SecureSMX uses eheap, a heap similar to dlmalloc, but designed for embedded systems. eheap has numerous attractive features beyond the scope of this paper. Of importance here is that it provides simple, multi-heap support and that each heap can be customized for the partition it is in — i.e. it can be very minimal or it can be very extensive. This minimizes legacy software rewriting and supports object-oriented designs (e.g. C++).
Partition Portals
Libraries, servers, and other code typically employ function-call APIs. This means that to use a function, it must be callable by the client and thus must be accessible by the client — i.e. it must be in a region shared between the client and the server. Typically subroutines and global data also must be shared. Thus isolation between the client and the server is seriously reduced. Since servers often are vulnerable (e.g. network servers) this is not acceptable.
Portals provide a solution to this problem. SecureSMX offers two types of portals: tunnel portals for high-speed multi-block data transfers and free message portals for commands, callbacks, and low-speed data transfers.
A header file is included in client software modules to map server function calls in those modules to shell functions. Shell functions have slightly different names (e.g. sfsp_fread() instead of sfs_fread()) and are contained in a separate module. The shell function for each server function creates a protected message, pmsg, and sends it to the pmsg exchange for the portal. pmsgs are queued at the pmsg exchange in priority order and when a pmsg is accepted by a server, the server adopts the priority of the pmsg.
Protected messages are so named because they are mini MPU regions, which carry their own region information (e.g. rbar and rasr) with them. When received by a server, the pmsg region is plugged into an MPU slot and voila! the pmsg becomes accessible to the server.
After accepting a pmsg from its portal exchange, a switch function in the server converts the pmsg to the corresponding function call and makes the call. The function return value and data, if any, is passed back to the client software via the pmsg and the shell function. Thus client software sees no difference between direct calls and portal calls, except small increases in execution times. Yet client and server are fully isolated from each other. Other than sending faulty messages or returning values and data, there is nothing a hacker can do to break down the barrier between client and server — his bag of tricks is empty!
Ralph Moore is President of Micro Digital. A graduate of Caltech, he has served many roles at Micro Digital since founding it in 1975. Currently he is lead architect and programmer for SecureSMX.
Copyright © 2021-2024 Micro Digital, Inc. All rights reserved.
smx is a registered trademark of Micro Digital Inc. smx product names are trademarks of Micro Digital, Inc.
More Solution Papers
|