Six keywords appearing in almost every embedded project, cited commonly in code reviews, and understood correctly by almost a negligible number of new developers. And this isn't a report from a random academic survey — it is a field report from development teams of systems where getting these wrong costs weeks and months.
Particular kind of bug keeps haunting embedded systems: it is the kind where the code is correct, logic looks sound, unit tests pass, but the system still fails in hardware. You revisit the algorithm. Verify the peripherals. Add printfs for debugging — which, in turn, changes the timing enough for the fault to disappear. Then you remove it. And the problem returns.
In a significant proportion of such cases, the root cause can be traced to misunderstood keywords. It is not about a missing keyword — it is about a keyword that is present, used with confidence, but doing something entirely different from what the developer believed.
Engineers behind InnoLixir have encountered such faults in flight simulators, avionics systems, and bare-metal embedded devices. Such failures aren't exotic ones. They are systematic, just like the misunderstandings are. The language standard teaches you what these keywords mean, syntactically. It does not, however, teach you what happens in the context of a compiler with aggressive optimisations enabled, on a processor with cache and speculative execution, and interaction with a hardware peripheral that changes state independently of the CPU.
This article covers six keywords. Each, widely used. Each, widely misunderstood. And each misconception has a potential of causing a specific, technically precise failure mode that we have either encountered ourselves or debugged in someone's production system.
Ask any embedded developer what volatile does and many will tell you: it prevents the compiler from optimising away reads and writes to a variable. That is correct. That is also dangerously incomplete, in two distinct directions.
It does not guarantee atomicity. A volatile read or write on a 32-bit variable on an 8-bit or 16-bit processor, or being misaligned on a 32-bit processor may require multiple machine instructions. An interrupt occurring between those instructions produces a torn read — thus you get half of the old value and half the new one. The variable is volatile. The read is not atomic.
volatile uint32_t sensor_reading;
void ISR_Handler(void) {
sensor_reading = ADC->DR;
/* assumed atomic */
}
uint32_t val = sensor_reading;
/* assumed safe */volatile uint32_t sensor_reading;
uint32_t read_sensor_safe(void) {
uint32_t val;
__disable_irq();
val = sensor_reading;
__enable_irq();
return val;
}It does not. The standard itself specifies that volatile accesses are not reordered relative to each other. It says nothing about reordering relative to non-volatile accesses. On processors with weak memory ordering — ARM Cortex-A series, PowerPC, many RISC-V implementations — the hardware itself may reorder memory operations. A volatile write to a status register followed by a volatile write to a data register may arrive at the hardware in the opposite order. For that you need a memory barrier.
/* On a weakly-ordered architecture, this is NOT safe */ volatile uint32_t *status_reg = (volatile uint32_t *)0x40001000; volatile uint32_t *data_reg = (volatile uint32_t *)0x40001004; *data_reg = payload; /* hardware may see this second */ *status_reg = SEND_CMD; /* or this one first — weak ordering */ /* Correct: insert a data memory barrier */ *data_reg = payload; __DMB(); /* Data Memory Barrier — ARM */ *status_reg = SEND_CMD;
"volatile is a promise with the compiler, not with the hardware. On a modern processor, the hardware makes no promises at all unless you explicitly demand them with a barrier."
_Atomic or mutexesThe widespread belief is that const makes a variable read-only. This is true from the compiler's perspective. It is irrelevant from the hardware's perspective — and from the perspective of any other piece of code holding a non-const pointer to the same address.
These four declarations mean different things. Most developers treat them as equivalent until a bug teaches them otherwise.
const uint32_t *p; /* pointer to const uint32_t — p can move, *p cannot be written via p */ uint32_t * const p; /* const pointer to uint32_t — p cannot move, *p can be written */ const uint32_t * const p; /* const pointer to const uint32_t — neither can change */ uint32_t *p; /* plain pointer — both can change */
A hardware register that is read-only from software's perspective — a status register, an input register — should be declared volatile const. The volatile ensures the compiler actually reads it from hardware on every access. The const prevents accidental writes in software. Both qualifiers serve different purposes and both are necessary.
const uint32_t *STATUS =
(const uint32_t *)0x40020010;
while (*STATUS & BUSY_BIT) {
/* compiler may hoist this read
out of the loop entirely */
}volatile const uint32_t *STATUS =
(volatile const uint32_t *)0x40020010;
while (*STATUS & BUSY_BIT) {
/* every iteration reads from
hardware */
}Declaring a large lookup table as const does not guarantee the linker places it in flash. It only tells the compiler that the data is read-only. Where that data lives is determined by your linker script. On targets with limited RAM, failing to verify that your const tables are actually in ROM — not copied to RAM at startup — is a common source of memory exhaustion that commonly surfaces under full system load.
The single most overloaded keyword in C/C++. static has four distinct meanings depending on where it appears. Using a wrong mental model for which it applies can produce some of the hardest-to-reproduce bugs in production.
/* Meaning 1: static local variable
— persists across calls, lives in .data/.bss not stack
*/
void update_filter(int sample) {
static int32_t accumulator = 0; /* survives between calls */
accumulator += sample;
}
/* Meaning 2: static global variable
— internal linkage, invisible outside this translation unit
*/
static uint32_t module_state = 0; /* cannot be referenced from other .c/.cpp files */
/* Meaning 3: static function — internal linkage */
static void helper(void) /* not callable from other translation units */
{
...
}
/* Meaning 4: static in a struct member array size (C99)
— entirely different, rarely used
*/
void process(int arr[static 4]) /* guarantees arr has at least 4 elements */
{
...
}Static local variables make a function non-reentrant. In an RTOS with preemptive scheduling, or in any function called from both task context and ISR context, a static local has shared state without protection. The failure mode presented is a classic race condition — two concurrent callers modify the same static variable, producing corrupted intermediate state that neither caller observes directly but that causes a downstream fault - several execution steps later.
char *format_value(int v) {
static char buf[16]; /* shared across all callers */
sprintf(buf, "%d", v);
return buf; /* caller may see corrupted data */
}char *format_value(int v,
char *buf, size_t len) {
snprintf(buf, len, "%d", v);
return buf; /* safe: no shared state */
}Static variables with non-zero initialisers are placed in the .data section and copied from flash to RAM by the startup code — crt0 or equivalent — before main() is executed. If your startup code is incomplete, missing, or executes peripherals before the copy completes, your static initialisers give garbage. This is not theoretical — it appears in systems where someone inserts an early hardware initialisation call before the C runtime has fully set up.
The register keyword is a relic from an era before optimising compilers existed. Its original purpose was to tell the compiler: "keep this variable in a CPU register rather than spilling it over to memory."
Modern compilers — GCC, Clang, ARMCC, IAR — perform register allocations that are almost universally superior to anything that a developer would request manually.
The keyword is explicitly permitted but a non-binding since C99. And in C17, it retains its prohibition on taking the address of a register variable, but has no effect on code generation in any mainstream compiler.
In embedded codebases, register most commonly appears in legacy code copied from the 1990s, in code generated by older code-generation tools, and in codebases maintained by engineers who were taught that it matters. Now it does not. Just remove it. It adds noise and signals to any reviewer that the surrounding code may not have been updated since the optimiser was trusted.
The one practical consequence of register worth knowing: you cannot take the address of a register variable. This constraint survives even though the storage hint is ignored. If you are maintaining code that uses register and attempt to pass the variable by pointer, the compiler will reject it — which can create confusion when the intent of the original code is unclear.
Introduced in C99, restrict is the most powerful and the most dangerous keyword in this list. It tells the compiler: the object pointed to by this pointer will not be accessed through any other pointer for the lifetime of this pointer. In return, the compiler is permitted to perform optimisations — particularly around load/store elimination and vectorisation — that it cannot otherwise safely apply.
The danger is that restrict is a contract, not a constraint. The compiler does not verify that the promise holds. If you declare a pointer restrict and the pointed-to object is also accessed through another pointer — aliasing occurs — the resulting behaviour is undefined. The compiler may have cached a value it was told would not change. The memory may have changed anyway. The output is wrong, and the compiler has done nothing wrong.
void dsp_filter(float * restrict output,
const float * restrict input,
size_t n)
{
for (size_t i = 0; i < n; i++)
output[i] = input[i] * 0.5f;
/* compiler can vectorise freely */
}float buf[128];
/* called with overlapping ranges */
dsp_filter(buf + 1, /* output */
buf, /* input — ALIASES output */
64);
/* restrict promise is broken:
output and input alias each other */In DSP code, DMA transfer handlers, and any other function processing large data buffers on constrained hardware, restrict can produce meaningful performance improvements. 20–40% reduction in cycle count on SIMD-capable cores is not unusual. But the precondition must be true at every call site. It must be documented, reviewed, and tested explicitly.
Undefined behaviour from a broken restrict contract is among the most difficult class of bugs to diagnose. Because the failure is fairly distant from the violation itself.
The inline keyword carries two meanings in the minds of most embedded developers: "make this function fast" and "put this function's body at the call site." Both beliefs are only partially true. And the gap between belief and reality produces subtle nuances.
The C99 standard is explicit: inline is a hint. The compiler may ignore it.
Modern compilers inline aggressively at higher optimisation levels — functions not marked inline are routinely inlined at -O2 if the compiler determines that the call overhead is worth eliminating. Conversely, functions marked inline may not be inlined if the call site count is high, the function is recursive, or the function body is large enough that inlining would cause instruction cache pressure.
If you need to guarantee inlining, use __attribute__((always_inline)) on GCC and Clang, or __forceinline on MSVC and IAR.
Placing an inline function definition in a header file — a common practice for small utility functions — creates a linkage hazard that varies between C and C++.
In C99, an inline function definition in a header has external linkage by default but provides no external definition. If the compiler decides not to inline a call, it needs an external definition — which must exist in exactly one translation unit. Missing this produces linker errors.
Declaring the function static inline avoids the issue by giving each translation unit its own copy.
/* header.h */
/* C99 inline —
may cause linker errors if compiler chooses not to inline */
inline int clamp(int v, int lo, int hi) {
return v < lo ? lo : v > hi ? hi : v;
}
/* Preferred —
static inline eliminates linkage ambiguity */
static inline int clamp(int v, int lo, int hi) {
return v < lo ? lo : v > hi ? hi : v;
}
/* When inlining must be guaranteed
— compiler-specific, NOT portable */
__attribute__((always_inline))
static inline int clamp(int v, int lo, int hi) {
return v < lo ? lo : v > hi ? hi : v;
}Inlined functions disappear as distinct entities in the binary. If you are debugging with a JTAG probe and stepping through the code, an aggressively inlined codebase becomes significantly harder to follow. The debugger may not be able to show the call stack accurately. And breakpoints set on inlined functions may not trigger as expected.
In DO-178C contexts, where you must demonstrate that every statement in the source code maps to at least one object code instruction that can be tested, aggressive inlining can complicate structural coverage analysis unless your toolchain explicitly supports inline-aware coverage.
The pattern across these six keywords is the same: each is meant to solve a specific problem in a specific context. And each is routinely applied outside that context by developers who know the syntax but not the semantics.
The standard is clear. Hardware is indifferent. Compilers are opportunistic.
Understanding the gap between what you write and what the system behind will do is the discipline that separates the serious practitioners from the casual ones.
Comments
Post a Comment