Trading modularity for performance and portability
TL;DR By merging interface and implementation of a component into a single file, and marking all functions and data as
static
, and a little bit ofC
macros, you can help yourC
compiler to generate highly-optimized code for you.
Engineering is all about trade-offs! A problem can have many different solutions, even many acceptable solutions.
Acceptable solution is a solution which satisfies the pre-defined design goals. Defining the design goals is, obviously, part of the designer's job.
I think, performance and portability are the most important goals for embedded designs due to the constrained nature of embedded systems.
So a design that sacrifices modularity is an acceptable solution for me.
Design
So the idea here is to get rid of separate files for interface and implementation of a software component and merge them. By software component, I mean a logical unit that do some useful task. E.g., "touch component" that interacts with touch IC.
Physical layout design
As an example, the physical layout of a project with three components
(comp-a
, comp-b
, and comp-c
) will be like this (this is a simplified
view, you need more files in a real-world project):
\
+-- platform/
\
+-- ...
+-- comp-a.h
+-- comp-a.test.c
+-- comp-b.h
+-- comp-b.test.c
+-- comp-c.h
+-- comp-c.test.c
+-- app.h
+-- app.test.c
+-- main.c
Other than main.c
, all .c
files are test codes. And they'll run on the
build machine (not on the target microcontroller, obviously).
Each component comprises:
- an "interface/implementation file" (e.g.,
comp-a.h
), - a unit test file (e.g.,
comp-a.test.c
)
The app.h
includes and configures all required components. And
provides app_setup
and app_loop
functions.
There's also an integration test for the whole app in app.test.c
file.
Program design
There are two phases in a program:
- Initialization
- Main loop
// main.c
#include "app.h"
int
main()
{
app_setup(); // initialization
while (1)
app_loop(); // main loop
return 0;
}
app.h
will provides app_setup
and app_loop
.
Platform abstraction
Each component expects a set of macros to be defined. These macros expose functinalities of the platform (abstraction).
Imagine that comp-a
needs to do IO using the GPIO functionlity of the
platform, we can pass the functionality to comp-a.h
like this (assuming
that gpio_write
is a function defined by the platform):
// app.h
#define COMP_A_GPIO_WRITE(port, pin) gpio_write(GPIO_PORTS_##port, pin)
#include "comp-a.h"
#undef COMP_A_GPIO_WRITE
The comp-a.h
can provides an explicit error message when the expected macro
is not defined:
// comp-a.h
#ifndef COMP_A_GPIO_WRITE
#error COMP_A_GPIO_WRITE should be defined by user
#endif
This way we can improve the modularity :)
Now, let's look at a concrete example: A component to interact with touch IC through I2C (despite the fact that I hate STM32 HAL library, I'll use it for this example):
// app.h
#define TOUCH_I2C_TX(adr, buf, size) \
(HAL_I2C_Master_Transmit(&hi2c1, adr, (buf), (size), 1) == HAL_OK)
#define TOUCH_I2C_RX(adr, buf, size) \
(HAL_I2C_Master_Receive(&hi2c1, adr, (buf), (size), 1) == HAL_OK)
#include "touch.h"
#undef TOUCH_I2C_RX
#undef TOUCH_I2C_TX
// Our *expected* declarations for `touch_init` and `touch_read` functions
// which are defined in `touch.h`.
// This is useful for catching possible mismatches.
static void
touch_init(void);
static int
touch_read(uint8_t* pressed, uint8_t* released);
We expect touch.h
gives us two functions: touch_init
and touch_read
.
I put declarations of touch_init
and touch_read
functions after inclusion
of touch.h
file to catch possible errors, early and explicitly. This is a
double-check!
Other components glued together in the same manner in app.h
. So app.h
would be something like this:
// app.h
#define COMP_A_GPIO_WRITE(port, pin) gpio_write(GPIO_PORTS_##port, pin)
#include "comp-a.h"
#undef COMP_A_GPIO_WRITE
// ...
#include "comp-b.h"
// ...
// ...
#include "comp-c.h"
// ...
static void
app_setup(void)
{
// ...
}
static void
app_loop(void)
{
// ...
}
Improving modularity
To improve modularity and decreasing user errors, a component must not leak data, macro or function definitions (other than the expected ones).
Imagine comp-a
defines COMP_A_DATA_
variable (which is a static, global
variable for storing internal data inside the component), and we don't want
to leak that to the outside of comp-a.h
.
To accomplish this aim, we can use macros!
#ifndef COMP_A_H_
#define COMP_A_H_
static struct
{
// ...
} COMP_A_DATA_;
// ...
#define COMP_A_DATA_ PRIVATE_
#endif
The macro processor will replace all COMP_A_DATA_
identifiers (after inclusion
of comp-a.h
) with PRIVATE_
, and the user will get a compilation error for
an undefined identifier PRIVATE_
if his/her inadvertently uses that
variable. Easy!
This technique applies to functions, too. For macros you can use #undef
.
Advantages
Compiler can apply more eager optimizations
Let's take a look at the magic of static
keyword on functions in
this example on Compiler Explorer.
It's a non-real code, just to show you the effect of static
in function
declarations.
If you uncomment line 5 of the source code, you'll see that the compiler
replaces function calls with simple jumps, which are in general cheaper.
(Because the function foo
is not that complicated, the usefulness is
not obvious).
The reason is that optimizer knows, for sure, that this function has internal linkage, and there's no need for keeping the function semantics (according to the ABI).
But I encourage you to look at the disassembly of your codes
(arm-none-eabi-objdump -d -j .text your-firmware.elf
). And you'll be amazed
by the improvement of the generated code.
Debugging on development machine
Because of platform abstractions, the code is very testable (you can mock the platform easily). So you can debug the bugs on your development machine, using your favorite debugger (instead of on the target microcontroller).
Static and dynamic analysis tools
Regardless of your target compiler, you can use the gcc
- and
clang
-provided analysis. You can employ different sanitizers to catch
errors dynamically.
You can compile your test code with gcc
(version 10 or later) using
-fanalyzer
to take advantage of gcc's static analyzer
(more info
here
and
here).
LLVM has the Clang Static Analyzer project.
You can also compile the test codes with different sanitizers like
AddressSanitizer
(using option -fsanitize=address
),
UndefinedBehaviorSanitizer
(using option -fsanitize=undefined
), etc.
You can check manual of your compiler for more info. There are a ton of
sanitizers which can help you to detect errors before the code hits the
target microcontroller thanks to "platform abstraction"!
Disadvantages
- Having implementation inside files with
.h
suffix makes some people grumpy! - Use of macros (some people just don't like macros!)
- Requires the programmer to be more disciplined in writing code
- Because of more eager optimizations of the compiler, debugging can become harder.
Conclusion
Summary of what we did:
- Hardware abstraction using macros (instead of function pointers)
- Put all codes into a single compilation unit using C pre-processor
- Applied simple macro tricks to improve maintainability
The advantages of this approach outweight the disadvantages by, at least, an order of magnitude! So this is a good trade-off for me.
Acknowledgements
I want to thank Ali-Reza Chegini and Hamid Rostami for reading the draft of this post and giving useful feedbacks.