This is an old revision of the document!
Author: Lucian-Ioan Popescu (SRIC 1)
State-of-the-art compilers use Undefined Behavior (UB) to trigger various optimizations. Their effects range from different code size or code speed to different power consumption levels. All those aspects are critical for IoT application with limited resources. In this work we will uncover the impact of UB optimization on IoT devices using Nuttx, a popular real-time operating system for IoT devices, Coremark, a benchmark that measures the performance of micro-controllers, and Clang, the compiler we will use for compiling Nuttx and Coremark.
In C/C++, UB imposes no requirements on the final behavior of the program. In this context, the compiler is free to make various assumptions during the transformation passes from C/C++ source code to target specific assembly code. This example [1] displays how signed overflow UB can lead to better code transformation inside loops. In short, on 64-bit platforms, the int type still has a range of 32-bits. That causes problems when computing 64-bit addresses with 32-bit offsets. However, compilers use signed overflow UB as a free ticket to promote int32 loop counters to int64, leading to shorter and faster code.
Even if for this specific example, we proved that the generated code is shorter and faster, for the general case we do not know how the final generated code will look like. This happens because transformation passes might interact in unpredictable ways that can change the size and speed of the final code.
The first goal of this project was to compile Nuttx with Clang. This happened because much of my work on researching the impact of UB for other use cases was already done in Clang. There were already some efforts in this area [2] but I could not use them because they did not provide the complete toolchain for compiling Nuttx.
The ISA used by ESP-32 boards is designed by Cadence and it is called Xtensa [4]. Much of the work of targeting this architecture in LLVM was already started by Espressif [3]. The first step of integrating the LLVM fork of Espressif into Nuttx was to hack into the build system of Nuttx to be able to compile it with Clang. The patches that I introduced can be found in my fork of Nuttx [5].
In summary, the changes that I need to do were the following:
After this step was finished, I had to move to Xtensa LLVM to patch it in order to successfully compile all Nuttx source code. You can find my fork of Xtensa LLVM here [6]. The modification I had to do in this step was rather simple, i.e. solve a typing error in the register info tablegen. For the `intset` register, the name of the register was wrongly typed as `interrupt`. However this process was time consuming because I had do debug various parts of the Xtensa backend before getting to the root cause of the problem.
At this point, I successfully compiled Nuttx with Xtensa LLVM.