This shows you the differences between two versions of the page.
iothings:proiecte:2022sric:ub-benchmark [2023/06/01 16:50] lucian_ioan.popescu |
iothings:proiecte:2022sric:ub-benchmark [2023/06/01 22:09] (current) lucian_ioan.popescu [Running the Benchmarks] |
||
---|---|---|---|
Line 34: | Line 34: | ||
"EEMBC’s CoreMark® is a benchmark that measures the performance of microcontrollers (MCUs) and central processing units (CPUs) used in embedded systems. Replacing the antiquated Dhrystone benchmark, Coremark contains implementations of the following algorithms: list processing (find and sort), matrix manipulation (common matrix operations), state machine (determine if an input stream contains valid numbers), and CRC (cyclic redundancy check). It is designed to run on devices from 8-bit microcontrollers to 64-bit microprocessors." | "EEMBC’s CoreMark® is a benchmark that measures the performance of microcontrollers (MCUs) and central processing units (CPUs) used in embedded systems. Replacing the antiquated Dhrystone benchmark, Coremark contains implementations of the following algorithms: list processing (find and sort), matrix manipulation (common matrix operations), state machine (determine if an input stream contains valid numbers), and CRC (cyclic redundancy check). It is designed to run on devices from 8-bit microcontrollers to 64-bit microprocessors." | ||
- | The results I was interested are the following: coremark score (speed of execution), code size and power consumption. For the first metric I used the output of coremark which I will present later. For the second metric I measured the binary size of Nuttx and Coremark after compiling them and for the third metric I used an USB tester [8] that displays the voltage and the current consumed by my ESP32 board. | + | The results I was interested are the following: coremark score (speed of execution), code size and power consumption. For the first metric I used the output of coremark which I will present later. For the second metric I measured the binary size of Nuttx and Coremark after compiling them and for the third metric I used an USB tester [8] that displays the voltage and the current consumed by my ESP32 board [10]. |
The following is a sample output for coremark: | The following is a sample output for coremark: | ||
Line 64: | Line 64: | ||
Next I will present all configurations used for this experiment. I used a total of 13 configurations based on various flags that change the behavior of the compiler with regards to exploiting UB. | Next I will present all configurations used for this experiment. I used a total of 13 configurations based on various flags that change the behavior of the compiler with regards to exploiting UB. | ||
- | ^ UB flag ^ Description ^ | + | ^ No ^ UB flag ^ Description ^ |
- | | -fwrapv | Treat signed overflow as two's complement | | + | | 1 | -fwrapv | Treat signed overflow as two's complement | |
- | | -fno-strict-aliasing | Don't use type based alias analysis | | + | | 2 | -fno-strict-aliasing | Don't use type based alias analysis | |
- | | -fstrict-enums | Enable optimizations that take advantage of enum's value range | | + | | 3 | -fstrict-enums | Enable optimizations that take advantage of enum's value range | |
- | | -fno-delete-null-pointer-checks | Assume that programs can safely dereference null pointers | | + | | 4 | -fno-delete-null-pointer-checks | Assume that programs can safely dereference null pointers | |
- | | -fno-finite-loops | Don't assume that all loops are finite | | + | | 5 | -fno-finite-loops | Don't assume that all loops are finite | |
- | | -fconstrain-shift-value | Constrain shift RHS so it doesn't produce undefined results when RHS >= bitwitdh | | + | | 6 | -fconstrain-shift-value | Constrain shift RHS so it doesn't produce undefined results when RHS >= bitwitdh | |
- | | -fno-constrain-bool-value | Don't constrain bool values in {0,1} | | + | | 7 | -fno-constrain-bool-value | Don't constrain bool values in {0,1} | |
- | | all + -O2 | All flags from above + -O2 | | + | | 8 | all + -O2 | All flags from above + -O2 | |
- | | all + -Os | All flags from above + -Os | | + | | 9 | all + -Os | All flags from above + -Os | |
- | | base + -O2 | No flag from above + -O2 | | + | | 10 | base + -O2 | No flag from above + -O2 | |
- | | base + -Os | No flag from above + -Os | | + | | 11 | base + -Os | No flag from above + -Os | |
- | | -fno-use-default-alignment | Use alignment of one for all memory operations | | + | | 12 | -fno-use-default-alignment | Use alignment of one for all memory operations | |
==== Results ==== | ==== Results ==== | ||
Line 96: | Line 96: | ||
There is no specific improvement between `base + -O2` and all the configuration that make use of UB. However what is interesting to see is the impact of `all + -Os` compared with `base + -Os`. There is a performance decrease by 1%. | There is no specific improvement between `base + -O2` and all the configuration that make use of UB. However what is interesting to see is the impact of `all + -Os` compared with `base + -Os`. There is a performance decrease by 1%. | ||
- | Note that no results set contains numbers for the -fno-use-default-alignment configuration. This happens because Nuttx crashes when compiled with this flag and no benchmark can be run. Compare to x86, for which this flag is targeted, Xtensa has stricter alignment rules that cannot be modified. | + | Note that no results set contains numbers for the -fno-use-default-alignment configuration. This happens because Nuttx crashes when compiled with this flag and no benchmark can be run. Compared to x86, for which this flag was initially is targeted, Xtensa has stricter alignment rules that cannot be modified. |
==== Conclusions and Further Work ==== | ==== Conclusions and Further Work ==== | ||
- | The results show that there is not much difference in terms of code size or code speed in the context of undefined behavior for Nuttx and Coremark. One reason for that might be that while developing those systems, the developers made little to no use of undefined behavior. Another reason might be that Xtensa LLVM cannot take proper advantage of undefined behavior when triggering optimizations for Nuttx and Coremark. | + | The results show that there is not much difference in terms of code size, code speed and power consumption in the context of undefined behavior for Nuttx and Coremark. One reason for that might be that while developing those systems, the developers made little to no use of undefined behavior. Another reason might be that Xtensa LLVM cannot take proper advantage of undefined behavior when triggering optimizations for Nuttx and Coremark. |
Those are interesting paths worth further researching. Moreover, rerunning the experiments with a better USB tester can lead to more accurate results with regards to the power consumption capabilities of the Xtensa processsor. | Those are interesting paths worth further researching. Moreover, rerunning the experiments with a better USB tester can lead to more accurate results with regards to the power consumption capabilities of the Xtensa processsor. | ||
Line 106: | Line 106: | ||
==== References ==== | ==== References ==== | ||
- | [1] [[https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759de5a7|A bit of background on compilers exploiting signed overflow]] | + | [1] [[https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759de5a7|A bit of background on compilers exploiting signed overflow]] \\ |
- | + | [2] [[https://meka.rs/blog/2017/07/03/nuttx-and-clang/|NuttX and Clang]] \\ | |
- | [2] [[https://meka.rs/blog/2017/07/03/nuttx-and-clang/|NuttX and Clang]] | + | [3] [[https://github.com/espressif/llvm-project|Fork of LLVM targeted at Xtensa]] \\ |
- | + | [4] [[https://www.cadence.com/content/dam/cadence-www/global/en_US/documents/tools/ip/tensilica-ip/isa-summary.pdf|Xtensa ISA]] \\ | |
- | [3] [[https://github.com/espressif/llvm-project|Fork of LLVM targeted at Xtensa]] | + | [5] [[https://github.com/lucic71/nuttx|My fork of Nuttx]] \\ |
- | + | [6] [[https://github.com/lucic71/llvm-project-espressif|My fork of Xtensa LLVM]] \\ | |
- | [4] [[https://www.cadence.com/content/dam/cadence-www/global/en_US/documents/tools/ip/tensilica-ip/isa-summary.pdf|Xtensa ISA]] | + | [7] [[https://www.eembc.org/coremark/|Coremark]] \\ |
- | + | [8] [[https://www.emag.ro/tester-usb-ut658-uni-t-afisaj-lcd-9999-mah-mie0415/pd/DYL0T7MBM/?ref=hdr-favorite_products|Tester USB UT658 Uni-T, afisaj LCD, 9999 mAh]] \\ | |
- | [5] [[https://github.com/lucic71/nuttx|My fork of Nuttx]] | + | [9] [[https://www.espressif.com/sites/default/files/documentation/esp32_datasheet_en.pdf|ESP32 datasheet]] \\ |
- | + | [10] [[https://www.emag.ro/placa-dezvoltare-esp32-devkit-v1-ai669/pd/DXV9FDMBM/|Placa dezvoltare ESP32, DEVKIT V1]] | |
- | [6] [[https://github.com/lucic71/llvm-project-espressif|My fork of Xtensa LLVM]] | + | |
- | + | ||
- | [7] [[https://www.eembc.org/coremark/|Coremark]] | + | |
- | + | ||
- | [8] [[https://www.emag.ro/tester-usb-ut658-uni-t-afisaj-lcd-9999-mah-mie0415/pd/DYL0T7MBM/?ref=hdr-favorite_products|Tester USB UT658 Uni-T, afisaj LCD, 9999 mAh]] | + | |
- | + | ||
- | [9] [[https://www.espressif.com/sites/default/files/documentation/esp32_datasheet_en.pdf|ESP32 datasheet]] | + |