This shows you the differences between two versions of the page.
— |
ass:labs-2024:02:tasks:04 [2025/08/03 10:11] (current) florin.stancu created |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ==== 04. Debuging (aka. the "fun" part) ==== | ||
+ | |||
+ | If you faithfully followed the instructions up to this point, you'll be glad to know that you are precisely 3 bugs away from having a working Linux-based system. So, let's get started: | ||
+ | |||
+ | === Bug A - bootm decompression error === | ||
+ | |||
+ | After executing ''bootm'', you may notice some successful loads of the FDT and ramdisk, but the following error during the kernel loading process: | ||
+ | |||
+ | <code> | ||
+ | uncompressed: uncompress error -28 | ||
+ | </code> | ||
+ | |||
+ | <note tip> | ||
+ | The U-Boot source code may hold some answers. \\ | ||
+ | The error message formatting looks something like ''"%s: uncompress error %d"''. \\ | ||
+ | The error code is an [[https://man.freebsd.org/cgi/man.cgi?query=errno&sektion=2&manpath=freebsd-release-ports|errno]] value. \\ | ||
+ | **grep** is your friend; but so is a text editor with [[https://neovim.io/doc/user/lsp.html|LSP]] support. | ||
+ | </note> | ||
+ | |||
+ | === Bug B - Kernel panic === | ||
+ | |||
+ | Congratulations! The kernel is finally booting. I'm certain you're thankful for keeping that Makefile up to date, right? But what's this? | ||
+ | |||
+ | <code> | ||
+ | [ 1.878803] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) | ||
+ | [ 1.887072] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc1-00248-gb6e6cc1f78c7 #1 | ||
+ | [ 1.894996] Hardware name: TechNexion PICO-PI-8M (DT) | ||
+ | [ 1.900051] Call trace: | ||
+ | [ 1.902501] dump_backtrace+0x90/0xe8 | ||
+ | [ 1.906183] show_stack+0x18/0x24 | ||
+ | [ 1.909510] dump_stack_lvl+0x48/0x60 | ||
+ | [ 1.913182] dump_stack+0x18/0x24 | ||
+ | [ 1.916503] panic+0x31c/0x378 | ||
+ | [ 1.919567] mount_root_generic+0x254/0x324 | ||
+ | [ 1.923762] mount_root+0x16c/0x330 | ||
+ | </code> | ||
+ | |||
+ | First time seeing a kernel panic may be a bit daunting, but the reason looks to be pretty clear: | ||
+ | |||
+ | <code> | ||
+ | Unable to mount root fs on unknown-block(0,0) | ||
+ | </code> | ||
+ | |||
+ | The kernel needs to know the backing device (and partition) for the root file system in order to mount it at ''/''. Apparently, U-Boot did not know how to tell it to use the ramdisk as rootfs, but chances are that the kernel //does// know about it. | ||
+ | |||
+ | This is not documented in the command's help message but ''bootm'' relies on an environment argument called ''bootargs'' to pass the kernel its command line arguments. This should be a good place to start our investigation: | ||
+ | |||
+ | <code> | ||
+ | u-boot=> printenv bootargs | ||
+ | bootargs=console=ttymxc0,115200,115200 rdinit=/linuxrc clk_ignore_unused | ||
+ | </code> | ||
+ | |||
+ | From ''rdinit=/linuxrc'' we notice that the [[https://www.kernel.org/doc/html/v4.12/admin-guide/initrd.html#obsolete-root-change-mechanism|obsolete change_root procedure]] is being used. Since we don't actually have a persistent rootfs, let's just treat the ramdisk as root. | ||
+ | |||
+ | <note tip> | ||
+ | The environment variable can be changed from the U-Boot shell. \\ | ||
+ | The name of the ramdisk device will be ''/dev/ram0''. \\ | ||
+ | Check the (partial) documentation on the [[https://www.kernel.org/doc/html/v4.12/admin-guide/initrd.html#boot-command-line-options|kernel boot coomand line options]]. | ||
+ | </note> | ||
+ | |||
+ | <solution -hidden> | ||
+ | <code> | ||
+ | u-boot=> setenv bootargs "console=ttymxc0,115200,115200 root=/dev/ram0 rw clk_ignore_unused" | ||
+ | u-boot=> bootm 0x80000000 | ||
+ | </code> | ||
+ | </solution> | ||
+ | |||
+ | === Bug C - System freeze at login === | ||
+ | |||
+ | Now that the kernel panic has been solved, the only remaining issue is a freeze right after forking into User Space. | ||
+ | The final kernel log message should be indicative of the underlying problem: | ||
+ | |||
+ | <code> | ||
+ | Welcome to Buildroot | ||
+ | buildroot login: | ||
+ | [ 9.473402] random: crng init done | ||
+ | [ 33.757425] buck1: disabling | ||
+ | </code> | ||
+ | |||
+ | Try to decompile the device tree that we include in the FIT image and that is passed to Linux. What is ''buck1''? \\ | ||
+ | Apply the patch mentioned in the tips section below, then recompile the DTB and the FIT image. This should solve the problem. | ||
+ | |||
+ | <note tip> | ||
+ | This [[https://github.com/TechNexion/linux-tn-imx/commit/daa7ce9c0e92ec9c7f6cbb87e295ff449f1f8e41|patch]] from the TechNexion fork of the Linux kernel contains the solution. \\ | ||
+ | Apparently, the power regulator shuts itself down after a while if a certain attribute is not specified in the FDT. \\ | ||
+ | It's always the <del>butler</del> power regulator. | ||
+ | </note> | ||