Show page

Differences

This shows you the differences between two versions of the page.

--- ass:cursuri:04:theory:00 [2024/08/09 14:12]
radu.mantu created
+++ ass:cursuri:04:theory:00 [2024/08/15 19:20] (current)
radu.mantu [QEMU setup for kernel development]
@@ Line 2: / Line 2: @@
 In this section we present a step-by-step guide of setting up a kernel development environment using **qemu-system-aarch64**. This emulator is very widely used, enables easy kernel debugging, and is extremely efficient. However, it is not a replacement for a hardware testing environment. If you want to develop a //device driver// and not say, an //iptables plugin//, you should probably be looking for a development board. Similarly, this emulator does away with most of the ARM boot process, meaning that you're not going to be able to test out the bootloaders studied during the first two sessions. This solution is //strictly// for kernel development.
+=== Bootstrapping the rootfs ===
+In this example setup we're going to be using **debootstrap** to generate a root filesystem based on Debian.
+<code bash>
+$ mkdir rootfs
+$ debootstrap --arch arm64 stable rootfs/ \
+              http://ftp.hosteurope.de/mirror/ftp.debian.org/debian/
+</code>
+== Chrooting into different architectures ==
+If your host system is **x86-64**, then **chroot**-ing into the newly installed rootfs to change the root password (for example) would normally be impossible. However, you can configure your system to automatically run cross-compiled binaries using **qemu-user-static**. This makes executing non-native binaries transparent to **chroot**.
+For this, you need **binfmt.d** and **qemu-user-static**. It's up to you to find the appropriate packages on your distribution :p
+Note in the setup instructions below how we copy **qemu-aarch64-static** into our bootstrapped directory. When **chroot**-ing, we are creating what is called a [[https://www.man7.org/linux/man-pages/man7/mount_namespaces.7.html|mount namespace]]. Essentially, we are voluntarily restricting the spawned process' view of the entire filesystem in such way that it will think that the //rootfs/ // directory is in fact **//the// root filesystem**. However, the transition to the new mount namespace happens //before// spawning the **/bin/bash** process, so that we don't use the native, x64 one. While this doesn't impact the kernel's ability to consult the **binfmt.d** service, it does pose a problem in that the userspace process still needs to have access to **qemu-aarch64-static**. As a result, we need to copy it in the bootstrapped filesystem so that it is available to a x64 kernel and CPU to execute while the userspace process is in a subordinated mount namespace. This is also the reason why we are using the //static// version instead of simply **qemu-aarch64** (that has shared object dependencies): because we want to limit the number of x64 binaries installed in a arm64 rootfs and avoid overwriting pre-existing shared objects with the same name.
+<code bash>
+# make sure the binfmt service is active
+$ systemctl status systemd-binfmt
+# create the binfmt config file for aarch64 ELF handling
+#
+# if you have the `qemu-user-static-binfmt` package on your distro, copy this
+# file instead of using the heredoc: /usr/lib/binfmt.d/qemu-aarch64-static.conf
+$ cat >/etc/binfmt.d/qemu-aarch64-static.conf <<EOF
+:qemu-aarch64:M::\x7fELF\x02\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\xb7\x00:\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff:/usr/bin/qemu-aarch64-static:FP
+EOF
+# verify that the config was loaded correctly
+$ cat /proc/sys/fs/binfmt_misc/qemu-aarch64
+    enabled
+    interpreter /usr/bin/qemu-aarch64-static
+    flags: PF
+    offset 0
+    magic 7f454c460201010000000000000000000200b700
+    mask ffffffffffffff00fffffffffffffffffeffffff
+# copy qemu-user-static into the bootstraped dir
+$ cp $(which qemu-aarch64-static) rootfs/usr/bin/
+# chroot into the aarch64 distribution
+$ chroot rootfs /bin/bash
+</code>
 === The Linux kernel ===
-This step is very straightforward. Just clone the Linux repo (with depth=1 to squash the commit history) and compile it for the **arm64** target. Optionally, you can enable the generation of debug symbols & **gdb** helper scripts; the config options are ''CONFIG_DEBUG_INFO_DWARF5'' and ''CONFIG_GDB_SCRIPTS'', found under ''Kernel hacking -> Compile-time checks and compiler options''.
+This step is very straightforward. Just clone the Linux repo (with depth=1 to squash the commit history, reducing the amount of downloaded data) and compile it for the **arm64** target. Notice in the instructions below that we also specify ''kvm_guest.config''. We don't plan to perform a //perfect// emulation of arm64 devices but instead be transparent to the kernel and let it know that it's running in a virtual environment. This permits us to cut some corners and save some crucial clock cycles, increasing the performance of the VM. This is also the main difference between //emulation// and //simulation//.
+Optionally, you can enable the generation of debug symbols and **gdb** helper scripts. The config options are ''CONFIG_DEBUG_INFO_DWARF5'' and ''CONFIG_GDB_SCRIPTS'', found under ''Kernel hacking -> Compile-time checks and compiler options''. Also, make sure ''CONFIG_DEBUG_INFO_REDUCED'' is disabled so that the compiler will generate debug info for //structures//.
 <code bash>
 # clone the repo
 ~$ git clone --depth=1 https://github.com/torvalds/linux.git
+~$ cd linux/
 # generate a default config & compile
 # NOTE: the default target will also compile the modules
-linux$ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- defconfig
+linux$ make CROSS_COMPILE=aarch64-linux-gnu- ARCH=arm64 defconfig kvm_guest.config
-linux$ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -j $(nproc)
+linux$ make CROSS_COMPILE=aarch64-linux-gnu- ARCH=arm64 -j $(nproc)
+# prepare the kernel build environment for compiling out-of-tree modules (for later)
+linux$ make CROSS_COMPILE=aarch64-linux-gnu- ARCH=arm64 modules_prepare -j $(nproc)
+# (optional) install modules in the bootstrapped rootfs directory
+linux$ INSTALL_MOD_PATH=../rootfs make modules_install
+# notice how the modules are installed under rootfs/lib/modules/ and a subdirectory
+# that's specific to this exact kernel version; you can check this version as shown below
+# NOTE: the "-rc<n>" after the "<major>.<minor>.<patch>" stands for "release candidate"
+linux$ make kernelversion
+.11.0-rc2
 </code>
@@ Line 23: / Line 84: @@
 Unfortunately, the server does not have any foolproof method of inferring the structure of the project. For example, it has no idea whether you are using the //arm64// or the //x86// definition of a function. Luckily for us, the Linux build system contains a script that's able to generate a file names ''compile_commands.json''. This file contains all compiler invocations used during the boot and can deduce include paths, compiled source, macro definitions, etc. The language server will automatically load it and generate indices for the compiled source code.
-Note that this script only forks for the Linux kernel. If you want to generate ''compile_commands.json'' for other projects, consider using [[https://github.com/rizsotto/Bear|Bear]]. This is a tool that hooks the ''exec()'' family of library calls via LD_PRELOAD hooking in order to extract the command arguments.
+<code bash>
+linux$ ./scripts/clang-tools/gen_compile_commands.py
+</code>
+Note that this script only works for the Linux kernel. If you want to generate ''compile_commands.json'' for other projects, consider using [[https://github.com/rizsotto/Bear|Bear]]. This is a tool that hooks the ''exec()'' family of library calls via LD_PRELOAD hooking in order to extract the command arguments.
+Also, for such a large codebase, expect the language server to take up //a lot// of your CPU when you first open your editor, while doing the indexing (usually in a ''.cache/'' directory).
 </note>
-=== Bootstrapping the rootfs ===
+=== Booting the VM ===
-In this example setup we're going to be using **debootstrap** to generate a root filesystem based on Debian.
+In order to boot the VM, we're going to use the following script:
-<code bash>
+<code bash run.sh>
-~$ mkdir rootfs
+#!/bin/bash
-~$ sudo debootstrap --arch arm64 stable rootfs/ \
-                     http://ftp.hosteurope.de/mirror/ftp.debian.org/debian/
+# get workspace directory path from environtment
+# or use the current working directory as a default value
+WS_DIR=${WS_DIR:-$(pwd)}
+# boot up the VM
+qemu-system-aarch64                     \
+    -machine virt                       \
+    -cpu cortex-a53                     \
+    -smp 1                              \
+    -m 512M                             \
+    -nographic                          \
+    -kernel linux/arch/arm64/boot/Image \
+    -virtfs local,path=${WS_DIR}/rootfs,mount_tag=rootfs,security_model=passthrough,id=rootfs,multidevs=remap \
+    -append "root=rootfs rootfstype=9p rootflags=trans=virtio,version=9p2000.L rw console=ttyAMA0"
 </code>
+This may look overwhelming but let's take it one step at a time and analyze each flag passed to **qemu-system-aarch64**:
+  * ''machine'': Specifies what //platform// we want it to emulate. For example, it can emulate even a Raspberry Pi 4B. Here, ''virt'' means that we're transparent to the guest kernel that its running in a virtual environment. For a full list of supported platforms, pass this flag the ''help'' argument.
+  * ''cpu'': The CPU model that we want it to emulate. Note that it can't emulate the //exact// implementation of a vendor, such as the i.MX8M Quad on our boards. Only the high-level specification that ARM provides, namely the Cortex-A53.
+  * ''smp'': Number of symmetric multi-processors (basically CPUs) that we want to allocate to the VM. Keep this under the value of ''$(nproc))''.
+  * ''m'': The amount of RAM to allocate to the VM.
+  * ''nographic'': If this option is left out, **qmeu-system** will spawn a GUI window for the VM's terminal. As Linux users, we resent this and instead ask the emulator to redirect the VM's I/O to/from our current terminal. If you need to forcibly shut down **qemu-system** while in this mode, use the ''<Ctrl + A> + X'' key combination.
+  * ''kernel'': Pretty self-explanatory. It's the kernel image that will be loaded.
+  * ''virtfs'': This options specifies virtual storage device based on [[https://www.kernel.org/doc/html/latest/filesystems/9p.html|9p]] remote filesystem protocol. Using this instead of a disk image (like the one you had to create with **partx** in the second session) can be chalked up to personal preference. With 9p you can manipulate the guest filesystem from the host while the VM is running. If we used a disk image, we'd have to double mount it to obtain the same functionality and this would lead to corruptions with //100% certainty// due to the lack of synchronization mechanisms at the VFS driver level (between host and guest kernels). For more information regarding it's key-value list of arguments, check out the [[https://wiki.qemu.org/Documentation/9psetup|documentation]].
+  * ''append'': The command line arguments for the kernel. Check [[https://www.kernel.org/doc/html/v4.14/admin-guide/kernel-parameters.html|this]] out for more kernel cli arguments.
+    * ''root=rootfs'': The rootfs device, it's value must match that of the ''mount_tag'' attribute from the previous flag.
+    * ''rootfstype=9p'': Informs the kernel what the backing storage device for the rootfs will be.
+    * ''rootflags=trans=virtio,version=9p2000.L'': p9-specific configuration; it's copy-pasted from the QEMU docs :p
+    * ''rw'': Specifies that the rootfs must be mounted with both read and write permissions. Our home directory is on that partition so yeah, it'd better be writeable.
+    * ''console=ttyAMA0'': The standard output for the kernel's terminal (that's also being used in userspace; this is a "server" flavor of a debian distribution). The **ttyAMA0** is specific to a serial device that's instantiated by the emulator based on the ''machine'' flag's value.
+<note important>
+**Troubleshooting**
+----
+If the terminal in the VM is acting weird (e.g., commands wrapping over onto the same line and overwriting the characters on display), set ''TERM=xterm'' inside the VM. This variable defines the type of terminal for which the shell prepares its output. You can read up more on this topic [[https://bash.cyberciti.biz/guide/$TERM_variable|here]] but **xterm** is a safe option for **bash**, **zsh** and **sh**.
+Some terminal emulators such as [[https://github.com/kovidgoyal/kitty|kitty]] support custom ''TERM'' values such as **xterm-kitty** but most shells won't have any idea how to particularize the output for it. Stick to **xterm** :p
+----
+If the VM doesn't accept your root login password even after specifically setting it from **chroot** via **passwd**, make sure you're running the script with **sudo**. When debootstrapping the rootfs, the file owner was set to //root//, meaning that your unprivileged user won't have read access to ''/etc/shadow'' that has ''rw%%----%%'' permissions. As a result, ''systemd-login'' won't be able to validate your password against its stored cryptographic hash. Though I agree, it could be a bit more verbose about that fact :/
+</note>