Show page

Differences

This shows you the differences between two versions of the page.

--- ass:cursuri:04:theory:00 [2024/08/09 14:47]
radu.mantu
+++ ass:cursuri:04:theory:00 [2024/08/15 19:20] (current)
radu.mantu [QEMU setup for kernel development]
@@ Line 35: / Line 35: @@
 # verify that the config was loaded correctly
 $ cat /proc/sys/fs/binfmt_misc/qemu-aarch64
-enabled
+    enabled
-interpreter /usr/bin/qemu-aarch64-static
+    interpreter /usr/bin/qemu-aarch64-static
-flags: PF
+    flags: PF
-offset 0
+    offset 0
-magic 7f454c460201010000000000000000000200b700
+    magic 7f454c460201010000000000000000000200b700
-mask ffffffffffffff00fffffffffffffffffeffffff
+    mask ffffffffffffff00fffffffffffffffffeffffff
 # copy qemu-user-static into the bootstraped dir
@@ Line 48: / Line 48: @@
 $ chroot rootfs /bin/bash
 </code>
 === The Linux kernel ===
@@ Line 54: / Line 53: @@
 This step is very straightforward. Just clone the Linux repo (with depth=1 to squash the commit history, reducing the amount of downloaded data) and compile it for the **arm64** target. Notice in the instructions below that we also specify ''kvm_guest.config''. We don't plan to perform a //perfect// emulation of arm64 devices but instead be transparent to the kernel and let it know that it's running in a virtual environment. This permits us to cut some corners and save some crucial clock cycles, increasing the performance of the VM. This is also the main difference between //emulation// and //simulation//.
-Optionally, you can enable the generation of debug symbols and **gdb** helper scripts. The config options are ''CONFIG_DEBUG_INFO_DWARF5'' and ''CONFIG_GDB_SCRIPTS'', found under ''Kernel hacking -> Compile-time checks and compiler options''.
+Optionally, you can enable the generation of debug symbols and **gdb** helper scripts. The config options are ''CONFIG_DEBUG_INFO_DWARF5'' and ''CONFIG_GDB_SCRIPTS'', found under ''Kernel hacking -> Compile-time checks and compiler options''. Also, make sure ''CONFIG_DEBUG_INFO_REDUCED'' is disabled so that the compiler will generate debug info for //structures//.
 <code bash>
@@ Line 71: / Line 70: @@
 # (optional) install modules in the bootstrapped rootfs directory
 linux$ INSTALL_MOD_PATH=../rootfs make modules_install
+# notice how the modules are installed under rootfs/lib/modules/ and a subdirectory
+# that's specific to this exact kernel version; you can check this version as shown below
+# NOTE: the "-rc<n>" after the "<major>.<minor>.<patch>" stands for "release candidate"
+linux$ make kernelversion
+.11.0-rc2
 </code>
@@ Line 79: / Line 84: @@
 Unfortunately, the server does not have any foolproof method of inferring the structure of the project. For example, it has no idea whether you are using the //arm64// or the //x86// definition of a function. Luckily for us, the Linux build system contains a script that's able to generate a file names ''compile_commands.json''. This file contains all compiler invocations used during the boot and can deduce include paths, compiled source, macro definitions, etc. The language server will automatically load it and generate indices for the compiled source code.
-Note that this script only forks for the Linux kernel. If you want to generate ''compile_commands.json'' for other projects, consider using [[https://github.com/rizsotto/Bear|Bear]]. This is a tool that hooks the ''exec()'' family of library calls via LD_PRELOAD hooking in order to extract the command arguments.
+<code bash>
+linux$ ./scripts/clang-tools/gen_compile_commands.py
+</code>
+Note that this script only works for the Linux kernel. If you want to generate ''compile_commands.json'' for other projects, consider using [[https://github.com/rizsotto/Bear|Bear]]. This is a tool that hooks the ''exec()'' family of library calls via LD_PRELOAD hooking in order to extract the command arguments.
+Also, for such a large codebase, expect the language server to take up //a lot// of your CPU when you first open your editor, while doing the indexing (usually in a ''.cache/'' directory).
 </note>
+=== Booting the VM ===
+In order to boot the VM, we're going to use the following script:
+<code bash run.sh>
+#!/bin/bash
+# get workspace directory path from environtment
+# or use the current working directory as a default value
+WS_DIR=${WS_DIR:-$(pwd)}
+# boot up the VM
+qemu-system-aarch64                     \
+    -machine virt                       \
+    -cpu cortex-a53                     \
+    -smp 1                              \
+    -m 512M                             \
+    -nographic                          \
+    -kernel linux/arch/arm64/boot/Image \
+    -virtfs local,path=${WS_DIR}/rootfs,mount_tag=rootfs,security_model=passthrough,id=rootfs,multidevs=remap \
+    -append "root=rootfs rootfstype=9p rootflags=trans=virtio,version=9p2000.L rw console=ttyAMA0"
+</code>
+This may look overwhelming but let's take it one step at a time and analyze each flag passed to **qemu-system-aarch64**:
+  * ''machine'': Specifies what //platform// we want it to emulate. For example, it can emulate even a Raspberry Pi 4B. Here, ''virt'' means that we're transparent to the guest kernel that its running in a virtual environment. For a full list of supported platforms, pass this flag the ''help'' argument.
+  * ''cpu'': The CPU model that we want it to emulate. Note that it can't emulate the //exact// implementation of a vendor, such as the i.MX8M Quad on our boards. Only the high-level specification that ARM provides, namely the Cortex-A53.
+  * ''smp'': Number of symmetric multi-processors (basically CPUs) that we want to allocate to the VM. Keep this under the value of ''$(nproc))''.
+  * ''m'': The amount of RAM to allocate to the VM.
+  * ''nographic'': If this option is left out, **qmeu-system** will spawn a GUI window for the VM's terminal. As Linux users, we resent this and instead ask the emulator to redirect the VM's I/O to/from our current terminal. If you need to forcibly shut down **qemu-system** while in this mode, use the ''<Ctrl + A> + X'' key combination.
+  * ''kernel'': Pretty self-explanatory. It's the kernel image that will be loaded.
+  * ''virtfs'': This options specifies virtual storage device based on [[https://www.kernel.org/doc/html/latest/filesystems/9p.html|9p]] remote filesystem protocol. Using this instead of a disk image (like the one you had to create with **partx** in the second session) can be chalked up to personal preference. With 9p you can manipulate the guest filesystem from the host while the VM is running. If we used a disk image, we'd have to double mount it to obtain the same functionality and this would lead to corruptions with //100% certainty// due to the lack of synchronization mechanisms at the VFS driver level (between host and guest kernels). For more information regarding it's key-value list of arguments, check out the [[https://wiki.qemu.org/Documentation/9psetup|documentation]].
+  * ''append'': The command line arguments for the kernel. Check [[https://www.kernel.org/doc/html/v4.14/admin-guide/kernel-parameters.html|this]] out for more kernel cli arguments.
+    * ''root=rootfs'': The rootfs device, it's value must match that of the ''mount_tag'' attribute from the previous flag.
+    * ''rootfstype=9p'': Informs the kernel what the backing storage device for the rootfs will be.
+    * ''rootflags=trans=virtio,version=9p2000.L'': p9-specific configuration; it's copy-pasted from the QEMU docs :p
+    * ''rw'': Specifies that the rootfs must be mounted with both read and write permissions. Our home directory is on that partition so yeah, it'd better be writeable.
+    * ''console=ttyAMA0'': The standard output for the kernel's terminal (that's also being used in userspace; this is a "server" flavor of a debian distribution). The **ttyAMA0** is specific to a serial device that's instantiated by the emulator based on the ''machine'' flag's value.
+<note important>
+**Troubleshooting**
+----
+If the terminal in the VM is acting weird (e.g., commands wrapping over onto the same line and overwriting the characters on display), set ''TERM=xterm'' inside the VM. This variable defines the type of terminal for which the shell prepares its output. You can read up more on this topic [[https://bash.cyberciti.biz/guide/$TERM_variable|here]] but **xterm** is a safe option for **bash**, **zsh** and **sh**.
+Some terminal emulators such as [[https://github.com/kovidgoyal/kitty|kitty]] support custom ''TERM'' values such as **xterm-kitty** but most shells won't have any idea how to particularize the output for it. Stick to **xterm** :p
+----
+If the VM doesn't accept your root login password even after specifically setting it from **chroot** via **passwd**, make sure you're running the script with **sudo**. When debootstrapping the rootfs, the file owner was set to //root//, meaning that your unprivileged user won't have read access to ''/etc/shadow'' that has ''rw%%----%%'' permissions. As a result, ''systemd-login'' won't be able to validate your password against its stored cryptographic hash. Though I agree, it could be a bit more verbose about that fact :/
+</note>