ass:laboratoare:04

04 - Kernel development

Objectives

Learn Linux kernel development basics;
Build your own Linux kernel module;
Understand userspace device types;
Use MMIO to communicate with the IMX UART peripheral!

Lecture

Before beginning the tasks, please check out the lecture slides & notes here.

Additional resources

This is a list of curated sources of information to help you study kernel development on your own:

EmbeTronicX tutorials: Usually it's pretty hard to find up-to-date tutorials on how to write drivers. While some core APIs have been the same for years, there's always one thing that changes every few releases and deprecates previously written materials. These 40+ bite-sized lessons are great for learning about the core systems you interact with while writing modules.
Intro to x86-64 kernel dev: A lab written by yours truly for an ASE Masters class. This walks you through setting up your testing environment (including bootstrapping the rootfs) using qemu-system. It contains some tips about kernel debugging with gdb and teaches how to write an iptables plugin. Note: try this out on a Ubuntu target environment (i.e.: what's in the lab). The latest releases on Arch are in the process of deprecating iptables plugins in favor of transitioning to nftables. Haven't had the chance to look further into this.
Linux Weekly News: A news website containing discussions about new kernel features. Chances are that at some point you'll find a stack overflow answer linking back to one of these articles. In case you're worried about the paywalled articles, know that each week you get a composite release of recent articles (past 7 days). While this is still paywalled, the previous week's articles become free.
Phoronix: Yet another news website, but more focused on hardware.

Tasks

01. Preparation

The Linux kernel is comprised of numerous modules. These can be in-tree (part of the kernel source structure) or out-of-tree (independent modules). While there are some limitations when writing out-of-tree modules, such as restricted access to certain functions, this generally doesn't affect your ability to write drivers.

Task A - The kernel

We are going to write a few out-of-tree kernel modules since this method is more portable. If your code is not architecture-dependent, then you can compile and test your module on your host machine just as easily as on the board. However, if you want to compile it to run on the board, you need a copy of the kernel's source tree that is checked out at the same version as that which is running on the target device. In the following code example we assume that the board is running Linux v6.4.

# check kernel version on the board
[root@board ~]$ uname -r
6.4
 
# see available release tags on your copy of the Linux repo
[student@host ~/linux]$ git tag
 
# check out to the appropriate release version
[student@host ~/linux]$ git checkout v6.4

Although we have the desired kernel version, remember that the FDT needed to be slightly modified. Apply the following diffpatch:

kernel_fdt.patch

diff --git a/arch/arm64/boot/dts/freescale/imx8mq-pico-pi.dts b/arch/arm64/boot/dts/freescale/imx8mq-pico-pi.dts
index 89cbec5c41b2..3fe7f3713e4b 100644
--- a/arch/arm64/boot/dts/freescale/imx8mq-pico-pi.dts
+++ b/arch/arm64/boot/dts/freescale/imx8mq-pico-pi.dts
@@ -19,6 +19,25 @@ chosen {
 		stdout-path = &uart1;
 	};
 
+	firmware {
+		optee {
+			compatible = "linaro,optee-tz";
+			method = "smc";
+		};
+	};
+
+	leds {
+		compatible = "gpio-leds";
+		pinctrl-names = "default";
+
+		led {
+			label = "gpio-led";
+			pintctrl-0 = <&pinctrl_led>;
+			gpios = <&gpio5 5 0>;
+			linux,default-trigger = "heartbeat";
+		};
+	};
+
 	pmic_osc: clock-pmic {
 		compatible = "fixed-clock";
 		#clock-cells = <0>;
@@ -80,6 +99,7 @@ buck1: BUCK1 {
 				regulator-min-microvolt = <700000>;
 				regulator-max-microvolt = <1300000>;
 				regulator-boot-on;
+				regulator-always-on;
 				regulator-ramp-delay = <1250>;
 				rohm,dvs-run-voltage = <900000>;
 				rohm,dvs-idle-voltage = <850000>;
@@ -91,6 +111,7 @@ buck2: BUCK2 {
 				regulator-min-microvolt = <700000>;
 				regulator-max-microvolt = <1300000>;
 				regulator-boot-on;
+				regulator-always-on;
 				regulator-ramp-delay = <1250>;
 				rohm,dvs-run-voltage = <1000000>;
 				rohm,dvs-idle-voltage = <900000>;
@@ -101,6 +122,7 @@ buck3: BUCK3 {
 				regulator-min-microvolt = <700000>;
 				regulator-max-microvolt = <1300000>;
 				regulator-boot-on;
+				regulator-always-on;
 				rohm,dvs-run-voltage = <1000000>;
 			};
 
@@ -109,6 +131,7 @@ buck4: BUCK4 {
 				regulator-min-microvolt = <700000>;
 				regulator-max-microvolt = <1300000>;
 				regulator-boot-on;
+				regulator-always-on;
 				rohm,dvs-run-voltage = <1000000>;
 			};
 
@@ -117,6 +140,7 @@ buck5: BUCK5 {
 				regulator-min-microvolt = <700000>;
 				regulator-max-microvolt = <1350000>;
 				regulator-boot-on;
+				regulator-always-on;
 			};
 
 			buck6: BUCK6 {
@@ -124,6 +148,7 @@ buck6: BUCK6 {
 				regulator-min-microvolt = <3000000>;
 				regulator-max-microvolt = <3300000>;
 				regulator-boot-on;
+				regulator-always-on;
 			};
 
 			buck7: BUCK7 {
@@ -131,6 +156,7 @@ buck7: BUCK7 {
 				regulator-min-microvolt = <1605000>;
 				regulator-max-microvolt = <1995000>;
 				regulator-boot-on;
+				regulator-always-on;
 			};
 
 			buck8: BUCK8 {
@@ -138,6 +164,7 @@ buck8: BUCK8 {
 				regulator-min-microvolt = <800000>;
 				regulator-max-microvolt = <1400000>;
 				regulator-boot-on;
+				regulator-always-on;
 			};
 
 			ldo1: LDO1 {
@@ -161,6 +188,7 @@ ldo3: LDO3 {
 				regulator-min-microvolt = <1800000>;
 				regulator-max-microvolt = <3300000>;
 				regulator-boot-on;
+				regulator-always-on;
 			};
 
 			ldo4: LDO4 {
@@ -168,6 +196,7 @@ ldo4: LDO4 {
 				regulator-min-microvolt = <900000>;
 				regulator-max-microvolt = <1800000>;
 				regulator-boot-on;
+				regulator-always-on;
 			};
 
 			ldo5: LDO5 {
@@ -175,6 +204,7 @@ ldo5: LDO5 {
 				regulator-min-microvolt = <1800000>;
 				regulator-max-microvolt = <3300000>;
 				regulator-boot-on;
+				regulator-always-on;
 			};
 
 			ldo6: LDO6 {
@@ -182,6 +212,7 @@ ldo6: LDO6 {
 				regulator-min-microvolt = <900000>;
 				regulator-max-microvolt = <1800000>;
 				regulator-boot-on;
+				regulator-always-on;
 			};
 
 			ldo7: LDO7 {
@@ -189,6 +220,7 @@ ldo7: LDO7 {
 				regulator-min-microvolt = <1800000>;
 				regulator-max-microvolt = <3300000>;
 				regulator-boot-on;
+				regulator-always-on;
 			};
 		};
 	};
@@ -415,4 +447,10 @@ pinctrl_wdog: wdoggrp {
 			MX8MQ_IOMUXC_GPIO1_IO02_WDOG1_WDOG_B 0xc6
 		>;
 	};
+
+	pinctrl_led: ledggrp {
+		fsl,pins = <
+			MX8MQ_IOMUXC_SPDIF_EXT_CLK_GPIO5_IO5	0x19
+		>;
+	};
 };

[student@host ~/linux]$ git apply kernel_fdt.patch

Finally, compile the Linux kernel after creating the arm64 defconfig. Also, consider enabling the generation of debug info in Kernel hacking / Compile-time checks and compiler options. If you are compiling the kernel in a VM, make sure to allocate said VM as many CPUs and as much RAM as you can, otherwise it will take a while.

# assuming the cross compiler bin/ is in PATH
[student@host ~/linux]$ make CROSS_COMPILE=aarch64-none-linux-gnu- ARCH=arm64 defconfig
[student@host ~/linux]$ make CROSS_COMPILE=aarch64-none-linux-gnu- ARCH=arm64 Image -j $(nproc)
[student@host ~/linux]$ make CROSS_COMPILE=aarch64-none-linux-gnu- ARCH=arm64 dtbs

During a previous lab, we saw how important it is to have a language server integrated into your text editor. The language server lets you jump to function definitions or see all references of a variable, even outside the current source file. However, the language server needs some hints regarding what code was compiled; normally, it can't know that you've compiled the arm64 version of a architecture-dependent function and not the x86 one. All this information can be provided via a compile_commands.json file that contains the cmdline of all ${CROSS_COMPILE}gcc invocations. Tools like bear can generate it for you without much hassle, but the Linux build system (separate from Kbuild) has a handy script that assembles it for you after compilation:

[student@host ~/linux]$ ./scripts/clang-tools/gen_compile_commands.py

If you don't have a language server configured, you can use elixir as an online alternative.

Task B - The rootfs

For this lab we want to be able to easily transfer files to our boards. We are going to achieve this via SSH, so include BR2_PACKAGE_OPENSSH in your BuildRoot's .config.

Additionally, we want to place a configuration file for the SSH daemon (i.e. sshd) at /etc/ssh/sshd_config (this is required to allow root login with password). In order to achieve this, we are going to use a rootfs overlay. Essentially, we are going to specify the absolute path to a certain directory (we'll call it overlay/). After BuildRoot finishes creating the rootfs in its staging directory, it will take the contents of our overlay and copy it over, overwriting any pre-existing instance of a file.

# create the overlay directory + intermediate dirs between / and sshd_config
[student@host ~/buildroot]$ mkdir -p overlay/etc/ssh
 
# copy existing sshd_config (from openssh package)
[student@host ~/buildroot]$ cp /etc/ssh/sshd_config overlay/etc/ssh
 
# TODO: make sure the following settings are uncommented and have the right value
#   PermitRootLogin yes
#   PasswordAuthentication yes

Next, set the absolute path to buildroot/overlay/ to the BR2_ROOTFS_OVERLAY config variable. Then, recompile the CPIO archive.

Make sure you have the coreutils, openssh, iproute2 and vim packages installed!

Task C - Persistent storage configuration over UMS

For this task we are going to make our current configuration persistent. The first step is to boot (as we normally do) to bl33, and get the U-Boot shell. From there, we are going to expose the 16GB eMMC memory on the board as an external storage device to our host computer. This means that we can format it and copy files directly once mounted.

Step 1: eMMC partitioning

Some of you may already have the following disk setup. If that's the case, you can move on.

Our immediate goal is to create two partitions. One will hold a FIT image containing the kernel and FDT (but no ramdisk) from which we are going to boot Linux. The second will represent the root filesystem and will contain everything that BuildRoot generated. Between the partition table and the first partition we are going to leave ~10MB of unused space for later use.

# expose eMMC via UMS
u-boot=> ums mmc 0
 
# check on you host what the newly discovered device is called
# from this point on, assuming it's called /dev/sdb
[student@host ~]$ dmesg
[student@host ~]$ lsblk
 
# format the external eMMC storage device
[student@host ~]$ fdisk /dev/sdb
 
# create a fresh MBR partition table
(fdisk) o
 
# create a 100MB partition starting at 10MB offset
(fdisk) n
Partition type: p
Partition number: 1
First sector: 20480       # not the default value!
Last sector: +100M
 
# print the current partition table; check out end sector of partition 1!
(fdisk) p
Device     Boot Start    End Sectors  Size Id Type
/dev/sdb1       20480 225279  204800  100M 83 Linux
 
# create a second partition, to take up the rest of the space
(fdisk) n
Partition type: p
Partition number: 2
First sector: 225290
Last sector: <Enter>
 
# write changes to disk
(fdisk) w
 
# format partition 1 as FAT32 & partition 2 as ext4
[student@host ~]$ mkimage.fat -F 32 /dev/sdb1
[student@host ~]$ mkimage.ext4 /dev/sdb2
 
# copy FIT image (without ramdisk!) to FAT32 partition
[student@host ~/staging]$ mount /dev/sdb1 /mnt
[student@host ~/staging]$ cp linux.itb
[student@host ~/staging]$ umount /mnt
 
# extract rootfs CPIO contents onto ext4 partition
# NOTE: ext4 required in order to support symlinks
[student@host ~/buildroot]$ mount /dev/sdb2 /mnt
[student@host ~/buildroot]$ cpio -i -D /mnt -F output/images/rootfs.cpio
[student@host ~/staging]$ umount /mnt

Step 2: Automatic boot to Linux

Next, we want to set up bl33 to boot automatically. For this, we need to configure a number of commands to run by default. Add the following commands (separated by ; instead of new line) to the CONFIG_BOOTCOMMAND variable in U-Boot's config.

fatload mmc 0:1 0x80000000 linux.itb
setenv bootargs console=ttymxc0,115200,115200 root=/dev/mmcblk0p2 rw clk_ignore_unused
bootm 0x80000000

Recompile U-Boot and regenerate the Firmware Image Package (i.e. flash.bin). Next, we are going to copy the FIP on the eMMC, in the empty space between the MBR and the first partition. When doing an eMMC boot, the bl1 bootrom will look for the FIP at a 33KB offset into the storage device. Same is true for an SD card boot.

# place the FIP onto the eMMC at 33KB offset from the start
[student@host ~/imx-mkimage/iMX8M]$ dd if=flash.bin of=/dev/sda bs=1024 seek=33 conv=fsync oflag=direct status=progress

Now change the jumpers on the board to perform an eMMC boot.

Task D - Network configuration

About now you should have logged onto the board via the serial console. In this task we want to establish a network connection between your host and the board. For this to happen, we need to configure static IPs on the two network interfaces (since we don't have a DHCP server). Consider providing network connectivity for the board a bonus task ;)

# observe the ethernet interfaces
# NOTE: usually named ethX, enpXsY, enoX, endX
# NOTE: -c flag for color output from iproute2
# NOTE: "addr show" can be abbreviated to "a s"
[student@host ~]$ ip -c addr show
2: enp60s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 8c:47:be:24:bb:61 brd ff:ff:ff:ff:ff:ff
 
[root@board ~]$ ip -c a s
2: end0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 00:1f:7b:65:03:3c brd ff:ff:ff:ff:ff:ff
 
# configure static IPs on each interface
[student@host ~]$ ip addr add 192.168.101.1/24 dev enp60s0
[root@board ~]$   ip addr add 192.168.101.2/24 dev end0
 
# enable links
[student@host ~]$ ip link set dev enp60s0 up
[root@board ~]$   ip link set dev end0 up
 
# add routing information
[student@host ~]$ ip route add 192.168.101.0/24 dev enp60s0
[root@board ~]$   ip route add 192.168.101.0/24 dev end0
 
# ping the target
[student@host ~]$ ping 192.168.101.2
 
# connect to the target via SSH
[root@board ~]$ systemctl status sshd
[root@board ~]$ systemctl start sshd      # only if not already started
 
[student@host ~]$ ssh root@192.168.101.2

02. Your first kernel module

Way back when, kernels used to be monolithic, meaning that adding new functionality required recompiling and installing it, followed by a reboot. Today, things are much easier. By using the kmod daemon (man 8 kmod), users are allowed to load and unload modules (i.e.: kernel object files) on demand, without all the fuss. These modules are C programs that must implement initialization and removal functions that are called automatically. Usually, these functions register / unregister other functions contained in your object with core kernel systems.

We can use lsmod to get a list of all present modules, and modinfo to obtain detailed information about a specific module.

[student@host ~]$ lsmod
ecdh_generic           16384  1 bluetooth
 
[student@host ~]$ modinfo ecdh_generic | grep description
description:    ECDH generic algorithm
 
[student@host ~]$ modinfo bluetooth | grep description 
description:    Bluetooth Core ver 2.22

What we can understand from this is that the Elliptic Curve Diffie-Hellman module is 16384 bytes in size and is used by one other module, via the bluetooth ECDH helper. As you probably noticed, elixir.bootlin.com is a critical resource in navigating the kernel code.

If it's not a module that you're unsure about but a device, you can use udevadvm get more information about it. For example, if you have a NVMe drive (an SSD, let's say) and you want to figure out what drivers are involved in its operation, you can tell udevadm to scan sysfs bottom-up, starting with that device:

[student@host ~]$ udevadm info -a /dev/nvme0n1 | grep DRIVER
    DRIVERS=="nvme"
    DRIVERS=="pcieport"

From this, we glean that we need both the NVMe driver and the PCIe driver in order to operate our SSD.

Task A - Prepare your build system

Take a look at this piece of documentation before you get started. Then, create a new directory with the following structure:

.
├── Kbuild    --> defines the output module via obj-m
├── Makefile  --> defines the build targets, relying on the Linux headers
└── my_first_module.c

The makefile should look something like this:

KDIR ?= /lib/modules/`uname -r`/build
 
build:
	$(MAKE) -C $(KDIR) M=$(PWD) modules
 
clean:
	$(MAKE) -C $(KDIR) M=$(PWD) clean

Notice how KDIR is used to determine the precise kernel that we are compiling for. If invoked without overwriting KDIR, its default value ensures that the module is compiled for our current system (given that the kernel headers are installed). Note, however, that in order to compile the module for our board, it's not sufficient to point to the correct repo path. You still have to pass the CROSS_COMPILE and ARCH variables. Otherwise, the kernel's .config will be reset.

TLDR: modify KDIR + pass the appropriate variables when cross-compiling the Linux kernel!

Task B - Write a minimal module

#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/module.h>
 
MODULE_DESCRIPTION("A test module.");
MODULE_AUTHOR("Student");
MODULE_LICENSE("GPL");
 
/* custom log message header; used by pr_* */
#ifdef pr_fmt
#undef pr_fmt
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#endif
 
/* init - module initialization callback
 *  @return :  0 if everything went well ==> module is loaded
 *            -1 if an error ocurred     ==> module is not loaded
 */
static int init(void)
{
    pr_info("Hello world!\n");
 
    return 0;
}
 
/* fini - module removal callback
 */
static void fini(void)
{
    pr_info("Goodbye cruel, cruel world!\n");
}
 
/* register on_init and on_exit event handlers */
module_init(init);
module_exit(fini);

A few things to mention about this code:

The module_init and module_exit macros mark the module initialization and cleanup functions. In other modules you may find the __init attribute thrown around. That is just an alias for atribute¹⁾), placing the initialization function in a special section. All function located in these section are automatically marked as safe to be deleted after first being executed, in order to reclaim some memory.
pr_fmt is yet another macro that is used by the kernel print function, printk. pr_fmt allows us to prepend unique module identifiers before each logged line, in order to keep track of which module generated what output.
Although we mentioned printk as the main print function, it's recommended to use the pr_${LOG_LEVEL} alternatives. In this case, pr_info is a rather mild message that might not even be considered important enough to print to your console. Instead, all debug messages no matter their importance can be viewed when running dmesg (or printing /proc/kmsg for raw output).
MODULE_LICENSE(“GPL”); this line is pretty much required in order for your module to interact with the larger kernel. With this macro, you save the “GPL” string in a special section, indicating that you comply with the GNU Public License that the kernel uses. Not doing so can lead to restricted API access or even the module not being accepted by the kernel.

Task C - Insert the module into the kernel

After you compile the kernel for your machine, insert it into the kernel, then remove it. You can do this via insmod and rmmod. Check the kernel debug log using dmesg.

Once you're convinced that the module works, clean the workspace then rebuild it for the board. Pass the kernel object (i.e.: *.ko file) via SSH, then repeat the process remotely.

If you want to investigate the kernel log of a previous session, you can use journalctl:

# show all log info from previous boot
[student@host ~]$ journalctl -b -1 -a

This can come in handy when you want to investigate crashes.

03. Making a simple character device

Check Google for some inspiration on Linux kernel sample character device modules.

Simply take the code, try to build it / fix it (if required, since newer kernel versions may break old APIs).

Also make sure to put some printk() calls for debugging (if they're not already in the sample code).

After a successful compilation, test it on your board using insmod! Check dmesg for the (hopefully) successful messages.

04. Writing to a serial device (UART)

Download & read the i.MX8M's Reference Manual, chapter 16.2. Ahem, not really all of it, just check the memory map / register definition for UTXD and UTS.

Next, we will enhance our character device to print the using IMX UART peripheral using a simple MMIO interface. Useful resources for mapping, reading & writing IO memory: https://www.kernel.org/doc/html/latest/driver-api/device-io.html (read the introduction + MMIO parts).

In principle, these are the steps you'll have to do:

create a function, let's say, myuart_put_char(unsigned char ch) – sends the specified character over the UART.
inside the module's init function, map the UART1's BASE memory using kernel's ioremap API (check either the Reference Manual or the device tree for the addresses!); you'll need to provide the base address + size (since the offset of the IO registers you'll access is pretty small, you can limit it to 0x1000);
implementation for our putchar:
- first, you need to wait for the UTS TX ready flag (bit) to be set; use a while() busy loop, but call ''cpu_relax'' to temporarily lower power consumption (optional, but good practice); use readl() to read IO from registers;
- afterwards, simply copy the character to the UART TXD register (use writel() for this);
finally, call your putchar function for every character in your buffer inside the character's device write callback!
test it by writing something in your char device, e.g.: echo hello uart > /dev/<mychardevicename>.

¹⁾ section(”.init.text”