This shows you the differences between two versions of the page.
isc:labs:kernel:tasks:01 [2021/12/02 11:15] radu.mantu |
— (current) | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ==== 01. [??p] Prerequisites ==== | ||
- | === [??p] Task A - Dependencies installation === | ||
- | |||
- | {{:isc:labs:kernel:tasks:skeleton.zip|}} | ||
- | |||
- | <code bash> | ||
- | [student@host]$ sudo apt update | ||
- | [student@host]$ sudo apt install wget make gcc iptables dnsutils qemu-system | ||
- | </code> | ||
- | |||
- | <note important> | ||
- | TODO: Ubuntu | ||
- | </note> | ||
- | |||
- | === [??p] Task B - Development environment === | ||
- | |||
- | When developing new features for the kernel, chances are that you will screw up. Often. Depending on the severity, the kernel may or may not recover. So to avoid restarting your PC over and over, it's better to work in a minimal virtualized environment. As such, we will first bootstrap a loopback disk image with a basic Ubuntu system, but without the kernel. Eventually, we will boot a virtual macine with **qemu-system-x86_64** from this disk image, with a custom kernel that we will build ourselves. | ||
- | |||
- | <note> | ||
- | The bootstrapping and kernel building process may take a few (~15) minutes. Feel free to jump to Exercise 2 and come back once in a while to see if any progress was made. While Task A is doable on your live kernel, make sure to stop there. Task B is meant to generate errors and should be solved in the VM. For your sake :p | ||
- | </note> | ||
- | |||
- | === Bootstrapping === | ||
- | |||
- | For the bootstrapping process, we will use **debootstrap**. This tool will download a Debian-based ecosystem and install it in whatever directory we tell it to. Incidentally, that directory will be the mount point of the disk image that we are going to create. | ||
- | |||
- | <code bash> | ||
- | # create a 5GB empty file -- this will be our disk image | ||
- | [student@host]$ qemu-img create images/ubuntu.raw 5G | ||
- | |||
- | # build an ext4 filesystem onto the disk image -- now we can mount it | ||
- | [student@host]$ mkfs.ext4 images/ubuntu.raw | ||
- | |||
- | # mount the ext4 filesystem -- now we can copy files onto it | ||
- | [student@host]$ sudo mount images/ubuntu.raw /mnt | ||
- | |||
- | # bootstrat the Ubuntu system | ||
- | [student@host]$ sudo debootstrap --arch amd64 focal /mnt http://archive.ubuntu.com/ubuntu | ||
- | </code> | ||
- | |||
- | Almost there... if you list the contents of //%%/mnt/%%//, you will see most of the usual entries from your //root// directory. At this point, we should be able to boot into this machine (if we had a kernel image), but we wouldn't be able to log in. The only thing that's left for us to do is set a password for the //root// user. For this, we need to trick the **passwd** tool to thing that //%%/mnt/%%// is in fact the root of our filesystem. So we use **chroot**: | ||
- | |||
- | <code bash> | ||
- | # pretend that /mnt/ is our new / and start a bash instance inside | ||
- | [student@host]$ sudo chroot /mnt /bin/bash | ||
- | |||
- | # change password for current user (root) | ||
- | [ root@jail]$ passwd | ||
- | New password: root | ||
- | Retype new password: root | ||
- | passwd: password updated successfully | ||
- | |||
- | # while we're here, set Google DNS as primary DNS | ||
- | # qemu has a bug where it refuses to fall back to other resolvers | ||
- | # so it will be hard stuck on 127.0.0.53 ==> can't resolve domain names | ||
- | [ root@jail]$ echo 'nameserver 8.8.8.8' > /etc/resolv.conf | ||
- | |||
- | # exit from this bash instance and escape from the chroot jail | ||
- | [ root@jail]$ exit | ||
- | |||
- | # finally, unmount our disk -- we're done with it for now | ||
- | [student@host]$ sudo umount /mnt | ||
- | </code> | ||
- | |||
- | === Kernel building === | ||
- | |||
- | Next step is to get the kernel source code and compile it. By separating the kernel from the disk image, we are able to checkout to other branches / commits and test out different versions without installing them anywhere. Normally, you would have to select what options you want included in the compilation process (e.g.: memory allocators, cryptographic systems, etc.) by running ''make menuconfig''. After finishing your selection and saving the configuration, a //.config// file would be created. Because we haven't the faintest idea what most of the things enumerated in that menu even are, we will rely on default configurations. One thing that is not part of the default configuration and will be useful later are debug symbols. Go through the following commands and refer to the GIF below when you'll be required to navigate the config menu. | ||
- | |||
- | <code bash> | ||
- | # clone the linux kernel locally | ||
- | [student@host]$ git clone --depth=1 https://github.com/torvalds/linux.git | ||
- | |||
- | # go into the repo directory | ||
- | [student@host]$ pushd linux | ||
- | |||
- | # create a default configuration file (.config) | ||
- | [student@host]$ make x86_64_defconfig kvm_guest.config | ||
- | |||
- | # manually enable debug symbols on top of current .config | ||
- | # NOTE: refer to the GIF below | ||
- | [student@host]$ make menuconfig | ||
- | |||
- | # optional: check out the generated .config file | ||
- | [student@host]$ less .config | ||
- | |||
- | # compile the kernel using all cores | ||
- | [student@host]$ make -j $(nproc) | ||
- | |||
- | # return to the previous direcotry | ||
- | [student@host]$ popd | ||
- | </code> | ||
- | |||
- | [[https://ocw.cs.pub.ro/courses/_media/isc/labs/kernel/tasks/menuconfig-demo.gif|{{ :isc:labs:kernel:tasks:menuconfig-demo.gif?700 |}}]] | ||
- | <html><center><i> Click GIF to maximize. </i></center></html> | ||
- | |||
- | === Booting up the virtual machine === | ||
- | |||
- | We are finally here. Let's boot up the VM from our bootstrapped disk image, with our personally compiled Linux kernel. | ||
- | |||
- | <code bash> | ||
- | [student@host]$ sudo qemu-system-x86_64 \ | ||
- | -m 4G \ | ||
- | -smp 1 \ | ||
- | -enable-kvm \ | ||
- | -kernel linux/arch/x86/boot/bzImage \ | ||
- | -drive file=images/ubuntu.raw,format=raw,index=0 \ | ||
- | -append 'root=/dev/sda rw console=ttyS0 nokaslr' \ | ||
- | -nographic | ||
- | </code> | ||
- | |||
- | Let us have a look at this command, line by line: | ||
- | - ''-m 4G'': allocate 4GB of memory (change this as you wish) | ||
- | - ''-smp 1'': use only 1 vCPU; this is recommended for debugging purposes | ||
- | - ''-enable-kvm'': [[https://www.redhat.com/en/topics/virtualization/what-is-KVM|KVM]] is a Linux kernel module that transforms your operating system intro a bare-metal hypervisor. This is what allows you to run __actual virtual machines__ on Linux. Without it, **qemu** would try to __emulate__ the system, resulting in worse performance. | ||
- | - ''-kernel .../bzImage'': this specifies the compiled & compressed kernel image to use when booting the virtual machine | ||
- | - ''-drive ...'' : specifies the disk image to load; note that ''index=0'' will make the VM consider this to be //%%/dev/sda%%//. Adding another drive with ''index=1'' will cause it to be regarded as //%%/dev/sdb%%//. | ||
- | - ''-append ...'': these are command line arguments for the kernel (yes, even it has those). ''root=/dev/sda rw'' marks //%%/dev/sda%%// (i.e.: our //Ubuntu.raw// disk image) as the root device to be mounted onto the root directory (i.e.: //%%/%%//) in read-write mode. ''console=ttyS0'' exposes an [[https://en.wikipedia.org/wiki/Universal_asynchronous_receiver-transmitter|UART]] serial interface to the VM and tells Linux to use it for I/O. ''nokaslr'' tells the kernel to disable address space layout randomization. | ||
- | - ''-nographic'': tells **qemu:** not to open a separate window for the GUI. In stead, it will take the virtual serial device (which the VM will recognize as //ttyS0//) and link it to the terminal. So whatever the VM sends via the serial to be printed will end out in your //stdout//. Whatever you type into //stdin// will be forwarded to the VM as input. | ||
- | |||
- | <note tip> | ||
- | If you have problems with the VM booting and you can't //<Ctrl-C>// out of it, try //<Ctrl+A X>// to signal **qemu** that is time to exit. Note that if you feel something odd happening with your terminal (e.g.: overlapping lines), you can run **reset**. | ||
- | |||
- | Under normal circumstances, exit the VM by running **poweroff**. | ||
- | </note> | ||
- | |||
- | After starting the VM and logging in as //root// (with the password that was set earlier), try finding out the kernel version in both the host and guest operating systems: | ||
- | |||
- | <code bash> | ||
- | # host has the latest Arch Linux kernel (you may have Ubuntu, etc.) | ||
- | [student@host]$ uname -r | ||
- | 5.15.2-arch1-1 | ||
- | |||
- | # guest has the newest Linux kernel release candidate | ||
- | [ root@guest]$ uname -r | ||
- | 5.16.0-rc2+ | ||
- | </code> | ||
- | |||
- | <note important> | ||
- | Note how we did not specify a network device to **qemu**. By default, SLiRP is used to provide network connectivity. If you've never heard of SLiRP, don't feel bad. It's a program that emulates Point-to-Protocol (PPP) using shell accounts and has become largely obsolete with the advent of dial-up modems (I kid you not). While it does provide TCP and UDP connectivity, note that ICMP packets will be dropped and your VM will not be discoverable; not even from your host. | ||
- | |||
- | The correct way of doing things would be creating a bridge (i.e.: a software layer-2 switch) with **brctl**, adding a network device to your VM via the ''-netdev'' flag, and attaching it to the newly created bridge. This is a bit overkill for our purpose today. If you ever need to create such a setup, there are plenty of [[https://www.linux-kvm.org/page/Networking|resources]] available. | ||
- | |||
- | ---- | ||
- | |||
- | Although we said that you //should// have network access in your VM, there //may// be a chance that you don't have an IP address assigned. You may need to do this manually: | ||
- | |||
- | <code bash> | ||
- | # list available interfaces (in a colorful fashion) | ||
- | [root@guest]$ ip -c addr show | ||
- | 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 | ||
- | link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 | ||
- | inet 127.0.0.1/8 scope host lo | ||
- | valid_lft forever preferred_lft forever | ||
- | inet6 ::1/128 scope host | ||
- | valid_lft forever preferred_lft forever | ||
- | 2: enp0s3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 | ||
- | link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff | ||
- | 3: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000 | ||
- | link/sit 0.0.0.0 brd 0.0.0.0 | ||
- | |||
- | # send a DHCP request on your Ethernet interface | ||
- | [root@guest]$ dhclient enp0s3 | ||
- | |||
- | # check if an IP address was allocated | ||
- | [root@guest]$ ip -c addr show enp0s3 | ||
- | 2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 | ||
- | link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff | ||
- | inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic enp0s3 | ||
- | valid_lft 86313sec preferred_lft 86313sec | ||
- | inet6 fec0::5054:ff:fe12:3456/64 scope site dynamic mngtmpaddr | ||
- | valid_lft 86318sec preferred_lft 14318sec | ||
- | inet6 fe80::5054:ff:fe12:3456/64 scope link | ||
- | valid_lft forever preferred_lft forever | ||
- | </code> | ||
- | </note> |