This is an old revision of the document!
athreads is a small preemptive scheduling library for 8-bit AVR microcontrollers, developed and demonstrated on the ATmega2560.
The project implements a lightweight threading runtime in C and AVR assembly. It supports timer-driven preemption, thread context switching, per-thread time quanta, sleeping threads, and basic CPU usage accounting.
The original idea was to better understand how operating systems schedule tasks, but in a very constrained embedded environment where there is no operating system underneath. Instead of only simulating scheduling on a PC, this project runs directly on a microcontroller and exposes its behavior through hardware and profiling tools.
The project is useful because it shows how a small preemptive runtime can be built from low-level primitives: interrupts, stacks, registers, timers, and context switching. It can be used as an educational tool for operating systems, embedded systems, real-time programming, and low-level performance profiling.
The project is centered around the athreads scheduler library. Around it, I built optional modules for visualization and profiling.
The scheduler runs on the ATmega2560 and uses a hardware timer interrupt to periodically preempt the currently running thread. Each thread has its own saved stack pointer and execution context. When a context switch happens, the current CPU state is saved and another thread is restored.
The demo application adds:
+-------------------------------------------------------------+
| PC / Laptop |
| |
| +-------------------------------+ |
| | Python CPU Profiler | |
| | - reads USART packets | |
| | - plots per-thread CPU usage | |
| +---------------^---------------+ |
| | USB Serial |
+------------------|------------------------------------------+
|
+------------------v------------------------------------------+
| Arduino Mega 2560 |
| |
| +-------------------------------------------------------+ |
| | athreads Scheduler | |
| | | |
| | - thread table | |
| | - per-thread stack/context | |
| | - round-robin scheduling | |
| | - per-thread quantum | |
| | - sleeping thread wakeup | |
| | - CPU tick accounting | |
| +-----------^--------------------^----------------------+ |
| | | |
| Timer1 interrupt Timer2 uptime tick |
| preemption sleep/encoder timing |
| |
| +------------------+ +------------------+ |
| | OLED UI Thread | | Encoder Thread | |
| | process menu | | input handling | |
| +--------^---------+ +--------^---------+ |
| | | |
| SPI OLED Rotary Encoder |
| |
| +-------------------------------------------------------+ |
| | Worker Threads | |
| | synthetic workloads used for profiling/demo purposes | |
| +-------------------------------------------------------+ |
+-------------------------------------------------------------+
The hardware used for the demo consists of:
The OLED is configured in 4-wire SPI mode.
| OLED Pin | Arduino Mega 2560 Pin |
|---|---|
| VCC | 3.3V |
| GND | GND |
| NC | Not connected |
| DIN | D51 / MOSI |
| CLK | D52 / SCK |
| CS | D53 |
| DC | D49 |
| RES | D48 |
| BS0 | GND |
| BS1 | GND |
| Encoder Pin | Arduino Mega 2560 Pin |
|---|---|
| + | 5V |
| GND | GND |
| CLK | D43 |
| DT | D45 |
| SW | D47 |
The OLED and rotary encoder are not required by the scheduler itself. They are used to demonstrate that the scheduler can run multiple independent threads and expose their state interactively.
The OLED displays the current thread list and time quantum values. The encoder allows the user to navigate between threads and modify a selected thread's quantum while the system is running.
The project is developed using:
include/ athread/ Public scheduler API and generated AVR structure offsets platform/ USART, uptime, and debug headers profiling/ CPU statistics, tracing, and worker demo headers ui/ OLED and encoder headers src/ athread/ Scheduler implementation and AVR context switch assembly platform/ USART and millisecond uptime support profiling/ CPU sampling, trace hooks, and demo workloads ui/ SPI OLED driver and rotary encoder input main.c Demo firmware entry point tools/ cpu_task_manager.py Live CPU profiling viewer gen_offsets.py PlatformIO pre-build offset generator gen_athread_offsets.c Offset generator source
The scheduler keeps a table of thread descriptors. Each descriptor contains information such as:
Each thread has its own stack. When a thread is created, the scheduler prepares its initial context so that it can later be started by the context switcher.
Timer1 is used for preemption. When the timer interrupt fires, the running thread's quantum is updated. If the quantum expires, the scheduler selects another ready thread and switches context.
Timer2 is used for millisecond uptime and periodic support tasks, such as waking sleeping threads and debouncing the rotary encoder in the demo application.
| Function | Description |
|---|---|
athread_init() | Initializes scheduler state, creates the idle thread, and prepares the scheduler before application threads are created. |
athread_start() | Starts the scheduler. After this call, execution is controlled by the scheduler and the function does not normally return. |
athread_create(entry, info) | Creates a new thread with the given entry function and argument pointer. Returns the thread ID or ATHREAD_INVALID_TID on failure. |
athread_yield() | Voluntarily gives up the CPU and allows another ready thread to run. |
athread_sleep_ticks(ticks) | Puts the current thread to sleep for a number of scheduler ticks. |
athread_set_quantum(tid, quantum_ticks) | Changes the time quantum of a thread at runtime. |
athread_get_quantum(tid) | Returns the configured time quantum of a thread. |
athread_get_thread_count() | Returns the number of allocated thread IDs, including the idle thread. |
athread_get_cpu_ticks(out_ticks, max_ticks) | Copies per-thread CPU tick counters for profiling and diagnostics. |
athread_get_current_tid() | Returns the ID of the currently running thread. |
athread_tick() | Advances scheduler timing state, including sleeping thread wakeups. Used by the platform timer code. |
athread_bootstrap() | Internal startup wrapper used when a thread begins execution. |
A quantum is the amount of scheduler time a thread is allowed to run before it can be preempted.
In this project, the quantum is measured in scheduler timer ticks. For example, if the scheduler tick is 5 ms, a quantum of 4 allows a thread to run for approximately 20 ms before the scheduler may switch to another thread.
Changing a thread's quantum affects responsiveness and CPU distribution:
The profiling system is optional and is built on top of the scheduler's CPU counters.
The firmware periodically reads per-thread CPU tick counters and sends compact packets over USART. On the PC, a Python program reads the serial stream and displays a live graph of CPU usage per thread.
The viewer can be started with:
python ./tools/cpu_task_manager.py --port COM3
The port can be changed depending on where the Arduino Mega appears.
The demo firmware creates several threads to show scheduling behavior:
| Thread | Name | Purpose |
|---|---|---|
| T0 | IDLE | Idle thread created by the scheduler |
| T1 | MAIN | Main application coordinator |
| T2 | ENC | Rotary encoder input handling |
| T3 | OLED | OLED process menu |
| T4 | WRK1 | Synthetic worker workload |
| T5 | WRK2 | Synthetic worker workload |
| T6 | WRK3 | Synthetic worker workload |
| T7 | WRK4 | More dynamic worker workload |
The project successfully implements a working preemptive scheduler on the ATmega2560.
The main achieved results are:
The profiling viewer makes it possible to observe how scheduling decisions affect CPU distribution between threads.
This project helped me understand preemptive scheduling from the hardware level upward. Implementing context switching on a microcontroller made the relationship between stacks, registers, interrupts, and scheduling much clearer than a high-level simulation would have.
The most challenging parts were the AVR assembly context switch, stack initialization for newly created threads, and making profiling work without disturbing the system too much + made me realize how much I love virtual addressing for its memory effectiveness.
The final result is a small but functional preemptive scheduling library, with optional profiling tools that make the runtime behavior visible and easier to reason about.
Possible future improvements include:
The project source code, firmware, tools, and documentation are available in the repository.
Recommended archive contents:
Build command:
pio run
Upload command:
pio run -t upload