Note: this article was originally written in French, then translated with the help of LLMs. You can read this article in French here, but now that I’m mainly updating this article.
Introduction #
For the past few months, I’ve been interested in developing games for the Game Boy Advance.
There are many tools available to build GBA ROMs:
There are also many tutorials on creating GBA ROMs. TONC is the most well-known (and the one I personally followed). However, these tutorials generally assume the use of a library that handles the build and initialization of the ROM. The user doesn’t take control at the entry point but rather later, in a clean state that abstracts most of the initialization issues.
It seems to me that writing a ROM for the Game Boy Advance using only Zig and
its build system is a good introduction to system and embedded programming.
Without diving into the complexities of virtual memory, the kernel, and x86
,
this project allows us to understand these different topics:
- how a simple CPU works;
- how it executes code;
- how to write code for this type of machine;
- what new constraints this environment imposes;
- what ARM assembly looks like;
- how to build a bare-metal program.
A fairly comprehensive program with a rather fun and motivating subject: how to program the GBA games of our childhood.
This tutorial is aimed at readers who already know how to program but may have
never done low-level programming. We will go through the concepts one by one to
eventually reach a very simple while
loop, allowing you to then start a
tutorial like TONC with your own low-level library.
Why Zig? #
I don’t have much experience with Zig. After programming for a few weeks for
the GBA in Rust with the agb
crate, I wanted to try a simpler language to
better understand the inner workings of the GBA and to avoid being distracted
by a language that encourages experimentation and the pursuit of perfection.
Zig is a language that has intrigued me for some time. The promise of a system language for the 21st century, simple and transparent. A modern C with optionals, Rust-like error handling, explicit allocators, and above all, a coherent build system integrated into the language.
So I looked at which library I could use. I quickly spotted ZigGBA, a project that is already quite complete, accompanied by a blog post that I invite you to read.
It was while reading the source code of ZigGBA, more concise than that of agb, that curiosity struck me. I was reading the content of the build system, trying to understand the linker script, and then I realized that the build process of ZigGBA is much easier for me to understand than that of agb.
I want to clarify that I am neither an expert in Rust nor an expert in Zig. I am only speaking from my own knowledge, and I imagine that for a developer accustomed to Rust’s build system and concepts, all this seems simpler. But for me, it is really this simplicity that led me to explore the problem of writing GBA programs from scratch.
All this to say that this project could be done in C, Rust, or, I imagine, any other compiled language with more or less complications, but Zig seems perfect to me for discovering system and embedded programming without having to struggle with the dated quirks of C and with a simple and modern language.
Programming for the Game Boy Advance #
The main difference between programming on the GBA and the programming you may be used to (programming for Linux, the web, or smartphones) is that usually, you are not alone on the machine where your program runs, and you do not know the details of the machine on which it runs.
On your Linux machine, your program is not the only one running: you also have a browser, terminal emulators, maybe a music player. All these programs need to share your machine’s resources: some memory here, a few processor cycles there.
You may have noticed that as a developer, you didn’t have to worry about these considerations. From your point of view, memory is infinite and belongs to you, the processor is yours, and you will never be asked to think about the other programs running on your machine. This is because this role is assigned to your operating system.
It is the OS that must ensure process compartmentalization, i.e., guarantee that a process cannot access the execution environment or memory of another process.
It is also the OS that handles resource sharing: it distributes memory to processes that request it, orchestrates process execution, ensuring that each gets some execution time.
It is always the OS that allows you to write software without having to think about the hardware on which you will run it. No matter the brand of your screen, whether your mouse is ball-based or Bluetooth, whether your RAM is DDR3 or 4, or even if you are in a virtual machine, your program will be the same. This is thanks to the abstraction layer that your OS offers between you and the hardware. This allows you to write portable programs, to distribute your software to everyone using the same OS as you.
flowchart TD Browser["Browser"] <--> OS Terminal["Terminal"] <--> OS MusicPlayer["Music Player"] <--> OS YourProgram["Your Program"] <--> OS subgraph OS["Operating System"] end OS <--> Hardware subgraph Hardware Memory["Memory"] Display["Display"] Input["Input Devices"] Drive["Hard drives"] end
This is where the biggest difference with programming for the GBA lies: once the console’s BIOS has loaded your program, you are alone on board. No OS to allocate memory for you, no OS to abstract the console’s resources, no OS to launch multiple programs, no OS to catch crashes. You are launched without a safety net into the deep end! From afar, it may seem complicated, but in reality, the Game Boy Advance’s architecture is relatively simple. It is therefore the perfect gateway to low-level programming.
flowchart BT YourProgram["Your Program"] subgraph Hardware RAM["RAM"] VRAM["Screen"] Input["Buttons"] Sound["Sound channels"] end YourProgram --->|Memory Reads and Writes| Hardware Hardware -->|Interruptions| YourProgram
Before Starting #
What are the prerequisites? #
I want to clarify that this tutorial does not require you to own a Game Boy Advance. To be honest, I don’t know where I put mine, and I don’t have a flash cartridge that would allow me to run a custom program.
Personally, I use the mGBA emulator. There are others, but this one works well.
We will also use readelf, hexdump, and gdb as part of this tutorial. These are tools that will be useful for understanding the behavior of our compiler and our program. They are not mandatory to follow the process, but they are useful tools.
You will also need to install Zig on your machine.
This tutorial assumes:
- basic programming knowledge;
- familiarity with a compiled language like Rust, C, or C++;
- some proficiency with Linux and its console.
We can begin!
Where to start? #
My goal was to reach the first step of the TONC tutorial, which is to display three colored dots on the screen. It’s the equivalent of a “Hello, World!” but for a console that has its own screen.
It is certainly a result that seems basic, but the path to get there is interesting. Because we will have to understand how the GBA processor accesses the cartridge’s code and executes our program, how to build our cartridge so that it is recognized by the GBA, and how to display pixels on a screen.
But then, where to start? I suggest starting with a minimal program: an infinite loop.
Let’s start by understanding how the GBA works.
The Game Boy Advance #
The Game Boy Advance is a portable console sold by Nintendo starting in 2001. Thought of as the evolution of the Game Boy, the GBA makes a real technical leap by moving from an 8-bit Sharp SM83 processor to a 32-bit ARM7TDMI (even though the ARM7 was already at the end of its life). However, the GBA still contains an SM83 to ensure backward compatibility with the Game Boy.
ARM7TDMI has two modes: a 32-bit ARM instruction mode and a compact 16-bit THUMB instruction mode, which allows our program to take up less space.
The GBA also includes memory:
- 16 KB of BIOS
- 288 KB of RAM
- 96 KB of video memory
- 1 KB of memory for sprite management
- 1 KB of memory for the palette
Sound:
- 4 analog channels
- 2 digital channels
Buttons:
- the famous D-Pad, so 4 directional buttons
- 6 buttons A, B, Start, Select, L, and R
And especially a cartridge port:
- GBA cartridge, max. 32 MB ROM + max. 64 KB SRAM
- Game Boy cartridge, max 32 KB ROM + 8 KB SRAM (and more with banking)
There is also a serial port (for the famous link cable).
Memory map #
If you have done some programming, you probably have an idea of what computer memory is. It is a large table of bytes where you can write and read. Each byte has an address, which according to the processor has a defined size.
For ARM7TDMI, addresses are encoded on 32 bits. This means hypothetically, our
program can access addresses 0x0000_0000
to 0xFFFF_FFFF
.
But wait, 0xFFFF_FFFF
is 4294967295! So we would have access to 4 GiB of RAM?
But the previous technical sheet only shows 288 KB, so what is going on?
The trick is that each address does not necessarily represent a physical memory cell. An address range can be used to access RAM, another to access the BIOS ROM, another still to access the video memory that allows pixels to be displayed on the screen. And finally, the vast majority of address ranges are… Unused! Reading or writing over them will not cause anything logical or useful (and at worst will cause a pure and simple crash).
Here is the memory map of the Game Boy Advance (copied from GBATEK):
Start Addr | End Addr | Usage |
---|---|---|
0x0000_0000 | 0x0000_3FFF | BIOS - System ROM (16 KBytes) |
0x0000_4000 | 0x01FF_FFFF | Unused |
0x0200_0000 | 0x0203_FFFF | WRAM - On-board Work RAM (256 KBytes, 2 Wait) |
0x0204_0000 | 0x02FF_FFFF | Unused |
0x0300_0000 | 0x0300_7FFF | WRAM - On-chip Work RAM (32 KBytes) |
0x0300_8000 | 0x03FF_FFFF | Unused |
0x0400_0000 | 0x0400_03FE | I/O Registers |
0x0400_0400 | 0x04FF_FFFF | Unused |
0x0500_0000 | 0x0500_03FF | BG/OBJ Palette RAM (1 Kbyte) |
0x0500_0400 | 0x05FF_FFFF | Unused |
0x0600_0000 | 0x0601_7FFF | VRAM - Video RAM (96 KBytes) |
0x0601_8000 | 0x06FF_FFFF | Unused |
0x0700_0000 | 0x0700_03FF | OAM - OBJ Attributes (1 Kbyte) |
0x0700_0400 | 0x07FF_FFFF | Unused |
0x0800_0000 | 0x09FF_FFFF | Game Pak ROM/FlashROM (max 32MB) - Wait State 0 |
0x0A00_0000 | 0x0BFF_FFFF | Game Pak ROM/FlashROM (max 32MB) - Wait State 1 |
0x0C00_0000 | 0x0DFF_FFFF | Game Pak ROM/FlashROM (max 32MB) - Wait State 2 |
0x0E00_0000 | 0x0E00_FFFF | Game Pak SRAM (max 64 KBytes, 8-bit Bus width) |
0x0E01_0000 | 0x0FFF_FFFF | Unused |
0x1000_0000 | 0xFFFF_FFFF | Unused |
As you can see, the vast majority of addresses are not used at all.
But then where is our code? It is in the cartridge. A removable, external
cartridge. The memory range allocated to external memory is
0x0800_0000-0x0E00_FFFF
. But what does this mean? It means that when the
processor tries to access address 0x0800_0000
, it actually accesses not its
internal memory but the cartridge’s ROM at address 0x0000_0000
.
Execution #
Now that we know how the GBA’s memory works, how does the processor execute code?
First thing: the code is stored in memory. Yes, the code has an address, and that’s how the processor accesses it.
To understand, you need to know that the processor itself has very little memory. It is very small because it must be very fast, as this is the memory the processor uses to perform its various operations. ARM7TDMI has 37 registers of 32 bits each, so literally 37 bytes of memory.
Register 15 contains the Program Counter (PC). This register contains the address of the instruction the processor is currently executing.
This is not entirely true. The register contains the address of the current instruction plus two instructions. In other architectures, PC contains the next instruction. With an ARM7TDMI processor, it’s two instructions. This is not very important for understanding this chapter, but it will become important later, so keep this in mind.
At console startup, the PC is at 0x0000_0000
. The processor will therefore
read the instruction at address 0x0000_0000
, execute it, then read the next
instruction, and so on. The instruction itself can modify the PC register,
allowing for conditional jumps or loops.
So you just need to put your code at 0x0000_0000
for it to be executed? In
theory yes, but 0x0000_0000-0x0000_3FFF
is the BIOS range of the console. The
BIOS is a program that allows the console to initialize itself. This program is
stored in a ROM and is therefore not editable. It is actually the BIOS that
will hand over to our program by jumping to address 0x0800_0000
.
We just need to place our infinite loop at the very beginning of our cartridge, and the Game Boy Advance BIOS will take care of handing over to us.
Objective: Infinite Loop #
Our First Executable #
Well, now that it’s clear, we know what we have to do: write our program! Let’s go naively and see where it takes us. First, let’s create a new Zig project.
mkdir mygbarom
cd mygbarom
mkdir src
From there, we can add ./src/main.zig
:
pub fn main() void {
while (true) {}
}
Even if you don’t know Zig, this shouldn’t shock you.
We can also add ./build.zig
:
const std = @import("std");
pub fn build(b: *std.Build) void {
const target = b.standardTargetOptions(.{});
const optimize = b.standardOptimizeOption(.{});
const exe_mod = b.createModule(.{
.root_source_file = b.path("src/main.zig"),
.target = target,
.optimize = optimize,
});
const exe = b.addExecutable(.{
.name = "mygbarom",
.root_module = exe_mod,
});
b.installArtifact(exe);
}
This is the file that Zig executes to build the program. It’s equivalent to a
Makefile or a CMake config. I took the base file generated by zig init
and
removed the superfluous.
To elaborate a bit, the purpose of our
build
function is to install our executable in./zig-out/bin/
. To do this, we first need to define the target of our compilation, which is ourtarget
variable. Then, we need to define our optimization level with ouroptimize
variable.exe_mod
contains our module. Finally,exe
defines our final executable, which will be named “mygbarom”. This final executable will be installed in./zig-out/bin/
thanks tob.installArtifact(exe)
.A small clarification, I am not a Zig expert, even less an expert in its build system. Take what I say with a grain of salt and feel free to report any errors.
We can now build the project and run the executable.
zig build
./zig-out/bin/mygbarom
Normally, the program will just do nothing without terminating; you can stop it
with CTRL+C
.
Great! We have the behavior we wanted: an infinite loop.
Well, now we can test with mGBA.
$ mgba zig-out/bin/mygbarom
Could not run game. Are you sure the file exists and is a compatible game?
Yes, it couldn’t be that simple, as you might expect. But this is an
interesting starting point for us because this path between the executable we
just generated and the cartridge will gradually introduce us to the world of
bare-metal programming, but it will also help us understand a bit more the
world we come from (here, Linux on x86-64
).
So, if our executable is not a GBA cartridge, then what is it?
$ file zig-out/bin/mygbarom
mygbarom: ELF 64-bit LSB executable, x86-64, version 1 (SYSV),
statically linked, with debug_info, not stripped
An ELF file? Let’s see what that is.
The ELF File Format #
ELF (Executable and Linkable Format) is the standard executable file format on Linux and many modern UNIX systems. It is a format that stores not only the executable code of a program but also data, symbols, and information on how the program should be loaded into memory.
An ELF file is divided into several sections. The main ones are:
.text
: contains the program’s executable code.data
: contains initialized variables.bss
: reserves space for uninitialized variables.rodata
: read-only data (like constant strings)
There are also other specific sections, like .ARM.exidx
that we see in our
file, specifically used for exceptions on ARM architecture.
When you execute a program on Linux, the operating system loads the ELF file into memory, following the instructions contained in its header, then positions the program counter (PC) register at the specified entry point.
However, the GBA does not understand this format. It expects raw code that it can execute directly, without a complex loading phase. This is why we will need to extract only the necessary code and data from our ELF file to create our GBA ROM.
Let’s compare this with our goal: on a GBA, we need our code to be placed directly in the cartridge’s ROM memory, at specific addresses and in a very specific format. The ELF format is far too complex with its headers and multiple sections to be used directly.
Let’s see how to transform our ELF file into a raw binary suitable for the GBA.
From ELF File to <code>.gba</code> #
So what is wrong?
- Our executable does not use ARM assembly but x86_64.
- Our executable uses ELF, a format not recognized by our GBA.
Setting the Target #
Let’s first resolve the assembly issue.
In ./build.zig
, replace target
with:
const target = std.Target.Query{
.cpu_arch = .thumb,
.cpu_model = .{ .explicit = &std.Target.arm.cpu.arm7tdmi },
.os_tag = .freestanding,
};
Now, the Zig compiler will target the ARM7TDMI processor and write in THUMB assembly, which, as a reminder, is a compact ARM assembly with 16-bit instructions, very practical for GBA programming.
Next, update the executable module:
const exe_mod = b.createModule(.{
.root_source_file = b.path("src/main.zig"),
.target = b.resolveTargetQuery(target), // here
.optimize = optimize,
});
Let’s try to build:
$ zig build
install
└─ install mygbarom
└─ zig build-exe foo Debug thumb-freestanding failure
error: warning(link): unexpected LLD stderr:
ld.lld: warning: cannot find entry symbol _start; not setting start address
Ah, an error. Here we are confronted with the first lie of the wonderful world
of programming: main
is not the program’s entry point. The entry point is
(standardly) _start
. This symbol represents the address where the program
execution begins. This symbol, after potentially performing the program’s
initialization, hands over to the main
function.
Here, the linker cannot find the _start
symbol in our program. The reason is
simple: we asked Zig to use the freestanding
flag for the OS, which means we
are not targeting any OS. Zig will therefore let us manually add _start
to
our program and export it.
So how do we add our entry point?
export fn _start() noreturn {
while (true) {}
}
We just replaced main
with _start
. We also added export
, which makes this
symbol visible and allows it to be considered the entry point. The function
type is now noreturn
, which allows the Zig compiler to verify that our
function never returns.
noreturn
ensures that our pointer in the PC register does not “escape.” Hypothetically, PC could continue to increment after our program, and thus fall on random values until it crashes. We must ensure that whatever happens, PC remains confined to the boundaries of our code (for example, with an infinite loop).
If we zig build
, the build should finish successfully! If we use readelf
,
we have:
$ readelf mygbarom -h
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: ARM
Version: 0x1
Entry point address: 0x54261
Start of program headers: 52 (bytes into file)
Start of section headers: 1456952 (bytes into file)
Flags: 0x5000200, Version5 EABI, soft-float ABI
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 6
Size of section headers: 40 (bytes)
Number of section headers: 21
Section header string table index: 19
We can see that we indeed have an ARM file! Congratulations!
Cleaning Up Our ELF File #
Let’s take a closer look at our ELF file.
$ readelf --sections zig-out/bin/mygbarom
There are 21 section headers, starting at offset 0x163b38:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .ARM.exidx ARM_EXIDX 000100f4 0000f4 000d90 00 AL 4 0 4
[ 2] .ARM.extab PROGBITS 00010e84 000e84 000888 00 A 0 0 4
[ 3] .rodata PROGBITS 00011710 001710 0066d8 00 AMS 0 0 8
[ 4] .text PROGBITS 00027de8 007de8 0587f2 00 AX 0 0 4
[ 5] .data PROGBITS 000905dc 0605dc 000004 00 WA 0 0 4
[ 6] .bss NOBITS 00090600 0605e0 001000 00 WA 0 0 64
[ 7] .debug_loc PROGBITS 00000000 0605e0 0775ed 00 0 0 1
[ 8] .debug_abbrev PROGBITS 00000000 0d7bcd 0006e7 00 0 0 1
[ 9] .debug_info PROGBITS 00000000 0d82b4 03434c 00 0 0 1
[10] .debug_ranges PROGBITS 00000000 10c600 008938 00 0 0 1
[11] .debug_str PROGBITS 00000000 114f38 00ee7d 01 MS 0 0 1
[12] .debug_pubnames PROGBITS 00000000 123db5 0051ea 00 0 0 1
[13] .debug_pubtypes PROGBITS 00000000 128f9f 0019d6 00 0 0 1
[14] .ARM.attributes ARM_ATTRIBUTES 00000000 12a975 00003a 00 0 0 1
[15] .debug_frame PROGBITS 00000000 12a9b0 005e70 00 0 0 4
[16] .debug_line PROGBITS 00000000 130820 024313 00 0 0 1
[17] .comment PROGBITS 00000000 154b33 000013 01 MS 0 0 1
[18] .symtab SYMTAB 00000000 154b48 009c00 10 20 1988 4
[19] .shstrtab STRTAB 00000000 15e748 0000da 00 0 0 1
[20] .strtab STRTAB 00000000 15e822 005314 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
D (mbind), y (purecode), p (processor specific)
So we remember well: the .text
section contains the code, the .data
,
.rodata
, and .bss
sections contain the data.
Let’s look at our .text
section: size = 0x0587f2
? That’s huge! How is it
that our code section is so large when we just implemented a while
loop?
The answer is simple: we zig build
in debug mode. We told Zig to keep debug
symbols, checks, library symbols, and not optimize.
Here’s how to tell the Zig compiler to remove all that, in ./build.zig
:
pub fn build(b: *std.Build) void {
// ...
const optimize = .ReleaseSmall;
// ...
}
The .ReleaseSmall
optimization allows us to optimize the size of our code,
minimizing the number of instructions.
I tested using other optimization formats (notably
.ReleaseSafe
), and I encountered errors and unrecognized instructions. I haven’t identified the problem, but everything works well in.ReleaseSmall
, so I won’t change it for now. I’ll do some research later.
Let’s build:
zig build
$ readelf --sections zig-out/bin/mygbarom
There are 6 section headers, starting at offset 0x16c:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .ARM.exidx ARM_EXIDX 000100d4 0000d4 000010 00 AL 2 0 4
[ 2] .text PROGBITS 000200e4 0000e4 000004 00 AX 0 0 4
[ 3] .ARM.attributes ARM_ATTRIBUTES 00000000 0000e8 00003a 00 0 0 1
[ 4] .comment PROGBITS 00000000 000122 000013 01 MS 0 0 1
[ 5] .shstrtab STRTAB 00000000 000135 000035 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
D (mbind), y (purecode), p (processor specific)
And there we go! We now have a clean ELF file!
From ELF File to Binary #
Now that we have a cleaner ELF file, will it be recognized by mGBA?
❯ mgba zig-out/bin/mygbarom.gba
Could not run game. Are you sure the file exists and is a compatible game?
As mentioned earlier, the Game Boy Advance does not recognize the ELF format. Why would it? Reminder: the Game Boy Advance has no OS. It doesn’t need a standard portable format, it doesn’t need a section table, relocation table, dynamic libraries, etc. We need to remove all the superfluous from the ELF format to keep only the essential: our program.
Note: mGBA have the option to read ELF files. Of course it’s not a native capability of the GBA, so we’ll not use that.
Here’s how to do it, in ./build.zig
:
pub fn build(b: *std.Build) void {
// ...
const objcopy_step = exe.addObjCopy(.{ .format = .bin });
const install_bin_step = b.addInstallBinFile(objcopy_step.getOutput(), "mygbarom.gba");
install_bin_step.step.dependOn(&objcopy_step.step);
b.getInstallStep().dependOn(&install_bin_step.step);
// ...
}
We use the objcopy tool to extract our code from the ELF file, in the .bin
format, i.e., raw code without file headers or unloaded sections. In our case,
only the .text
and .ARM.exidx
sections will be copied into mygbarom.gba
.
Normally, we can find the binary in ./zig-out/bin/mygbarom.gba
. Let’s see
what we find in there.
$ hexdump zig-out/bin/mygbarom.gba
0000000 0010 0001 b0b0 80b0 000c 0001 0001 0000
0000010 0000 0000 0000 0000 0000 0000 0000 0000
*
0010010 e7fe 4770
0010014
For those unfamiliar with hexdump
, it’s a tool that allows you to see the
contents of a file in hexadecimal. So what do we see? The left column shows the
offset, a bit like a line number. The other eight columns show the bytes
contained in the binary. The asterisk represents lines filled with zeros. But
then why does our binary look like this? Let’s recheck our ELF file.
$ readelf zig-out/bin/mygbarom -x 1
Hex dump of section '.ARM.exidx':
0x000100d4 10000100 b0b0b080 0c000100 01000000 ................
$ readelf zig-out/bin/mygbarom -x 2
Hex dump of section '.text':
0x000200e4 fee77047 ..pG
So what we see is that our binary consists of the contents of .ARM.exidx
,
followed by a long padding, then the contents of .text
.
Good! We have our binary!
The Cartridge Format #
Let’s see how mGBA reacts to our masterpiece:
$ mgba zig-out/bin/mygbarom.gba
Could not run game. Are you sure the file exists and is a compatible game?
Failed again. So what’s the problem? Once again, I lied when I said the Game Boy Advance doesn’t expect a standard format. That’s false. The console, before launching the game, will check the contents of a header, which must precede our program. Here is its structure:
const Header = extern struct {
entry_point: u32,
nintendo_logo: [156]u8,
game_name: [12]u8,
game_code: [4]u8,
maker_code: [2]u8,
fixed_value: u8,
unit_code: u8,
device_type: u8,
reserved1: [7]u8,
software_version: u8,
complement_check: u8,
reserved2: [2]u8,
};
The first entry of the header is entry_point
: it’s an ARM instruction, the
first one that will be executed by the processor. This instruction must be
positioned at 0x0800_0000
in the GBA’s memory. This instruction is often a
simple “jump,” a jump to an address located after the end of the header.
At startup, the GBA will check two values in the header: nintendo_logo
and
complement_check
. nintendo_logo
must contain the Nintendo logo bitmap to
the bit. This is a way to prevent the commercialization of unofficial
cartridges because the logo is copyrighted, and Nintendo can sue companies
distributing their logo without authorization. complement_check
is a
checksum, a value calculated based on the bytes contained in the cartridge,
which allows verifying if the cartridge has been modified.
Most emulators do not check these values (to be able to read unofficial ROMs and for convenience for those like us who just want to debug their game without calculating a checksum). However, mGBA checks two values to determine if what we give it is indeed a GBA cartridge:
- The fourth byte of
entrypoint
, which should be equal to0xEA
. fixed_value
, which should be equal to0x96
.
Let’s try adding this header to our source file ./src/main.zig
:
const Header = extern struct {
entry_point: u32 = 0xEA00002E,
nintendo_logo: [156]u8 = @splat(0x00),
game_name: [12]u8 = @splat(0x00),
game_code: [4]u8 = @splat(0x00),
maker_code: [2]u8 = @splat(0x00),
fixed_value: u8 = 0x96,
unit_code: u8 = 0x00,
device_type: u8 = 0x00,
reserved1: [7]u8 = @splat(0x00),
software_version: u8 = 0x00,
complement_check: u8 = 0x00,
reserved2: [2]u8 = @splat(0x00),
};
const header = Header{};
export fn _start() void {
while (true) {}
}
So we declared a Header
structure, with the extern
keyword that tells the
compiler to respect the order of attributes.
To be more precise,
extern
allows respecting what is called the “C ABI.” ABI stands for Application Binary Interface. It’s a convention, a standard that allows sharing (among other things) the same method of struct construction, ensuring that two libraries possibly compiled separately can still share the same structures. When we useextern
in Zig, we are actually telling the compiler to respect a standard that will allow other programs (possibly not written in Zig) to interact with ours.
If we zig build
, we can see that absolutely nothing has changed in our ELF
file or binary file.
I imagine the Zig compiler does not include the header in the program because it is never used.
We need to force the compiler and linker to place this header exactly at the very beginning of the file, then our code right after. To do this, we will need to give directives to the linker using a linker script.
Alignment #
We talked a bit about struct construction standards in the previous section, but we need to go further and address alignment.
By default, in C ABI, structures are not necessarily constructed to optimize the space occupied. They rather adopt a layout that optimizes access speed.
In general, memory accesses for data larger than 8 bits must be aligned, meaning the address must be a multiple of the data size.
For example, writing a 16-bit encoded int
must be done at an address
divisible by 16 (16, 32, 64, 0x100, and so on).
Imagine the following structure:
const Example = extern struct {
a: u8,
b: u16,
c: u32,
}
How will this structure be represented in memory?
Naively, we could construct it like this:
block-beta a:2 b:4 c:8
But let’s see how this structure can be viewed when aligned to 32 bits.
block-beta columns 8 a:2 b:4 c1["c"]:2 c2["c"]:6
Imagine we want to access b
. Since it’s 16-bit data, we cannot access it
directly: we need to use an aligned address. We must therefore read the 32 bits
that include a
, b
, and a part of c
.
block-beta columns 8 a:2 b:4 c1["c"]:2
Then we need to shift the bits.
block-beta columns 8 b:4 c:2 z["0"]:2
Finally, we need to zero out (or ignore) the 16 insignificant bits.
block-beta columns 8 b:4 z["0"]:4
This represents many steps to access a value. And again, we only needed one
memory access. For c
, it would have required two.
How do compilers solve this problem? They add padding.
block-beta columns 4 a:1 p1["padding"]:1 b:2 c:4
Now, each data is aligned and accessible in a single instruction!
How does this concern us? This way of adding padding in structures could make our header incorrect, non-compliant with what the Game Boy Advance expects.
We must therefore specify to the compiler not to modify the alignment of our data.
const Header = extern struct {
entry_point: u32 align(1) = 0xEA00002E,
nintendo_logo: [156]u8 align(1) = @splat(0),
game_name: [12]u8 align(1) = @splat(0),
game_code: [4]u8 align(1) = @splat(0),
maker_code: [2]u8 align(1) = @splat(0),
fixed_value: u8 align(1) = 0x96,
unit_code: u8 align(1) = @splat(0),
device_type: u8 align(1) = @splat(0),
reserved1: [7]u8 align(1) = @splat(0),
software_version: u8 align(1) = @splat(0),
complement_check: u8 align(1) = @splat(0),
reserved2: [2]u8 align(1) = @splat(0),
};
Note that on my machine, both versions of
Header
produce exactly the same result. This is (I think) because the header does not pose the alignment problems we discussed. But it’s a bit by “accident.” Addingalign(1)
ensures that our header will remain correct regardless of the compiler’s default behavior.
The Linker Script #
A quick point on the build process.
When we call zig build
, we first call the compiler on each source file, which
compiles it into an object file, an incomplete ELF file that only contains
information specific to the source file. Then we call the linker, which will
take all the object files and merge them into a single executable ELF file.
flowchart LR z1("main.zig") z2("foo.zig") z3("bar.zig") z1 -- "compile" --> o1("main.o") z2 -- "compile" --> o2("foo.o") z3 -- "compile" --> o3("bar.o") o1 -- link --> elf("out.elf") o2 -- link --> elf o3 -- link --> elf
The linker script is the instructions for this final link. It’s a file that tells the linker how to construct an ELF file from the sections of the intermediate object files. A linker script is not mandatory: in most cases, the linker can manage without it. But in our case, we are forced to tell the linker that we want our header first.
Our goal will be to declare a .gbaheader
section containing an instance of
our Header
structure and place this section first, before the .text
section.
Let’s start by telling the Zig compiler that we want to place our header in our
.gbaheader
section, in ./src/main.zig
:
export const header linksection(".gbaheader") = Header{};
Next, let’s create our linker script, ./gba.ld
.
SECTIONS {
.gbaheader : {
KEEP(*(.gbaheader))
}
.text : {
*(.text)
}
.ARM.exidx : {
*(.ARM.exidx)
}
}
So what did we just do? We declare the contents of the sections that our final
ELF file will contain. For example, .text
will contain all the .text
sections from all the source files we are compiling. In our case, there is only
main.zig
, but there could be other source files that the linker assembles
into a single ELF file. Same for .ARM.exidx
, which the linker absolutely
wants to place before our header. For .gbaheader
, we have indeed created a
custom section in main.zig
, which for now only exists in the generated object
file. We declare a .gbaheader
section in our final file, which contains all
the .gbaheader
sections we have declared (normally just one). A small
subtlety: we use the “function” KEEP
to tell the linker to keep this section
in the final file, even if it seems unused elsewhere.
The declaration order is important: we declare that .gbaheader
is before
.text
and .ARM.exidx
.
Last step: tell Zig that we are using a linker script, in ./build.zig
:
pub fn build(b: *std.Build) void {
// ...
exe.setLinkerScript(.{ .src_path = .{
.owner = b,
.sub_path = "gba.ld",
} });
// ...
}
And there we go! Normally, after a zig build
, you should be able to find:
$ readelf zig-out/bin/mygbarom --sections
There are 8 section headers, starting at offset 0x10180:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .gbaheader PROGBITS 00010000 010000 0000c0 00 A 0 0 4
[ 2] .text PROGBITS 000100c0 0100c0 000002 00 AX 0 0 4
[ 3] .text.__aeab[...] PROGBITS 000100c2 0100c2 000002 00 AX 0 0 2
[ 4] .ARM.exidx ARM_EXIDX 000100c4 0100c4 000010 00 AL 2 0 4
[ 5] .ARM.attributes ARM_ATTRIBUTES 00000000 0100d4 00003a 00 0 0 1
[ 6] .comment PROGBITS 00000000 01010e 000013 01 MS 0 0 1
[ 7] .shstrtab STRTAB 00000000 010121 00005d 00 0 0 1
Our section exists! Let’s see what it contains:
$ readelf zig-out/bin/mygbarom -x 1
Hex dump of section '.gbaheader':
0x00010000 2e0000ea 00000000 00000000 00000000 ................
0x00010010 00000000 00000000 00000000 00000000 ................
0x00010020 00000000 00000000 00000000 00000000 ................
0x00010030 00000000 00000000 00000000 00000000 ................
0x00010040 00000000 00000000 00000000 00000000 ................
0x00010050 00000000 00000000 00000000 00000000 ................
0x00010060 00000000 00000000 00000000 00000000 ................
0x00010070 00000000 00000000 00000000 00000000 ................
0x00010080 00000000 00000000 00000000 00000000 ................
0x00010090 00000000 00000000 00000000 00000000 ................
0x000100a0 00000000 00000000 00000000 00000000 ................
0x000100b0 00009600 00000000 00000000 00000000 ................
It’s indeed our header: we can clearly see the entry instruction 0xEA00002E
(displayed here in little endian), and 0x96
at the end (yes, we see it).
Now let’s look at our .gba
binary:
$ hexdump zig-out/bin/mygbarom.gba
0000000 002e ea00 0000 0000 0000 0000 0000 0000
0000010 0000 0000 0000 0000 0000 0000 0000 0000
*
00000b0 0000 0096 0000 0000 0000 0000 0000 0000
00000c0 e7fe 4770 fffc 7fff b0b0 80b0 fff8 7fff
00000d0 0001 0000
00000d4
Total victory! We can see our file starting with 0xEA00_002E
(still in little
endian), our 0x96
, then starting from offset 0x000_00c0
, our code!
Last step, the most crucial: how will mGBA react?
mgba zig-out/bin/mygbarom.gba
If everything goes well, you should see a new white screen. This is the mGBA
emulator successfully launched! Congratulations, mGBA has identified our .gba
file as a cartridge!
At first glance, one might think everything went well. mGBA is stuck because it’s caught in an infinite loop.
But if we launch:
mgba-qt zig-out/bin/mygbarom.gba
…we encounter a mysterious error.
This game uses a BIOS call that is not implemented.
Please use the official BIOS for best experience.
then…
The game has crashed with the following error:
Jumped to invalid address: 0E0000C0
Now, that’s strange. If you already have the solution to this problem, congratulations, because it’s not so intuitive.
Switching from ARM to THUMB #
As mentioned earlier, ARM7TDMI can read two types of assembly:
- ARM
- THUMB
As a reminder, THUMB instructions are more compact (16 bits instead of 32) and therefore take up less space and less time to load.
To understand this last point, you need to understand that the cartridge’s ROM is accessible via a 16-bit bus. This means it can only be read 16 bits at a time. So an ARM instruction that takes 32 bits takes twice as long as a THUMB instruction. This is not the case for most other GBA memories.
And if we look at our file ./build.zig
:
pub fn build(b: *std.Build) void {
const target = std.Target.Query{
.cpu_arch = .thumb, // it's thumb!
.cpu_model = .{ .explicit = &std.Target.arm.cpu.arm7tdmi },
.os_tag = .freestanding,
};
// ...
}
Our code compiles in THUMB! Our infinite while
loop is in THUMB instructions,
but our processor initially reads ARM instructions first. We need a specific
ARM instruction to switch to THUMB mode.
BX
: branch and exchange instruction set.Syntax:
BX Rm
where:
Rm
is a register containing an address to branch to.BX Rm derives the target instruction set from bit[0] of Rm:
- If bit[0] of Rm is 0, the processor changes to, or remains in, ARM state.
- If bit[0] of Rm is 1, the processor changes to, or remains in, Thumb state.
So to understand this well, this instruction is used to jump to a certain
address while changing the instruction mode. BX
uses the fact that addresses
must be aligned to 8 to use the last bit of the given address to choose between
ARM and THUMB.
Example:
Imagine we want to jump to address
0xCAFE_0000
in THUMB. We will therefore need to put the value0xCAFE_0001
in a register. This is possible because the address0xCAFE_0001
is not aligned, whether with 32 or 16-bit instructions.
How do we write ARM in our project? There are several ways, but we will try to go the simplest route and use Zig’s inline assembler function.
So in ./src/main.zig
:
export fn _start() noreturn {
asm volatile (
\\.arm
\\.cpu arm7tdmi
\\add r0, pc, #1
\\bx r0
);
while (true) {}
}
asm
allows us to write assembly in Zig. volatile
is a directive for the
compiler to not optimize or move this piece of assembly, leaving it as is.
.arm
declares that we are writing ARM assembly..cpu arm7tdmi
declares that we are targeting an ARM7TDMI CPU.add r0, pc, #1
is equivalent tor0 = pc + 1
.bx r0
allows jumping to the value contained inr0
.
To understand what is happening here, you need to remember what we saw about the PC register. It does not contain the address of the current instruction but the address of the instruction that follows the next instruction.
block-beta columns 3 n["Current Instruction N"] n1["N + S"] n2["N + 2S"]
With S = 4
for ARM instructions and S = 2
for THUMB instructions.
We will therefore have PC = N + 2S
.
So in our example, we have:
block-beta n["add r0, pc, #1"] n1["bx r0"] n2["our code..."]
So r0
will indeed contain the address of the start of our THUMB code… plus 1.
Why +1
? Because we want to switch to THUMB mode, so bit[0]
of r0
must
be equal to 1.
That’s it! Now just zig build
and open mGBA. Normally, you should see the
same white screen, but without errors and without crashes.
But how can we be sure everything is working as we want? We can use the debugger integrated into mgba-qt!
mgba-qt -g zig-out/bin/mygbarom.gba
The -g
allows launching a GDB session, the GNU debugger you may know.
More precisely, mGBA will launch a GDB server, which you can connect to with a
GDB client.
$ gdb
(gdb) target remote localhost:2345 # connect to mgba gdb server
(gdb) layout asm # show asm TUI window
(gdb) layout reg # show registers TUI window
(gdb) stepi # go to next assembly instruction
Now, you can repeat the stepi
command and see the GBA execute. As you can
see, it starts at 0x0000_0000
, immediately jumps to 0x0000_0354
(in the
BIOS!), then after a few instructions, jumps to 0x0800_0000
(it’s our
cartridge!) then to 0x0800_00C0
(after our header) and executes
add r0, pc, # 1
then bx r0
. We can see that r0 = 0x0800_00C9
, which corresponds well to
the address 0x0800_00C8
where our THUMB code is located, plus 1 to switch to
THUMB mode.
After the jump, we are therefore at 0x0800_00C8
, and we are… Stuck!
The instruction we are on is b.n 0x0800_00C8
, i.e., “jump to
address 0x0800_00C8
.” So it’s a jump in place, an infinite loop.
Exactly what we wanted.
Victory!
Conclusion #
We have successfully created a program for the Game Boy Advance that runs in an emulator. Congratulations if you managed to read or follow this far!
I want to clarify that we went quickly. I did not cover all the subtleties of the ELF format, linker scripts, and the functioning of the Game Boy Advance.
Nevertheless, we were able to cover fairly advanced topics together, and it’s not over! I intend to continue this tutorial to introduce the topics of memory registers, hardware interruptions, and DMA. We are therefore entering the part that is also covered by the TONC tutorial, and I will draw heavily from it.
My goal is not to teach you how to create a GBA game (although now you have a base to continue!), but to understand low-level concepts that are often hidden behind abstraction layers set up by your OS.
My goal is also to learn these concepts more deeply. These are topics I explored during my studies, and I enjoy rediscovering and sharing them.
If you think I said something wrong, feel free to contact me! Similarly, if there are inaccuracies or points you didn’t understand, it would help me improve this article.
To go further, I invite you to check out:
- GBATEK (documentation on the GBA)
- TONC (on GBA game programming)
- ZigGBA (for programming in Zig on GBA)
Here is the result of this first (long) chapter.
$ tree .
.
├── build.zig
├── build.zig.zon
├── gba.ld
├── src
│ └── main.zig
└── zig-out
└── bin
├── mygbarom
├── mygbarom.gba
└── mygbarom.sav
4 directories, 7 files
./src/main.zig
:
const Header = extern struct {
entry_point: u32 align(1) = 0xEA00002E,
nintendo_logo: [156]u8 align(1) = @splat(0x00),
game_name: [12]u8 align(1) = @splat(0x00),
game_code: [4]u8 align(1) = @splat(0x00),
maker_code: [2]u8 align(1) = @splat(0x00),
fixed_value: u8 align(1) = 0x96,
unit_code: u8 align(1) = 0x00,
device_type: u8 align(1) = 0x00,
reserved1: [7]u8 align(1) = @splat(0x00),
software_version: u8 align(1) = 0x00,
complement_check: u8 align(1) = 0x00,
reserved2: [2]u8 align(1) = @splat(0x00),
};
export const header linksection(".gbaheader") = Header{};
export fn _start() noreturn {
asm volatile (
\\.arm
\\.cpu arm7tdmi
\\add r0, pc, #1
\\bx r0
);
while (true) {}
}
./build.zig
:
const std = @import("std");
pub fn build(b: *std.Build) void {
const target = std.Target.Query{
.cpu_arch = .thumb,
.cpu_model = .{ .explicit = &std.Target.arm.cpu.arm7tdmi },
.os_tag = .freestanding,
};
const optimize = .ReleaseSmall;
const exe_mod = b.createModule(.{
.root_source_file = b.path("src/main.zig"),
.target = b.resolveTargetQuery(target),
.optimize = optimize,
});
const exe = b.addExecutable(.{
.name = "mygbarom",
.root_module = exe_mod,
});
exe.setLinkerScript(.{ .src_path = .{
.owner = b,
.sub_path = "gba.ld",
} });
const objcopy_step = exe.addObjCopy(.{ .format = .bin });
const install_bin_step = b.addInstallBinFile(objcopy_step.getOutput(), "mygbarom.gba");
install_bin_step.step.dependOn(&objcopy_step.step);
b.getInstallStep().dependOn(&install_bin_step.step);
b.installArtifact(exe);
}
./gba.ld
:
SECTIONS {
.gbaheader : {
KEEP(*(.gbaheader))
}
.text : {
*(.text)
}
.ARM.exidx : {
*(.ARM.exidx)
}
}