How to Create a GBA ROM from Scratch with Zig?

Note: this article was originally written in French, then translated with the help of LLMs. You can read this article in French here, but now that I’m mainly updating this article.

Introduction #

For the past few months, I’ve been interested in developing games for the Game Boy Advance.

There are many tools available to build GBA ROMs:

There are also many tutorials on creating GBA ROMs. TONC is the most well-known (and the one I personally followed). However, these tutorials generally assume the use of a library that handles the build and initialization of the ROM. The user doesn’t take control at the entry point but rather later, in a clean state that abstracts most of the initialization issues.

It seems to me that writing a ROM for the Game Boy Advance using only Zig and its build system is a good introduction to system and embedded programming. Without diving into the complexities of virtual memory, the kernel, and x86, this project allows us to understand these different topics:

  • how a simple CPU works;
  • how it executes code;
  • how to write code for this type of machine;
  • what new constraints this environment imposes;
  • what ARM assembly looks like;
  • how to build a bare-metal program.

A fairly comprehensive program with a rather fun and motivating subject: how to program the GBA games of our childhood.

This tutorial is aimed at readers who already know how to program but may have never done low-level programming. We will go through the concepts one by one to eventually reach a very simple while loop, allowing you to then start a tutorial like TONC with your own low-level library.

Why Zig? #

I don’t have much experience with Zig. After programming for a few weeks for the GBA in Rust with the agb crate, I wanted to try a simpler language to better understand the inner workings of the GBA and to avoid being distracted by a language that encourages experimentation and the pursuit of perfection.

Zig is a language that has intrigued me for some time. The promise of a system language for the 21st century, simple and transparent. A modern C with optionals, Rust-like error handling, explicit allocators, and above all, a coherent build system integrated into the language.

So I looked at which library I could use. I quickly spotted ZigGBA, a project that is already quite complete, accompanied by a blog post that I invite you to read.

It was while reading the source code of ZigGBA, more concise than that of agb, that curiosity struck me. I was reading the content of the build system, trying to understand the linker script, and then I realized that the build process of ZigGBA is much easier for me to understand than that of agb.

I want to clarify that I am neither an expert in Rust nor an expert in Zig. I am only speaking from my own knowledge, and I imagine that for a developer accustomed to Rust’s build system and concepts, all this seems simpler. But for me, it is really this simplicity that led me to explore the problem of writing GBA programs from scratch.

All this to say that this project could be done in C, Rust, or, I imagine, any other compiled language with more or less complications, but Zig seems perfect to me for discovering system and embedded programming without having to struggle with the dated quirks of C and with a simple and modern language.

Programming for the Game Boy Advance #

The main difference between programming on the GBA and the programming you may be used to (programming for Linux, the web, or smartphones) is that usually, you are not alone on the machine where your program runs, and you do not know the details of the machine on which it runs.

On your Linux machine, your program is not the only one running: you also have a browser, terminal emulators, maybe a music player. All these programs need to share your machine’s resources: some memory here, a few processor cycles there.

You may have noticed that as a developer, you didn’t have to worry about these considerations. From your point of view, memory is infinite and belongs to you, the processor is yours, and you will never be asked to think about the other programs running on your machine. This is because this role is assigned to your operating system.

It is the OS that must ensure process compartmentalization, i.e., guarantee that a process cannot access the execution environment or memory of another process.

It is also the OS that handles resource sharing: it distributes memory to processes that request it, orchestrates process execution, ensuring that each gets some execution time.

It is always the OS that allows you to write software without having to think about the hardware on which you will run it. No matter the brand of your screen, whether your mouse is ball-based or Bluetooth, whether your RAM is DDR3 or 4, or even if you are in a virtual machine, your program will be the same. This is thanks to the abstraction layer that your OS offers between you and the hardware. This allows you to write portable programs, to distribute your software to everyone using the same OS as you.

flowchart TD
    Browser["Browser"] <--> OS
    Terminal["Terminal"] <--> OS
    MusicPlayer["Music Player"] <--> OS
    YourProgram["Your Program"] <--> OS

    subgraph OS["Operating System"]
    end

    OS <--> Hardware

    subgraph Hardware
        Memory["Memory"]
        Display["Display"]
        Input["Input Devices"]
        Drive["Hard drives"]
    end

This is where the biggest difference with programming for the GBA lies: once the console’s BIOS has loaded your program, you are alone on board. No OS to allocate memory for you, no OS to abstract the console’s resources, no OS to launch multiple programs, no OS to catch crashes. You are launched without a safety net into the deep end! From afar, it may seem complicated, but in reality, the Game Boy Advance’s architecture is relatively simple. It is therefore the perfect gateway to low-level programming.

flowchart BT
    YourProgram["Your Program"]

    subgraph Hardware
        RAM["RAM"]
        VRAM["Screen"]
        Input["Buttons"]
        Sound["Sound channels"]
    end

    YourProgram --->|Memory Reads and Writes| Hardware
    Hardware -->|Interruptions| YourProgram

Before Starting #

What are the prerequisites? #

I want to clarify that this tutorial does not require you to own a Game Boy Advance. To be honest, I don’t know where I put mine, and I don’t have a flash cartridge that would allow me to run a custom program.

Personally, I use the mGBA emulator. There are others, but this one works well.

We will also use readelf, hexdump, and gdb as part of this tutorial. These are tools that will be useful for understanding the behavior of our compiler and our program. They are not mandatory to follow the process, but they are useful tools.

You will also need to install Zig on your machine.

This tutorial assumes:

  • basic programming knowledge;
  • familiarity with a compiled language like Rust, C, or C++;
  • some proficiency with Linux and its console.

We can begin!

Where to start? #

My goal was to reach the first step of the TONC tutorial, which is to display three colored dots on the screen. It’s the equivalent of a “Hello, World!” but for a console that has its own screen.

It is certainly a result that seems basic, but the path to get there is interesting. Because we will have to understand how the GBA processor accesses the cartridge’s code and executes our program, how to build our cartridge so that it is recognized by the GBA, and how to display pixels on a screen.

But then, where to start? I suggest starting with a minimal program: an infinite loop.

Let’s start by understanding how the GBA works.

The Game Boy Advance #

The Game Boy Advance is a portable console sold by Nintendo starting in 2001. Thought of as the evolution of the Game Boy, the GBA makes a real technical leap by moving from an 8-bit Sharp SM83 processor to a 32-bit ARM7TDMI (even though the ARM7 was already at the end of its life). However, the GBA still contains an SM83 to ensure backward compatibility with the Game Boy.

ARM7TDMI has two modes: a 32-bit ARM instruction mode and a compact 16-bit THUMB instruction mode, which allows our program to take up less space.

The GBA also includes memory:

  • 16 KB of BIOS
  • 288 KB of RAM
  • 96 KB of video memory
  • 1 KB of memory for sprite management
  • 1 KB of memory for the palette

Sound:

  • 4 analog channels
  • 2 digital channels

Buttons:

  • the famous D-Pad, so 4 directional buttons
  • 6 buttons A, B, Start, Select, L, and R

And especially a cartridge port:

  • GBA cartridge, max. 32 MB ROM + max. 64 KB SRAM
  • Game Boy cartridge, max 32 KB ROM + 8 KB SRAM (and more with banking)

There is also a serial port (for the famous link cable).

Memory map #

If you have done some programming, you probably have an idea of what computer memory is. It is a large table of bytes where you can write and read. Each byte has an address, which according to the processor has a defined size.

For ARM7TDMI, addresses are encoded on 32 bits. This means hypothetically, our program can access addresses 0x0000_0000 to 0xFFFF_FFFF.

But wait, 0xFFFF_FFFF is 4294967295! So we would have access to 4 GiB of RAM? But the previous technical sheet only shows 288 KB, so what is going on?

The trick is that each address does not necessarily represent a physical memory cell. An address range can be used to access RAM, another to access the BIOS ROM, another still to access the video memory that allows pixels to be displayed on the screen. And finally, the vast majority of address ranges are… Unused! Reading or writing over them will not cause anything logical or useful (and at worst will cause a pure and simple crash).

Here is the memory map of the Game Boy Advance (copied from GBATEK):

Start Addr End Addr Usage
0x0000_0000 0x0000_3FFF BIOS - System ROM (16 KBytes)
0x0000_4000 0x01FF_FFFF Unused
0x0200_0000 0x0203_FFFF WRAM - On-board Work RAM (256 KBytes, 2 Wait)
0x0204_0000 0x02FF_FFFF Unused
0x0300_0000 0x0300_7FFF WRAM - On-chip Work RAM (32 KBytes)
0x0300_8000 0x03FF_FFFF Unused
0x0400_0000 0x0400_03FE I/O Registers
0x0400_0400 0x04FF_FFFF Unused
0x0500_0000 0x0500_03FF BG/OBJ Palette RAM (1 Kbyte)
0x0500_0400 0x05FF_FFFF Unused
0x0600_0000 0x0601_7FFF VRAM - Video RAM (96 KBytes)
0x0601_8000 0x06FF_FFFF Unused
0x0700_0000 0x0700_03FF OAM - OBJ Attributes (1 Kbyte)
0x0700_0400 0x07FF_FFFF Unused
0x0800_0000 0x09FF_FFFF Game Pak ROM/FlashROM (max 32MB) - Wait State 0
0x0A00_0000 0x0BFF_FFFF Game Pak ROM/FlashROM (max 32MB) - Wait State 1
0x0C00_0000 0x0DFF_FFFF Game Pak ROM/FlashROM (max 32MB) - Wait State 2
0x0E00_0000 0x0E00_FFFF Game Pak SRAM (max 64 KBytes, 8-bit Bus width)
0x0E01_0000 0x0FFF_FFFF Unused
0x1000_0000 0xFFFF_FFFF Unused

As you can see, the vast majority of addresses are not used at all.

But then where is our code? It is in the cartridge. A removable, external cartridge. The memory range allocated to external memory is 0x0800_0000-0x0E00_FFFF. But what does this mean? It means that when the processor tries to access address 0x0800_0000, it actually accesses not its internal memory but the cartridge’s ROM at address 0x0000_0000.

Execution #

Now that we know how the GBA’s memory works, how does the processor execute code?

First thing: the code is stored in memory. Yes, the code has an address, and that’s how the processor accesses it.

To understand, you need to know that the processor itself has very little memory. It is very small because it must be very fast, as this is the memory the processor uses to perform its various operations. ARM7TDMI has 37 registers of 32 bits each, so literally 37 bytes of memory.

Register 15 contains the Program Counter (PC). This register contains the address of the instruction the processor is currently executing.

This is not entirely true. The register contains the address of the current instruction plus two instructions. In other architectures, PC contains the next instruction. With an ARM7TDMI processor, it’s two instructions. This is not very important for understanding this chapter, but it will become important later, so keep this in mind.

At console startup, the PC is at 0x0000_0000. The processor will therefore read the instruction at address 0x0000_0000, execute it, then read the next instruction, and so on. The instruction itself can modify the PC register, allowing for conditional jumps or loops.

So you just need to put your code at 0x0000_0000 for it to be executed? In theory yes, but 0x0000_0000-0x0000_3FFF is the BIOS range of the console. The BIOS is a program that allows the console to initialize itself. This program is stored in a ROM and is therefore not editable. It is actually the BIOS that will hand over to our program by jumping to address 0x0800_0000.

We just need to place our infinite loop at the very beginning of our cartridge, and the Game Boy Advance BIOS will take care of handing over to us.

Objective: Infinite Loop #

Our First Executable #

Well, now that it’s clear, we know what we have to do: write our program! Let’s go naively and see where it takes us. First, let’s create a new Zig project.

mkdir mygbarom
cd mygbarom
mkdir src

From there, we can add ./src/main.zig:

pub fn main() void {
  while (true) {}
}

Even if you don’t know Zig, this shouldn’t shock you.

We can also add ./build.zig:

const std = @import("std");

pub fn build(b: *std.Build) void {
    const target = b.standardTargetOptions(.{});
    const optimize = b.standardOptimizeOption(.{});

    const exe_mod = b.createModule(.{
        .root_source_file = b.path("src/main.zig"),
        .target = target,
        .optimize = optimize,
    });

    const exe = b.addExecutable(.{
        .name = "mygbarom",
        .root_module = exe_mod,
    });

    b.installArtifact(exe);
}

This is the file that Zig executes to build the program. It’s equivalent to a Makefile or a CMake config. I took the base file generated by zig init and removed the superfluous.

To elaborate a bit, the purpose of our build function is to install our executable in ./zig-out/bin/. To do this, we first need to define the target of our compilation, which is our target variable. Then, we need to define our optimization level with our optimize variable. exe_mod contains our module. Finally, exe defines our final executable, which will be named “mygbarom”. This final executable will be installed in ./zig-out/bin/ thanks to b.installArtifact(exe).

A small clarification, I am not a Zig expert, even less an expert in its build system. Take what I say with a grain of salt and feel free to report any errors.

We can now build the project and run the executable.

zig build
./zig-out/bin/mygbarom

Normally, the program will just do nothing without terminating; you can stop it with CTRL+C.

Great! We have the behavior we wanted: an infinite loop.

Well, now we can test with mGBA.

$ mgba zig-out/bin/mygbarom
Could not run game. Are you sure the file exists and is a compatible game?

Yes, it couldn’t be that simple, as you might expect. But this is an interesting starting point for us because this path between the executable we just generated and the cartridge will gradually introduce us to the world of bare-metal programming, but it will also help us understand a bit more the world we come from (here, Linux on x86-64).

So, if our executable is not a GBA cartridge, then what is it?

$ file zig-out/bin/mygbarom
mygbarom: ELF 64-bit LSB executable, x86-64, version 1 (SYSV),
statically linked, with debug_info, not stripped

An ELF file? Let’s see what that is.

The ELF File Format #

ELF (Executable and Linkable Format) is the standard executable file format on Linux and many modern UNIX systems. It is a format that stores not only the executable code of a program but also data, symbols, and information on how the program should be loaded into memory.

An ELF file is divided into several sections. The main ones are:

  • .text: contains the program’s executable code
  • .data: contains initialized variables
  • .bss: reserves space for uninitialized variables
  • .rodata: read-only data (like constant strings)

There are also other specific sections, like .ARM.exidx that we see in our file, specifically used for exceptions on ARM architecture.

When you execute a program on Linux, the operating system loads the ELF file into memory, following the instructions contained in its header, then positions the program counter (PC) register at the specified entry point.

However, the GBA does not understand this format. It expects raw code that it can execute directly, without a complex loading phase. This is why we will need to extract only the necessary code and data from our ELF file to create our GBA ROM.

Let’s compare this with our goal: on a GBA, we need our code to be placed directly in the cartridge’s ROM memory, at specific addresses and in a very specific format. The ELF format is far too complex with its headers and multiple sections to be used directly.

Let’s see how to transform our ELF file into a raw binary suitable for the GBA.

From ELF File to <code>.gba</code> #

So what is wrong?

  1. Our executable does not use ARM assembly but x86_64.
  2. Our executable uses ELF, a format not recognized by our GBA.

Setting the Target #

Let’s first resolve the assembly issue.

In ./build.zig, replace target with:

const target = std.Target.Query{
    .cpu_arch = .thumb,
    .cpu_model = .{ .explicit = &std.Target.arm.cpu.arm7tdmi },
    .os_tag = .freestanding,
};

Now, the Zig compiler will target the ARM7TDMI processor and write in THUMB assembly, which, as a reminder, is a compact ARM assembly with 16-bit instructions, very practical for GBA programming.

Next, update the executable module:

const exe_mod = b.createModule(.{
    .root_source_file = b.path("src/main.zig"),
    .target = b.resolveTargetQuery(target), // here
    .optimize = optimize,
});

Let’s try to build:

$ zig build
install
└─ install mygbarom
   └─ zig build-exe foo Debug thumb-freestanding failure
error: warning(link): unexpected LLD stderr:
ld.lld: warning: cannot find entry symbol _start; not setting start address

Ah, an error. Here we are confronted with the first lie of the wonderful world of programming: main is not the program’s entry point. The entry point is (standardly) _start. This symbol represents the address where the program execution begins. This symbol, after potentially performing the program’s initialization, hands over to the main function.

Here, the linker cannot find the _start symbol in our program. The reason is simple: we asked Zig to use the freestanding flag for the OS, which means we are not targeting any OS. Zig will therefore let us manually add _start to our program and export it.

So how do we add our entry point?

export fn _start() noreturn {
    while (true) {}
}

We just replaced main with _start. We also added export, which makes this symbol visible and allows it to be considered the entry point. The function type is now noreturn, which allows the Zig compiler to verify that our function never returns.

noreturn ensures that our pointer in the PC register does not “escape.” Hypothetically, PC could continue to increment after our program, and thus fall on random values until it crashes. We must ensure that whatever happens, PC remains confined to the boundaries of our code (for example, with an infinite loop).

If we zig build, the build should finish successfully! If we use readelf, we have:

$ readelf mygbarom -h
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           ARM
  Version:                           0x1
  Entry point address:               0x54261
  Start of program headers:          52 (bytes into file)
  Start of section headers:          1456952 (bytes into file)
  Flags:                             0x5000200, Version5 EABI, soft-float ABI
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         6
  Size of section headers:           40 (bytes)
  Number of section headers:         21
  Section header string table index: 19

We can see that we indeed have an ARM file! Congratulations!

Cleaning Up Our ELF File #

Let’s take a closer look at our ELF file.

$ readelf --sections zig-out/bin/mygbarom
There are 21 section headers, starting at offset 0x163b38:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .ARM.exidx        ARM_EXIDX       000100f4 0000f4 000d90 00  AL  4   0  4
  [ 2] .ARM.extab        PROGBITS        00010e84 000e84 000888 00   A  0   0  4
  [ 3] .rodata           PROGBITS        00011710 001710 0066d8 00 AMS  0   0  8
  [ 4] .text             PROGBITS        00027de8 007de8 0587f2 00  AX  0   0  4
  [ 5] .data             PROGBITS        000905dc 0605dc 000004 00  WA  0   0  4
  [ 6] .bss              NOBITS          00090600 0605e0 001000 00  WA  0   0 64
  [ 7] .debug_loc        PROGBITS        00000000 0605e0 0775ed 00      0   0  1
  [ 8] .debug_abbrev     PROGBITS        00000000 0d7bcd 0006e7 00      0   0  1
  [ 9] .debug_info       PROGBITS        00000000 0d82b4 03434c 00      0   0  1
  [10] .debug_ranges     PROGBITS        00000000 10c600 008938 00      0   0  1
  [11] .debug_str        PROGBITS        00000000 114f38 00ee7d 01  MS  0   0  1
  [12] .debug_pubnames   PROGBITS        00000000 123db5 0051ea 00      0   0  1
  [13] .debug_pubtypes   PROGBITS        00000000 128f9f 0019d6 00      0   0  1
  [14] .ARM.attributes   ARM_ATTRIBUTES  00000000 12a975 00003a 00      0   0  1
  [15] .debug_frame      PROGBITS        00000000 12a9b0 005e70 00      0   0  4
  [16] .debug_line       PROGBITS        00000000 130820 024313 00      0   0  1
  [17] .comment          PROGBITS        00000000 154b33 000013 01  MS  0   0  1
  [18] .symtab           SYMTAB          00000000 154b48 009c00 10     20 1988  4
  [19] .shstrtab         STRTAB          00000000 15e748 0000da 00      0   0  1
  [20] .strtab           STRTAB          00000000 15e822 005314 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  D (mbind), y (purecode), p (processor specific)

So we remember well: the .text section contains the code, the .data, .rodata, and .bss sections contain the data.

Let’s look at our .text section: size = 0x0587f2? That’s huge! How is it that our code section is so large when we just implemented a while loop?

The answer is simple: we zig build in debug mode. We told Zig to keep debug symbols, checks, library symbols, and not optimize.

Here’s how to tell the Zig compiler to remove all that, in ./build.zig:

pub fn build(b: *std.Build) void {
  // ...
  const optimize = .ReleaseSmall;
  // ...
}

The .ReleaseSmall optimization allows us to optimize the size of our code, minimizing the number of instructions.

I tested using other optimization formats (notably .ReleaseSafe), and I encountered errors and unrecognized instructions. I haven’t identified the problem, but everything works well in .ReleaseSmall, so I won’t change it for now. I’ll do some research later.

Let’s build:

zig build
$ readelf --sections zig-out/bin/mygbarom
There are 6 section headers, starting at offset 0x16c:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .ARM.exidx        ARM_EXIDX       000100d4 0000d4 000010 00  AL  2   0  4
  [ 2] .text             PROGBITS        000200e4 0000e4 000004 00  AX  0   0  4
  [ 3] .ARM.attributes   ARM_ATTRIBUTES  00000000 0000e8 00003a 00      0   0  1
  [ 4] .comment          PROGBITS        00000000 000122 000013 01  MS  0   0  1
  [ 5] .shstrtab         STRTAB          00000000 000135 000035 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  D (mbind), y (purecode), p (processor specific)

And there we go! We now have a clean ELF file!

From ELF File to Binary #

Now that we have a cleaner ELF file, will it be recognized by mGBA?

❯ mgba zig-out/bin/mygbarom.gba
Could not run game. Are you sure the file exists and is a compatible game?

As mentioned earlier, the Game Boy Advance does not recognize the ELF format. Why would it? Reminder: the Game Boy Advance has no OS. It doesn’t need a standard portable format, it doesn’t need a section table, relocation table, dynamic libraries, etc. We need to remove all the superfluous from the ELF format to keep only the essential: our program.

Note: mGBA have the option to read ELF files. Of course it’s not a native capability of the GBA, so we’ll not use that.

Here’s how to do it, in ./build.zig:

pub fn build(b: *std.Build) void {
  // ...

  const objcopy_step = exe.addObjCopy(.{ .format = .bin });
  const install_bin_step = b.addInstallBinFile(objcopy_step.getOutput(), "mygbarom.gba");

  install_bin_step.step.dependOn(&objcopy_step.step);
  b.getInstallStep().dependOn(&install_bin_step.step);

  // ...
}

We use the objcopy tool to extract our code from the ELF file, in the .bin format, i.e., raw code without file headers or unloaded sections. In our case, only the .text and .ARM.exidx sections will be copied into mygbarom.gba.

Normally, we can find the binary in ./zig-out/bin/mygbarom.gba. Let’s see what we find in there.

$ hexdump zig-out/bin/mygbarom.gba
0000000 0010 0001 b0b0 80b0 000c 0001 0001 0000
0000010 0000 0000 0000 0000 0000 0000 0000 0000
*
0010010 e7fe 4770
0010014

For those unfamiliar with hexdump, it’s a tool that allows you to see the contents of a file in hexadecimal. So what do we see? The left column shows the offset, a bit like a line number. The other eight columns show the bytes contained in the binary. The asterisk represents lines filled with zeros. But then why does our binary look like this? Let’s recheck our ELF file.

$ readelf zig-out/bin/mygbarom -x 1

Hex dump of section '.ARM.exidx':
  0x000100d4 10000100 b0b0b080 0c000100 01000000 ................
$ readelf zig-out/bin/mygbarom -x 2

Hex dump of section '.text':
  0x000200e4 fee77047                            ..pG

So what we see is that our binary consists of the contents of .ARM.exidx, followed by a long padding, then the contents of .text.

Good! We have our binary!

The Cartridge Format #

Let’s see how mGBA reacts to our masterpiece:

$ mgba zig-out/bin/mygbarom.gba
Could not run game. Are you sure the file exists and is a compatible game?

Failed again. So what’s the problem? Once again, I lied when I said the Game Boy Advance doesn’t expect a standard format. That’s false. The console, before launching the game, will check the contents of a header, which must precede our program. Here is its structure:

const Header = extern struct {
    entry_point: u32,
    nintendo_logo: [156]u8,
    game_name: [12]u8,
    game_code: [4]u8,
    maker_code: [2]u8,
    fixed_value: u8,
    unit_code: u8,
    device_type: u8,
    reserved1: [7]u8,
    software_version: u8,
    complement_check: u8,
    reserved2: [2]u8,
};

The first entry of the header is entry_point: it’s an ARM instruction, the first one that will be executed by the processor. This instruction must be positioned at 0x0800_0000 in the GBA’s memory. This instruction is often a simple “jump,” a jump to an address located after the end of the header.

At startup, the GBA will check two values in the header: nintendo_logo and complement_check. nintendo_logo must contain the Nintendo logo bitmap to the bit. This is a way to prevent the commercialization of unofficial cartridges because the logo is copyrighted, and Nintendo can sue companies distributing their logo without authorization. complement_check is a checksum, a value calculated based on the bytes contained in the cartridge, which allows verifying if the cartridge has been modified.

Most emulators do not check these values (to be able to read unofficial ROMs and for convenience for those like us who just want to debug their game without calculating a checksum). However, mGBA checks two values to determine if what we give it is indeed a GBA cartridge:

  1. The fourth byte of entrypoint, which should be equal to 0xEA.
  2. fixed_value, which should be equal to 0x96.

Let’s try adding this header to our source file ./src/main.zig:

const Header = extern struct {
    entry_point: u32 = 0xEA00002E,
    nintendo_logo: [156]u8 = @splat(0x00),
    game_name: [12]u8 = @splat(0x00),
    game_code: [4]u8 = @splat(0x00),
    maker_code: [2]u8 = @splat(0x00),
    fixed_value: u8 = 0x96,
    unit_code: u8 = 0x00,
    device_type: u8 = 0x00,
    reserved1: [7]u8 = @splat(0x00),
    software_version: u8 = 0x00,
    complement_check: u8 = 0x00,
    reserved2: [2]u8 = @splat(0x00),
};

const header = Header{};

export fn _start() void {
    while (true) {}
}

So we declared a Header structure, with the extern keyword that tells the compiler to respect the order of attributes.

To be more precise, extern allows respecting what is called the “C ABI.” ABI stands for Application Binary Interface. It’s a convention, a standard that allows sharing (among other things) the same method of struct construction, ensuring that two libraries possibly compiled separately can still share the same structures. When we use extern in Zig, we are actually telling the compiler to respect a standard that will allow other programs (possibly not written in Zig) to interact with ours.

If we zig build, we can see that absolutely nothing has changed in our ELF file or binary file.

I imagine the Zig compiler does not include the header in the program because it is never used.

We need to force the compiler and linker to place this header exactly at the very beginning of the file, then our code right after. To do this, we will need to give directives to the linker using a linker script.

Alignment #

We talked a bit about struct construction standards in the previous section, but we need to go further and address alignment.

By default, in C ABI, structures are not necessarily constructed to optimize the space occupied. They rather adopt a layout that optimizes access speed.

In general, memory accesses for data larger than 8 bits must be aligned, meaning the address must be a multiple of the data size.

For example, writing a 16-bit encoded int must be done at an address divisible by 16 (16, 32, 64, 0x100, and so on).

Imagine the following structure:

const Example = extern struct {
    a: u8,
    b: u16,
    c: u32,
}

How will this structure be represented in memory?

Naively, we could construct it like this:

block-beta
  a:2
  b:4
  c:8

But let’s see how this structure can be viewed when aligned to 32 bits.

block-beta
  columns 8
  a:2
  b:4
  c1["c"]:2
  c2["c"]:6

Imagine we want to access b. Since it’s 16-bit data, we cannot access it directly: we need to use an aligned address. We must therefore read the 32 bits that include a, b, and a part of c.

block-beta
  columns 8
  a:2
  b:4
  c1["c"]:2

Then we need to shift the bits.

block-beta
  columns 8
  b:4
  c:2
  z["0"]:2

Finally, we need to zero out (or ignore) the 16 insignificant bits.

block-beta
  columns 8
  b:4
  z["0"]:4

This represents many steps to access a value. And again, we only needed one memory access. For c, it would have required two.

How do compilers solve this problem? They add padding.

block-beta
  columns 4
  a:1
  p1["padding"]:1
  b:2
  c:4

Now, each data is aligned and accessible in a single instruction!

How does this concern us? This way of adding padding in structures could make our header incorrect, non-compliant with what the Game Boy Advance expects.

We must therefore specify to the compiler not to modify the alignment of our data.

const Header = extern struct {
    entry_point: u32 align(1) = 0xEA00002E,
    nintendo_logo: [156]u8 align(1) = @splat(0),
    game_name: [12]u8 align(1) = @splat(0),
    game_code: [4]u8 align(1) = @splat(0),
    maker_code: [2]u8 align(1) = @splat(0),
    fixed_value: u8 align(1) = 0x96,
    unit_code: u8 align(1) = @splat(0),
    device_type: u8 align(1) = @splat(0),
    reserved1: [7]u8 align(1) = @splat(0),
    software_version: u8 align(1) = @splat(0),
    complement_check: u8 align(1) = @splat(0),
    reserved2: [2]u8 align(1) = @splat(0),
};

Note that on my machine, both versions of Header produce exactly the same result. This is (I think) because the header does not pose the alignment problems we discussed. But it’s a bit by “accident.” Adding align(1) ensures that our header will remain correct regardless of the compiler’s default behavior.

The Linker Script #

A quick point on the build process.

When we call zig build, we first call the compiler on each source file, which compiles it into an object file, an incomplete ELF file that only contains information specific to the source file. Then we call the linker, which will take all the object files and merge them into a single executable ELF file.

flowchart LR
        z1("main.zig")
        z2("foo.zig")
        z3("bar.zig")
        z1 -- "compile" --> o1("main.o")
        z2 -- "compile" --> o2("foo.o")
        z3 -- "compile" --> o3("bar.o")
        o1 -- link --> elf("out.elf")
        o2 -- link --> elf
        o3 -- link --> elf

The linker script is the instructions for this final link. It’s a file that tells the linker how to construct an ELF file from the sections of the intermediate object files. A linker script is not mandatory: in most cases, the linker can manage without it. But in our case, we are forced to tell the linker that we want our header first.

Our goal will be to declare a .gbaheader section containing an instance of our Header structure and place this section first, before the .text section.

Let’s start by telling the Zig compiler that we want to place our header in our .gbaheader section, in ./src/main.zig:

export const header linksection(".gbaheader") = Header{};

Next, let’s create our linker script, ./gba.ld.

SECTIONS {
  .gbaheader : {
    KEEP(*(.gbaheader))
  }
  .text : {
    *(.text)
  }
  .ARM.exidx : {
    *(.ARM.exidx)
  }
}

So what did we just do? We declare the contents of the sections that our final ELF file will contain. For example, .text will contain all the .text sections from all the source files we are compiling. In our case, there is only main.zig, but there could be other source files that the linker assembles into a single ELF file. Same for .ARM.exidx, which the linker absolutely wants to place before our header. For .gbaheader, we have indeed created a custom section in main.zig, which for now only exists in the generated object file. We declare a .gbaheader section in our final file, which contains all the .gbaheader sections we have declared (normally just one). A small subtlety: we use the “function” KEEP to tell the linker to keep this section in the final file, even if it seems unused elsewhere.

The declaration order is important: we declare that .gbaheader is before .text and .ARM.exidx.

Last step: tell Zig that we are using a linker script, in ./build.zig:

pub fn build(b: *std.Build) void {
    // ...

    exe.setLinkerScript(.{ .src_path = .{
        .owner = b,
        .sub_path = "gba.ld",
    } });

    // ...
}

And there we go! Normally, after a zig build, you should be able to find:

$ readelf zig-out/bin/mygbarom --sections
There are 8 section headers, starting at offset 0x10180:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .gbaheader        PROGBITS        00010000 010000 0000c0 00   A  0   0  4
  [ 2] .text             PROGBITS        000100c0 0100c0 000002 00  AX  0   0  4
  [ 3] .text.__aeab[...] PROGBITS        000100c2 0100c2 000002 00  AX  0   0  2
  [ 4] .ARM.exidx        ARM_EXIDX       000100c4 0100c4 000010 00  AL  2   0  4
  [ 5] .ARM.attributes   ARM_ATTRIBUTES  00000000 0100d4 00003a 00      0   0  1
  [ 6] .comment          PROGBITS        00000000 01010e 000013 01  MS  0   0  1
  [ 7] .shstrtab         STRTAB          00000000 010121 00005d 00      0   0  1

Our section exists! Let’s see what it contains:

$ readelf zig-out/bin/mygbarom -x 1

Hex dump of section '.gbaheader':
  0x00010000 2e0000ea 00000000 00000000 00000000 ................
  0x00010010 00000000 00000000 00000000 00000000 ................
  0x00010020 00000000 00000000 00000000 00000000 ................
  0x00010030 00000000 00000000 00000000 00000000 ................
  0x00010040 00000000 00000000 00000000 00000000 ................
  0x00010050 00000000 00000000 00000000 00000000 ................
  0x00010060 00000000 00000000 00000000 00000000 ................
  0x00010070 00000000 00000000 00000000 00000000 ................
  0x00010080 00000000 00000000 00000000 00000000 ................
  0x00010090 00000000 00000000 00000000 00000000 ................
  0x000100a0 00000000 00000000 00000000 00000000 ................
  0x000100b0 00009600 00000000 00000000 00000000 ................

It’s indeed our header: we can clearly see the entry instruction 0xEA00002E (displayed here in little endian), and 0x96 at the end (yes, we see it).

Now let’s look at our .gba binary:

$ hexdump zig-out/bin/mygbarom.gba
0000000 002e ea00 0000 0000 0000 0000 0000 0000
0000010 0000 0000 0000 0000 0000 0000 0000 0000
*
00000b0 0000 0096 0000 0000 0000 0000 0000 0000
00000c0 e7fe 4770 fffc 7fff b0b0 80b0 fff8 7fff
00000d0 0001 0000
00000d4

Total victory! We can see our file starting with 0xEA00_002E (still in little endian), our 0x96, then starting from offset 0x000_00c0, our code!

Last step, the most crucial: how will mGBA react?

mgba zig-out/bin/mygbarom.gba

If everything goes well, you should see a new white screen. This is the mGBA emulator successfully launched! Congratulations, mGBA has identified our .gba file as a cartridge!

At first glance, one might think everything went well. mGBA is stuck because it’s caught in an infinite loop.

But if we launch:

mgba-qt zig-out/bin/mygbarom.gba

…we encounter a mysterious error.

This game uses a BIOS call that is not implemented.
Please use the official BIOS for best experience.

then…

The game has crashed with the following error:

Jumped to invalid address: 0E0000C0

Now, that’s strange. If you already have the solution to this problem, congratulations, because it’s not so intuitive.

Switching from ARM to THUMB #

As mentioned earlier, ARM7TDMI can read two types of assembly:

  • ARM
  • THUMB

As a reminder, THUMB instructions are more compact (16 bits instead of 32) and therefore take up less space and less time to load.

To understand this last point, you need to understand that the cartridge’s ROM is accessible via a 16-bit bus. This means it can only be read 16 bits at a time. So an ARM instruction that takes 32 bits takes twice as long as a THUMB instruction. This is not the case for most other GBA memories.

And if we look at our file ./build.zig:

pub fn build(b: *std.Build) void {
    const target = std.Target.Query{
        .cpu_arch = .thumb, // it's thumb!
        .cpu_model = .{ .explicit = &std.Target.arm.cpu.arm7tdmi },
        .os_tag = .freestanding,
    };
    // ...
}

Our code compiles in THUMB! Our infinite while loop is in THUMB instructions, but our processor initially reads ARM instructions first. We need a specific ARM instruction to switch to THUMB mode.

BX: branch and exchange instruction set.

Syntax: BX Rm where:

Rm is a register containing an address to branch to.

BX Rm derives the target instruction set from bit[0] of Rm:

  • If bit[0] of Rm is 0, the processor changes to, or remains in, ARM state.
  • If bit[0] of Rm is 1, the processor changes to, or remains in, Thumb state.

So to understand this well, this instruction is used to jump to a certain address while changing the instruction mode. BX uses the fact that addresses must be aligned to 8 to use the last bit of the given address to choose between ARM and THUMB.

Example:

Imagine we want to jump to address 0xCAFE_0000 in THUMB. We will therefore need to put the value 0xCAFE_0001 in a register. This is possible because the address 0xCAFE_0001 is not aligned, whether with 32 or 16-bit instructions.

How do we write ARM in our project? There are several ways, but we will try to go the simplest route and use Zig’s inline assembler function.

So in ./src/main.zig:

export fn _start() noreturn {
    asm volatile (
        \\.arm
        \\.cpu arm7tdmi
        \\add r0, pc, #1
        \\bx r0
    );
    while (true) {}
}

asm allows us to write assembly in Zig. volatile is a directive for the compiler to not optimize or move this piece of assembly, leaving it as is.

  • .arm declares that we are writing ARM assembly.
  • .cpu arm7tdmi declares that we are targeting an ARM7TDMI CPU.
  • add r0, pc, #1 is equivalent to r0 = pc + 1.
  • bx r0 allows jumping to the value contained in r0.

To understand what is happening here, you need to remember what we saw about the PC register. It does not contain the address of the current instruction but the address of the instruction that follows the next instruction.

block-beta
  columns 3
  n["Current Instruction N"] n1["N + S"] n2["N + 2S"]

With S = 4 for ARM instructions and S = 2 for THUMB instructions. We will therefore have PC = N + 2S.

So in our example, we have:

block-beta
  n["add r0, pc, #1"] n1["bx r0"] n2["our code..."]

So r0 will indeed contain the address of the start of our THUMB code… plus 1. Why +1? Because we want to switch to THUMB mode, so bit[0] of r0 must be equal to 1.

That’s it! Now just zig build and open mGBA. Normally, you should see the same white screen, but without errors and without crashes.

But how can we be sure everything is working as we want? We can use the debugger integrated into mgba-qt!

mgba-qt -g zig-out/bin/mygbarom.gba

The -g allows launching a GDB session, the GNU debugger you may know. More precisely, mGBA will launch a GDB server, which you can connect to with a GDB client.

$ gdb
(gdb) target remote localhost:2345 # connect to mgba gdb server
(gdb) layout asm # show asm TUI window
(gdb) layout reg # show registers TUI window
(gdb) stepi # go to next assembly instruction

Now, you can repeat the stepi command and see the GBA execute. As you can see, it starts at 0x0000_0000, immediately jumps to 0x0000_0354 (in the BIOS!), then after a few instructions, jumps to 0x0800_0000 (it’s our cartridge!) then to 0x0800_00C0 (after our header) and executes add r0, pc, # 1 then bx r0. We can see that r0 = 0x0800_00C9, which corresponds well to the address 0x0800_00C8 where our THUMB code is located, plus 1 to switch to THUMB mode.

After the jump, we are therefore at 0x0800_00C8, and we are… Stuck! The instruction we are on is b.n 0x0800_00C8, i.e., “jump to address 0x0800_00C8.” So it’s a jump in place, an infinite loop. Exactly what we wanted.

Victory!

Conclusion #

We have successfully created a program for the Game Boy Advance that runs in an emulator. Congratulations if you managed to read or follow this far!

I want to clarify that we went quickly. I did not cover all the subtleties of the ELF format, linker scripts, and the functioning of the Game Boy Advance.

Nevertheless, we were able to cover fairly advanced topics together, and it’s not over! I intend to continue this tutorial to introduce the topics of memory registers, hardware interruptions, and DMA. We are therefore entering the part that is also covered by the TONC tutorial, and I will draw heavily from it.

My goal is not to teach you how to create a GBA game (although now you have a base to continue!), but to understand low-level concepts that are often hidden behind abstraction layers set up by your OS.

My goal is also to learn these concepts more deeply. These are topics I explored during my studies, and I enjoy rediscovering and sharing them.

If you think I said something wrong, feel free to contact me! Similarly, if there are inaccuracies or points you didn’t understand, it would help me improve this article.

To go further, I invite you to check out:

  • GBATEK (documentation on the GBA)
  • TONC (on GBA game programming)
  • ZigGBA (for programming in Zig on GBA)

Here is the result of this first (long) chapter.

$ tree .
.
├── build.zig
├── build.zig.zon
├── gba.ld
├── src
│   └── main.zig
└── zig-out
    └── bin
        ├── mygbarom
        ├── mygbarom.gba
        └── mygbarom.sav

4 directories, 7 files

./src/main.zig:

const Header = extern struct {
    entry_point: u32 align(1) = 0xEA00002E,
    nintendo_logo: [156]u8 align(1) = @splat(0x00),
    game_name: [12]u8 align(1) = @splat(0x00),
    game_code: [4]u8 align(1) = @splat(0x00),
    maker_code: [2]u8 align(1) = @splat(0x00),
    fixed_value: u8 align(1) = 0x96,
    unit_code: u8 align(1) = 0x00,
    device_type: u8 align(1) = 0x00,
    reserved1: [7]u8 align(1) = @splat(0x00),
    software_version: u8 align(1) = 0x00,
    complement_check: u8 align(1) = 0x00,
    reserved2: [2]u8 align(1) = @splat(0x00),
};

export const header linksection(".gbaheader") = Header{};

export fn _start() noreturn {
    asm volatile (
        \\.arm
        \\.cpu arm7tdmi
        \\add r0, pc, #1
        \\bx r0
    );
    while (true) {}
}

./build.zig:

const std = @import("std");

pub fn build(b: *std.Build) void {
    const target = std.Target.Query{
        .cpu_arch = .thumb,
        .cpu_model = .{ .explicit = &std.Target.arm.cpu.arm7tdmi },
        .os_tag = .freestanding,
    };
    const optimize = .ReleaseSmall;

    const exe_mod = b.createModule(.{
        .root_source_file = b.path("src/main.zig"),
        .target = b.resolveTargetQuery(target),
        .optimize = optimize,
    });

    const exe = b.addExecutable(.{
        .name = "mygbarom",
        .root_module = exe_mod,
    });

    exe.setLinkerScript(.{ .src_path = .{
        .owner = b,
        .sub_path = "gba.ld",
    } });

    const objcopy_step = exe.addObjCopy(.{ .format = .bin });
    const install_bin_step = b.addInstallBinFile(objcopy_step.getOutput(), "mygbarom.gba");

    install_bin_step.step.dependOn(&objcopy_step.step);
    b.getInstallStep().dependOn(&install_bin_step.step);

    b.installArtifact(exe);
}

./gba.ld:

SECTIONS {
  .gbaheader : {
    KEEP(*(.gbaheader))
  }
  .text : {
    *(.text)
  }
  .ARM.exidx : {
    *(.ARM.exidx)
  }
}