General Purpose I/O(e)
In this chapter, I will discuss the GPIO pins that the Raspberry Pi can use for communication with the outside world. Ideally, a pin is connected to an LED on the board so that we can initially forego external hardware and effectively see a result. Among bare-metal programmers, this is closest to a "Hello World" program. In the first subchapter, we will make the LED blink. In the next one, we will try to bring some structure to the source code and then improve the code so that the functions can be used more universally.
The Raspberry Pi has several GPIO pins that are particularly interesting for hardware enthusiasts as they are easy to program and can connect well with the outside world. The Raspberry Pi 4 has 40 GPIO pins, of which 26 can be programmed as either input or output. Here is a summary of the 40 pins and their assignments:
Additionally, there are other internal GPIO addresses that can be programmed through the programming interface. One of these is the green LED, which can be controlled through Port 42 on the Raspberry Pi 4.
Making the LED Light Up
As already mentioned, there is a green LED on the Raspberry Pi 4 that we can control via software to turn on and off. This is controlled via Port 42. On older models, the address is different. Check the appendix for these addresses. In the chapter “Advanced Functions for the LED”, it is very easy to control the LED accordingly, but do not forget to change the base address of the Raspberry in the “base.inc” file.
First, the complete source code:
.equ RPI_BASE, 0xFE000000
.equ GPIO_BASE, RPI_BASE + 0x200000
@ GPIO function select (GFSEL) registers have 3 bits per GPIO
.equ GPFSEL0, GPIO_BASE + 0x0 @GPIO select 0
.equ GPFSEL1, GPIO_BASE + 0x4 @GPIO select 1
.equ GPFSEL2, GPIO_BASE + 0x8 @GPIO select 2
.equ GPFSEL3, GPIO_BASE + 0xC @GPIO select 3
.equ GPFSEL4, GPIO_BASE + 0x10 @GPIO select 4
@GPIO SET/CLEAR registers have 1 bit per GPIO
.equ GPSET0, GPIO_BASE + 0x1C @set0 (GPIO 0 - 31)
.equ GPSET1, GPIO_BASE + 0x20 @set1 (GPIO 32 - 63)
.equ GPCLR0, GPIO_BASE + 0x28 @clear0 (GPIO 0 - 31)
.equ GPCLR1, GPIO_BASE + 0x2C @clear1 (GPIO 32 - 63)
This part of the source code will be inserted into our initial file between lines 4 and 5. This part generates variables that are available to the assembler during compilation.
/*
* LED is Pin42
* on the GPFSEL4 register Bits 8-6
* 001 = GPIO Pin 42 is an output
*/
mov r1, #1
lsl r1, #6 /* -> 001 000 000 */
/*
* 0x10 GPFSEL4 GPIO Function Select 4
*/
ldr r0, =GPFSEL4
str r1, [r0]
/*
* Set the 42 Bit
* 25:0 SETn (n=32..57) 0 = No effect; 1 = Set GPIO pin n.
* 42 - 32 = 10
*/
mov r1, #1
lsl r1, #10
MainLoop: @Infinite loop
/*
* 0x2C GPCLR1 GPIO Pin Output Clear 1
*/
ldr r0, =GPCLR1 @LED is off
str r1, [r0]
/*
* Waiting
*/
mov r2, #0x3F0000
wait1$:
sub r2, #1
cmp r2, #0
bne wait1$
/*
* 0x20 GPSET1 GPIO Pin Output Set 1
*/
ldr r0, =GPSET1 @LED is on
str r1, [r0]
/*
* Waiting
*/
mov r2, #0x3F0000
wait2$:
sub r2, #1
cmp r2, #0
bne wait2$
b MainLoop
And the actual program.
If you don’t feel like typing all of this, you can also download the source as zweites.s.
So now you see what happens. Compile it following the above-mentioned scheme. Replace “erstes” with “zweites” and follow the instructions as described above. The specified LED will start blinking after a boot.
First, I describe the new commands I have used here, and then I describe how the program works.
New Commands
.equ
.equ itself is not an assembler instruction for the processor. It gives an instruction to the assembler itself. .equ tells the assembler to create a variable with a specific value. This allows the programmer to use a variable name instead of specific numerical values. Alternatively, the .set command can also be used, which has the same function. For example:
.equ RPI_BASE, 0xFE000000
Here, the variable "RPI_BASE" is assigned the value "0xFE000000". Additionally, calculations can also be performed here.
.equ GPIO_BASE, RPI_BASE + 0x200000
This means that the value of "RPI_BASE" is added to the value "0x200000", resulting in "0xFE200000".
But what do these values mean?
The Raspberry Pi uses address registers that take over certain functions. These are offered at a specific address in the Raspberry Pi's memory, which we are allowed to use. These registers are described in the document BCM2711 ARM Peripherals and how they are to be used. The first value (RPI_BASE) we set is the fundamental address of the peripherals as visible to the ARM (Page 10). For the Raspberry Pi 4, this address is 0xFE000000. It is different for older models. In the appendix “Base addresses of the models”, I have listed these. If another model is being used, this base address must be adjusted accordingly here. To control the LED, we use the GPIO register. According to the documentation (Page 83), this would be "0x215000". Through testing, however, I can say that this cannot be correct. In previous models, the address was "0x200000" and this also works on the Raspberry Pi 4. Unfortunately, there are some errors in the documents, and in case of doubt, the Raspberry Pi community is helpful. The remaining values are taken from Chapter 5.2 of the document and defined as variables.
Registers
Let's get to some fundamentals. In a processor, there are registers that allow the processor to compute quickly. On the ARM, there are 16 of these, R0-R15 and CPSR. The registers R13-R15 also have other functions. R13 is also called the stack pointer (SP), R14 the link register (LR), and R15 the program counter (PC). Generally, these three registers should only be used for these purposes. CPSR is the status register, which can also be called APSR. This contains a copy of the status flags of the Arithmetic Logic Unit (ALU). They are also called condition code flags. The processor uses them to determine whether conditional instructions should be executed or not.
| R0 | R1 | R2 | R3 | R4 | R5 | R6 | R7 | R8 | R9 | R10 | R11 | R12 | R13 (SP) |
R14 (LR) |
R15 (PC) |
CPSR |
ldr (Load Register)
The ldr command loads a 32-bit number from memory and writes it into a register.
ldr r0, =GPFSEL4
In our case, we assign r0 the number from GPFSEL4 (0xFE200010). However, the ARM cannot do this directly as written. Here we use a function of the assembler that stores this number in memory and the assembler enters this address there. Essentially, we could do the same by creating this address ourselves and referencing it. This will come up in a later part of the course.
lsl (Logical Shift Left)
With lsl, you can shift bits to the left. In this case, the bits are shifted to the left.
As an example from our code:
lsl r1, #6
Here, the content of r1 is shifted 6 places to the left.
For example, if we assign r1 with 0x1, this number looks as follows in binary code:
0000 0000 0000 0000 0000 0000 0000 0001
If we now shift it 6 places to the left as described above, the binary code becomes:
0000 0000 0000 0000 0000 0000 0010 0000
sub (Subtract)
As the command suggests, something is subtracted here.
sub r2, #1
In our example, the content of register r2 is subtracted by 1.
cmp (Compare)
The cmp command compares data and fills the CPSR register with the result. We already use it in the next command.
cmp r2, #0
Here in the example, r2 is compared with "0".
bne (Branch if Not Equal)
We have already mentioned the b command in the previous part of the course. b means that we should jump to another address. Here, an ARM peculiarity is used. The ARM can execute most of its commands with conditions controlled by the status register. There are some conditions that will be described in the next section.
bne wait1$
In our example, the jump is only executed if the result of the previous command was not equal. If the result was equal, the command is ignored, and the next instruction is executed.
Condition Code
Most ARM commands can be given a condition code. The following suffixes are possible, which are simply appended to the corresponding command:
| SUFFIX | MEANING |
|---|---|
| EQ | Equal |
| NE | Not Equal |
| CS or HS | Higher or Same (unsigned) |
| CC or LO | Lower (unsigned) |
| MI | Negative |
| PL | Positive or Zero |
| VS | Overflow |
| VC | No Overflow |
| HI | Higher (unsigned) |
| LS | Kleiner oder gleich (ohne Vorzeichen) |
| GE | Größer oder gleich (ohne Vorzeichen) |
| LT | Less Than (signed) |
| GT | Greater Than (signed) |
| LE | Less Than or Equal (signed) |
| AL | Always (usually not written) |
The Program (zweites.s)
Our program follows this sequence:
- Prepare the Raspberry Pi to communicate with an LED
- Turn off the LED
- Wait
- Turn on the LED
- Wait
- Execute loop
Preparing the Raspberry Pi
First, we need to inform the Raspberry Pi about our intentions. We need to tell the GPFSEL register how to use its pins. I have already described which registers are involved under System Programming / Bare Metal. According to the documentation in Chapter 5, there are five GPFSEL registers responsible for different GPIO addresses. GPFSEL0 is for GPIO addresses 0-9, GPFSEL1 for 10-19, and so on. Since our LED is addressed via GPIO 42, we need to use GPFSEL4. Within the register, each GPIO address has 3 bits available to define or query its function. These three bits are defined as follows:
| 000 | PIN is an input |
| 001 | PIN is an output |
| 100 | PIN has alternative function 0 |
| 101 | PIN has alternative function 1 |
| 110 | PIN has alternative function 2 |
| 111 | PIN has alternative function 3 |
| 011 | PIN has alternative function 4 |
| 010 | PIN has alternative function 5 |
Only two states are clearly defined as status—defining the pin either as an input or as an output. All other states are attributed to alternative functions. The alternative function of one pin does not necessarily mean the same function for other bits. You can refer to the documentation in Chapter 5.3 "Alternative Function Assignments" to see what these are. Later, we will also use these. Now, since we want to control an LED, we need to use the function "001 - PIN is an output". In the register GPFSEL4, bits 8-6 are responsible for pin 42. Therefore, we need to enter the bit sequence "001" here. One way would be to simply write this bitcode directly "0b001000000", but since we want to understand a bit more about the code, we set r1 to 1 (equivalent to 0b001) and shift its content left by 6 places using the lsl command. The result is the same.
mov r1, #1 @ equivalent to 0b001
lsl r1, #6 @ -> 0b001 000 000
The function registers are nothing more than addresses where specific values are stored. The peripherals take these values from there and respond accordingly. To write our value to the correct position, we first load the address of the function register into a register and then transfer our value through this register.
ldr r0, =GPFSEL4 @Load address of GPFSEL4 into r0
str r1, [r0] @Store r1 in this address
It is not possible with the ARM processor to directly store a value in a memory area. All ARM processor commands are 32-bits and cannot be extended as some other processors. In such a 32-bit code, the command itself, the registers used, the condition code, etc. are included. Few bits remain to store an address. Short distances are possible with addressing relative to the command itself. But since the register is likely outside this area, we need to take a detour. The ldr command loads the address of a variable that is relative to this command into a register. Essentially, it simply enters a value stating that the address XYZ is, for example, 200 bytes further. In our example, the content of the address of the variable "GPFSEL4" is loaded into r0, hence the value from GPFSEL4 itself. With the next command str (store), we store the content of the register r1 into the address from r0. This way, we have informed the function register "GPFSEL4" about how it should handle our pin.
Turning the LED On and Off
To turn the LED on and off, we use two registers that the Raspberry Pi provides. These are the "GPSET" and "GPCLR" registers, which can set or clear certain pins in the GPIO. According to the documentation on page 89, certain bits are responsible for specific pins. First, let's look at GPSET. GPSET0 is for pins 0-31, and GPSET1 is for pins 32-57. So, our pin 42 is addressed with GPSET1. To calculate the position correctly, we write a 1 into register r1 and shift the bit left by 10 (42 - 32 = 10) positions. Since this rule also applies to GPCLR1, we will later use this value from register r1 there as well. The actual register "GPSET1" is used later in our infinite loop.
mov r1, #1
lsl r1, #10
The Infinite Loop
Now we come to our loop. We simply label it as "MainLoop":
MainLoop:
As mentioned earlier, we will use the GPCLR1 register to first turn off our LED. To do this, we load the address of the register for GPCLR1 with ldr into r0 and then store r1 (Our PIN) in this register.
ldr r0, =GPCLR1 @LED is off
str r1, [r0]
If we now immediately turn the LED back on, we wouldn't notice the effect as it would happen too quickly. To make it noticeable, we need to occupy the processor for a while. We create a loop (wait1$), set the register r2 to a value of 0x3F0000. Within the loop, we decrement the value by 1 and check if r2 is zero. As long as it is not, we let the loop run again. Once it reaches zero, the program continues.
mov r2, #0x3F0000 @Set r2 to 3F0000
wait1$:
sub r2, #1 @Subtract 1 from r2
cmp r2, #0 @Compare the value with "0"
bne wait1$ @If not equal, repeat the loop
In the next step, we use the GPSET1 register in the same way as the GPCLR1 register earlier. This turns the LED on.
ldr r0, =GPSET1 @LED is on
str r1, [r0]
And we wait again for a while with the loop wait2$.
mov r2, #0x3F0000 @Set r2 to 3F0000
wait2$:
sub r2, #1 @Subtract 1 from r2
cmp r2, #0 @Compare the value with "0"
bne wait2$ @If not equal, repeat the loop
Now we jump into our infinite loop, and the LED blinks.
b MainLoop
Using Includes
When we look at our code a bit, we will notice something. If our program is supposed to do more than just turn an LED on and off, we will eventually end up with code that becomes very complex. The compiler offers us the option to outsource certain code. Our current code consists of two parts—a declaration section and the actual code. To demonstrate the principle of "includes," we will first outsource our declarations to another file. It is best to structure this file in a way that we can always access it, regardless of what we write. The declarations for the Raspberry Pi remain the same and can be used in other source codes.
So, the first include we create will be a declarations database. We will simply call it "base.inc."
/* Raspberry PI 4 Declarations
* File: base.inc
* Created: 17.10.2020
*/
.equ RPI_BASE, 0xFE000000
.equ GPIO_BASE, RPI_BASE + 0x200000
@ GPIO function select (GFSEL) registers have 3 bits per GPIO
.equ GPFSEL0, GPIO_BASE + 0x0 @GPIO select 0
.equ GPFSEL1, GPIO_BASE + 0x4 @GPIO select 1
.equ GPFSEL2, GPIO_BASE + 0x8 @GPIO select 2
.equ GPFSEL3, GPIO_BASE + 0xC @GPIO select 3
.equ GPFSEL4, GPIO_BASE + 0x10 @GPIO select 4
@GPIO SET/CLEAR registers have 1 bit per GPIO
.equ GPSET0, GPIO_BASE + 0x1C @set0 (GPIO 0 - 31)
.equ GPSET1, GPIO_BASE + 0x20 @set1 (GPIO 32 - 63)
.equ GPCLR0, GPIO_BASE + 0x28 @clear0 (GPIO 0 - 31)
.equ GPCLR1, GPIO_BASE + 0x2C @clear1 (GPIO 32 - 63)
In our main program, we can simply add this file with ".include." Internally, the assembler sees it as if the ".include" is replaced by the code from the include. In case of an error, the assembler also gives us the corresponding error line from the include, so we can quickly see where the error is.
.include "base.inc"
The source code is now available as a zip file because our projects will contain multiple files from now on.
2.2.zip
Using Functions
Functions are used when you want to perform repetitive tasks. But also to extract different parts from the main program to improve readability. As an example, I show a function here that we will not use immediately but is advantageous for explanation. We now create a function named “Get_GPIO_Base.”
/*
* int (r0) Get_GPIO-Base{void}
* returns the GPIO address in r0
*/
Get_GPIO_Base:
push {lr}
ldr r0, =GPIO_BASE
pop {pc}
Since functions are typically used in other programs as well, each function should have a description so that later on, you can always trace what the function is used for. It is also important to know what data the function expects and what is returned when it is called.
This function will return the GPIO address in r0 and requires no parameters.
A label is used to declare the function. In our case, the function is named Get_GPIO_Base. This will also be the name we use to call this function later.
When we call a function, the function is given an address in the lr (link register). This is the address from which the function was called, plus an offset that indicates the next command. To be able to return later, we must save this address. This is done with the "push" command. This command writes the address to the stack (refer also to new commands for functions). When the function execution is complete, we simply reset the address with "pop" in the program counter (pc), and the program that called this function continues.
Our function is now ready to be called from our main program.
bl Get_GPIO_Base
With a simple "b" (branch), it would be possible to call our function. However, with a simple "branch," the pointer is not passed to lr. As a result, the function would not know where it was called from. With the extended command bl (branch with link), this address is written to the lr register. In the function, the address is later used to execute the next command that follows.
As previously mentioned, this is just an example, and it doesn’t make much sense for us. It should only illustrate how functions are written and work. In the following examples, we use this type of functions, but there are still some things to note, which I will now explain.
Quasi-Standard for Registers in Functions
The problematic part of functions is that the user of such functions does not know which registers are used within the function. This can lead to complications that are not intended. One can imagine if I use register r5 in my program to store my calculations, which I want to use later, the value should remain available. If the function uses this register, it will overwrite this register, and my value is lost. To prevent this, rules are established so that certain registers are not overwritten or at least have the same value after returning from the function.
Assembly functions also have their place in other high-level languages. Such functions can also be called there. Since the high-level language does not know what happens in the function, guidelines must be followed.
Some programmers have developed a standard, which we will follow here. This standard is named “Application Binary Interface (ABI).” Some guidelines were established for the ARM processor:
The standard states that r0, r1, r2, and r3 can be used sequentially as inputs for a function. If a function requires no inputs, it doesn’t matter what values these registers contain. The return value is always in register r0. If a function has no return, it doesn’t matter what value r0 takes. Additionally, r4 to r12 must have the same values after the function execution as they did at the start of the function. This means that when calling a function, one can be sure that the values of r4 to r12 will not change, but the values of r0 to r3 may not be reliable.
If registers higher than r3 are needed within the function, their values need to be saved and restored at the end of the function. To accomplish this, we use the stack defined at the beginning of the main program, and we use push and pop commands for this purpose.
Example:
push {r5, lr}
This command places the registers lr and r5 on the stack. With
pop {r5, pc}
the data is loaded from the stack. The stack pointer is also adjusted, and the data is no longer available. Here in the example, lr is pushed to the stack and this value is popped into the pc register.
The translation is complete, and the content appears accurate. No changes have been noted that would necessitate a summary of modifications.