Ok, this one is on a really old topic. As you may already know, x86-based machines have several different modes of operation. When engineers had to make the leap from 16 to 32 bits they had to figure out a way to avoid breaking old software and came up with what we call protected mode. From then on the old 16 bit mode was called real mode.
There are many places on the Internet where you can find good resources to learn how real mode works and how to switch between modes, which can be a bit tricky sometimes. Here are a couple of links to pages I found particularly useful:
- What is Real Mode?
- General description of Real Mode, Protected and V86 Mode
In this post we will go through a small boot sector I wrote that wipes all the memory above 1MB with known values. As we will see, to do this it is necessary to switch between real and protected modes several times. This will prove to be a very important part of the procedure I described in my previous post about dumping the physical ram of a virtual machine. First of all, here is the code:
On x86 machines there are two ways to map hardware resources i.e. memory and I/O ports. In theory a 32-bit x86 CPU could address up to 4GB of RAM but this is usually not the case. On any desktop computer you can find devices connected to several busses, like PCI, SM, I2C, etc. What all these devices have in common is that there are fixed memory and port mappings set on startup so that system software can access them. Some of these mappings can be reconfigured and some others are invariably fixed. The system BIOS is the firmware responsible for setting everything up so that the operating system does not need to bother with things that are too specific to the chipset. What we need to do at the moment is querying the BIOS about the memory ranges in use and their types. The standard method to do this is software interrupt 0×15. When we trigger this interrupt with the value 0xe820 on eax the BIOS returns what we call an E820 memory map.
mov EDX, 0x0534d4150 ; Place "SMAP" into EDX mov EAX, 0xe820 mov [DI + 20], dword 1 ; force a valid ACPI 3.X entry mov ECX, 24 ; ask for 24 bytes int 0x15
The memory map can be very long so we request only one entry (24 bytes) at a time. This entry defines the start and length of a memory zone as well as its type. We are only interested in zones of type 1 (RAM memory).
When we get a new memory zone we wipe each page with a value equal to the zone’s absolute offset. To make it easier to identify the pages we wiped we put the string “SIGN” at the begining of each page. For example at offset 0×1000000 the wiped page will look like this:
'S' 'I' 'G' 'N' 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x01 ... 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x01
This sounds simple but when we get hands on to writing the boot sector we face a pretty basic problem: real mode can only access 1MB of RAM memory. In real mode memory addressess are formed by combining a 16-bit segment and a 16-bit offset and the final offset is obtained by shifting the segment 2 bits to the left and adding the offset to the result. This way old x86 CPUs could access much more memory than could be addressed with a simple 16-bit offset. There is a lot of information on this on the Internet so I will not go into details here.
To wipe the full memory map we will need to call the BIOS interrupt 0×15 from real mode, then switch to protected mode, wipe the zone we just obtained and then go back to real mode to request another one. We will repeat this process until the BIOS tells us there are no memory zones left.
To switch to protected mode we need a global descriptor table (GDT). This table is pointed to by one of the special processor registers which can be loaded with the lgdt machine instruction. The GDT specifies which segments can be used, including their offsets, lengths and attributes. In this boot sector we will map the full 4GB address space linearly from address 0 and allow reads, writes and code execution. The format of the GDT and all the details about protected mode memory management can be found on Intel’s Architectures Software Developer’s Manual in chapters 3, 4 and 5. Here is the GDT we will use in our jumps into protected mode:
align 16 gdt: dw gdt_end-gdt-1 ; gdt limit dd gdt ; gdt base = %cs<<4 + offset dw 0 ; dummy ;CS_SEG: dw 0xffff ; 4Gb - (0x100000*0x1000 = 4Gb) dw 0x0000 ; base address=0 dw 0x9a00 ; code read/exec dw 0x00cf ; granularity=4096, 386 (+5th nibble of limit) ;DS_SEG: dw 0xffff ; 4Gb - (0x100000*0x1000 = 4Gb) dw 0x0000 ; base address=0 dw 0x9200 ; data read/write dw 0x00cf ; granularity=4096, 386 (+5th nibble of limit) gdt_end:
We set up 2 overlapping segments that span from 0 to 4GB. The first segment allows for code execution (i.e. it is a code segment) and the second one allows for reading and writing data but not for executing code (i.e. it is a data segment). When we make these two segments overlap we get a memory area with no restrictions for reading, writing and executing code.
Once we are in protected mode we first need to read the starting offset and the length of the memory zone from the structure we obtained with interrupt 0×15. Then we just need to go through each page in the zone, write the string “SIGN” to the first 4 bytes and wipe it with the offset of it’s first byte. Here is how I implemented it:
wipe_zone: cli ; Disable interrupts pushad xor EBX, EBX mov EBX, DS shl EBX, 4 add EBX, [e820current] ; Move zone descriptor address into EBX mov EDI, [EBX] ; EDI = Offset cmp EDI, 0x100000 jl .real ; We only do zones over 1MB lgdt [CS:gdt] mov EAX, CR0 or EAX, 0x00000001 mov CR0, EAX jmp .protected ; Switch to protected mode .protected: mov AX, 16 mov ES, AX ; Use segment at offset 16 .loop: mov EAX, EDI ; EAX = Offset mov ECX, [EBX + 8] ; ECX = Length sub EAX, [EBX + 8] cmp EAX, [EBX] je .end_loop mov ECX, 4096 >> 2 ; ECX = Length in double words sub ECX, 1 ; Substract a dword from the length mov EAX, "SIGN" ; Sign page mov [EDI], EAX mov EAX, EDI ; Use pointer value as wipe value add EDI, 4 ; Start one dword further a32 rep stosd jmp .loop .end_loop: mov EAX, CR0 and EAX, ~0x00000001 mov CR0, EAX jmp .real ; Switch to real mode .real: popad sti ; Enable interrupts ret
Having all these the only bit left is how we actually switch between modes. This is achieved by modifying the contents of the register CR0. The only tricky thing is that the processor does not change its addressing mode until we do a long jump, even after modifying CR0. This can be used to work in some “synthetic” modes that were mainly used by old-school DOS programmers. These modes are very interesting topic but they are out of scope in this post. The I used to to switch from real to protected mode is as follows:
lgdt [CS:gdt] mov EAX, CR0 or EAX, 0x00000001 mov CR0, EAX jmp .protected ; Switch to protected mode
And the opposite is done with these lines:
mov EAX, CR0 and EAX, ~0x00000001 mov CR0, EAX jmp .real ; Switch to real mode
After wiping all the memory ranges reported by the BIOS the boot sector arrives at this chunk of code:
.done: cmp byte [hflag], 0x00 jz .done int 0x18 ; Boot next drive
This is an infinite loop that might seems useless at first sight. It loops until the value at [hflag] becomes different to 0×00 and then it calls interrupt 0×18. Tells the BIOS that it should try to boot from the next available drive. Here is how the flag is defined in the assembly code:
org StdBase jmp 0x0000:Boot ; Make sure CS is 0 dd "WIPE" ; Signature dw 0 dd "SIGN" ; Signature dw 1 dd "WIPE" ; Signature dw 2 dd "SIGN" ; Signature dw 3 hflag: db 1 ; Hang flag Boot:
The hang flag together with the signature before it will be used when we boot a virtual machine with our boot sector so that we can let it continue running after wiping the whole RAM.
I think this boot sector is pretty interesting, specially if you have never coded anything in real mode before. The code is heavily commented but please do not hesitate to ask if you have any questions. Some of the things I have done in the boot sector might seem arbitrary and some others can be optimized a lot. The only requirement for this one is that it fits in 512 bytes, which was a relatively easy task to achieve. Apart from that I only tried to make the code as easy to understand as possible. I believe the arbitrary parts will make more sense when I publish my next post on how to interface this boot sector with a userspace program outside the virtual machine we are booting.
Happy hacking!
11:43 pm, October 25, 2010Paul /
Interesting article, but the explanation at the beginning on the motivation for protected mode is a bit off. The protected/real mode distinction was not motivated by a switch to 32 bit registers and a desire to avoid breaking 16 bit software. The advent of protected mode predates 32 bit registers by several years. The 80286 chips, for example, are 16 bit and also support protected mode.
It is true that the power-on state being real mode is for backwards compatibility. This does not have anything to do with a shift to 32 bit registers per se, but rather with the privilege levels and memory addressing modes.
12:10 am, October 26, 2010digital /
Thanks for your comment Paul. I was pretty sure about the motivation for the protected/real mode distinction but after going through the older Intel manuals I believe you are right. However I don’t particularly like editing posts to hide my own mistakes so I will leave this one as is and these two comments will hopefully be enough to clarify things.
6:13 am, October 27, 2010force /
Hi Digital,
Thanks for the information, I especially enjoyed this article as you included the asm overview. If you have any further information on the startup processes of the x86 arch, I would very much enjoy reading it.
Regards,
Force