System memory dumps on Linux

Information leaks are pretty common in today’s software. For some reason people get really scared when they are told they have a buffer overflow -even if it’s not exploitable- but they don’t care at all where the data goes when their program dies.

Well, if you know a bit about OS development you know that memory doesn’t just disappear when a program finishes its execution. RAM is an expensive and scarce resource and, as such, it gets reused much more aggressively than other resources.

In theory it should be fairly easy to dump the memory from a Linux box by just grabbing the contents of /dev/mem. The practical problem is that /dev/mem gives you access to much more than the system RAM. The main use for /dev/mem is mapping PCI resources to be used by the X server so if you just go and dd it to a file you’ll probably end up with a hung box. Randomly reading hardware registers has never been a good idea.

So what can we do? Luckily on Linux we have a file named /proc/iomem which contains the system’s memory map. At this precise moment mine looks like this:

00000000-0009fbff : System RAM
0009fc00-0009ffff : reserved
000ef000-000fffff : reserved
00100000-b8f43fff : System RAM
  01000000-0139b823 : Kernel code
  0139b824-01514597 : Kernel data
  0157e000-015ef4ab : Kernel bss
b8f44000-b8f45fff : reserved
b8f46000-b9d6ffff : System RAM
b9d70000-b9d7ffff : ACPI Non-volatile Storage
b9d80000-bc4c0fff : System RAM
bc4c1000-bc6c0fff : ACPI Non-volatile Storage
bc6c1000-bde91fff : System RAM
bde92000-bde99fff : reserved
bde9a000-bdebefff : System RAM
bdebf000-bdecefff : reserved
bdecf000-bdfcefff : ACPI Non-volatile Storage
bdfcf000-bdffefff : ACPI Tables
bdfff000-bdffffff : System RAM
be000000-bfffffff : reserved
c0000000-cfffffff : PCI Bus 0000:01
  c0000000-cfffffff : 0000:01:00.0
    c0000000-c0ffffff : vesafb
d0000000-d00fffff : PCI Bus 0000:86
  d0000000-d0000fff : 0000:86:09.3
  d0001000-d00017ff : 0000:86:09.0
    d0001000-d00017ff : firewire_ohci
  d0001800-d00018ff : 0000:86:09.2
  d0001900-d00019ff : 0000:86:09.1
    d0001900-d00019ff : mmc0
d0100000-d40fffff : PCI Bus 0000:45
d4100000-d80fffff : PCI Bus 0000:04
d8100000-d81fffff : PCI Bus 0000:03
  d8100000-d8101fff : 0000:03:00.0
    d8100000-d8101fff : iwlagn
d8200000-d82fffff : PCI Bus 0000:02
d8300000-d83fffff : PCI Bus 0000:01
  d8300000-d830ffff : 0000:01:00.0
  d8320000-d833ffff : 0000:01:00.0
d8400000-d841ffff : 0000:00:19.0
  d8400000-d841ffff : e1000e
d8420000-d8423fff : 0000:00:1b.0
  d8420000-d8423fff : ICH HD audio
d8424000-d8424fff : 0000:00:19.0
  d8424000-d8424fff : e1000e
d8425000-d8425fff : 0000:00:03.3
d8426000-d84267ff : 0000:00:1f.2
  d8426000-d84267ff : ahci
d8426800-d8426bff : 0000:00:1d.7
  d8426800-d8426bff : ehci_hcd
d8426c00-d8426fff : 0000:00:1a.7
  d8426c00-d8426fff : ehci_hcd
d8427000-d842700f : 0000:00:03.0
d8500000-d86fffff : PCI Bus 0000:02
d8700000-d88fffff : PCI Bus 0000:03
d8900000-d8afffff : PCI Bus 0000:04
d8b00000-d8cfffff : PCI Bus 0000:45
dc000000-dfffffff : PCI Bus 0000:86
  dc000000-dfffffff : PCI CardBus 0000:87
e0000000-efffffff : PCI MMCONFIG 0 [00-ff]
  e0000000-efffffff : reserved
    e0000000-efffffff : pnp 00:01
f0000000-f3ffffff : PCI CardBus 0000:87
fec00000-fec00fff : IOAPIC 0
  fec00000-fec00fff : reserved
fed00000-fed003ff : HPET 0
  fed00000-fed003ff : pnp 00:05
fed10000-fed13fff : reserved
  fed10000-fed13fff : pnp 00:01
fed18000-fed19fff : reserved
  fed18000-fed18fff : pnp 00:01
  fed19000-fed19fff : pnp 00:01
fed1c000-fed1ffff : reserved
  fed1c000-fed1ffff : pnp 00:01
fed20000-fed3ffff : pnp 00:01
fed45000-fed8ffff : pnp 00:01
fee00000-fee00fff : Local APIC
  fee00000-fee00fff : reserved
ffe80000-ffffffff : reserved
100000000-13bffffff : System RAM

We are interested in the regions labelled as System RAM. All other regions are used to map hardware resources, which depend on the hardware you have and the device drivers you are using. If you want to dump the whole RAM the theory is pretty easy.

a) Open /dev/mem
b) Open /proc/iomem
c) Go through each RAM region in /proc/iomem and dump the corresponding offsets from /dev/mem

Here is a little bash script that I wrote to do just that. It works pretty well but it has a couple of caveats we will see in a moment.

#!/bin/sh

if [[ $# -ne 0 ]]; then
	echo "USAGE: $0"
	exit 1
fi

grep '^[^ ].*$' /proc/iomem | grep 'System RAM' | while read -r LINE; do
	X0="0x`echo $LINE | sed 's|^\([^-]*\)-.*|\1|'`"
	X1="0x`echo $LINE | sed 's|^[^-]*-\([^ ]*\) .*|\1|'`"

	R0=$(( ($(printf %d $X0) / 4096)))
	R1=$(( ($(printf %d $X1) / 4096)))

	if [ $(( $(printf %d $X1) % 4096 )) -ne 0 ]; then
		R1=$(( $R1 + 1 ))
	fi

	echo "CHUNK: $X0-$X1"
	dd if=/dev/mem bs=4096 skip=$R0 count=$(( $R1 - $R0  ))
done

You might wonder what the rounding is for. On all the hardware architectures that I know of memory gets mapped in pages. On x86 a standard page is 4096 bytes but as you can see on my /proc/iomem file, the Linux kernel doesn’t always give aligned offsets for the upper limit of RAM areas. Anyway, we know that at the lowest level both limits are aligned so we just round up the non-aligned limits we find.

Now the issues. This little bash script just spits the contents of the memory to stdout, which doesn’t sound very useful. We need to store the dump somewhere so we can analyse it later searching for interesting data. Our first idea could be to just direct the dump to a file using the shell but that would have the undesired effect of filling up the kernel’s internal caches, thus wiping the data we are interesting in. So if we can’t put the dump into the hard drive why don’t we store it somewhere else on our local network? A very simple way to achieve this is to listen for a connection from a different host and then run our script piping the output to netcat.

Dest$ nc -l -p 1337 > dump
Orig$ ./mcat_poc.sh | nc dest 1337

When we do this at least 2 programs get loaded into memory and run in the origin host, i.e. netcat and a bash interpreter for mcat_poc.sh. Each of these programs has its in-memory structures in user and kernel space and depends on a bunch of shared libraries that might or might not be already loaded for their use by another program. Things are even worse if we have a look into mcat_poc.sh. Because of the way bash works some more processes are spawned, making this implementation particularly noisy. Ideally, we would like to dump all the memory in the system without modifying anything. Of course this is infeasible as we need to load the dumper itself into memory but we should try to keep this process as small as possible. My approach has been to implement the dumper in C, including the networking code. It would have been better to write it in pure assembly to get rid of libc and the crt but I think this solution is a reasonable compromise between simplicity and effectiveness. If anybody is interested in the pure assembly solution just let me know and if I have the time I will write it as well.

mcat.c

When you have the dump you can look for some memorable strings in it. I found my password in 3 different places when I ran it the first time :-) . It’s incredibly difficult to keep track of every data buffer in any reasonably large program and thus many commonly used applications have information leaks. Many library functions copy data internally, thus generating hidden duplicates that are really hard to spot in a code review. We all know coders are too lazy to test these things thoroughly. Most people don’t even test their own code, what about a library written by a guy they’ll never meet? Well, this is good for you if you are going to go bug-hunting anyway.

Happy hacking!

Comments (2)

  1. 3:23 pm, June 16, 2010Jari  / Reply

    You say that if you write to a local file you risk modifying memory by filling the kernel’s buffer cache. Couldn’t you still write to a local disk without polluting the buffer cache by using direct I/O? Wouldn’t that have the desired effect?

  2. 12:15 am, June 22, 2010digital  / Reply

    I suppose what you would do is to open the file with the O_DIRECT flag. This way you wouldn’t be going through the linux kernel cache, that’s right. However you are faced with another problem. We don’t want to use the cache because we want to corrupt as little memory as possible, not because caching is bad by itself. If you do direct IO you still need to create the buffer to hold the data in userspace, which means that you need to call brk(). This will grab some more pages from the free page list and some memory will get corrupted anyway.

    This is not really a good argument but linux kernel developers recommend against using O_DIRECT (see http://kerneltrap.org/node/7563), maybe this case is just an acceptable exception.

    The way I see it there is no good solution for this, anything you do you are going to corrupt some pages. The O_DIRECT solution is definitely better than just opening a file for writing without the flag but its still not perfect.

    Nice suggestion and thanks for commenting :-)

Leave a Reply

Allowed Tags - You may use these HTML tags and attributes in your comment.

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Pingbacks (0)

› No pingbacks yet.