ELF Binary Mangling Pt. 2: Golfin


[:  :]

Greetings everyone, and welcome to part two of the Binary Mangling series. In our last installment, we took a look at the basics of what an ELF binary is, how it’s laid out, and the bare minimum needed to execute some raw machine code. We also did a little bit of mangling, by hand optimizing our binary in a hex editor to put things where they aren’t supposed to go.

In this installment, we are going much, much deeper, to challenge the kernel with a clown car of barely valid bytes to test the limits of the ELF format itself.

The third part should be coming out at around the same time as this, with a practical example of binary golf, the art of executing a binary in as few moves as possible.

That Shrinking Feeling

Personally, I have wanted to figure out what the absolute smallest ELF64 binary I could manage to create was for a while now.

By normal means, using the GNU assembler and linker, you can create pretty small binaries that have the basic data structures needed to dictate how the executable is parsed by the OS.

In the previous write up, we showed the two necessary structures required to execute your code: The ELF header and the program header. For more info on these, please refer to [part one].

In my initial investigation, I created a 120 byte ELF64 that included the exit syscall in part of the ELF header, meaning the binary was exactly the size of the ELF and Program Headers. After playing around a bit more, I discovered that you can actually end the ELF header early, at 0x3A instead of 0x40, meaning you could save a whole 6 bytes by moving the program header up into the ELF header.

I explored the possibility of overlaying the program and ELF headers, but due to the 8 byte addresses of ELF64, I couldn’t figure out a proper way to do it. I tried again months later and was almost there, but still no dice. After a bunch of messed up nasm files, and some botched hex editing sessions, @_veekun came through and found the proper alignment (big big shoutout).

Here is a breakdown of how we are going about this.

Overlay

So the basic principles of overlaying these two header structures relies on some very precise pieces to be in place.

It’s probably best explained with this nasm file that you can pop into your fav text editor.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
;                                                                        exit.asm
;────────────────────────────────────────────────────────────────────────────────
BITS 64

        org 0x100000000  ; Where to load this into memory

;----------------------+------+-------------+----------+--------------------
; ELF Header struct    | OFFS | ELFHDR      | PHDR     | ASSEMBLY OUTPUT
;----------------------+------+-------------+----------+--------------------
        db 0x7F, "ELF" ; 0x00 | e_ident     |          | 7f45 4c46
_start: mov    al,0x3c ; 0x04 | ei_class    |          | b0
                       ; 0x05 | ei_data     |          | 3c
        xor    rdi,rdi ; 0x06 | ei_version  |          | 4831 ff
        syscall        ; 0x09 |   u         |          | 0f05 
        nop            ; 0x0b |    n        |          | 90
        nop            ; 0x0c |     u       |          | 90
        nop            ; 0x0d |      s      |          | 90
        nop            ; 0x0e |       e     |          | 90
        nop            ; 0x0f |        d    |          | 90
;----------------------+------+-------------+----------+--------------------
; ELF Header struct ct.| OFFS | ELFHDR      | PHDR     | ASSEMBLY OUTPUT
;----------------------+------+-------------+----------+--------------------
        dw 2           ; 0x10 | e_type      |          | 0200
        dw 0x3e        ; 0x12 | e_machine   |          | 3e00
        dd 1           ; 0x14 | e_version   |          | 0100 0000
        dd _start - $$ ; 0x18 | e_entry     |          | 0400 0000 
;----------------------+------+-------------+----------+--------------------
; Program Header Begin | OFFS | ELFHDR      | PHDR     | ASSEMBLY OUTPUT
;----------------------+------+-------------+----------+--------------------
phdr:   dd 1           ; 0x1C |   ...       | p_type   | 0100 0000 
        dd phdr - $$   ; 0x20 | e_phoff     | p_flags  | 1c00 0000
        dd 0           ; 0x24 |   ...       | p_offset | 0000 0000
        dd 0           ; 0x28 | e_shoff     |   ...    | 0000 0000
        dq $$          ; 0x2C |   ...       | p_vaddr  | 0000 0000 
                       ; 0x30 | e_flags     |   ...    | 0100 0000 
        dw 0x40        ; 0x34 | e_shsize    | p_addr   | 4000
        dw 0x38        ; 0x36 | e_phentsize |   ...    | 3800
        dw 1           ; 0x38 | e_phnum     |   ...    | 0100
        dw 2           ; 0x3A | e_shentsize |   ...    | 0200
        dq 2           ; 0x3C | e_shnum     | p_filesz | 0200 0000 0000 0000
        dq 2           ; 0x44 |             | p_memsz  | 0200 0000 0000 0000
        dq 2           ; 0x4C |             | p_align  | 0200 0000 0000 0000

Let’s go through this line by line

This is where things get interesting.

The p_type data structure denotes the type of segment, here a 1, which is a LOAD segment, meaning it will be loaded into memory for further usage (the ORG address on line 3 plays a key role in where we want to be loaded.)

This structure, containing a value of 1, actually completes the second half of the e_entry data structure, making the full value 04 00 00 00 01 00 00 00. This gives the proper entry address of 0x100000004, which is 4 bytes after the location that the binary is loaded into memory, or 0x4. This is precisely where our _start label is.

The next structures that overlap are e_phoff (which is where the program headers are located), and p_flags, which are the flags that determine the sections permissions. In this case the flags are 0x1C which is 00011100 in binary. The ABI only pays attention to the lowest three bits, meaning this is marked as “executable”, and it’s value can be shared with the ELF header to designate that 0x1C is the address where the program headers start (which is in the middle of the ELF header.)

The rest of the structure continues and overlaps each other with largely dummy values until the end.

Keeping these data sizes in mind and how they overlap will be useful in the next write up. For now, here’s a small annotated part of elf.h that shows how this is eventually laid out.

TODO: Where did this link to before?

Putt Putt

So now we have our basic structure set up, let’s go through what happens when you run this.

First, let’s compile and run exit.asm like so:

1
2
3
4
nasm -f bin -o exit exit.asm
chmod +x exit
./exit
echo $?

If all goes according to plan, you should see a 0 as the output from the last command.

When we run our ELF file, the kernel looks for the data structures we just went over in order to determine what to do with this particular binary. We want to load it into memory (at our specific address defined in p_vaddr) and execute at the entry point we defined at e_entry. Since the ELF and program headers overlap, the kernel jumps around our header and eventually loads our binary and figures out how to execute our code.

Our program is quite simple:

1
2
3
mov al,0x3c
xor rdi,rdi
syscall

All we are doing here is moving value of 0x3C into the lowest 8 bits of the RAX register, putting a value of zero in the RDI register, and calling the kernel. 0x3C is the syscall number for exit, which just exits a process. XORing RDI with itself creates a 0, which is our EXIT STATUS code, 0 meaning SUCCESS.

So it’s running just fine, despite being completely mangled, but what happens when we try and analyze this code?

It appears that objdump has no idea what to do with it, and readelf gives some very bizarre values as well. Other debuggers display some interesting results. You should play around !

What’s next?

So now we’ve successfully created an 84 byte ELF binary, the next step is seeing what else we can pack in this header, and do some more binary golfing. The next write up will describe applying these same techniques to create a VPS nuking one liner containing an 84 byte binary. It will also provide a more in depth look at specific data structures that can be reused to hold code, and how to jump around between them. Thanks for reading!

Greetz to: hermit, +Eevee, dnz, readme, rqu, decoded, notpike, skelsec, jinn, notdan, MG, phaith, nux, zuph, sshell, def_hand, cedric, protoxin, xero and the rest of the Thugcrowd crew.