BGGP6: REVIVING RDOFF PART 1


[:  :]

Motivations & Goals

This year’s BGGP challenge is “RECYCLE”. Players are encouraged to recycle old ideas, lost projects, and obscure file formats to complete any of the previous 5 BGGP challenges, or just create the smallest file that prints/returns/displays the number “6”. After looking through some old notes on file formats I wanted to play with, I remembered a format called RDOFF. Years ago I wrote to myself “what are rdoff tools?” after learning that nasm shipped with support for it. There was even a package called nasm-rdoff. I didn’t know much about this format, other than it was some sort of internal object file. This sounded like a perfect for BGGP6 !

My main goal is to figure out how to generate and execute an RDOFF file using nasm’s rdx interpreter. Once this is complete, I want to golf the file to make it as small as possible.

This writeup focuses on understanding the RDOFF format, poking at nasm internals, and crafting files using a custom scapy library.

To follow along, you can clone and build the nasm 2.15 repo!

git clone --branch nasm-2.15 https://github.com/netwide-assembler/nasm

Everything in this writeup was built and tested on a x86_64 Ubuntu 24.04 system.

What is RDOFF?

RDOFF, or the Relocatable Dynamic Object File Format, is an object file format that was first released as a part of nasm 0.91 in November 1997. It was created to provide a reference 32 bit object file format for nasm developers to use to test the assembler’s object file generation, as the only other supported format at the time was the 16 bit DOS .OBJ format. Per the original release doc, the nasm devs realized that RDOFF could be useful to OS and assembler devs, and decided to standardize it.

The first version of RDOFF only supported 4 header record types: “Relocation”, “Symbol Import”, “Symbol Export”, and “Dynamic Link Library”. Over time, RDOFF was expanded to include other record types, turning it into a reference implementation that generalized many of the core structure types used in object files.

On December 3rd 2002, this commit added RDOFF2 support to nasm. The biggest changes to the format were adding length fields to most structures, to make it easier to support custom header records and sections, as well as skip over any unsupported types.

From v1-v2.txt:

This isn’t as pointless as it sounds; I’m using RDOFF in a microkernel operating system, and this is the ideal way of loading multiple driver modules at boot time.

RDOFF2 also deprecated Header Record Type 9 or RDFREC_MULTIBOOTHDR, which was never reused. From this point on, most of the changes to the core format involved internal nasm code, as many things used in RDOFF were mapped to core code in the nasmlib, but there weren’t any more major updates to the format or the toolchain.

20 years later, in 2022, all RDOFF code was removed from nasm 2.15.4 (and 2.16).

Per the release notes:

As of Version 2.16: Support for the rdf format has been discontinued and all the RDOFF utilities has been removed.

What happened?

The Death of RDOFF

Around the time that RDOFF2 was added to nasm, there were at least two known projects using the RDOFF format in some capacity. One of them was the MOSCOW operating system, written by nasm developer Julian Hall. It is referenced in the nasm source, in both comments and code (segment types between 0x20 and 0x1000 are reserved for MOSCOW). I wasn’t able to find any other references to this operating system online, so I assume it’s no longer in active development.

The other project was a clone of QNX Neutrino 6.1 called radiantOS or “radiOS”. radiOS was written mostly in nasm assembly, with some C for it’s boot time module linker “btl”. It actually uses RDOFF in btl and in other modules, and even has custom headers. The last update to the site had an announcement that “The fourth pre-release of RadiOS-0.0.1.7 is being prepared.” on Jun 11, 2004. The last available source is radios-0.0.1.7-pre4a which is what I reviewed.

Beyond these two projects, it doesn’t seem like RDOFF was ever used anywhere else, except for nasm compatibility. Through the years, the documentation and code reflected that.

From the nasm 2.15 docs:

RDOFF is not used by any well-known operating systems. Those writing their own systems, however, may well wish to use RDOFF as their object format, on the grounds that it is designed primarily for simplicity and contains very little file-header bureaucracy.

The documentation doesn’t describe what is required to write assembly that will generate an rdf file, only how to include an already built .rdf file. The spec itself is a texi file buried in the repo, and not referenced in the docs. The only hints about using nasm to create RDOFF files are the test programs in rdoff/test/, which have examples of some features like calling external functions. The other scattered bits of info moreso describe the tooling and notes about slight changes, than about the files themselves.

Online searches yielded little results. I tried to find any example RDF file, but none were available. There are tons of random links that weren’t helpful, with people asking similar questions about RDOFF back in 2004. I did see this random CVE for the nasm-rdoff package, but the bug was unrelated to the file format. I also saw that yasm supports generating RDOFF, but it seems to just be there for compatibility. The manpage says that nasm contains the toolchain and the loader for RDOFF. The only non-nasm tool I could find for analyzing the format was from a repo called objview. Written in TurboPascal, the RDF viewer shows information about existing RDOFF files, but doesn’t contain any files to test with.

Since there were no example files I could find, I decided to just try to use nasm to generate one.

Failing To Generate An RDF

I took the source of a small x86 assembly program and tried to output to the rdf format. I got this error when assembling:

2025-11-22 19:22:38 ~/projects/binarygolf/bggp6/nasm-nasm-2.15 
▶ ./nasm -f rdf 6.asm
panic: 6.asm: rdf segment numbers not allocated as expected (2,4,6)

I spent some time trying to get it to build, looking around the docs for info on what I need to include in my assembly to make it work. Sometimes different formats have specific keywords to use, or need to include a special symbol, but there wasn’t much info to go off of.

I formatted my program in the same way as the example code in rdoff/test/. An RDF file seems to require 3 sections: .text, .data, and .bss. The only other real difference to regular code I write with nasm is that the code’s entrypoint is _main instead of _start.

From the manpage:

rdx loads an RDOFF object, and then calls ‘_main’, which it expects to be a C-style function, accepting two parameters, argc and argv in normal C style.

This is the RDF-friendly source I ended up with:

[SECTION .text] ; this is the text section
[BITS 64]       ; using 64 bit
[GLOBAL _main]  ; the _main function as a global

_main: 
  mov    rax,0x3c ; exit() syscall
  mov    rdi, 6   ; return value for BGGP6
  syscall         ; call the kernel and exit

[SECTION .data]
  db   0 ; 1 byte data section

[SECTION .bss]
  resb 1 ; resb is used to allocate in bss

When I tried to build it, I got the same error. Hrm…

For a sanity check, I tried to build the test cases, but none of those worked either!

2025-11-22 21:27:39 ~/projects/binarygolf/bggp6/nasm-nasm-2.15 
▶ ./nasm -f rdf rdoff/test/testlib.asm
panic: rdoff/test/testlib.asm: rdf segment numbers not allocated as expected (2,4,6)
2025-11-22 21:27:42 ~/projects/binarygolf/bggp6/nasm-nasm-2.15 
▶ ./nasm -f rdf rdoff/test/rdftest1.asm
panic: rdoff/test/rdftest1.asm: rdf segment numbers not allocated as expected (2,4,6)

At this point I was really confused. It should work right? The docs say that nasm has a full toolchain for RDOFF files. Even other tools say to use nasm to test the RDF files they generate.

It’s also still in the help file in nasm 2.15. rdf was created to test the very output engine that generates these other file formats, so it is strange that it doesn’t seem to work at all.

    -f format     select output file format
       bin                  Flat raw binary (MS-DOS, embedded, ...) [default]
       ith                  Intel Hex encoded flat binary
       srec                 Motorola S-records encoded flat binary
       aout                 Linux a.out
       aoutb                NetBSD/FreeBSD a.out
       coff                 COFF (i386) (DJGPP, some Unix variants)
       elf32                ELF32 (i386) (Linux, most Unix variants)
       elf64                ELF64 (x86-64) (Linux, most Unix variants)
       elfx32               ELFx32 (ELF32 for x86-64) (Linux)
       as86                 as86 (bin86/dev86 toolchain)
       obj                  Intel/Microsoft OMF (MS-DOS, OS/2, Win16)
       win32                Microsoft extended COFF for Win32 (i386)
       win64                Microsoft extended COFF for Win64 (x86-64)
   --> rdf                  Relocatable Dynamic Object File Format v2.0      <--
       ieee                 IEEE-695 (LADsoft variant) object file format
       macho32              Mach-O i386 (Mach, including MacOS X and variants)
       macho64              Mach-O x86-64 (Mach, including MacOS X and variants)
       dbg                  Trace of all info passed to output stage
       elf                  Legacy alias for "elf32"
       macho                Legacy alias for "macho32"
       win                  Legacy alias for "win32"

I decided to ask Copilot on Github, since you can ask it questions about a repo. I asked about the errors I was seeing within the context of the nasm 2.15 repo, because I wanted to know if there was either something missing in my assembly source, or if nasm itself was broken. Copilot told me to try a bunch of things that didn’t work, and wasn’t aware of any issues regarding the rdf format. I ended up asking if there were any known RDF files published on Github. It gave me this answer:

Instead of trying to figure out what version of nasm can emit an RDOFF, or waste more time trying to get an LLM to be useful, it seemed more fruitful to just look at the source.

Patching The nasm Loader

A quick grep for the error message leads to the outrdf2.c file. It contains the function rdf2_init().

This is where the error comes from:

    segtext = seg_alloc();
    segdata = seg_alloc();
    segbss = seg_alloc();
    if (segtext != 0 || segdata != 2 || segbss != 4)
        nasm_panic("rdf segment numbers not allocated as expected (%d,%d,%d)",
                   segtext, segdata, segbss);

There are 3 calls to seg_alloc() that populate the segtext, segdata, and segbss variables. This lines up with the 3 required sections included in the assembly source. nasm seems to only be returning “2,4,6” instead of “0,2,4” as expected. What is seg_alloc() doing then? Let’s take a look at segalloc.c!

Wait what…

static int32_t next_seg  = 2;

int32_t seg_alloc(void)
{
    int32_t this_seg = next_seg;

    next_seg += 2;
    return this_seg;
}

This isn’t allocating anything! It’s incrementing a global value starting at 2. Bizarre, but it lines up with what we saw, as next_seg is initialized to 2, not 0, so it will always start at 2. This is in the rdf2_init() function, so this hasn’t worked in while huh?

The fix seems obvious then, just change next_seg back to 0 in segalloc.c and see what happens.

static int32_t next_seg  = 0; // 2;

The most recent commit from 2018-06-14 has some words about this:

segalloc: DO NOT reset segment numbers

We are not supposed to reset the segment numbers; this was an
attempted fix for a convergence bug that didn't actually exist. The
backend is required to return the same segment number for the same
segment; if it does not, the front end will not converge, but that is
in fact the correct behavior.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

None of the code in this commit relates to RDOFF at all. The changes just remove seg_alloc_setup_done and seg_alloc_reset from the segalloc.c file and the few refs to them.

The previous commit from 2018-06-01 is more consequential, but the reasoning is also unrelated to RDOFF:

Cleanup of label renaming infrastructure, add subsection support

In order to support Mach-O better, add support for subsections, as
used by Mach-O "subsections_via_symbols". We also want to add
infrastructure to support this by downcalling to the backend to
indicate if a new subsection is needed.

Currently this supports a maximum of 2^14 subsections per section for
Mach-O; this can be addressed by adding a level of indirection (or
cleaning up the handling of sections so we have an actual data
structure.)

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

In this commit, next_seg was bumped from 0 to 2 in asm/segalloc.c, shown here:

Several output formats were updated, but the RDOFF2 generator outrdf2.c was not. It appears that nasm hasn’t been able to produce an RDF since at least June 2018.

Let’s apply the patch, rebuild using make everything, and test!

(Tragic!): There was a PR to the nasm repo from 2021 by objview creator DosWorld, which suggested this same patch. It wasn’t accepted because the format was being deprecated!

Generating A Broken RDOFF

With a patched segalloc.c , I built testlib.asm as a starting point. Incredibly, nasm didn’t complain, and created an RDF file for the first time this decade!

2025-11-22 22:01:40 ~/projects/binarygolf/bggp6/nasm-nasm-2.15 
▶ nasm -f rdf testlib.asm -o testlib.rdf
2025-11-22 22:01:47 ~/projects/binarygolf/bggp6/nasm-nasm-2.15 
▶ xxd testlib.rdf 
00000000: 5244 4f46 4632 7300 0000 3600 0000 020b  RDOFF2s...6.....
00000010: 0003 005f 7374 7263 6d70 0002 0900 0400  ..._strcmp......
00000020: 5f6d 6169 6e00 0108 0001 0000 0004 0100  _main...........
00000030: 0108 0006 0000 0004 0100 0108 400b 0000  ............@...
00000040: 0004 0300 0100 0000 0000 1300 0000 6800  ..............h.
00000050: 0000 0068 0400 0000 e8f1 ffff ff83 c408  ...h............
00000060: c302 0001 0000 0008 0000 0061 6263 0061  ...........abc.a
00000070: 6264 0000 0000 0000 0000 0000 00         bd...........

This is the output from rdfdump, a tool included with nasm.

2025-11-22 22:01:56 ~/projects/binarygolf/bggp6/nasm-nasm-2.15 
▶ rdoff/rdfdump rdoff/test/testlib.rdf 
RDOFF dump utility, version 2.3
RDOFF2 revision 0.6.1
Copyright (c) 1996,99 Julian R Hall
Improvements and fixes (c) 2002-2004 RET & COM Research.
File rdoff/test/testlib.rdf: RDOFF version 2

Object content size: 115 bytes
Header (54 bytes):
  extern: segment 0003 = _strcmp
  extern: segment 0004 = _main
  relocation: location (0000:00000001), length 4, referred seg 0001
  relocation: location (0000:00000006), length 4, referred seg 0001
  relocation: location (0040:0000000b), length 4, referred seg 0003

Segment:
  Type   = 0001 (text)
  Number = 0000
  Resrvd = 0000
  Length = 19 bytes

Segment:
  Type   = 0002 (data)
  Number = 0001
  Resrvd = 0000
  Length = 8 bytes

NULL segment

Total number of segments: 2
Total segment content length: 27 bytes

Sick!!

Let’s try building 6.asm since the code is simpler and also targets x86_64.

./nasm -f rdf 6.asm -o 6.rdf

This also works! What happens when you run it?

2025-11-22 22:06:23 ~/projects/binarygolf/bggp6/nasm-nasm-2.15/rdoff 
▶ ./rdx ../6.rdf
rdx: could not find symbol '_main' in '../6.rdf'

Uh oh, it says it can’t find the _main symbol. The same thing happens with testlib.rdf too… We DID just do the one thing they told us not to: reset the segment numbers.

It’s likely that nasm incorrectly generated the file, and any further modifications to the nasm source could create even more problems. Let’s analyze the file ourselves to see how the generated structures actually line up from a binary perspective.

RDOFF File Analysis

This is a hex dump of 6.rdf:

2025-11-22 22:05:56 ~/projects/binarygolf/bggp6/nasm-nasm-2.15 
▶ xxd 6.rdf
00000000: 5244 4f46 4632 4000 0000 1100 0000 0209  RDOFF2@.........
00000010: 0003 005f 6d61 696e 0005 0401 0000 0001  ..._main........
00000020: 0000 0000 000c 0000 00b8 3c00 0000 bf06  ..........<.....
00000030: 0000 000f 0502 0001 0000 0001 0000 0000  ................
00000040: 0000 0000 0000 0000 0000                 ..........

There are two strings in the hex dump, the RDOFF2 file signature, and the global function _main. The code section starts at offset 0x29

The rdfdump utility uses the same file parsing functions as rdx. It shows information about the RDF file’s contents.

2025-11-22 22:06:39 ~/projects/binarygolf/bggp6/nasm-nasm-2.15 
▶ ./rdoff/rdfdump 6.rdf
RDOFF dump utility, version 2.3
RDOFF2 revision 0.6.1
Copyright (c) 1996,99 Julian R Hall
Improvements and fixes (c) 2002-2004 RET & COM Research.
File 6.rdf: RDOFF version 2

Object content size: 64 bytes
Header (17 bytes):
  extern: segment 0003 = _main
  bss reservation: 00000001 bytes

Segment:
  Type   = 0001 (text)
  Number = 0000
  Resrvd = 0000
  Length = 12 bytes

Segment:
  Type   = 0002 (data)
  Number = 0001
  Resrvd = 0000
  Length = 1 bytes

NULL segment

Total number of segments: 2
Total segment content length: 13 bytes

Weird, the header appears to have _main listed as an extern, and in segment 3? I thought it was supposed to be GLOBAL? Also, where does the segment number come from? The .data and .bss look like what I would expect, and my .text is indeed 12 bytes long. If there are no errors, why doesn’t it run?

With a valid RDF that can be parsed by rdfdump, it felt like the right idea to start writing my own generator that implements the file format, and use rdfdump to validate it against the loader implementation shared with rdx.

RDOFF Scapy Implementation

scapy is a packet manipulation library written in Python, and is quite good at that! The cool part about scapy is that it has a lot of nice built in data types for you to use, from all the weirdo data types needed in all the protocols it supports. As a result, it makes a fantastic library for parsing and crafting files as well. Who says you have to send() the “packets” you construct anyways? What is a packet if not a file in motion?

Using scapy’s APIs, I wrote an RDOFF implementation here that consists of a header, and the two main record types:

I won’t go into full details about writing in scapy, but I do plan to write a blog about using scapy for file format hax soon! I mainly just went through the RDOFF code and found the relevant structures, then translated them to scapy fields of the correct type. I did my best to keep most of the naming from the original source, but some structs and fields have different names that refer to the same thing. I also read through the loader code to verify how certain things were processed, if at all.

After some time (and a bit of rubby duckying in the binary golf discord voice chat), I got a full parser working for 6.rdf.

▶ python3 RDOFF.py 
###[ RDOFF ]###
 magic     = b'RDOFF2'
 obj_len   = 64
 hdr_len   = 17
 \hdr       \
  |###[ Header ]###
  |  \hdr_records\
  |   |###[ RDFHDR_TLVs ]###
  |   |  type      = IMPORT
  |   |  length    = 9
  |   |  \value     \
  |   |   |###[ RDFREC_IMPORT ]###
  |   |   |  flags     = 0
  |   |   |  segment   = 3
  |   |   |  label     = b'_main'
  |   |###[ RDFHDR_TLVs ]###
  |   |  type      = BSS
  |   |  length    = 4
  |   |  \value     \
  |   |   |###[ RDFREC_BSS ]###
  |   |   |  bss_size  = 1
 \segs      \
  |###[ Header ]###
  |  \rdf_segs  \
  |   |###[ RDFHDR_TLVs ]###
  |   |  type      = TEXT
  |   |  number    = 0
  |   |  resrvd    = 0
  |   |  length    = 12
  |   |  data      = b83c000000bf060000000f05
  |   |###[ RDFHDR_TLVs ]###
  |   |  type      = DATA
  |   |  number    = 1
  |   |  resrvd    = 0
  |   |  length    = 1
  |   |  data      = 00
  |   |###[ RDFHDR_TLVs ]###
  |   |  type      = NULL
  |   |  number    = 0
  |   |  resrvd    = 0
  |   |  length    = 0
  |   |  data      = 

It matches up pretty well with the output from rdfdump! I also used the library to also generate an exact copy of the original RDF that nasm made me :))

Satisfying The Loader: rdx Analysis

Previously, I noticed that the Header Record Type for the section pointing to _main was IMPORT (2) (or EXTERN in the rdfdump output) instead of what I would expect it to be, EXPORT (3) (also called GLOBAL in the nasm code). Header Record Type GLOBAL (3) has an extra field describing the offset of the text segment. I didn’t know where the offset field was supposed to point to. Is it relative to the start of the file, or the .text segment? Same with the segment field which was set to 3 before. To be safe, I just left both at 0.

Why did the nasm generator swap my global to an import and put it at segment 3 in the first place? This would be actually asking to the linker for the _main function instead of specifying it as a global symbol.

The more I thought about it, the more questions I had. Where would _main even come from if I didn’t specify any other modules? What linker is it using? Where is the program being loaded in the first place? How is the image set up in the process? I had to re-focus.

The core issue was that _main can’t be found:

rdx: could not find symbol '_main' in '../6.rdf'

Let’s take a look at rdx.c to see where the error coming from. rdx is a short program, really just meant to demonstrate the rdf loader capabilities. I adore the funny C shellcode loader style trick to execute the code. It only executes code after it is able to find _main in the symbol table and resolve it’s offset.

    s = symtabFind(m->symtab, "_main");
    if (!s) {
        fprintf(stderr, "rdx: could not find symbol '_main' in '%s'\n",
                argv[1]);
        exit(255);
    }

    code = (main_fn)(size_t) s->offset;

    argv++, argc--;             /* remove 'rdx' from command line */

    return code(argc, argv);    /* execute */
}

This is a diagram of the various calls that process the incoming .rdf file and set up the code to execute.

flowchart LR
classDef success stroke:#0f0
classDef important stroke:#0ef
rdx("rdx")
rdfload("rdfload")
rdfopen("rdfopen")
rdfopenhere("rdfopenhere")
rdfrelocate("rdfrelocate")
rdfgetheaderrec("rdfgetheaderrec")
symtabInsert("symtabInsert")
symtabFind("symtabFind")
code("code()"):::success

rdx --> rdfload
rdfload --> rdfopen
rdfopen --> rdfopenhere:::important

rdx --> rdfrelocate:::important
rdfrelocate --> rdfgetheaderrec
rdfrelocate -->|if type 3| symtabInsert

rdx --> symtabFind:::important

rdx -->|if _main found| code

The initial parsing is done by rdfopenhere, which reads the file and sets up internal structures based on the first header. Notably, it rejects the RDOFF1 type altogether! If successful, rdx calls the rdfrelocate function, and iterates over the headers looking for two types: Relocate (1) and Export (3). If it finds an Export header record, it adds it’s name to the symbol table, which is where symtabFind looks for _main.

This means that our Import (2) Header Record Type never gets processed! In fact, load-time linkage is not supported at all. From rdfload.c

/* We currently do not support load-time linkage.
   This should be added some time soon... */

I created a test RDOFF with my scapy lib, using an GLOBAL type instead of an IMPORT type.

def test3_global():
    # This generates an RDF with an Global/Export/Public header type that can be loaded by rdx called global.rdf
    # It doesn't use any tricks to make it small
    # df5a07fb3400e95d4c68cb36a10451f15bd1ca536b50993709d7a534a3b3f4a6  global.rdf
    # (venv-2025) 2025-11-24 14:10:16 ~/projects/binarygolf/bggp6 
    # ▶ xxd global.rdf 
    # 00000000: 5244 4f46 4632 4300 0000 1400 0000 030c  RDOFF2C.........
    # 00000010: 0000 0000 0000 5f6d 6169 6e00 0504 0100  ......_main.....
    # 00000020: 0000 0100 0000 0000 0c00 0000 b83c 0000  .............<..
    # 00000030: 00bf 0600 0000 0f05 0200 0100 0000 0100  ................
    # 00000040: 0000 6600 0000 0000 0000 0000 00         ..f..........

    # Setting up the header - a GLOBAL, which points to the code section
    hdr_global = RDFREC_GLOBAL(segment=0,label="_main")
    # This one BSS segment is also needed
    hdr_bss = RDFREC_BSS(bss_size=1)

    # Now we construct the header and add our records to the list
    rdf_hdr = RDFHDR()
    rdf_hdr.hdr_recs.append(RDFRECS(type=RDFREC.GLOBAL, length=len(hdr_global), value=hdr_global))
    rdf_hdr.hdr_recs.append(RDFRECS(type=RDFREC.BSS, length=len(hdr_bss), value=hdr_bss))

    # This is the code we want to run, here it just exits 6 using a syscall
    mycode = bytes.fromhex("b83c000000bf060000000f05")

    # construct the segments
    rdf_segs = RDFSEGS()
    rdf_segs.rdf_segs.append(RDFSEG_TLV(type=RDFSEG.TEXT, length=len(mycode), data=mycode))
    rdf_segs.rdf_segs.append(RDFSEG_TLV(type=RDFSEG.DATA, number=1, length=1, data=b"\x66"))

    # This padding was added by nasm, not sure if it was intentional but we account for it anyways
    padding = b"\x00" * 10 # unknown usage
    obj_len = len(rdf_hdr) + len(rdf_segs) + len(padding) + 4 # the +4 is for the hdr_len for now

    # Now to initialize the entire file buffer and add our segments, appending the padding
    rdoff = RDOFF(hdr_len=len(rdf_hdr),hdr=rdf_hdr, segs=rdf_segs, obj_len=obj_len) / padding
    
    rdoff.show2() # show the dissected fields

    with open("global.rdf","wb") as f:
        f.write(bytes(rdoff))
        f.close()

rdfdump doesn’t seem to have a problem with this, a good sign!

(venv-2025) 2025-11-23 18:43:46 ~/projects/binarygolf/bggp6 
▶ ./nasm-nasm-2.15/rdoff/rdfdump global.rdf 
RDOFF dump utility, version 2.3
RDOFF2 revision 0.6.1
Copyright (c) 1996,99 Julian R Hall
Improvements and fixes (c) 2002-2004 RET & COM Research.
File global.rdf: RDOFF version 2

Object content size: 67 bytes
Header (20 bytes):
  public: (0000:00000000) = _main
  bss reservation: 00000001 bytes

Segment:
  Type   = 0001 (text)
  Number = 0000
  Resrvd = 0000
  Length = 12 bytes

Segment:
  Type   = 0002 (data)
  Number = 0001
  Resrvd = 0000
  Length = 1 bytes

NULL segment

Total number of segments: 2
Total segment content length: 13 bytes

Oh neat, new way to refer to Header Record Type 3 (GLOBAL or EXPORT) just dropped: PUBLIC !!

Anyways, this file segfaulted rdx, but for a curious reason.

I re-ran rdx in gdb and noticed that the value in *s->offset looks correct-ish, but it was missing something. I added the top half of the base address to *s->offset and then did a hex dump on that address. The code was in the correct place, but the address was truncated to 32 bit!

Whoops, need a 32bit executable heap

I took at look at the code for the record types and found the definition for “Public/export record” (no love for “GLOBAL” here). The offset is indeed an int32_t

include/rdoff.h

/*
 * Public/export record
 */
struct ExportRec {
    uint8_t type;                  /* must be 3 */
    uint8_t reclen;                /* content length */
    uint8_t flags;                 /* SYM_* flags (see below) */
    uint8_t segment;               /* segment referred to (0/1/2) */
    int32_t offset;                /* offset within segment */
    char label[EXIM_LABEL_MAX]; /* zero terminated as in import */
};

If you recall rdx.c, the code is provided by s->offset, meaning that this is the actual address we control here.

    code = (main_fn)(size_t) s->offset;
    argv++, argc--;             /* remove 'rdx' from command line */
    return code(argc, argv);    /* execute */

With execution being tantalizingly close, I decided to try to make the code support 64 bit addresses.

These are all the places I changed from int32 to int64:

I re-ran the code, and rdx segfaulted again, but for a different reason.

It got to the entrypoint, but it can’t execute because it needs an executable heap…

This tells me that rdx hasn’t worked on a 64 bit system, probably ever. The way the process image is set up just calls nasm_malloc(), and doesn’t set up any sort of sandbox or tweak any permissions for the code tries to load and run. It just allocates a chunk in the existing nasm heap and lets it rip I guess. This is pretty cool, it’s basically just a little shellcode loader! All these nerds decided the heap needed to be protected and ruined the fun. :(

iximeow saves the day

While I was debugging this issue in voice chat, iximeow suggested I rewrite the nasm_malloc() calls in rdfload with mmap() instead, so that I can just reuse the existing 32 bit values by mapping the allocated memory to the 32 bit address space. This solves both problems and simplifies it greatly!!

While I was busy getting pwned by the nx bit, ixi actually wrote the patch for rdfload.c, which is here:

diff --git a/rdoff/rdfload.c b/rdoff/rdfload.c
index 1c24f2fc..1a1803cd 100644
--- a/rdoff/rdfload.c
+++ b/rdoff/rdfload.c
@@ -51,6 +51,8 @@
 #include "symtab.h"
 #include "collectn.h"

+#include <sys/mman.h>
+
 extern int rdf_errno;

 rdfmodule *rdfload(const char *filename)
@@ -81,17 +83,26 @@ rdfmodule *rdfload(const char *filename)

     /* read in text and data segments, and header */

+    /*
+     *  (((ULONG_PTR)(x)) + PAGE_SIZE-1)  & (~(PAGE_SIZE-1)) ) // useless comment 
+     * want 32-bit offsets and permissions to work with, so mmap instead
     f->t = nasm_malloc(f->f.seg[0].length);
-    f->d = nasm_malloc(f->f.seg[1].length);  /* BSS seg allocated later */
+    f->d = nasm_malloc(f->f.seg[1].length);  // BSS seg allocated later
+    */
+    #define PAGE_CEIL(x) ((x & !0xfff) + 0x1000)
+    f->t = mmap(0, PAGE_CEIL(f->f.seg[0].length), PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANON | MAP_32BIT, 0, 0);
+    f->d = mmap(0, PAGE_CEIL(f->f.seg[1].length), PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON | MAP_32BIT, 0, 0);
     hdr = nasm_malloc(f->f.header_len);

     if (!f->t || !f->d || !hdr) {
         rdf_errno = RDF_ERR_NOMEM;
         rdfclose(&f->f);
         if (f->t)
-            nasm_free(f->t);
+            munmap(f->t, PAGE_CEIL(f->f.seg[0].length));
+            // nasm_free(f->t);
         if (f->d)
-            nasm_free(f->d);
+            munmap(f->d, PAGE_CEIL(f->f.seg[1].length));
+            // nasm_free(f->d);
         nasm_free(f);
         nasm_free(hdr);
         return NULL;

Executing RDOFF

After applying the path, global.rdf runs! It exits 6 as expected!!!

It even works outside of gdb (the real test)

Hell yeah!!!

Pouring One Out For RDOFF

What began life as an internal test case for nasm’s object file generator, ended as a forgotten file format. By the time it was removed from nasm, RDOFF was broken shell of it’s former self, it’s machinery unceremoniously optimized out.

The key (only?) benefit of RDOFF was it’s simplicity, but even with it’s barebones spec, the reference implementation was never fully fleshed out. Many features which were part of the original spec, such as dynamic linking, remained incomplete or undefined through it’s entire lifespan. The desire to keep the format flexible for “someone” to use, left key design decisions up to whoever dared to. RDOFF came out in 1997, 2 years before ELF was standardized as the default binary for Linux. By the time RDOFF2 came out in late 2002, many compilers and assemblers were already targeting ELF, PE, and their associated object files. It feels like RDOFF2 was more of an idea than a file format. A dream, yearning for a simple, flexible format, based on common features of existing formats. The only problem was that there was no one to build it. I can see why the nasm team would finally remove it and refactor all the surrounding library code.

To be honest, I am glad that they removed something that is no longer supported. Having a random file format that no one knows anything about that can be executed by a loader that most people don’t realize ships with nasm seems like the right thing to do. I can appreciate it’s quirkiness, it’s 90s style C, it’s undefined behavior. But for a general use format, I would def stick with object files targeting ELF or PE :) unless u a fweak !!

What’s Next?

I wanted to put out this writeup to see if anyone would come out of the woodwork in defense of RDOFF. I would love to see any other examples of custom usages, or formats that used RDOFF as a basis or inspiration. I hope this write up serves as a case study on how to preserve endangered file formats as well.

In Part 2, I am going to go over golfing RDOFF files. As of this writing, the smallest spec conforming RDF I could make (that still returns 6) is 62 bytes! SHA256: 68554fbb087180353ab2ef6b10bf91613dc31206824c11baccea7b0705d3c449

Given the weirdness of the loader, it feels like it’s possible to go even smaller. I will also test other RDOFF loaders and generators mentioned in this writeup.

RDOFF alt defs


Tags
Binary Golf · Bggp · Bggp6 · Nasm · X86 · Rdoff · Scapy