Handcrafting object files

The Problem:

Image you’ve got a c-file with the following code:

 

unsigned char text[] = { 0x53, 0x6f, 0x20, 0x69, 0x6d, 0x61, 0x67, 0x65, 0x20, 0x79, 0x6f, 0x75,
 			 0x27, 0x76, 0x65, 0x20, 0x67, 0x6f, 0x74, 0x20, 0x61, 0x20, 0x63, 0x20,
 			 0x66, 0x69, 0x6c, 0x65, 0x20, 0x77, 0x69, 0x74, 0x68, 0x20, 0x74, 0x68,
 			 0x65, 0x20, 0x66, 0x6f, 0x6c, 0x6c, 0x6f, 0x77, 0x69, 0x6e, 0x67, 0x20,
 			 0x63, 0x6f, 0x64, 0x65, 0x3a, 0x0a, 0x0a };
long textSize = sizeof(text);

not too hard to imagine, isn’t it? Now imagine you’ve got a bit more data. Like 20MB of data. Storing 20mb of data in a c-file probably doesn’t use the best compression. You would end up in a text-file of 100mb+. Compiling such a file takes quite some time and more importantly takes plenty of ram. Dealing with that kind of files isn’t the best thing to do so I thought about a better solution.

It would be much better to store the binary file in…a binary file ;-). But you can’t just link to a random binary file and access its data. One would also want to have the data’s value available in a variable.

The solution

..is pretty easy actually: store the data in a static library and link your app against that library. This library could also export a symbol or two. One to get the data, one to get the size.

But as easy as the solution sounds, how do you get a static library out of a binary file? Or more importantly, how do you generate an object file? That’s pretty easy with a compiler. One could use gcc to compile the object file. Only problem is that gcc takes c-files as input, which kinda defeats the purpose.

I’ve chosen a different compiler, I’ve chosen as. As, for those who don’t know, is the assembler in gcc. It takes cpu instructions and converts them to bytes in an object file. But it can actually do a lot more. There’re plenty of special instructions available that don’t convert to machine code but rather decide where things go in the resulting object file. There’re instructions to declare a global, to define data, etc. A very handy instruction is the .space size, fill instruction. It allows us to create an area of bytes of a certain length. The instruction is just one line, but can generate space for tons of megabytes if you like. Using the .space instruction we can generate object files of a certain size, free to be filled with whatever we like.

The workflow for putting binary files inside a static library is like so: you generate the assembler file, compile it into an object file, link it to a static library via libtool and then you replace the placeholder inside the static library with the contents of the binary file.

Don’t worry. There’s no need to do that by hand. I’ve released DataLibraryCreator which does exactly that.

Using this technique you can do other pretty cool stuff, too. Like creating symbol files for times when you want to debug c-tools that don’t ship symbols (like here). If you compile a c-file via

cc -S myFile.c

you’ll get a myFile.s which contains the assembler code generated out of the c-file. If you aren’t to sure about which instructions you can use in the assembler file, just try and see what the c-compiler generates.

@_karsten_

 

7 Responses to “Handcrafting object files”

  1. Paul Says:

    Ermmm, can you use the incbin directive in an asm file?

  2. Karsten Says:

    ha, nice find… didn’t see this directive. I think that could remove the need to search/replace the bytes manually. thanks a lot!

  3. Marko Says:

    If your goal is simply to create an object file which contains a reserved location where you can put the data, couldn’t you simply create a c file with a global initialized char array so that you can reserve the space in the .data section without having to fiddle with as?

    char data[1024*1024*100] = {1};

    will create a 100Mb object file.

  4. ken Says:

    Another thing you can do is tack on a binary file as a mach segment using a linker argument.

    See http://stackoverflow.com/questions/1604673/how-do-i-embed-data-into-a-mac-os-x-mach-o-binary-files-text-section/1605237#1605237

  5. Karsten Says:

    good point, but how do you set a veriable to the start of a section? Isn’t that involving some mach-api stuff? Same goes with the size of the segment. How do you easily get the size of that segment?

  6. ken Says:

    Try ‘man getsectdata’ for the suite of functions. It returns the data and the size.

  7. Karsten Says:

    awesome info, thanks!