The smallest C++ binary

35 points by weineng a day ago on lobsters | 12 comments

raymii | a day ago

Why even use a C compiler if all we're writing is assembly?

[OP] weineng | a day ago

it's just a fun exercise :)

breadbox | a day ago

It's a very fun exercise!

I collected a list of "smallest" executables with slightly differing constraints here: https://www.muppetlabs.com/~breadbox/software/tiny/return42.html -- but I didn't think to include one with the constraint of having to use gcc to produce the binary. I like it -- although as @sammko points out here, you can use the -oformat=binary linker flag to devolve the constraint more or less completely.

Lilian | a day ago

The assembly is a really good place to start. I have a 231 byte hello world binary compiled from this: https://github.com/Cons-Cat/libCat/blob/main/examples%2Fhello.cpp

I started there from a similar tutorial a few years ago, and then factored the code out better and incrementally built up a lot of technology around it while keeping the overhead in the simple case as low as possible. I even have a CI test to ensure it stays at 231 bytes because that matters to me.

EDIT: Oops I somehow left an unneeded include in there. Gotta fix that.

peter-leonov | a day ago

Agreed. But it still has some nice C-only tricks and without some asm the picture would've been incomplete.

majaha | a day ago

sammko | a day ago

Indeed, there is a 45B binary here. Surely it is possible to encode that in assembly (just a sequence of db's in the extreme) and get gcc to assemble that back into the 45B "raw" file. It will coincidentally be an ELF, but gcc doesn't need to know. This would satisfy the OPs rules?

It would cease to be a "C binary" though, for most reasonable definitions of that term.

sammko | a day ago

gcc -Wa,-mx86-used-note=no -Wl,--oformat=binary -nostdlib a.S suffices

donio | 19 hours ago

Meta: a few hours ago the Lobsters title was changed from the original "C" to "C++" and the "c" tag was dropped even though the post is very clearly about C. The mod log says it was an automatic change based on user suggestions. I wonder how exactly it happened.

fanf | a day ago

There’s an intermediate stage between the C++ program that calls exit(3) and the assembler call to SYS_exit: as you can tell from the manual section number, exit(3) is a library function that pulls in loads of libc (the atexit(3) machinery amongst other stuff). The standard way to call the raw exit system call is _exit(2), and if you put that in _start() and statically link then you should get something reasonably small. You can reduce the size of the compiler invocation and the source code by writing in C instead of C++.

spc476 | a day ago

I did just that.

#include <stdlib.h>
void _start(void)
{
  _Exit(0); /* C99 function to call SYS_exit() */
}

And compiled with gcc -Os -nostdlib -static -o x x.c -lc. I ended up with a striped executable size of 8912, but the actual code generated was only 96 bytes in size, as it included a generic syscall() function for _Exit().

singpolyma | a day ago

I think the answer depends by compiler. I'm not sure resorting to non-C code which some C compilers happen to accept counts though 😉