liballocs Better high-level overview via a more up-to-date research paper?

The whole codebase really needs me to do a brain dump in each .c file, explaining what goes on in each. A lot of non-obvious things 'obvious to me' are not stated anywhere.

For the 'overall' picture, the Onward! 2015 paper in theory covers this, but it doesn't do a great job and some of its details are outdated. It may be time to write a follow-up paper focusing on experience. Some things I can think of are the following.

initialization / bootstrapping being non-obviously hard
uniqtypes are allocators
memtables not being a good idea (yet) in practice
the generalised notion of reference
array uniqtypes going away? the story of that
stability in extensions of system interfaces: why trailers are better than headers
richer versions of system interfaces (libunwind, dladdr, maybe libdlbind counts here?)
the story of the custom dynamic linker
some of the content from my [forthcoming] "how to hook malloc, really, really, really" blog post
something about rejigging how allocation sites are classified, if I get on to that
something about my 'working underneath libc' and the contrast with sanitizer runtimes

Mar 09 '22 16:03 stephenrkell

Another one:

why make_precise wasn't the right interface, and how to do it better

Mar 09 '22 16:03 stephenrkell

More

why 'typedefs as aliases' isn't quite right, but is fixable
why 'global-symbol uniqueness' isn't quite right, for aliased uniqtypes

Mar 09 '22 16:03 stephenrkell

why 'allocation hierarchy' isn't quite accurate (mmaps are a patchwork) but seems to be recoverable

Mar 09 '22 16:03 stephenrkell

'there is no static' / the dynamic linker as an allocator (arguably covered in 2015 paper, but good to go into)

Mar 09 '22 16:03 stephenrkell

the motivation for fake_dlsym(), i.e. avoiding allocation where possible
the bootstrapping issues around use of malloc both 'early' (before systrapping: our private mallocs should not grow the 'maps' file while we're walking it) and later (the mapping_sequence 'pool' / reentrance issues handling mmap)

Mar 30 '22 20:03 stephenrkell

something about the practicalities of working in-process, making explicit the ptrace() contrast, i.e. why it's preferable, but harder, to get reflection happening in-process

Mar 30 '22 20:03 stephenrkell

sizeofness analysis generalised, if I get around to that (e.g. the perlbench case: sizeofness in fields / as a conceptually dynamic quantity, but one that happens to be mostly static/fixed)

Mar 30 '22 20:03 stephenrkell

recap the -ffunction-sections need, then cover how we eliminated the custom binutils (maybe borrowing the also-forthcoming blog post about more robust symbol interposition)

Mar 31 '22 10:03 stephenrkell

reentrancy redux: we had malloc-malloc reentrancy via dlsym (eliminate by fake_dlsym) but also malloc-mmap-malloc reentrancy via sys_alloc (eliminate by a never-mmaping private malloc).

Mar 31 '22 15:03 stephenrkell

also see malloc_hooks_stubs_preload.c for an interesting lock reentrancy issue

Mar 31 '22 16:03 stephenrkell

Could make a point that all this reentrancy-avoidance is about 'stratifying' (in the sense of Bracha and Ungar) the memory allocation system in the process. We rely on the lower stratum, i.e. the kernel's page allocator.

Mar 31 '22 16:03 stephenrkell

The old mallochooks approach of separating out reentrant calls to a sideline malloc requires on free() a way to tell apart sideline from mainline chunks, because reentrant alloc contexts needn't be reentrant free contexts. The new way is to avoid the reentrancy.

Mar 31 '22 16:03 stephenrkell

Another thing to mention: the mess with aliases (see comments in allocs-cflags). Neither GRP_COMDAT section groups nor global-symbol uniquing do the right thing in the presence of aliases; they can (respectively) discard aliases or break (de-alias) them.

Jan 23 '23 14:01 stephenrkell

Another thing was my aborted attempt at using strongly connected components (SCCs) in the DWARF to generate a stable notion of type identity for recursive types that may be defined independently multiple times. This 'obviously' doesn't work because of opaque structs and the like, since these cut the graph in different ways in different contexts.

For some reason, this has turned out not to be important, even though for a while I was sure it would be.

Jan 31 '23 05:01 stephenrkell

Continuing the 'opaque struct' thing: one could perhaps relate it to the idea in C that "types have no linkage" and the definition of "compatible type". I think the short summary (check this!) is that there's no nominal distinction of struct types having 'the same' definition... whereas in effect we are forced to create one. The reason it doesn't bite is that "independently multiple times" doesn't happen [much].

Jan 31 '23 05:01 stephenrkell

Continuing again: we are only forced to create one if we let synthetic file/line information creep into the summary codes of the struct or any of its (transitive) constituents. But we might be able to avoid that. Need to re-check what I did about this.

Jan 31 '23 10:01 stephenrkell

Another topic is the fun with gdb had during chain loading... e.g. how gdb will scan the .dynamic section in the file on disk, and only access inferior memory in very specific places. It even caches the location of _dl_debug_state, leading to our amusing gyrations in allocsld. I guess this belongs with discussion of the ELF zygote in libdlbind... overall, the mechanisms can support the dynamism but the assumptions made by tools sometimes partially undo that support.

Jun 12 '23 18:06 stephenrkell

One gyration (in allocsld/chain.c) involves padding .interp so we can later clobber it (if 'requested') or else simply later swap argv[0] to point to the inferior ld.so (if 'invoked'), to avoid the real (inferior) ld.so getting confused about its own name on disk.

But the main point continuing the preceding: you could call it a 'disk--memory assumption', about what is equal between a file's image on disk and its image in memory. One cannot simply overwrite stuff in memory and have that take precedence. That's why we end up clobbering _dl_debug_state in the inferior ld.so, so that instead of a no-op it calls our own _dl_debug_state which is the One True debugger-please-breakpoint-me function. The debugger will call the thing whose address it looks up on disk, not in memory.

It's not 100% clear that we need the debugger-please-breakpoint-me function to be in allocsld.so, if gdb will reliably find the real (inferior) ld.so when it looks.

Nov 20 '23 11:11 stephenrkell

The n-way comparison of indexers mooted in #67 could be a good experimental addition to a new paper.

Dec 12 '23 17:12 stephenrkell

liballocs liballocs copied to clipboard

Better high-level overview via a more up-to-date research paper?

liballocs
liballocs copied to clipboard