Make literals chunk uncompressed on Erlang/OTP 28
If compression is important, it can be enabled for the whole .beam.
See https://github.com/erlang/otp/pull/8940#issuecomment-2426460703.
Also, to quote @pguyot from that thread:
For what it's worth, not having compressed chunks allows more easily to mmap beams and read from there. AtomVM runs on microcontrollers that can map part of the flash to the address space, either at fixed offset (Pico with XIP) or with some address translation (ESP32 families). These microcontrollers use some cache for this but it's well optimized including in regards to interruptions and multicore.
Thanks for the suggestion! I've asked the OTP team for feedback on this idea.
Here is a WIP branch that no longer compresses the literal chunk:
https://github.com/bjorng/otp/tree/bjorn/uncompressed-literals/GH-8967/rfc
I have calculated the total size of all stripped BEAM files in OTP with and without this branch. Note that an undocumented feature of the beam_lib:strip functions is that the resulting BEAM file is compressed as well as stripped.
- Before: 9009382
- After: 8926893
- Difference: 82489 (0.9 percent)
@josevalim Do you have any benchmark to test whether this change would make any difference in load times?
@pguyot You will still have to convert all literals from the external term format to the external. Do you think that not having to uncompress the literal chunk will noticeably decrease load times?
Here is something I used the last time around:
defmodule Run do
def run([path]) do
entries =
for beam <- Path.wildcard(Path.join(path, "**/*.beam")) do
module = beam |> Path.basename() |> Path.rootname() |> String.to_atom()
{module, File.read!(beam)}
end
IO.puts("Loading #{length(entries)}")
:timer.tc(:lists, :foreach, [
entries,
fn {module, binary} -> :erlang.prepare_loading(module, binary) end
])
|> elem(0)
|> IO.inspect()
:timer.tc(:lists, :foreach, [
entries,
fn {module, binary} -> :erlang.prepare_loading(module, binary) end
])
|> elem(0)
|> IO.inspect()
end
end
Run.run(System.argv())
Which you can call as elixir script.exs path/to/rootdir. I will get all .beam files within the given path and print the measurements for preparing all of them. The first one is for warmup. I couldn't measure any difference before.
@pguyot You will still have to convert all literals from the external term format to the external. Do you think that not having to uncompress the literal chunk will noticeably decrease load times?
True. If the literals are compressed (LITT), AtomVM uncompresses them when the module is loaded, allocating a copy in RAM. If they are not compressed (LITU chunk that PackBeam tool creates on the desktop), AtomVM just maps them, allocating no additional RAM at all. Then when a literal is referenced, it is copied on heap, except for binaries (which we can do as long as modules are not unloaded in AtomVM).
Regarding the load times, I guess this question was not for AtomVM. Not having to uncompress the literal chunks reduces the load time as we just mmap it as opposed to allocating and inflating it, but we're not measuring this on microcontrollers.
Thanks!
I couldn't find any difference in load times.
I also calculated the size of the untouched BEAM files (that is, with all chunks present).
- Before: 46171884
- After: 47351240
- Difference: 1179356 (1 percent)
That slight size increase seems acceptable to me.
Therefore, it seems to me that the positive aspects of this change slightly outweigh the negative aspects.
I've finished the implementation and published a pull request.