otp Make literals chunk uncompressed on Erlang/OTP 28

If compression is important, it can be enabled for the whole .beam.

See https://github.com/erlang/otp/pull/8940#issuecomment-2426460703.

Also, to quote @pguyot from that thread:

For what it's worth, not having compressed chunks allows more easily to mmap beams and read from there. AtomVM runs on microcontrollers that can map part of the flash to the address space, either at fixed offset (Pico with XIP) or with some address translation (ESP32 families). These microcontrollers use some cache for this but it's well optimized including in regards to interruptions and multicore.

Oct 21 '24 12:10 josevalim

Thanks for the suggestion! I've asked the OTP team for feedback on this idea.

Oct 22 '24 07:10 bjorng

Here is a WIP branch that no longer compresses the literal chunk:

https://github.com/bjorng/otp/tree/bjorn/uncompressed-literals/GH-8967/rfc

I have calculated the total size of all stripped BEAM files in OTP with and without this branch. Note that an undocumented feature of the beam_lib:strip functions is that the resulting BEAM file is compressed as well as stripped.

Before: 9009382
After: 8926893
Difference: 82489 (0.9 percent)

@josevalim Do you have any benchmark to test whether this change would make any difference in load times?

@pguyot You will still have to convert all literals from the external term format to the external. Do you think that not having to uncompress the literal chunk will noticeably decrease load times?

Oct 22 '24 13:10 bjorng

Here is something I used the last time around:

defmodule Run do
  def run([path]) do
    entries =
      for beam <- Path.wildcard(Path.join(path, "**/*.beam")) do
        module = beam |> Path.basename() |> Path.rootname() |> String.to_atom()
        {module, File.read!(beam)}
      end

    IO.puts("Loading #{length(entries)}")

    :timer.tc(:lists, :foreach, [
      entries,
      fn {module, binary} -> :erlang.prepare_loading(module, binary) end
    ])
    |> elem(0)
    |> IO.inspect()

    :timer.tc(:lists, :foreach, [
      entries,
      fn {module, binary} -> :erlang.prepare_loading(module, binary) end
    ])
    |> elem(0)
    |> IO.inspect()
  end
end

Run.run(System.argv())

Which you can call as elixir script.exs path/to/rootdir. I will get all .beam files within the given path and print the measurements for preparing all of them. The first one is for warmup. I couldn't measure any difference before.

Oct 22 '24 13:10 josevalim

@pguyot You will still have to convert all literals from the external term format to the external. Do you think that not having to uncompress the literal chunk will noticeably decrease load times?

True. If the literals are compressed (LITT), AtomVM uncompresses them when the module is loaded, allocating a copy in RAM. If they are not compressed (LITU chunk that PackBeam tool creates on the desktop), AtomVM just maps them, allocating no additional RAM at all. Then when a literal is referenced, it is copied on heap, except for binaries (which we can do as long as modules are not unloaded in AtomVM).

Regarding the load times, I guess this question was not for AtomVM. Not having to uncompress the literal chunks reduces the load time as we just mmap it as opposed to allocating and inflating it, but we're not measuring this on microcontrollers.

Oct 22 '24 19:10 pguyot

Thanks!

I couldn't find any difference in load times.

I also calculated the size of the untouched BEAM files (that is, with all chunks present).

Before: 46171884
After: 47351240
Difference: 1179356 (1 percent)

That slight size increase seems acceptable to me.

Therefore, it seems to me that the positive aspects of this change slightly outweigh the negative aspects.

Oct 23 '24 05:10 bjorng

I've finished the implementation and published a pull request.

Oct 25 '24 12:10 bjorng