Several misaligned pointer dereference in the library
Unsoundness
Hi, it seems that three are several misaligned pointer created from transmute are used in the library. For example, in the module machine::system_calls::<impl machine::Machine>::socket_server_accept/close
https://github.com/mthom/scryer-prolog/blob/9837187183cf2d7071dd94601c530afc2e280321/src/machine/system_calls.rs#L6563-L6594
At line 6593, the macro match_untyped_arena_ptr will match ArenaHeaderTag and transmute the u8 raw pointer to the raw pointer of TcpListener, which is aligned to 4 bytes. The undefined behavior caused by the misaligned pointer dereference can lead to unexpected behaviors that we should avoid. The runtime panic when the system doesn't tolerate the misalignment can even cause the socket operation to fail.
The similar issues could also occur in machine::system_calls::<impl machine::Machine>::http_accept/http_answer
There is a lot of unsafe stuff happening all over the place in Scryer. It's kind of scary that a lot of them are behind macros which aren't easy to debug. I contributed initial support for running the unit tests with Miri in #2281, which would make debugging such things much easier, but there are a lot of things in streams.rs, arena.rs and atom_table.rs that need to be fixed before Miri can go deeper into these issues. I think most of the current Miri blockers aren't really anything wrong, but just provenance issues that make it so that Miri can't do it's job (relevant: #2019).
The macros seem to be a really foundational part of Scryer, so changing them to be safer will probably be really hard and/or introduce a big performance regression. Relevant quote from the Nomicon:
[...] unsafe code does more than pollute a whole function: it pollutes a whole module. Generally, the only bullet-proof way to limit the scope of unsafe code is at the module boundary with privacy.
Unsafe should be completely encapsulated at module level, but using unsafe in macros like Scryer does ends up putting unsafe everywhere even in modules that could and should probably be 100% safe Rust, which is a big nightmare for debugging Undefined Behavior.
What I can do here is to list out potential issues I considered to be unsafe (also related to misalignment)
machine::loader::<impl machine::Machine>::use_moduleIn this function,transmutewas wrapped in macrocell_as_load_state_payload, which will convert au8raw pointer into raw pointer ofmachine::term_stream::LoadStatePayload<machine::term_stream::LiveTermStream>in line 1472. This could lead to misalinged pointer sinceLoadStatePayloadhas aligned bytes larger than 1 byte (e.g., The struct could be aligned to the field ofretraction_info).heap_print::HCPrinter::<‘a Outputter>::handle_heap_termIn this function, misaligned pointers were created while transmutingu8pointer into the pointer ofstd::net::TcpListeneranddashu::rational::RBig.
@bakaq I agreed with your point. However, after we compile the code to MIR, all the macros have been expanded. For example, it would be clear to find that transmute is used in the function. This method could help us debug more efficiently.
@shinmao I tried to address some of these criticisms in https://github.com/mthom/scryer-prolog/pull/2393 if you don't mind taking a look and perhaps making further suggestions for improvement.
Sorry, mentioned the wrong PR Number