encoding_rs icon indicating copy to clipboard operation
encoding_rs copied to clipboard

Unsafe review notes

Open Manishearth opened this issue 4 months ago • 5 comments

I'm performing an unsafe review of encoding_rs. The hope is by the end of this I will have safety documentation for all unsafe blocks in the crate.

Opening this issue to keep track of things I have found, especially things I do not plan to fix with documentation (but may fix later).

The branch with the documentation is https://github.com/manishearth/encoding_rs/tree/unsafe (compare)

Manishearth avatar Feb 14 '24 20:02 Manishearth

Existing UB: https://github.com/hsivonen/encoding_rs/issues/79

Manishearth avatar Feb 14 '24 20:02 Manishearth

One thing I think that can be done is that the constants in ascii.rs can be cleaned up greatly.

I believe the following works:

pub const ALU_STRIDE_SIZE: usize = mem::size_of::<usize> * 2;
pub const MAX_STRIDE_SIZE: usize = ALU_STRIDE_SIZE;
pub const ALU_ALIGNMENT: usize = mem::align_of::<usize>();
pub const ALU_ALIGNMENT_MASK: usize = ALU_ALIGNMENT - 1;

For SIMD, I think we can do

pub const SIMD_ALIGNMENT = align_of::<u16x8>();

and assert in a constant that it is equivalent to align_of::<u8x16>()

Manishearth avatar Feb 14 '24 20:02 Manishearth

The store8_unaligned in copy_bmp_to could probably be aligned since we're writing to an aligned u16 slice

if SIMD_STRIDE_SIZE is a multiple of 4 (it is).

Manishearth avatar Feb 15 '24 00:02 Manishearth

Utf16Destination/Utf8Destination are extremely leaky abstractions, safety wise. Their safety depends heavily on how, precisely, they are used, but they're not unsafe.

I'm not sure if they can be properly reviewed without adding unsafe there

Manishearth avatar Feb 15 '24 02:02 Manishearth

I think the way to handle Utf16/Utf8Destination is to change the handle types to be "checked for one write" and "checked for two writes" (etc), and have them call each other. I think I'll do this as a separate PR, but I am convinced these are safe now.

They're fiddly enough that I think a refactoring would help in the long run. The invariants are currently scattered across the file.

Manishearth avatar Mar 06 '24 17:03 Manishearth