num-bigint icon indicating copy to clipboard operation
num-bigint copied to clipboard

Expose `BigDigit` and `IntDigits`

Open little-dude opened this issue 4 years ago • 3 comments

Hello,

I'll like to implement some sort of custom version of biguint::to_bitwise_digits_le that take a &mut [u8] instead of allocating a Vec. For this I'd need access to the BigDigits, but it's not exposed. Would it be possible to make this type and the IntDigit trait public?

If this is something you would consider I can make a PR to document IntDigit, since it's currently undocumented.

little-dude avatar May 15 '20 08:05 little-dude

Ah I just see you comment on a different issue about exposing BigDigit https://github.com/rust-num/num-bigint/issues/118#issuecomment-565744651:

That requires exposing the digit size, which I explicitly don't want.

~~What is the reason for this?~~ edit: found the response here

I explicitly don't want methods with BigDigit in the public API. This was specifically cleaned up in 0.2 to prepare for supporting different sizes. IMO it's too much of a portability footgun for someone to write code that works fine on their main target, but becomes totally wrong on a different target.

That's a really good point. I am actually targeting different platforms. So now I'm wondering: if I encode a BigUint with to_bytes_le() on a platform with BigDigit::BITS = 64, can I decode it on a platform where BigDigit::BITS = 32?


Maybe there's another way around my initial problem, since exposing BigDigit seems like a bad idea. Basically, I have a Vec<BigUint> with 1 to 10 million items that I need to serialize and send over the network. I don't want to use BigUint.to_bytes_le() here because it allocates a vec of each item which is costly and which I don't need.

Maybe BigUint could have a method that just takes a buffer and write the integer as LE in it:

// returns how many bytes were written
fn write_le(&self, buf: &mut [u8]) -> Result<usize> {
    // ...
}

little-dude avatar May 15 '20 08:05 little-dude

So now I'm wondering: if I encode a BigUint with to_bytes_le() on a platform with BigDigit::BITS = 64, can I decode it on a platform where BigDigit::BITS = 32?

The byte representation should be independent of whatever digit size was used internally. Why do you suspect there could be a problem here?

Basically, I have a Vec<BigUint> with 1 to 10 million items that I need to serialize and send over the network. I don't want to use BigUint.to_bytes_le() here because it allocates a vec of each item which is costly and which I don't need.

Maybe you could use the serde support here with bincode? I think you could then serialize your whole vector in one shot, maybe even as part of some surrounding data too.

Otherwise, another suggestion in #12 was to add iterators. I think a u8 iterator would be the MVP, but it should also be possible to iterate u32/u64 digits, regardless of the internal digit size.

cuviper avatar May 15 '20 16:05 cuviper

I currently have to use a fork of bigint that just expose the internal slice.

Iterator support would be awesome.

I guess something like:

fn iter_bytes_le() -> impl Iterator<Item = u8>; // (+ ExactSize + DoubleEnded + Fused)
fn iter_bytes_be() -> impl Iterator<Item = u8>;
fn iter_u32_digits() -> impl Iterator<Item = u32>;
fn iter_u64_digits() -> impl Iterator<Item = u64>;

Speedy37 avatar Jun 27 '20 22:06 Speedy37