zstd-rs icon indicating copy to clipboard operation
zstd-rs copied to clipboard

Support for no_std?

Open tomaka opened this issue 4 years ago • 15 comments

:wave:

As far as I know, decoding (and encoding) zstd is in principle a pure CPU operation that does not require any help from the operating system.

As such, it would make sense to me to mark the crate as #[no_std].

Pragmatically-speaking, this would mean exposing an API that does not use anything from std::io. The existing API (that uses std::io::Read) could be put behind an std feature-gate, like many crates in the ecosystem do.

tomaka avatar May 14 '21 08:05 tomaka

I theory you are right, zstd could be implemented using none of the std lib. This implementation does use std::vec::Vec in many places, because it is

  1. convenient
  2. Probably saves memory for the user, outside of worst-case szenarios

Iirc this could also be used with only the alloc part of the std lib, but I am no expert on #[no_std] rust. If this is feasible I would review a PR for it.

Dou you have a realworld usecase for #[no_std] zstd decoding?

KillingSpark avatar May 14 '21 15:05 KillingSpark

Inside a wasm blockchain is one example, but I could certainly expect it to be useful in browser contexts - I've not yet had to unzstd something in the browser but I've had to do various other decodings in browser so I think it's just a matter of time before I bump into this in a browser context.

gilescope avatar Jun 28 '21 10:06 gilescope

Right I didn't think about wasm, that is a pretty big thing. If I understand correctly, the only thing that would need to be changed would be the imports of Vec and Hashmap to use the alloc:: imports. Can you confirm that? I have no real idea on how to test this quickly.

KillingSpark avatar Jun 28 '21 13:06 KillingSpark

At minimum you'd have to replace all std:: imports by core:: and alloc::.

Additional challenges:

  • std::io::{Read, Error, Write} -- Not in alloc
  • implementing std::Error -- Can probably be removed via conditional compilation
  • HashMap -- Not in alloc. Could use BTreeMap instead.
  • VERBOSE + println! -- Not in alloc. Should be easy to fix, by defining a custom verbose_println! macro which conditionally expands to nothing or println!
  • Tests need std -- Probably easy to fix via conditional compilation

CodesInChaos avatar Jul 01 '21 08:07 CodesInChaos

I am a bit sketchy - think there is std wasm and no-std wasm. For std wasm there might be little to be done.

On Thu, 1 Jul 2021 at 09:23, CodesInChaos @.***> wrote:

At minimum you'd have to replace all std:: imports by core:: and alloc::. But it looks like you're using other std features, std::io::Read is a critical one.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/KillingSpark/zstd-rs/issues/15#issuecomment-872036532, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGEJCEFXTBND6OVMENZFHTTVQQYRANCNFSM4437OJDA .

gilescope avatar Jul 01 '21 08:07 gilescope

Replacing the Hashmap is no problem, it's just used for the dictionary lookup once per zstd stream (and only if a dict is used). I could just make this a Vec or whatever it doesn't really matter for the performance.

the std::Error thing should be relatively easy. If I am at that it would probably be worthwhile to convert to something like thiserror. Do you know if it is possible to use thiserror in the context of [no_std] or if it can be applied only conditionally?

Making the test use std and make them conditional shouldn't be a big problem.

VERBOSE + println! would be better if they were wrapped in a macro anyways. The way it's currently done is a bit ugly. Are the logging macros like info! and warn! a thing in [no_std] environments?

KillingSpark avatar Jul 03 '21 06:07 KillingSpark

HashMap is usually replaced with hashbrown::HashMap. The hashbrown library is actually used by the Rust standard library for its own implementation.

For the Error trait, personally I just don't bother implementing this trait anymore, but it can be implemented conditionally with #[cfg(feature = "std")] or something like #[cfg_attr(feature = "std", derive(thiserror::Error)]. You just need to add a std feature in your Cargo.toml.

As for the log macros, yes they are available.

To me the major difficulty is the usage of the std::io::Read trait.

tomaka avatar Jul 03 '21 10:07 tomaka

You are right using the std::io::Read makes it hard to port to no_std. But I really don't want to replace it. Maybe the errorhandling working group will succeed in pushing std::error::Error into libcore at some point?

KillingSpark avatar Jul 05 '21 19:07 KillingSpark

Any progress for no_std or wasm? I need it too. Thx!

yjhmelody avatar Aug 03 '22 03:08 yjhmelody

Well there is still the issue of std::io::Read/Write being used a lot throughout the codebase.

These do not exist in core afaik. I am not really involved in no_std development, so I'd need some help here. Is there some best practice to replace these traits with something more appropriate?

If the answer is: we don't use these traits, what API would a no_std user of this crate want, if not Read/Write based?

KillingSpark avatar Aug 03 '22 09:08 KillingSpark

Well there is still the issue of std::io::Read/Write being used a lot throughout the codebase.

These do not exist in core afaik. I am not really involved in no_std development, so I'd need some help here. Is there some best practice to replace these traits with something more appropriate?

If the answer is: we don't use these traits, what API would a no_std user of this crate want, if not Read/Write based?

I think you could define some simple traits that similar to std::io::Read/Write but could used in no_std.

Or just use bytes lib https://docs.rs/bytes/latest/bytes/ ?

yjhmelody avatar Aug 03 '22 10:08 yjhmelody

I think you could define some simple traits that similar to std::io::Read/Write but could used in no_std.

Sure, but I would imagine that the no_std community has some go-to replacement for these traits? Also using the std traits does provide some benefits, so I'd have to conditionally use one or the other trait depending on the std feature which seems complicated.

Using bytes would mean a relatively big change to the mechanisms this crate uses to get input and write output. It relies relatively heavily on the read/write traits.

Maybe it would be worth to fork this project and see how complicated it would be to just do a no_std version without keeping the benefits of using std and go from there?

KillingSpark avatar Aug 03 '22 10:08 KillingSpark

Ok, so I played around with this a little bit.

AFAICT the best course of action would be:

  1. Add a std feature that is on by default
  2. A lot of #[cfg(std)] and #[cfg_attr(std, )] for all the this_error stuff
  3. Make Read/Write traits in this crate that have a generic error type and blanket impl it for std::io::Read/Write if the std feature is on
  4. Figure out best way of integrating these error types with the existing error enums without losing the this_error ergonomics
  5. Figure out what to do with the println! passages. I would REALLY like to keep them for debugging in the future. It's probably just a short macro_rules! block that can be left empty in case the std feature is off... Debugging is likely happening in std environments
  6. Add some form of breakage detection in the CI. Probably a second build + tests run with the std feature turned off?
  7. Find solutions to all the problems I did not see yet while toying around

Tbh I don't know when and if I will find the motivation to do this. PRs are very much welcome though if you want to attempt this @yjhmelody

KillingSpark avatar Aug 09 '22 18:08 KillingSpark

Fine, I will have a try if I have time :). But I have no knowledge about zstd.

yjhmelody avatar Aug 10 '22 09:08 yjhmelody

You don't have to, just wanted to let you know :)

Actual knowledge about the internals of the zstd format should not be necessary though.

KillingSpark avatar Aug 10 '22 09:08 KillingSpark