file-format icon indicating copy to clipboard operation
file-format copied to clipboard

Add Serde, Strum and From features

Open FelixEngl opened this issue 1 year ago • 4 comments
trafficstars

Hello Mickaël,

this adds various functions that I found useful while working with file-format. To sum it up:

  • Adds optional non-std serde support for the enums FileFormat and Kind.
  • Adds optional extensions from strum, especially the IntoEnumIterator and AsRef<str> are useful when writing autogenerated code.
  • Adds the optional features from_extension and from_media_type. Both are returning a slice with the possible associated file formats for a media type or an extension. I also included the tests for all existing file formats as well as some health checks to ensure that all newly added file formats are also valid.

I know it looks like a big pull request, but this is mostly caused by adding all the necessary tests and calling the macros for generating the from_* methods. The real code changes are only around 80 to 90 lines. (To be honest, if we accept the strum features, we can possibly drop all of the new tests for from_* except three necessary, because we can use FileFormat::iter() to write the tests.)

Best regards Felix

FelixEngl avatar Aug 09 '24 10:08 FelixEngl

Hello Felix,

Thank you very much for your PR. I'm currently on holiday, but I'll have a look at it when I get back.

mmalecot avatar Aug 27 '24 12:08 mmalecot

I have checked out the proposed implementation, and the approach of manually authoring the mapping seems suboptimal. It is very easy to make a mistake and for the mappings to go out of sync with the actual formats. I propose adding a build script instead that would generate the source code files on the fly by collecting indexes from the all-formats enumerator provided by strum.

MOZGIII avatar Oct 16 '24 12:10 MOZGIII

I have checked out the proposed implementation, and the approach of manually authoring the mapping seems suboptimal. It is very easy to make a mistake and for the mappings to go out of sync with the actual formats. I propose adding a build script instead that would generate the source code files on the fly by collecting indexes from the all-formats enumerator provided by strum.

Sounds good. 👍 Sadly I don't have the time to take care of that at the moment. ☹️

But if the do that we could also use https://docs.rs/phf/latest/phf/ for improved performance? 🤔

FelixEngl avatar Oct 21 '24 14:10 FelixEngl

But if the do that we could also use docs.rs/phf/latest/phf for improved performance? 🤔

We'd need to bench

MOZGIII avatar Oct 21 '24 16:10 MOZGIII

Hi!

I'm currently working on optimizing the crate. I have several ideas, but there's still some work to be done.

I like the idea of using phf to optimize detection with several signatures, but also to collect all file formats for a given extension or media type.

The PR is a bit old, and it took me a while to get back to it, but I'm leaving it open to see what can be kept.

mmalecot avatar Aug 15 '25 12:08 mmalecot