rust-metaflac icon indicating copy to clipboard operation
rust-metaflac copied to clipboard

Performance improvements

Open SimonTeixidor opened this issue 4 years ago • 0 comments

Hi!

First and foremost - thank you for working on this library! I've been using it in a project I'm working on, and it quickly got me to a working prototype.

My prototype did not perform as well as I needed it to, so I started to optimise things. I ended up rewriting the logic for parsing out VORBIS_COMMENT blocks, which gave a nice speedup. See a benchmark here. The benchmark walks through a directory tree of flac files, and counts the number of times "Miles Davis" shows up in any tag. My results are here:

~/flac-benchmark $ cargo run --release -- ~/music
metaflac_count returns 1045 results in 0.442017544s.
metaflac_streaming_count returns 1045 results in 0.120263128s.
~/flac-benchmark $

(This is on a 300GB collection, I'm amazed that either solution is this fast, to be honest :))

I think the speedup comes from these factors, more or less:

  • I ignore all blocks that are not Vorbis comments
  • No allocations
    • A single Vec<u8> buffer is reused between files
    • Returned values are &str slices into the shared buffer
    • Values are returned with a "streaming iterator" instead of being copied into a HashMap

My point with this issue is to see if you would be interested in performance improvements? I would be happy to discuss, and to work on an implementation, if you are interested. I'm thinking that it should be possible to implement the current API on top of a streaming parser. Consumers of this library could then choose if they wish to work directly with the parser, or use the more high level API that you currently expose. Of course, my use case might be a bit special, so I'm also happy to just hand roll an optimised version for my project only.

SimonTeixidor avatar Apr 08 '20 21:04 SimonTeixidor