ogg icon indicating copy to clipboard operation
ogg copied to clipboard

Simple header search and remove dependency to io::Seek

Open TonalidadeHidrica opened this issue 5 years ago • 7 comments

closes #12

I simplified the algorithm of finding the page header. This method requires a lot of read() function called, but the performance degression can be avoided by wrapping the underlying reader with BufReader or some other buffering reader.

The algorithm of searching header is intended to be used only for continued sequence of pages. I am going to introduce another algorithm dedicated for page searching after seek later.

TonalidadeHidrica avatar Nov 21 '20 21:11 TonalidadeHidrica

This uses read_exact and isn't async friendly. Also, I'm not sure I consider #12 to be a bug. The best resolution to #12 is a helper type that takes a Read implementation and converts it into a Read + Seek implementation by offering some limited relative seeking.

est31 avatar Nov 21 '20 22:11 est31

I agree that read_exact blocks the entire thread and not async friendly, but I don't understand the difference between the original implementation. Does using read function make it async-friendly?

I don't strongly think that #12 is a bug that has to be handled right away, but something that improves the usefulness. Still, I don't think it's a good idea to enforce Seek via helper type, because that's what we can do it under the hood.

TonalidadeHidrica avatar Nov 21 '20 22:11 TonalidadeHidrica

Does using read function make it async-friendly?

~Yes. read itself can return a WouldBlock error. It's the pre-async model but is iirc somewhat compatible with the async/await features. Need to look up the details.~ Edit: see below.

that's what we can do it under the hood.

If it's done under the hood, it's done unconditionally, even though it adds a non zero overhead. So it's a bad idea.

est31 avatar Nov 21 '20 22:11 est31

Wait disregard that, all of PacketReader is non-async-ready.

est31 avatar Nov 21 '20 22:11 est31

So I don't have to care about async for this implementation? If so, what are other problems left?

TonalidadeHidrica avatar Nov 21 '20 22:11 TonalidadeHidrica

My previous code returns Error when failed to find a header, which is different from what is written in the document, so I corrected. As an side effect, the code does no longer use read_exact.

TonalidadeHidrica avatar Nov 23 '20 13:11 TonalidadeHidrica

Hmm thinking about this some more, it basically adds the requirement to copy data once more if you want to retain performance, which isn't optimal for performance due to the additional copy/buffering. Avoiding this was a deliberate design choice. See my comment above.

Maybe the new behaviour could be exposed under a different API. Then users can choose themselves which approach they want (lots of Read calls or occasional seeking).

est31 avatar May 10 '21 23:05 est31