rodio icon indicating copy to clipboard operation
rodio copied to clipboard

Seeking

Open mainrs opened this issue 5 years ago • 42 comments

Looks like seeking isn't supported for now. Do the used libraries allow seeking in general? It's a feature I would need for an application and I would be willing to try to implement it. Not sure where though. The sink looks like a good place for it.

mainrs avatar Aug 16 '18 08:08 mainrs

Do the used libraries allow seeking in general?

Lewton does. Dunno about the other libraries.

est31 avatar Aug 16 '18 17:08 est31

I guess it should first and foremost be implemented on the decoder struct, if the underlying libraries support it (which I'm not sure of).

tomaka avatar Aug 16 '18 17:08 tomaka

Anyone get anywhere with this?
I need to skip 30 secs at a time through an MP3. Would it be possible to do this by Seeking in the underlying decoder?

michaelmarconi avatar Sep 05 '19 13:09 michaelmarconi

I couldn't really get it working and gave up on it 😢

mainrs avatar Sep 05 '19 13:09 mainrs

It's a feature I need too, unfortunately I don't have much time to try to implement it.

Lisible avatar Dec 25 '19 10:12 Lisible

@recapitalverb I only had success with seeking with hound (hound::WavReader::seek) + cpal, fitting this into rodio might be some work, you can probably add a seek method to Source and only implement if for WavDecoder

slmjkdbtl avatar Aug 03 '20 18:08 slmjkdbtl

@michaelmarconi Yes you can seek any format that way but it'll take a lot of time to parse the whole audio file, it's the best way if you're fine with that

slmjkdbtl avatar Aug 04 '20 07:08 slmjkdbtl

Yeah it'd be very CPU and RAM intensive and not a good idea. Some people may want to play back 10 hours of background music or something all in one file...

est31 avatar Aug 04 '20 07:08 est31

Yeah it'd be very CPU and RAM intensive and not a good idea. Some people may want to play back 10 hours of background music or something all in one file...

Aren't (almost) all audio formats streamable? At least that's what I always thought... You could then load an interval of like 1min into RAM and keep it there at any given time and just continuously load new content and drop the old one.

mainrs avatar Aug 04 '20 09:08 mainrs

Already 1 minute of samples at 48000 samples per seconds with 2 bytes per sample would mean 5760k or 6 mb of memory use. If it's an opt-in feature, people can use it if they want but optimally one would have seeking support in all of the libraries that rodio uses, then make rodio abstract over it.

est31 avatar Aug 04 '20 10:08 est31

Eager for the seeking feature for this library, I looked up for each underlying decoder library:

MP3

minimp3-rs, has no feature for seeking, though the original C library minimp3 does.

High-level mp3dec_ex_seek function supports precise seek to sample (MP3D_SEEK_TO_SAMPLE) using index and binary search

There is a fork that adds a wrapper to the seek function in the original library, last updated on May 2020, a half year ago. The implementation wraps mp3dec_ex_seek, which enables sample-level seeking.

WAV

hound does provide sample-level seeking.

Vorbis

lewton provides a seeking function OggStreamReader::seek_absgp_pg, which provides with per-page seeking.

Related: RustAudio/lewton#73, and maybe RustAudio/lewton#16 as well? Any updates? @est31

FLAC

claxon is used. According to this issue comment, "Claxon currently does not support seeking".

Some FLAC files have "SEEKTABLE" metadata, so this may help for implementing the feature. Even if there is no such metadata, one can still seek to arbitrary position and look for the frame header, which contains the frame position, so it is easy to implement binary searching.

TonalidadeHidrica avatar Nov 04 '20 23:11 TonalidadeHidrica

I needed seeking for my podcast player build on rodio. I now have it working (in the most basic way possible) for mp3 on my fork. For that I added:

  • a new member function for sources that can seek called request_pos which takes a float that is the duration from the start of a source to seek to (in seconds).
  • an Option<f32> field set_pos to the controls struct that is used by the sink to modify playback.
  • a member function set_pos() to the Sink that is used to set the field to some position.
  • a check if the field is not None within the periodic_access closure. It takes care of calling the member function request_pos() of the source and setting the field back to None.

However sources might not support seeking thus we can not simply add request_pos() to the Source trait. For now I only added seeking to the mp3 decoder. Sources that build upon a source, think of the Mix struct, complicate this further. What to do if only one of the two sources mixed supports seeking? For now I added the request_pos() function under a new trait SourceExt. I implemented if for the structs:

  • Decoder<I>
  • Pausable<I>
  • Stopable<I>
  • Amplify<I>

where <I> has the trait SourceExt. I do not know if something like this would work for the other sources/structs. I suspect not. Has anyone got an idea how I could improve this? Should I open a PR and work on it there?

PS: I forked the fork of minimp3-rs mentioned above, brought it up to date with master and extended it to allow decoding a whole frame. It is used as a git dependency here instead of minimp3-rs on crates.io

dvdsk avatar Jan 13 '21 23:01 dvdsk

@est31 Any word on @dskleingeld's implementation?

@dskleingeld Do you think the seeking method could be added in a PR to the official minimp3-rs?

probablykasper avatar Jun 05 '21 21:06 probablykasper

@dskleingeld Do you think the seeking method could be added in a PR to the official minimp3-rs?

Should be doable, it adds a seperate Decoder object (SeekDecoder) that wraps around the high-level mp3dec_ex functions. However I feel some additional tests are needed. I am only using it for a side project right now so it has not been rigorously tested.

dvdsk avatar Jun 06 '21 00:06 dvdsk

@est31 Any word on @dskleingeld's implementation?

Haven't looked at it but in general seeking support would be a great addition. If it needs upstream changes, those need to be upstreamed first before it can be merged in rodio.

est31 avatar Jun 06 '21 04:06 est31

I have a branch here with a number of changes including seeking working. I was getting some strange runtime errors on Windows with the minimp3 fork with seek support (been a while and I can't remember what it was exactly. I think it was an access violation.) so I ended up using Symphonia instead. Symphonia supports MP3, FLAC, and MP4/AAC so I switched the existing MP3 and FLAC decoders over and also gained AAC support as a result.

Unfortunately Symphonia doesn't have seek implemented for AAC so I had to fork Symphonia as well and implement it there. The implementation is quick, hacky, and not particularly precise if you need sub-second precision as I don't really know much about audio codecs, but it seems to be fast enough and works okay if you can accept the limitations.

I don't think any of my changes are mergeable as is because I haven't thoroughly tested it aside from just using the basic playback capabilities so some features may not work as it is right now. I have been using it in my audio player project and the basic functionality seems to be working at least, but it should be considered very much a POC and not optimized.

The other change I added to this branch is discarding the silent samples from the very beginning of the stream to enable psuedo-gapless playback (seems to be gapless enough for the tracks I tested so far). I plan on making that an opt in feature, but haven't gotten around to it yet. I hope I can make all these changes mergeable eventually, but getting a non-hacky seek solution for all the codecs in Symphonia is an obstacle, there's some changes to Lewton that need to be merged first, and I'll need to do a lot more testing on the seek changes in Rodio to check for any regressions.

Here are all the changes I made in case anyone's curious: Rodio Changes Symphonia Changes

This also required this fork of Lewton with better seek support

Anyway, sorry for the essay, but I figured I'd let people know in case they really want this feature right now and are okay with a hacky solution until I can make this better or someone else creates a better solution.

aschey avatar Jun 07 '21 00:06 aschey

@aschey This sounds awesome, excited to see progress on it!

probablykasper avatar Jun 07 '21 03:06 probablykasper

@aschey, seeking support for all formats, that is great! I see you added a seek method to the Source trait, that looks way cleaner then adding a new trait like I did. I thought this would be a problem for sources that can/do not support seeking. However right now these could fake a seek implementation (always returning Ok(current_pos)?).

How can I help get this in rodio?

dvdsk avatar Jun 07 '21 10:06 dvdsk

@dskleingeld Yeah, that's basically how I handled it. I figured it's fine for a seek to be a no-op on situations where it doesn't make sense like a sine wave. If you want to run my branch and test it out a bit, that'd help me out a lot. PRs are welcome as well if you see something that can be improved.

Most of the source implementations are totally untested right now as I only used it for basic playback, none of the fancy effects or looping or anything. I just made my best guess as to how seek might work for the sources that I haven't used so there's a good chance that a lot of them still need tweaking and some definitely need to be cleaned up with proper error handling and whatnot.

In terms of getting this in for real, I think this needs to happen in a few phases. I was lazy and put all of this in one branch just to see if it would work at all, so I'll copy the different parts of my changes into new branches so they can be merged separately.

  1. Symphonia's decoder is not Send so I had to make a few modifications to get it to fit in with Rodio's requirements. Unfortunately the new Send requirement on MediaSource is technically a breaking change since that's a public trait that can be implemented by the user so I'll see if I can find a way to make this work without changing MediaSource directly. Looks like there's already a PR to add Send to MediaSource, but it hasn't gotten any response yet.
  2. Refactor Rodio to replace the existing MP3 and FLAC decoders with Symphonia.
  3. Add the seek function to the Source trait. Seek won't work for AAC and Ogg yet. I couldn't get seek to work in Lewton at all without TonalidadeHidrica's fork that I mentioned above, but I might've been using it wrong.
  4. Update Lewton and Symphonia with proper seek support for AAC and Ogg.

I've got some life stuff going on right now, so I might not make too much progress in the next few weeks, but I'll start working on part 1 and 2.

aschey avatar Jun 07 '21 14:06 aschey

I'm positive towards symphonia integration in general, but some users of rodio might have concerns about the copyleft license that symphonia uses (MPL-2.0). Many gamedevs have a disdain towards anything copyleft related, even if it's the weak copyleft of MPL-2.0. I think an optional backend based on symphonia would work (or maybe multiple, idk what's more elegant), but it shouldn't replace any existing "permissively" licensed backends.

est31 avatar Jun 07 '21 16:06 est31

Okay, fair enough. I can probably make them interchangeable and have it opt-in via a feature flag. Seek can just be a no-op for any decoders that don't support it.

aschey avatar Jun 07 '21 16:06 aschey

Hey all, developer of Symphonia here. I stumbled across this thanks to @aschey's recent PRs. I'm happy that Symphonia is getting some attention and thought I could address some of your concerns and add a few thoughts.

I'll preface this by saying that I am having difficultly finding time to work on Symphonia, hence the low activity on the repo, but I do hope to dedicate a few hours each week to fix bugs, merge PRs, and refactor things as necessary to improve the library and support different users. In the longer term, my goal is to complete the Vorbis decoder, and implement support for Opus.

So onto some thoughts:

  1. I believe most of the core traits: FormatReader, MetadataReader, Decoder, MediaSource, etc. can be marked Send without issue.
  2. I'm sitting on a couple commits that refactors the seek interface and improves MP3 seeking. These haven't been pushed to the public repository yet. I'll be bumping the versions of all crates to 0.3 before publishing so marking the traits in point 1 as Send should be fine.
  3. On the topic of seeking, to seek what users typically consider an AAC file (*.aac, *.m4a) would require both the ADTS and MP4 readers to implement seeking. The former would roughly be implemented the same way as MP3, while the latter is considerably more complex. This is something I am starting to look into to support the improvements in point 2.
  4. Just a heads up, the AAC decoder is not perfect. This may be something you want to consider since users will likely be filing the bugs to you and I don't think I'll be able to address them that quickly.
  5. Just a suggestion, but Symphonia currently supports all of Rodio's codecs except Vorbis. Perhaps it makes sense to just wrap lewton in a Symphonia Decoder and use Symphonia directly? This buys you a few things such as intelligent format detection (i.e., you can pipe bytes into Symphonia and it'll figure out which format and decoder to use), pretty thorough metadata support (including weird combos like ID3v2 on FLAC -- yes, I've seen this!), and support for many format-codec combinations such as FLAC-in-OGG. The downside is the risk of regressions.

Cheers!

pdeljanov avatar Jun 10 '21 02:06 pdeljanov

Thanks for reaching out @pdeljanov !

To 4: interesting, will keep it in mind. To 5: wrapping lewton might be a good idea anyways, but as I've said above, symphonia should rather be an option instead of a hard dependency. Having it as an option would be awesome.

est31 avatar Jun 10 '21 08:06 est31

Thanks for getting my PRs merged @pdeljanov. Wrapping lewton in a Decoder trait is an interesting idea that I'll definitely take a look at. I have a working branch with optional Symphonia integration here that I plan to make a PR for once the next Symphonia version is released.

aschey avatar Jun 11 '21 04:06 aschey

Would want to have seeking support in redlux (which glues together rust-mp4 and fdk-aac c-bindings with rodio) for .aac and .m4a, which @pdeljanov made me realize should at least be doable for AAC by parsing the ADTS headers while keeping track of where in the file we are. For m4a I'm pretty clueless though, anybody got any ideas?

probablykasper avatar Jun 11 '21 05:06 probablykasper

Hey @aschey, just wanted to give you a heads up that I published 0.3.0 which includes your changes.

@probablykasper, for basic (non-fragmented) MP4 files, all you have to do is use the STTS atom to convert a sample timestamp to a sample number. Once you have the sample number, you just need to start reading from that sample number forward. I implemented seeking in MP4 (and ADTS) for v0.3 of Symphonia so you could get some inspiration from that albeit it's a bit more complicated because it also supports fragmented streams.

pdeljanov avatar Jun 26 '21 17:06 pdeljanov

Hey @aschey, just wanted to give you a heads up that I published 0.3.0 which includes your changes.

@probablykasper, for basic (non-fragmented) MP4 files, all you have to do is use the STTS atom to convert a sample timestamp to a sample number. Once you have the sample number, you just need to start reading from that sample number forward. I implemented seeking in MP4 (and ADTS) for v0.3 of Symphonia so you could get some inspiration from that albeit it's a bit more complicated because it also supports fragmented streams.

Great, thanks for the heads up! Just created the PR to add Symphonia integration here: https://github.com/RustAudio/rodio/pull/376. Happy to accept feedback if you have ideas for improvements. I did briefly look into wrapping Lewton in a decoder trait, but it didn't seem like Lewton's API would be compatible with the way the decoder trait works since Lewton requires access to the underlying read/seek source.

Bringing this back to the intent of the original issue, I'll work on another PR to expose Symphonia's seek functionality to Rodio once the previously mentioned PR is merged.

aschey avatar Jun 27 '21 05:06 aschey

I did briefly look into wrapping Lewton in a decoder trait, but it didn't seem like Lewton's API would be compatible with the way the decoder trait works since Lewton requires access to the underlying read/seek source.

Lewton has two different APIs. A high level one in the inside_ogg module that indeed needs Read/Seek, as well as a lower level one which only needs access to &[u8] packets.

est31 avatar Jun 27 '21 06:06 est31

Patching Lewton might no longer be necessary, since Symphonia now has a Vorbis decoder.

It's 100% safe code and is at around 10% slower than ffmpeg. Correctness should be quite good as well - I've tested it against the reference implementation on over 200Gb of Vorbis from various sources, and all the mismatches with the reference implementation have been fixed.

Shnatsel avatar Jan 24 '22 00:01 Shnatsel

I've been wondering why have rodio at all, given that symphonia exists, has more codecs, etc. Why not use symphonia directly? There is one big difference: Symphonia has a copyleft license. For some people that's good. But I think many don't like copyleft.

So the target market of rodio is everything that's left. People who want to use copyleft free licensing. This is a big thing in e.g. games for example. This extends to the libraries that rodio is using. So whatever libraries rodio uses, they should be patched, if possible.

Generally it seems that symphonia has more maintainer momentum right now than rodio and related projects. The approach of having a monorepo is quite interesting, compared to rodio's approach of one-codec-per-crate. I still think that, beyond inertia, it's still valuable to maintain rodio and the libraries that rodio uses.

est31 avatar Jan 24 '22 01:01 est31