wahwah icon indicating copy to clipboard operation
wahwah copied to clipboard

Parse Ogg duration lazily to allow for streaming Oggs without downloading the whole file

Open obskyr opened this issue 4 months ago • 2 comments

This is the second part to my pull request #40 – thank you for facilitating that, and for all your lovely feedback!

Building on the ability to supply IO objects directly to WahWah, this pull request changes the behavior slightly so that when a WahWah object is initialized using an existing IO (and only then), some tags are loaded lazily. This allows files to be streamed from the internet without downloading more than needed! The behavior remains unchanged when supplying a path, so this is not a breaking change in relation to the latest release.

Also added in this pull request is Tag#load_fully, which allows the user to load all the tags eagerly, in case they wish to avoid this behavior. An example:

require "down"

stream = Down.open("https://github.com/aidewoode/wahwah/raw/master/test/files/vorbis_tag.ogg")
tag = WahWah.open(stream)
tag.duration # => 8.0
stream.close
tag.duration # => 8.0 – Accessible after the stream was closed thanks to being cached from the previous access


stream = Down.open("https://github.com/aidewoode/wahwah/raw/master/test/files/vorbis_tag.ogg")
tag = WahWah.open(stream)
tag.load_fully
stream.close

# Accessible after the stream was closed thanks to `load_fully`.
tag.duration # => 8.0


stream = Down.open("https://github.com/aidewoode/wahwah/raw/master/test/files/vorbis_tag.ogg")
tag = WahWah.open(stream)
stream.close

tag.duration # => IOError! No longer open for reading.

The only tags loaded lazily as of this pull request are the duration and bitrate of Oggs, as they require reading to the end of the file – I don't see a need to make all tags lazy-loaded, as the gain is negligible for metadata that appears near the start of the file. The lazy-loading facilities, found in /lib/wahwah/lazy_tag_attributes.rb, are however generic and reusable, so they can easily be applied to other formats should it turn out to be helpful for any other format.

obskyr avatar Feb 15 '24 20:02 obskyr

Pull Request Test Coverage Report for Build 7922796668

Details

  • 46 of 46 (100.0%) changed or added relevant lines in 4 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.1%) to 96.744%

Totals Coverage Status
Change from base Build 7916820046: 0.1%
Covered Lines: 921
Relevant Lines: 952

💛 - Coveralls

coveralls avatar Feb 15 '24 20:02 coveralls

Thanks for the great work, but there is one thing I doubted whether the load_fully method is really needed. The only scenario I could think of is when the user closes the IO before accessing the tag attribute. Then, they can call load_fully to eagerly load all tags. However, they can also just change the order of the code to avoid closing the IO before accessing the attribute. I think it's fair enough just raise IOError to notify the user should keep the IO open. And for support load_fully method also adds a lot of complication in the code. Like we can simply use ||= instead of meta programming for memoization.

aidewoode avatar Feb 19 '24 09:02 aidewoode