fluidsynth
fluidsynth copied to clipboard
Support for FLAC compressed samples?
SF3 uses the Vorbis lossy format, which, whilst offering a reduced file size, comes with a reduced sample quality that would often be unsuitable for a professional audio environment.
SF4 is essentially a similar modern variation on SF2, albeit using losslessly compressed FLAC audio samples, offering a balanced middle-ground between SF2 and SF3.
(The SFZ format already supports FLAC samples.)
There is scant support for SF4 as of yet - there is a conversion script available, https://github.com/cognitone/sf2convert - but does seem an eminently sensible idea.
SF4 seems to be very unknown. So far we only have one reply:
https://lists.nongnu.org/archive/html/fluid-dev/2020-01/msg00007.html
At least it's a positive one. I'm also positive because it should integrate into the current code base very easily. In case anybody is willing to draft a PR, you're welcome. (Not sure when/if I find time for that.)
As already written in the thread on fluid-dev, I retract my positive reply. I thought I could reduce my loading times with FLAC compressed soundfonts, but that doesn't actually work.
@mxmilkb Do you have a good use-case for FLAC compressed soundfonts? I agree that it seems like a sensible idea in principle... but I think that's not enough to create yet another non-standard SF2 variant.
So great to catch live talk, related to sfz-ext standards. As far as I understand, adding compressed samples to sfz is just matter of grep/sed-ing through its text description, e.g. to replace extensions.
Just look to this FR: https://github.com/davy7125/polyphone/issues/7 Shortly - linuxsampler is able to load sfz with any sample formats, supported by libsndfile, not just ogg/flac. I made that FR after short play with self-written script, which can take any *.sfz file and replace audio extensions. That way I tried to convert some of banks at my PC into flac.
Unfortunally idea of supporting every possible format via libsndfile was not even commented. Any pros&cons about that? Why should it ever require separate (sub-)format version to just add a single audio file format?
This thread is about extending SF2, not SFZ.
There's further discussion on https://github.com/divideconcept/FluidLite/issues/16
I really don't mind by what mechanism, but the use case for FLAC sound banks is plain and simple; smaller yet still lossless files.
Just visited sf2convert sources page. Now I'm lost in what is sf3. MuseScore names with it an extended sfz, differing only by ogg support. And when I looked to FR start, I assumed both sf3 and sf4 are sfz extensions. Did not know about FluidLite :]
Edit: Another probably good format, which just came to my mind, for sf2/sfz support - opus.
SF3 and the proposed SF4 are SF2 extensions. It has nothing to do with SFZ whatsoever.
Well, thanks. Meanwhile I visited musescore soundfonts page, and it states same :/ (how could I miss that)
Given that we went from SF2 (no compression) to SF3 (Vorbis compression), it seems a bit weird to then go to SF4 (FLAC compression). This means an application could conceivably support SF2 and SF4, but not SF3, just because it doesn't support Vorbis compression. Also, where do we go for Opus compression? SF5 anyone?
So we're already thinking about SF5, and yet nothing has changed since SF2 except the type of compression!
Perhaps it would be better to do something like SF2X where X is the format. E.g.:
- SF2 - No compression
- SF2F - FLAC
- SF2V - Vorbis
- SF2O - Opus
This makes it almost like .tar.gz: undo the compression and you are left with a standard container.
It's unfortunate that SF3 already breaks this trend, but it's non-standard anyway so it doesn't really matter.
As already posted on the mailing list: This is not about bumping the Soundfont's major version every single time we want to support a new compression! Bumping to version 3 was necessary due to breaking changes in the sample indexing. No further bumps are required just to support additional compressions. Please scratch that thought from your brains. I changed the title to clarify that.
What we're really looking for are people who have a use-case for FLAC, OPUS, or what-so-ever compressed samples.
Perhaps it would be better to do something like SF2X where X is the format.
The compression is sample-specific. That is, each sample may be compressed differently than the others. Thus, you cannot name the whole file after its compression.
I thought I could reduce my loading times with FLAC compressed soundfonts, but that doesn't actually work.
When I experimented converting by banks (sorry, again SFZ))) to FLAC - just essential, that compression requires some CPU load during load. Similar to linux kernel loading schema. Why it really matters for me - way lesser space usage. Just like WAV is worst to store music, sampler bank collection also would need at least some compression. Even 2x to 3x space use reduction, most typical for FLAC, looks valuable.
We are looking for real-world use cases for FLAC (or opus, or whatever) compressed samples in SF3 soundfonts.
I think we all agree that compression is a good concept in general. But so far nobody has come forward stating an actual problem that FLAC would solve (again: for SF3 soundfonts).
I thought it might solve my loading time problem, but that turned out to be not true.
SF2F - FLAC, SF2V - Vorbis, SF2O - Opus
For sf2 there could be SF2C (sf2 compressed) or so file ext. Even without letter - I noticed, that WAV as container allowes very much compression formats. At least in ffmpeg. I looked to ffmpeg audio export options, in audacity, and there are even FLAC, Vorbis, mp3, mp2, wma.
Edit:
We are looking for real-world use cases for FLAC (or opus, or whatever) compressed samples in SF3 soundfonts.
Ok, but I noticed some discussion about possible solutions and proposed own).
Ok, but I noticed some discussion about possible solutions and proposed own).
Thanks for that. But the technical aspect is not really the issue here. Adding FLAC support to SF3 is nearly trivially easy to implement. The question is: does anyone have a really good use-case that justifies creating yet another non-standard extension to the already non-standard and undocumented SF3 format?
Hm... If sf3 is not standard, then why worry. But even then - considering it's like standard version (v2 for sf2, v3 for sf3), it could be like v3.1. Still sf3, but even further improved. Adding more sample format support doesn't look breaking. At least unless it's baked as standard.
Because we are not alone in the world but part of a larger community of projects that read and write SoundFonts. Of course we could simply create our own format and write a tool that converts SF2 to FLAC compressed SF3. But even for internal use we would probably want to properly document and specify it. And maintain it for the forseeable future.
Regardless of the standardisation question... the question is: why should we do that, why spend time and effort on it, what is the real-world use-case here?
And limiting it to a FluidSynth-only extension doesn't remove that question. Now the question would be: who has a use-case for FLAC compressed SF3 SoundFonts that only work with FluidSynth?
Oh, interesting to see this. I’ve requested FLAC-compressed samples (for its lossless property) in SF3 soundfonts on the MuseScore side years ago.
This could be done in SF3 because you can put FLAC into an Ogg container, and SF3 basically specifies Ogg containers (or could be read as doing so).
Sadly, nobody on the MuseScore side even understood why I would wish for that…
Use case is easy: soundfonts get huge (MuseScore_General_HQ is currently 467 MiB as SF2, 82 MiB as Vorbis-compressed SF3). I’d estimate it to need less than 200 MiB as FLAC-compressed SF3, and FLAC can be used as source format for editing, whereas Vorbis is lossy. But, of course, MuseScore (Cc’ing @anatoly-os because he could spread the word on their side) would have to support it in their internal fluid synthesiser as well.
(Also, Vorbis currently adds the hard-coded string Xiph.Org libVorbis I 20180316 (Now 100% fewer shells) (a whooping 53 bytes) as comment per sample (i.e. 1246 times, that is over 64K, in MuseScore_General_HQ); I’m hoping for the FLAC encoder to be not as stupid.)
But the question remains: what is the use-case of having a 200 MB vs. 467 MB Soundfont? I mean: why do you need smaller soundfonts? Do you have 200 different Soundfonts on your machine and have run out of disk space? Or do you need to transfer the soundfonts via low-bandwidth connections? (Not trying to be difficult here... I just want to make sure there is an actual use-case an not just the "smaller is always better" argument).
Marcus Weseloh dixit:
Or do you need to transfer the soundfonts via low-bandwidth connections?
Ever been to Germany? ;-) So, yes.
(Also, been helping out some people stuck on crap devices like the Raspberry Pi recently. Anything to reduce size may help.)
bye, //mirabilos
Wish I had pine to hand :-( I'll give lynx a try, thanks.
Michael Schmitz on nntp://news.gmane.org/gmane.linux.debian.ports.68k a.k.a. {news.gmane.org/nntp}#news.gmane.linux.debian.ports.68k in pine
what is the use-case of having a 200 MB vs. 467 MB Soundfont? I just want to make sure there is an actual use-case an not just the "smaller is always better" argument).
I see, it supports libsndfile. Just like linuxsampler, which simply takes what libsndfile can read. If sndfile fails, than that's real obstacle.
Ever been to Germany?
Ja, ich halte mich da hauptsächlich auf. :-)
(Also, been helping out some people stuck on crap devices like the Raspberry Pi recently. Anything to reduce size may help.)
Not sure I follow: how would reducing on-disk size of Soundfonts help the Raspberry Pi?
Marcus Weseloh dixit:
Not sure I follow: how would reducing on-disk size of Soundfonts help the Raspberry Pi?
I’ve hopes that a mode will be implemented in which not all samples for all instruments need to be uncompressed ahead of time (only for instruments actually needed).
Other than that, package download and SD card size of course, though that applies to other devices as well.
bye, //mirabilos
“ah that reminds me, thanks for the stellar entertainment that you and certain other people provide on the Debian mailing lists │ sole reason I subscribed to them (I'm not using Debian anywhere) is the entertainment factor │ Debian does not strike me as a place for good humour, much less German admin-style humour”
I’ve hopes that a mode will be implemented in which not all samples for all instruments need to be uncompressed ahead of time (only for instruments actually needed).
If your goal is to reduce RAM consumption, we already have such a feature:
- http://www.fluidsynth.org/api/fluidsettings.xml#synth.dynamic-sample-loading
- See https://github.com/FluidSynth/fluidsynth/pull/366 for more details
If your goal is to reduce RAM consumption
For the Pi specifically, yes. Network/disc usage is a more general goal. Both are somewhat orthogonal, admittedly.
we already have such a feature:
Oh, nice! @anatoly_os we need that in MuseScore.
After testing out #652 and needing to download a 4GB soundfont, I do see some value in FLAC compression to reduce file size :-)
Ok... in my opinion, before we can even think about extending SF3 to support different encoding formats, we need a specification for the current SF3 format. I've created a first draft in the FluidSynth wiki here: https://github.com/FluidSynth/fluidsynth/wiki/SoundFont3Format
Please note that this page is meant to document the current SF3 format, not about extensions for other encoders or other changes. Any comments, clarifications, error corrections highly welcome. @derselbst is there a way to make this page editable for everybody? Or do you think there would be a better place to discuss and document the SF3 format?
Marcus Weseloh dixit:
Please note that this page is meant to document the current SF3 @format, not about extensions for other encoders or other changes. Any
Sure. We might wish to get @anatoly_os and @lasconic at the very least from the MuseScore side into the boar.
@comments, clarifications, error corrections highly welcome. derselbst @is there a way to make this page editable for everybody? Or do you @think there would be a better place to discuss and document the SF3 @format?
Unsure if there is one, but the musescore/sftools repository also contains hidden knowledge, and both @ChurchOrganist (I think) and @mrbumpy409 (definitely) have worked with and tweaked it.
Thanks for the wiki page @mawe42. I think that's a good place.
"All samples in an SF3 file MUST BE Ogg Vorbis compressed, there is no support for mixing uncompressed (SF2) and compressed (SF3) samples."
Marcus, it's not clear to me what makes you think so? Using dwStart and dwEnd as byte indices together with the indication of sfSampleType we should be able to mix up the SMPL chunk with OGG and PCM, shouldn't we?
is there a way to make this page editable for everybody?
Every authenticated GH user should already be able to edit it.
Marcus, it's not clear to me what makes you think so? Using
dwStartanddwEndas byte indices together with the indication ofsfSampleTypewe should be able to mix up the SMPL chunk with OGG and PCM, shouldn't we?
Partially, and only in theory. My approach was to take the sf3convert utility from MuseScore as "ground truth", not the way we implemented SF3 support in FluidSynth. That tool compresses all samples, there is not option to selectively compress some samples.
And if mixing OGG and PCM works (I haven't tried it), then I think only by accident. Consider the case where the first sample is OGG and the second is PCM. If the compressed OGG data has an uneven number of bytes, then the dwStart pointer to the PCM sample would be off-by-one (as dwStart for PCM uses word indices, not byte indices). Or did I miss something and the sf3convert tool or OGG Vorbis uses padding to 2-byte?
Also consider the case of 24-bit PCM samples. Granted, I wrote that 24-bit samples are not supported. But the main reason for that is the OGG compression. If we could mix PCM and OGG, then PCM should be allowed to be 24-bit, I think. But the dwStart and dwEnd indices are also used as byte indices into the SM24 sub-chunk. And if the PCM sample is not the first, then SM24 would need to be zero-padded for the length of the previous OGG samples.
Oh, and please note that by writing it down like that, it doesn't mean that I think it should work this way. I just tried to document the implicit requirements for SF3 to actually work as it currently does. As soon as those requirements and corner-cases are clear, I think we should discuss if they make sense, are required and actually wanted. And then change them.
Sure. We might wish to get @anatoly_os and @lasconic at the very least from the MuseScore side into the boar.
Yes, good idea. I reached out to MuseScore before, will ping them in the forum to come join the party. And also post on the fluid-dev mailing list to get more input.