musetree icon indicating copy to clipboard operation
musetree copied to clipboard

Improving sound quality using a wave table synth

Open ravi-annaswamy opened this issue 4 years ago • 12 comments

While FM synth is lightweight and has been implemented well, this app can use a decent wave table synth.

Couple of approaches:

  • use patch based implementation used in bitmidi JS implementation
  • use fluidsynth-lite.

I will research both to add more references below.

Here is some discussion from bitmidi author: https://news.ycombinator.com/item?id=17930669

Background: Pardon if I am preaching to the choir:

The sound quality of FM is ultimately limited because of the fact that likeness of instrument is only partially the harmonics which FM can model somewhat OK, it is due to widely different ADSR envelopes - attack, decay, sustain, release characteristic of notes from each instrument.

For instance piano has quick attack, decays , little bit controlled sustain and somewhat slow release. Trumpet has slower attack since air has to fill the chamber but has sustain as long as note can be blown etc.

Wavetables maintain wave snippets for each instrument (usually multiple notes to have a 'model' in various ranges in which physical characteristics will change.

ravi-annaswamy avatar Jun 27 '20 15:06 ravi-annaswamy

Yes, I would like to second this request. The current sounds are, well, not pretty and do not resemble the intended instruments at all. It makes it very hard for the musical ear to assess the MuseNet compositions. I imagine there should be a simple solution available to allow rendering midi data as GM Midi instrument in a browser.

For example, there is a Web-Midi API available which might solve the problem: https://www.keithmcmillen.com/blog/making-music-in-the-browser-web-midi-api/ https://www.w3.org/TR/webmidi/

FreddieM007 avatar Jun 27 '20 18:06 FreddieM007

The synths have always been a weak point, as I was completely learning on the job. It took me a couple of months in total, so trust me when I say that there's no silver bullet. For example, the web midi API is just for controlling midi instruments and DAWs, rather than for actually playing MIDIs.

I do already apply AHDSR envelopes to each instrument, as you can see in https://github.com/stevenwaterman/musetree/blob/master/src/audio/instruments/bass.ts

However, I definitely agree that the sound is bad. I have briefly looked into timidity and bitmidi today, and they seem semi-promising, but are still sample-based, which means a medium-large download of audio files for the end-user, which I'd rather avoid if possible.

I'm open to suggestions, and the audio quality is always in the back of my mind. I've been deliberately putting it off because the return on investment for work on the audio is so low. I would spend days and have barely any improvement, so I've focussed on the UI and other elements where I can have a bigger impact.

stevenwaterman avatar Jun 27 '20 19:06 stevenwaterman

Thanks Steven.

First I apologize for suggesting ADSR without noticing that you already have the ADHSR envelope implementation.

As a developer, your effort/return equation make sense and your creative output on the workflow and app side is giving great returns and makes sense and you can prioritize this at your freedom. I cant argue with that except that for a subset of serious users who really want to use this tool for real compositions, sound quality during auditioning could be a yay or nay. But let us not let this discussion get into a debate.

Instead let us keep looking at this, as a low priority for you, but high priority for me to research options :) Maybe soon other JS skilled enthusiasts could add this feature.

If we find a medium-effort, big jump alternative to sound quality, I feel many users will be delighted and take MuseTree to another level.

Options:

  1. Can we just download 10 or 20 instruments from the freepats file instead of all 128 instruments for GM support? I notice for instance that Winds can be Clarinet or Flute but have not seen other instruments in that group.

  2. Along that direction, soundfont format allows creation of a subset of a GM soundfont into a musetree.sf2 sound font file of few instrument sounds.

  3. I also noticed that few folks have tried this soundfont approach with MIDIjs.

https://github.com/gleitz/midi-js-soundfonts blog of this author: https://blog.gleitzman.com/post/63283830260/midijs-playing-audio-in-the-browser-with

  1. And even built cool demos eg https://galactic.ink/piano/

  2. Entire fluidGM soundfont in this approach is 140MB uncompressed, but each instrument is around 1MB js-encoded, for instance nylon guitar encoded as a separate soundfont is 1.1MB. https://github.com/gleitz/midi-js-soundfonts/blob/gh-pages/FluidR3_GM/acoustic_guitar_nylon-mp3.js

so we could lazy load the 10-20 instruments that musenet uses.

ravi-annaswamy avatar Jun 27 '20 23:06 ravi-annaswamy

Sorry for jumping in again. Let me first say, this is an amazing project which has the potential to revolutionize how composers use AI as co-composers. Your tool is a huge step change over the MuseNet UI! I am extremely grateful that you took on this huge challenge!

I think your best bet is to piggy-back on already existing solutions.

  1. You may already know it but there is another project on git that wraps around the MuseNet API: https://github.com/MrCheeze/musenet-midi I believe it uses this HTML5 control: audio controls="controls" which seems to support midi, mp3, ogg, wav, etc. out-of=the-box Otherwise I think it uses just HTML plus some js, i.e., a lot simpler than your tool.

  2. MIDI.js: https://github.com/mudcube/MIDI.js is perhaps the easiest of all options to integrate into your code

  3. MuseNet itself seems to use this HTML5 control: <audio preload="auto" src="data:audio/mp3;base64,SUQzBAAAAAAAI1RTU0UAAAAPAAADTGF2ZjU2LjQwLjEwMQAAAAAAAAAAAAAA//tQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAASW5mbwAAAA8AAAL (...) Basically, rendering a mp3 and embedding it into the web site. That's probably the reason for its occasional crashes (after consuming several GB of memory)...

Thanks again for what your are doing here!!! Dirk

FreddieM007 avatar Jun 28 '20 01:06 FreddieM007

@ravi-annaswamy If there's a way to get good sound quality by downloading a few MBs of samples, I'm happy to do that. As-is, the MuseNet requests are already many MBs in size so it doesn't make much difference as we're already bandwidth-hungry.

@FreddieM007 With regards to 1 (and 3), I've chatted briefly with MrCheeze and a lot of my code is based off his (including all the encoding/decoding code). His tool is doing the same thing that MuseTree version 1 did - using the MP3 send by MuseNet in the http response. That worked fine, but prevented a lot of features that have been added since v2 (like max encoding length and postprocessing). The relevant tickets with the background for why the change to ignoring MuseNet's mp3 are https://github.com/stevenwaterman/musetree/issues/1 and https://github.com/stevenwaterman/musetree/issues/33

The HTML5 audio control can't play MIDIs, MrCheeze's musenet-midi is using it to play MP3s. Otherwise, that would've been an ideal solution.

Midi.js is similar to the web midi api, in that it doesn't actually play the midi files, just parses them and provides an event-based interface for you to create your own audio.

Personally, I feel like the solution I currently have of using the web audio API is probably the best, but we definitely need someone who's better at audio engineering to come in and design some better instrument patches. Mine were based off https://github.com/g200kg/webaudio-tinysynth, with the piano and strings being more advanced and hand-made based on the sound on sound synth secrets magazine series. The drums are just sample-based, ripped straight from MuseNet

stevenwaterman avatar Jun 28 '20 20:06 stevenwaterman

OK, makes perfect sense. Obviously, you have already done your home work and know the technology a lot better than I do. I am surprised and puzzled that the midi format is not by default supported just like mp3, wav, ogg, etc. given that it is a lot older. I did not expect that.

For what it is worth it, I came across this project: https://surikov.github.io/webaudiofont/ which uses various soundfonts. The example code looks fairly straight forward and the sound quality is good enough for preview in MuseTree. I don't know if it is compatible with your code.

FreddieM007 avatar Jun 28 '20 23:06 FreddieM007

@stevenwaterman Thanks for clarification. I am totally new to JS workflow. Let me try to implement sound font based on the 10 instrument patches in python and if successful, we can explore in JS.

ravi-annaswamy avatar Jun 29 '20 02:06 ravi-annaswamy

The webaudiofont find above linked by @FreddieM007 seems to be very viable.

Here is a demo page to play midi songs.

https://surikov.github.io/webaudiofont/examples/midiplayer.html

Choose mozart string quarter and it downloads less than 800K Choose Mission Impossible - 10+ instruments still downloads only 1.2MB.

Code also seems self contained.

ravi-annaswamy avatar Jun 29 '20 03:06 ravi-annaswamy

Some further suggestions here: https://github.com/stevenwaterman/musetree/issues/85

stevenwaterman avatar Jul 30 '20 15:07 stevenwaterman

Totally agree with Alex's comments and I am glad he was forceful in expressing it too. By the way Alex your tracks on Soundcloud are awesome!

In spite of such a vastly superior workflow provided by MuseTree, I continue to use original Musenet mainly because evaluating the quality of the returned completions is not practical in MuseTree with the soundset, so any prioritization of this feature will make a world of difference.

Thanks Ravi

ravi-annaswamy avatar Jul 30 '20 19:07 ravi-annaswamy

The Java class com.sun.media.sound.EmergencySoundbank uses an interesting technique to generate MIDI instruments (review the file's license, though). Apparently, the technique of EmergencySoundbank is to render the instrument's tone in the frequency domain and convert the samples using an inverse Fourier transform. Alternatively, instrument samples can be generated by somehow mixing the public-domain single-cycle wave forms found in the AdventureKid Wave Form Collection:

A related request of mine for public-domain software synthesis and instrument banks.

peteroupc avatar Sep 16 '23 14:09 peteroupc

The Java class com.sun.media.sound.EmergencySoundbank uses an interesting technique to generate MIDI instruments (review the file's license, though). Apparently, the technique of EmergencySoundbank is to render the instrument's tone in the frequency domain and convert the samples using an inverse Fourier transform. Alternatively, instrument samples can be generated by somehow mixing the public-domain single-cycle wave forms found in the AdventureKid Wave Form Collection:

A related request of mine for public-domain software synthesis and instrument banks.

Here is what I found out about how EmergencySoundbank works.

Most of the instruments in EmergencySoundbank are generated by:

  • rendering a weighted sum of Gaussian curves in the frequency domain,
  • sampling this weighted sum at 44100 Hz,
  • rotating each sample in the complex plane by a random angle,
  • doing an inverse Fourier transform, and
  • taking the real part of the result,

among other things (such as setting the volume and modulation envelopes as well as fading in the beginning of the sound and setting loop points).

A few EmergencySoundbank instruments (generally drum patches) are generated a little differently, namely as a bass part and as a treble part.

peteroupc avatar Oct 04 '23 13:10 peteroupc