chip-player-js icon indicating copy to clipboard operation
chip-player-js copied to clipboard

New SF2 Synthesis Engine

Open spessasus opened this issue 7 months ago • 26 comments

Hi Matt,

Firstly I'd like to thank you for this project. The ability to play various music formats all in your browser is very cool, and the amount of songs available is very impressive!

However, in regards to MIDI support, the FluidLite library which is currently used doesn't seem to recognize a lot of common MIDI messages, such as bank select or drum channel sysEx. Here's an example. Even the default windows synthesizer recognizes the sine wave patches and three drum channels.

This is why I propose to replace the current soundfont engine with a synthesis library I'm maintaining: spessasynth_lib.

Here's why I think this is a good idea:

  • Typescript - I recently did a full rewrite of the entire project in typescript, so the library comes with full typescript support and extensive JSDoc comments.
  • Extensive MIDI support - I battle-tested this library with a lot of different MIDIs, both for XG and GS (and GM2!). I've devised an extensive bank selection algorithm ensuring that "XG-Compatible" soundfonts do use XG presets in XG MIDIs and GS presets in GS MIDIs (if they are available), solving the issues above. I've also included support for a lot of important system exclusive messages such as XG program change, random panning and more!
  • Audio worklet support - I've noticed this issue and I'm happy to report that spessasynth_lib comes with a native support for both AudioWorklets and Web Workers. However, if you'd like direct access to the audio samples, the underlying spessasynth_core library can be used directly.
  • Native DLS support with MobileBAE instrument aliasing - Conversions like these wouldn't be necessary anymore since the original .dls files can be uploaded and played accurately. This includes the infamous gm.dls file that powers the Microsoft GS Wavetable Synth!
  • RMIDI support - There was a format that contained both a DLS sound bank and a MIDI in one file. Some games used it and there's a small collection available. Spessasynth comes with full support for these, both DLS and SF2!
  • XMF support - Similar to RMIDI, and extensivelu used in phones for ringtones.
  • SF3 support - Compressed soundfonts! The website would be able to serve higher quality soundfonts while keeping the file size small.
  • Smart sequencer - Runs in the worklet/worker and can output to a Web MIDI output if needed. It automatically preloads needed samples before starting playback to prevent stutters!

I'd be happy to help you with integrating it into Chip Player if you're interested. Let me know what you think!

spessasus avatar Sep 11 '25 13:09 spessasus

uhh... wow!! This sounds cool. I'm somewhat amazed that the synth engine is written in TypeScript.

mmontag avatar Sep 17 '25 17:09 mmontag

I'm somewhat amazed that the synth engine is written in TypeScript.

Well, for most of the project's lifespan it was pure JS, but having tried typescript in another project I liked it and rewrote everything in it for improved developer experience and for easier use. the NPM package is still JS so it can be used with JS projects, but it comes with full type definitions.

I also forgot to mention that the library I'm proposing of course includes full SF2 support, including modulators, matching fluidsynth in sf2 compliance.

BTW here's the demo if you want to try it out: https://spessasus.github.io/SpessaSynth/

And again, I'd be happy to help integrating the library into the project.

spessasus avatar Sep 18 '25 13:09 spessasus

I'd love to make this happen, but would probably need some help.

Right now, MIDIPlayer supports multiple engines through this ad-hoc synth object in MIDIPlayer.js: https://github.com/mmontag/chip-player-js/blob/master/src/players/MIDIPlayer.js#L167

And I have my rudimentary sequencer code (MIDIFilePlayer.js) that reads MIDI data and triggers these noteOn, noteOff, etc.

I would like to keep a shared sequencer among all the synth engines. (If you have suggestions here, I'm all ears.)

So if your core player can be adapted to this, that would be a start. But I may need to figure out a better abstraction/model around this. Oh, I'd eventually also like to add emusc or nuked-sc55, so whatever solution we come up with should support that too...

mmontag avatar Sep 19 '25 05:09 mmontag

I would like to keep a shared sequencer among all the synth engines. (If you have suggestions here, I'm all ears.)

So if your core player can be adapted to this, that would be a start.

While of course the synth supports the methods like noteOn, noteOff, etc (and even directly sendMessage where you can pass a binary message directly), I noticed that your sequencer doesn't seem to support sysEx messages. This is not good, a lot of MIDI files use these messages for various things, for example for more drum channels (like the linked MIDI file) and even MIDIs made for the default Windows synth make use of them.

So while it's possible, I would suggest migrating to spessasynth's sequencer. Here's why:

Shared sequencer

Sequencer (both core and lib) supports sending external MIDI messages. i.e., as binary MIDI events. So they can be directed directly to a WebMIDI output (output.send()) or to any other software synthesizer. So the new sequencer can still be shared among all synths, and even the new ones you're planning to add. The abstracted method could simply be sendMessage which either calls exactly that in the synth if supported (most of them do I believe), or call a corresponding method, like this

And regarding muting channels, the messages can be filtered to only allow enabled channels.

The sequencer also has an option for disabling skipping to the first note to work with hardware synths like described here. I don't own any, but it works great with nuked SC-55 sent through web MIDI.

The playback rate is also fully supported, like in the current sequencer. You can test both in the demo page, open settings and select a MIDI device and it will be used instead of the built-in synth.

Loop points

Some MIDI files (for example touhou 6 MIDI files found in the Chip Player) set loop points for seamless playback. This is fully supported by the sequencer. You can try loading a MIDI file like th06_02.mid into the demo.

Preloading samples

While sf3 files are great for making high quality soundfonts small, they have a problem: decoding vorbis takes time. Spessasynth implements dynamic sample loading: I.e. samples are only loaded when they are needed. This makes loading sound banks very fast, but if a sample takes long to decode, there rendering has to wait until it's decoded.

To mitigate this, sequencer implements a preloading system, which loads not just samples, but all the synthesis data for the entire MIDI file, making it so that every pressed key will already be loaded and cached, ensuring smooth playback.

Multi-Port MIDI support

You can read about it here and here.

Through my journey in the MIDI world, I've found quite a few of MIDIs that utilize this feature (a lot of them come from MuseScore). (here's a small collection). Sequencer supports those as well. Unfortunately, this can only be used with spessasynth's engine, other players will still have only 16 channels, but that's just the limit of a meta event. You can't send it to a synthesizer.

Which library?

This is a very important question I need to ask before I can suggest actual implementation.

core

As mentioned before, core provides the direct PCM data. The synth has a renderAudio method and sequencer has the processTick method. This can simply be put into the ScriptProcessorNode you are currently using, but this means rendering in the main thread (potential interruptions as we have to wait for the DOM to render) and ScriptProcessor is deprecated. SpessaSynthProcessor also provides no effects. It routes them correctly, but they must be implemented manually via AudioNodes or similar.

lib

Lib is integrated into the Web Audio API. It runs on an AudioWorklet or a Web Worker in a separate thread. The main thread can even be frozen and it would still play smoothly! This is what the demo uses and this is what I propose. Firstly, the chorus and reverb effects are fully provided here. Other synths can still be in the main scriptProcessor for now, and the sequencer can still drive them from the worklet thread. In theory it should just work.

So, what do you think Matt? What's the correct choice here?

spessasus avatar Sep 19 '25 13:09 spessasus

Yeah, there are a lot of interesting MIDI extensions (GM2, loop points, multi-port) that I had not gotten around to implementing.

decoding vorbis takes time.

Yep. FluidLite supported SF3, but when I tried it, it introduced unacceptable latency and I completely gave up on the idea. Maybe if this decoding situation could be dramatically improved, it would make sense. Possibly incremental decoding. (Don't decode entire samples at once, but only the portion needed for the next audio buffer.) But for now, I simply avoid massive SoundFonts (I'm of the opinion that SoundFont craftsmanship is much more important, anyway. A 5 MB SF2 can sound better than an SF3 that unpacks to 500 MB).

So, what do you think Matt? What's the correct choice here?

I think the choice of library has to be core because this is the level of abstraction used by my Player interface. The players fill a buffer with audio samples.

Although I would like to migrate to AudioWorklet at some point, I believe this is something that all players should benefit from, and it would be a weird situation if only the MIDIPlayer used workers. I know scriptProcessorNode is deprecated, but I had a bad time when I tried to migrate off of it (years ago).

How about this:

  • Add support for spessasynth core to MIDIPlayer. This will require some improvements to the architecture (MIDI engine selection)
  • Replace my sequencer with spessasynth sequencer. The only thing I might miss is the "skip silence" feature (and "MIDI state restore" when switching engines). I also have some custom logic to flatten per-track loops in N64 MIDI files.
  • Later on, maybe I can tap your expertise for migrating the entire app to AudioWorklets.

I'm not sure how to break down responsibilities but we can figure something out. If all else fails, I'm happy to accept PRs.

mmontag avatar Sep 19 '25 21:09 mmontag

It's probably best to start rewriting MIDIPlayer side-by-side.

It should at least support choosing between the 3 fundamentally different engine types: spessasynth (JS softsynth), libadlmidi (WASM softsynth), and WebMIDI API. I want to delete tinyplayer.c (migrate its minimal logic to JS) so don't use tp_note_on etc, but instead call the libadlmidi exported functions directly (adl_rt_noteOn etc).

mmontag avatar Sep 19 '25 22:09 mmontag

Thanks for your response, Matt. Addressing your points:

Missing Sequencer features

The only thing I might miss is the "skip silence" feature (and "MIDI state restore" when switching engines).

I believe that these two are in my sequencer:

Skip Silence

Skipping silence is, in fact, enabled by default.

This quote:

the sequencer also has an option for disabling skipping to the first note to work with hardware synths

Mentions this feature.

skipToFirstNoteOn, as the name suggests, skips to the first note-on event. This means skipping all of the initial silence in the MIDI file until the first note is pressed. This doesn't literally skips the events the file, but it simply seeks to the time of the first keypress if the requested seek time is lower than it or the new song starts.

This brings me onto the second feature, which is closely related:

MIDI State Restore (seeking)

When you seek to a given time, the sequencer "plays back" the song from the start up to the given point internally. The sequencer saves the state of programs and controllers, pitch wheel and so on, restoring them after. It also emits all the system exclusive events up to that point.

For example:

  1. MIDI has a CC#7 (volume change) set to 100 at 2 seconds.
  2. Another change of CC#7 to 64 at 4 seconds.

If you seek to between 2 and 4 seconds, the CC#7 will be set to 100. If you seek to 4 or later, it will be set to 64.

These messages are only restored once, i.e. when you seek and before the actual playback from the given time starts. I believe it works for all softsynth, both integrated and external. I've tested it with a BASSMIDI driver through WebMIDI and it works fine.

However, the hardware synths like the SC-55 don't like having the sysEx events shoved into them all at once, that's why i mentioned skipToFirstNoteOn:

the sequencer also has an option for disabling skipping to the first note to work with hardware synths

When you disable it, the sequencer will play back the silence, which is usually the needed pause to set all the system exclusive events to the synthesizer.

Spessasynth_core as the final choice

I think the choice of library has to be core because this is the level of abstraction used by my Player interface. The players fill a buffer with audio samples.

It can work for now. However, you must remember that:

SpessaSynthProcessor also provides no effects. It routes them correctly, but they must be implemented manually via AudioNodes or similar.

I didn't implement audio effects directly in the processor for two reasons:

  • Native AudioNodes are faster than a JS implementation, especially the reverb.
  • Allow the users to implement custom chorus and reverb.

AudioWorklet provides a nice way of solving this: declaring three outputs: dry, chorus and reverb. ScriptProcessorNode doesn't have that I believe, so the effects must be implemented in the script processor itself. this public domain reverb can be ported to a script processor relatively easily. I'm not sure about chorus though.

What do you think?

spessasus avatar Sep 21 '25 12:09 spessasus

That's great, it sounds like your sequencer is full featured.

Does spessasynth_lib include reverb/chorus implementations? Can that be split to a reusable module?

Native AudioNodes are faster than a JS implementation, especially the reverb.

I'm not sure what you mean. There is no native reverb audio node. Maybe using a convolution node..? The Freeverb JS or WASM implementation would be really really fast. Chorus could be ported from FluidSynth. Anyway, I'm happy to punt this until later.

Just noticed:

Also keep in mind that the buffer size should not be larger than 256, as the renderAudio function calculates the envelopes and LFOs once, so buffer size represents the shortest amount of time between those changes.

It would be nice to decouple the control rate from the audio buffer, from a library consumer perspective - chip player JS doesn't have a fixed buffer size, but is negotiated by the browser. Ideally the control rate is tweakable under the hood of spessasynth_core and fully independent from output buffer size. (I recall doing something like this for my old DX7 synth https://github.com/mmontag/dx7-synth-js/blob/master/src/config.js)

mmontag avatar Sep 25 '25 07:09 mmontag

I'm not sure what you mean. There is no native reverb audio node. Maybe using a convolution node..? The Freeverb JS or WASM implementation would be really really fast. Chorus could be ported from FluidSynth. Anyway, I'm happy to punt this until later.

Yes, I meant the convolver. I tried implementing freeverb once and it really hurt the performance. Not to mention that these engines can introduce distortion which isn't present with the convolution-based approach. FluidSynth is also affected.

It would be nice to decouple the control rate from the audio buffer, from a library consumer perspective - chip player JS doesn't have a fixed buffer size, but is negotiated by the browser. Ideally the control rate is tweakable under the hood of spessasynth_core and fully independent from output buffer size.

This is done to optimize the processor for the audioWorklet - its buffer size is fixed to 128 samples, making multiple calls to renderAudio and splitting redundant.

Fortunately, there are offset and length params that makes using larger buffer sizes easy:

// Any multiplier of 128
const BUFFER_SIZE = 1024;
// The output arrays
const dry = [new Float32Array(BUFFER_SIZE), new Float32Array(BUFFER_SIZE)];
const rev = [new Float32Array(BUFFER_SIZE), new Float32Array(BUFFER_SIZE)];
const chr = [new Float32Array(BUFFER_SIZE), new Float32Array(BUFFER_SIZE)];

const QUANTUM_SIZE = 128;
const RENDER_COUNT = BUFFER_SIZE / QUANTUM_SIZE;


for(let i = 0; i < RENDER_COUNT; i++) {
    seq.processTick();
    // render QUANTUM_SIZE samples at a "QUANTUM_SIZE * i" offset
    synth.renderAudio(dry, rev, chr, i * QUANTUM_SIZE, QUANTUM_SIZE);
}

Here's a more complete example.

spessasus avatar Sep 25 '25 13:09 spessasus

Alright, I can work with that.

Quick question. Does spessasynth support layering (overriding) instruments by loading multiple SoundFont banks?

For example, if I want to use GeneralUserGS and then override the Piano with Yamaha Grand Lite.

mmontag avatar Sep 26 '25 18:09 mmontag

Yes, of course.

Each sound bank has a unique id you give and even has a bank MSB offset. You can rearrange them (change which overrides which).

This is how the RMIDI support works: it loads an embedded sound bank as the one with the highest priority.

spessasus avatar Sep 26 '25 19:09 spessasus

Hey, I got the spessasynth_core plugged in, and it's pretty awesome to hear the difference in XG and GS songs, complete with filter sweeps and nonstandard drum tracks.

Sharing some speed bumps/thoughts so far:

1.

Failed to compile.

./node_modules/spessasynth_core/dist/index.js 13:15

Module parse failed: Unexpected token (13:15)

You may need an appropriate loader to handle this file type, currently no loaders are configured to process this file. See https://webpack.js.org/concepts#loaders

|    * The current index of the array.
|    */
>   currentIndex = 0;
|   /**
|    * Returns a section of an array.

Fixed this by including spessasynth_core in Babel transpilation, but doesn't seem like a great solution.

2.

Compiled with warnings.

./node_modules/spessasynth_core/dist/index.js
Critical dependency: require function is used in a way in which dependencies cannot be statically extracted

./node_modules/spessasynth_core/dist/index.js
Critical dependency: require function is used in a way in which dependencies cannot be statically extracted

Search for the keywords to learn more about each warning.
To ignore, add // eslint-disable-next-line to the line before.

Not sure about this one. I am on node 22.14/npm 10.9.2/Webpack 4.46 (and have no desire to change that right now).

3.

The default pitch vibrato rate seems to be about 2x faster than FluidLite/FluidSynth.

4.

It would be cool to have an isFinished property in the sequencer, rather than relying on the pause event.

5.

Setting playbackRate has an unfortunate side effect of killing all notes. I see that you update currentTime in the setter, maybe there is another way?

Personally I am not a fan of JS accessors; getting/setting a property can have surprising side effects for library consumers.

mmontag avatar Sep 28 '25 21:09 mmontag

I'm glad that you've managed to get it working!

1., 2.

Can you commit your code to a branch so I can take a look at it? The only thing I can note so far is that you seem to use require. This is the old CommonJS method. Spessasynth an ES Module, so you do:

import { SpessaSynthProcessor } from 'spessasynth_core';

And the bundler should take care of everything. Example.

3.

What do you mean by the "default vibrato pitch rate"? If you mean the modulation wheel, I have confirmed that the latest fluidsynth has the exact same rate in gm.dls organ 1 patch as spessasynth. What sound bank and what MIDI does this occur on?

There's also a possibility that the MIDI sets a custom vibrato with the GS NRPN (such as the touhou MIDIs). This can be disabled

spessasus avatar Sep 29 '25 19:09 spessasus

1

I'll push a branch later tonight. I also write my project with ES module syntax... so I assume this warning is about some usage of require in the spessasynth_core compiled distributable.

3

Yea, I mean the mod wheel. It occurs on all songs. Even in the spessasynth demo site the vibrato sounds faster than what I am used to. But, who knows...

6

I also noticed there is some long (e.g. 200ms) blocking delay when parsing large MIDI files or loading them into the sequencer. I'm hopeful there's room to optimize this because I haven't had this issue (even though I scan through every MIDI event in JS before playback begins 😀). I tried A/B testing a couple files between my current sequencer and spessasynth and there is a significant difference.

Example:

  • https://chiptune.app/?play=%2FClassical%20MIDI%2FProkofiev%2FPiano%20Concerto%20No.%202%20in%20G%20minor%2C%20Op%2016%20-%20I.%20Cadenza%20(Ky6000).mid
  • https://chiptune.app/?play=%2FClassical%20MIDI%2FRachmaninoff%2FSymphony%20No.%202%20in%20E%20minor%2C%20Op.%2027%20%E2%80%93%20I.%20Largo.%20Allegro%20moderato%20(D.%20L.%20Viens).mid
  • https://chiptune.app/?play=%2FClassical%20MIDI%2FBrahms%2FSymphony%20No.%201%20in%20C%20minor%2C%20Op.%2068%20-%20IV.%20Adagio%20-%20Piu%CC%80%20andante%20-%20Allegro%20non%20troppo%2C%20ma%20con%20brio%20(Tatsu).mid
  • anything here https://chiptune.app/browse/Classical%20MIDI/Mahler 🙂

mmontag avatar Sep 29 '25 20:09 mmontag

3

I have compared polyphone, fluidsynth and spessasynth. All use the SF2 default 8.176 Hz. Note that a soundfont can override this value to something lower. Maybe you are using a different one?

6

I think I have diagnosed the issue. The first file crashes my MIDI viewer of choice, Falcosoft MIDI player. Seikaju managed to open it and confirmed that there are a lot of tempo change messages. Seikaju reports 46268 meta messages! Spessasynth detects exactly 46269 tempo changes. It firstly goes through them to determine the duration.. This loops through them after going through the entire file, so I think I can optimize it to calculate it while parsing.

What's also causing the problem are the metaEvent emitters. Tempo changes are technically meta events, so all of them up until the point of the seeked position are sent all at once. To be honest, I'm not sure what to do about this one. The only way I can think of would be only sending the last tempo change event... I need to look into it. I must admit that I've never seen a file with so many tempo changes!

spessasus avatar Sep 29 '25 22:09 spessasus

3

I'll look into this. Maybe something weird on my end. I'm using GeneralUser GS 2.0.1.

6

Haha, right. This huge number of tempo events might be the result of aligning a MIDI score to a performance or something.

Could I disable the event emitter and re-enable it after playback starts...?

My MIDI player currently relies on MIDIFile, which also calculates a playTime for each event.

mmontag avatar Sep 29 '25 22:09 mmontag

I have created a bug in core:

https://github.com/spessasus/spessasynth_core/issues/19

I'll start working on it soon.

spessasus avatar Sep 29 '25 23:09 spessasus

@mmontag It turns out that midiToSeconds was horribly inefficient. It called .find for every tempo change! Thanks for the heads up, this has been fixed! 4.0.11 now includes the fixed calculation time and optimized metaEvent calls. It loads pretty much instantly now.

spessasus avatar Sep 30 '25 12:09 spessasus

Thanks for fixing that! I have updated to 4.0.11 and now I still notice some issues, but I can give a more detailed diagnosis.

Mahler Symphony 7

Here we see 146 ms for BasicMIDI.fromArrayBuffer() and 733 ms for sequencer.loadNewSongList().

It is also detected as multiport MIDI, which looks like a false positive for many MIDI files I have tried out.

Image

Here's my code for logging this out:

Image

Here is a perf snapshot of the entire loadData sequence:

Image

(FYI, my machine is Macbook Air M3.)

mmontag avatar Oct 01 '25 06:10 mmontag

I did a profile with my existing MIDIPlayer for comparison.

The same Mahler file takes about 156 ms to parse (with midifile npm module), which is faster, but nothing incredible. I know at least this parser is keeping track of tempo and calculating wall time for every event.

I understand spessasynth sequencer might be doing a lot more bookkeeping, but would you be willing to add options for tuning this? Think we can get a comparable speed?

Image

mmontag avatar Oct 01 '25 06:10 mmontag

It is also detected as multiport MIDI, which looks like a false positive for many MIDI files I have tried out.

Mahler is a 32 channel Multi-port file:

Image

You can see with the synthesizer controller. Multi-port files are often used for the exact purpose of complex compositions. Though even if a file isn't multi-port, it doesn't really matter as it would just use the first port (the regular 16 channels).

Regarding the vibrato, this file indeed uses the Roland GS NRPN vibrato:

Calling

manager.synth.setLogLevel(true, true, true)

In the console of the web app results in:

Image This is probably why you heard the faster modulation. Fluidsynth does not support these messages, but spessasynth does..

Regarding the performance...

I wrote a simple test for measuring these two in node. I have removed a check that validates if all bytes are max 7-bit in voice messages, since I've found one file which declared a velocity of 255. But since the check is already in noteOn, I don't think it's needed.

Before:

Engine initialized, loading MIDI...
MIDI parsed in: 177.571ms
New song loaded in: 337.835ms

After

Engine initialized, loading MIDI...
MIDI parsed in: 77.631ms
New song loaded in: 146.742ms

I hope that's better :-)

And regarding the loadNewSongList time: it's the dynamic preloading. it doesn't just preload the samples, but it also preloads generators and modulators (searches for all matching zones and parameters), so on actual note on, it just pulls them from the cache. That's the tradeoff for smooth playback, regardless of compression.

For example the result with BasicSoundBank.getSampleSoundBankFile() is the above (146ms), while with a large SF3 bank it's:

Engine initialized, loading MIDI...
MIDI parsed in: 76.216ms
New song loaded in: 2.003s

Since the samples are decompressed at load time. I think it's a good trade for being able to use SF3 files. (Regarding those, I might have a few good suggestions to add to chip player if you're interested. Though I'd need to know what's the max size.)

You can also see that if you call loadNewSongList on the same song again:

New song loaded in: 2.684s
New song loaded in: 329.263ms

the time is way shorter as there's nothing to cache. I've also discovered an obscure bug in where the velocity override generator would also influence the cache, causing it to cache twice rarely. So thanks for that too!

The update is up as 4.0.12. Let me know how this goes.

PS: I still can't find the spessasynth branch, did you push it to github?

spessasus avatar Oct 01 '25 12:10 spessasus

BTW, between Fluidsynth and beatnik, in beatnik notes length is 105% ~ of real size. In bassmidi is partially supported but noteoff needs be to allow overlap the 5% extra time. Beatnik sound is much better algorythm. The well notice is that fluidsynth have better render and modulators support than bassmidi.

MXMF55 avatar Oct 01 '25 12:10 MXMF55

Branch

Just pushed the branch: https://github.com/mmontag/chip-player-js/tree/spessa/src

Multiport

Wow, now I see. The file has tracks labeled [A01] to [A16] and [B01] to [B16].
The fallback behavior in single port players is that A01 and B01 share a single channel, so one of them might get the wrong instrument, I guess?

Preloading strategy

I updated to 4.0.12 and measured the same file total load time 450 to 500 ms, consistent with your results. Great improvement!

However, I still don't understand the dynamic preloading, but maybe I'm not thinking hard enough. I don't understand why preloading samples (and generators and modulators) is relevant on a per-midifile basis. Take my perspective; my application loads the entire SoundFont in memory upon application start, not waiting for a MIDI file to ask for Bright Piano and Slow Strings à la carte. (Similar to what a video game would do, it's a preloading strategy of larger batches.) Would be awesome if spessasynth can accommodate this strategy!

Vibrato

I made some recordings on hardware:

Spessasynth mahler-7-5--spessasynth.mp3

Roland SC-88 Pro mahler-7-5--sc-88pro.mp3

Yamaha MU1000 mahler-7-5--mu1000.mp3

The vibrato can't possibly be the author's intention, right? And this is not reflected in the hardware synths. What do you think? Where does the problem lie?

mmontag avatar Oct 02 '25 07:10 mmontag

Branch

I have taken a look at it and I have some suggestions:

 this.fileExtensions = ['mid', 'midi', 'smf'];

You can also add kar, .rmi and xmf. All of these are supported by spessasynth (the same BasicMIDI.fromArrayBuffer, no need to change). RMI is SF2/DLS + MIDI in one file and XMF is like RMI except commonly used in older phones as ringtones.

 // HACK: MIDI metadata is guessed from filepath
    // based on the directory structure of Chip Player catalog.
    // Ideally, this data should be embedded in the MIDI files.
    if (parts.length >= 3) {
      meta.formatted = {
        title: `${parts[1]} - ${parts[len - 1]}`,
        subtitle: parts[0],
      };
    } else if (parts.length === 2) {
      meta.formatted = {
        title: parts[1],
        subtitle: parts[0],
      }
    } else {
      meta.formatted = {
        title: parts[0],
        subtitle: 'MIDI',
      }
    }

I propose passing the guessed name as the altName in the fromArrayBuffer. (i forgot to document it, sorry!) and using the name detection. Spessasynth can detect a MIDI name from the file itself. If it fails, the altName will be used. Same with metadata, though it is mostly text that is considered "interesting", usually empty track names where authors often put their names or copyrights, but also copyright events and more.

You can try to see what these two detect by running your MIDIs in the web app and opening the "music player mode".

Preloading

However, I still don't understand the dynamic preloading, but maybe I'm not thinking hard enough. I don't understand why preloading samples (and generators and modulators) is relevant on a per-midifile basis.

The voices are cached for each preset's note and velocity. This means that caching the entire preset would be 128 keys * 127 velocities = 16,256 lookups + possible sample decoding! And that's per preset, there are sound banks (like HiDef.sf2) which have over a thousand of them! Each lookup is relatively expensive, because matching preset and instrument zones have to be found and their generators and modulators have to be summed up according to the SF2 spec. Back when spesssasynth didn't have that, there were sometimes stutters with large presets with lots of zones when first playing a given key and velocity.

This is why spessasynth goes through the file to determine what keys are pressed with what velocities and with what presets, then only preloads these specific combinations. It used to load everything on bootup, but it took a very long time on weak devices, especially with SF3 (and ate up a lot of RAM too). So I have implemented the dynamic preloading system, i.e. only preloading used key combinations.

Technically you can force a complete preload during load time (iterate over each key-velocity combo in each preset). I haven't tried that though.

Vibrato

The Sound Canvas vibrato is relative. This means that the NRPN changes the existing vibrato parameters of a given instrument. SC instruments each have a set vibrato even if it's disabled by default, with params like delay or rate. Unfortunately, 99% of soundfonts usually leave it at default, which is disabled, 8.176Hz and no delay. BASSMIDI for example implements the relative change which usually results in very fast vibrato that doesn't sound good. So I came up with an equation that puts a custom vibrato on an instrument which sounds "alright". This isn't accurate in the slightest, but it's something. However it can be disabled permamently.. I hope this explains things.

spessasus avatar Oct 02 '25 17:10 spessasus

Branch

I'll implement your suggestions, thanks :)

Preloading

The voices are cached for each preset's note and velocity. This means that caching the entire preset would be 128 keys * 127 velocities = 16,256 lookups + possible sample decoding!

Please bear with me...

What is a "voice" here (assuming a plain SF2, what data must be cached)? What if I am using spessasynth for live performance with a MIDI keyboard? How does fluidsynth manage without preloading?

mmontag avatar Oct 08 '25 00:10 mmontag

4.0.16 comes with two of your requested features:

  • a isFinished boolean.
  • a preload boolean with which you can disable the preloading of the sequencer if you'd like.

What is a "voice" here (assuming a plain SF2, what data must be cached)?

A voice represents a single SF2 synthesis model. That is:

  • a sample and the so-called "wavetable oscillator" that plays it back
  • the volume envelope
  • the modulation envelope
  • the LFOs
  • the low-pass filter
  • the generators (static parameters)
  • the modulators (dynamic parameters)

Most of high-quality sound banks use more than one voice per note. For example 2 voices, a stereo sample pair.

The cached data are the generators, modulators and of course the sample. Generators and modulators are relatively cheap to compute, but when a song starts with multiple chords (it often does), a lot of these have to be loaded at once, possibly leading to stuttering on weaker devices.

What if I am using spessasynth for live performance with a MIDI keyboard?

Then it will load it as you play. Works fine with SF2, not so much with SF3. You can preload all the samples and it would fix that but it takes time.

How does fluidsynth manage without preloading?

It doesn't. Quoting you:

Yep. FluidLite supported SF3, but when I tried it, it introduced unacceptable latency and I completely gave up on the idea.

Which is what preloading completely avoids, no matter how large the samples are. The chiptune.app website already takes quite a bit of time for a MIDI to start playing (at least on my end), presumably downloading the files. Which is why 500 ms of extra latency is worth the benefits of being able to use SF3s, I think :-)

Though again, you can now completely disable the preloading.

spessasus avatar Oct 08 '25 07:10 spessasus