sd-webui-deforum icon indicating copy to clipboard operation
sd-webui-deforum copied to clipboard

[Feature Request]: a midi to keyframes tool

Open OrHavraPerry opened this issue 1 year ago • 9 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

The Tool would enable the conversion of MIDI files to a keyframes JSON format. The note parameters would be used as values to pass.

Proposed workflow

Run analysis

  1. Run analysis
  2. Filter MIDI by note, channel, gate, etc.
  3. Convert to keyframes, values, and function.
  4. Convert to JSON.

Additional information

This feature would be beneficial for working with musician's tools, such as any DAW.

Are you going to help adding it?

While i do not code much lately, especially not web apps, I can assist in the feature design. I have experience working with MIDI in code and creating music from time to time.

OrHavraPerry avatar Mar 09 '23 17:03 OrHavraPerry

This is a cool idea! I would like to chat more about implementation ideas around this. How would timing work? Distance between beats vs keyframe max or would this be a schedule? I think there is a lot of possibilities in here. Do midi files encode timings? Would the overall timing of the frames be matched? https://pypi.org/project/MIDIFile/ seems like a good option @OrHavraPerry do you have any ideas on these questions? Things you would like to see most

MatissesProjects avatar Mar 19 '23 08:03 MatissesProjects

This is super cool, as a musician this would be fantastic. The ability to read CC continuous control messages might be cool too. Timing in MIDI: https://mido.readthedocs.io/en/latest/midi_files.html

I'm not sure which keyframes to generate and from what data? Maybe there should be a pre-prepared track in the file where you could assign a single note to a parameter in Deforum, and the note velocity would be scaled between 0-1 automatically? So you could write there, let's say on note 63 (middle C IIRC) to be read for the zoom parameter, then write notes by hand on a separate track (let's assume track 1 is always read by Deforum) where you want the zoom speed to change. Then you could use note 64 for anothe parameter, etc.

Otherwise I find it hard to see how a polyphonic note stream could be automatically parsed to useful linear single stream of numbers which the parameters expect.

Maybe a stream of CC values could also be enabled to be read as a parameter, again maybe scaled always between 0 and 1.

Taikakim avatar Mar 25 '23 21:03 Taikakim

I feel the last part is exactly how I pictured it. Not trying to over-complicate and letting the user select the control tracks they want with the CC events getting re-timed and then mapped to the timings selected in deforum (fps rate, e.g.).

I feel putting something to parse the music itself is an idea, but not a first pass one on the utility of the power of midi control. You can still write control tracks atop the midi track and cast the events sync'd to the time in the music perfectly, without needing to parse note events per se.

You could also support multiple control tracks, and have fades, and even CC messages that can control which tracks to switch/use in the control tracks themselves.

I pictured this working like any running track, like audio, that will just have a stack of values available when the user wants to use them at frame X for whatever control. The existing expression language would just need some core cmd/function that would expose the values from the current midi file.

I would also normalize to -1...+1 instead of just 0...1. With -1 you have the ability to scale backwards. As a simple example think of the default zoom and how it uses sin (-1...+1) to have slow float in effect it produces. With 0 it will only ever flatten.

To recap the need for this idea:

  1. UI to load midi file for parsing, possibly in the Init tab?
  2. Define a new inherent function (midi) that takes a track name/number, and returns the current frame value for that control track e.g. midi(0) [default to track 0] will just take the CC values there and normalize to -1...1
  3. Make a control track for say CFG Scale in my favorite DAW, sync'd to where I want it to flake out with the music
  4. Issue "midi" command with track you want to use in expression language on anything you want to control with the value of the current frame from said track: e.g. Zoom: 0:(1.0025+0.002 * midi(0))

I feel this fits with the existing system.

Amorano avatar Mar 25 '23 23:03 Amorano

@Amorano I really like that you have defined a method style on the midi track call. Does this make sense generally to people? It could possibly be scheduled, this way as you mentioned you can have multiple control tracks, multiple possible switches between them.

As far as scaling between 0 to 1 vs -1 to 1, this can also be a setting. Technically 0-1 can be multiplied by 2 and offset to become -1 to 1 so we might go with the super set and just add a multiplier or something if that makes sense.

Question about how to read the values in the midi(0) track, how would you imagine this getting interpreted? Like if I have a way to read the different channels, what data should I look for?

MatissesProjects avatar Mar 26 '23 18:03 MatissesProjects

ok so I got a question on top of this, would interpolation between midi tracks be useful? say you have two or more control tracks, instead of modifying the song in midi you could control their scaling or something. Does that make sense for midi? I havent used it directly

MatissesProjects avatar Mar 26 '23 20:03 MatissesProjects

It might be worth sharing this with the dev of parseq; he's got an audio analyser, so parseq feels like a natural place for a MIDI thing.

andyxr avatar Mar 29 '23 17:03 andyxr

ok so I got a question on top of this, would interpolation between midi tracks be useful? say you have two or more control tracks, instead of modifying the song in midi you could control their scaling or something. Does that make sense for midi? I havent used it directly

Useful, maybe. Practical in the first pass of just getting a control value stream applied to an existing variable on a per frame pre-call... maybe not so much =D

Amorano avatar Mar 29 '23 17:03 Amorano

This is a good idea in principle - not sure if anyone needs to write anything though: https://www.visipiano.com/midi-to-json-converter/ and https://tonejs.github.io/Midi/ are just two examples of online (free) MIDI to JSON converters.

From there, you can convert it into parseq or schedules, based on whichever aspects of the MIDI data you're targeting to convert to Deforum events.

I might have a play and see if a good workflow drops out...

sashaagafonoff avatar Apr 14 '23 04:04 sashaagafonoff

I made a more specific version of it here for those who use Ableton. Basically hacks using track automation to generate keyframes which you copy paste into the UI. It works pretty well but there are some issues.

The biggest being that automation events are specified as fractional beats so multiple events can be mapped to the same keyframe and it's hard to disambiguate without knowing the artist's intent. I'll look to see if using midi instead could open this up to more DAWs and address that issue.

ryg4 avatar Apr 21 '23 20:04 ryg4

Not within the scope of this repo

kabachuha avatar Aug 16 '23 11:08 kabachuha