ScoreTransformer icon indicating copy to clipboard operation
ScoreTransformer copied to clipboard

all of the bpm of musicxml after detokenizer are 120

Open Chunyuan-Li opened this issue 2 years ago • 4 comments

I noticed that in the direction element of MusicXML, there is a per-minute attribute within the direction-type element used to describe beats per minute (BPM). However, this attribute is missing in both the tokenizer and detokenizer processes. As a result, after detokenization, all the BPM values in the resulting MXL file are set to 120, which is clearly problematic.

Chunyuan-Li avatar Jul 30 '23 08:07 Chunyuan-Li

Sorry for my super late response. Currently, the tools are not considering elements related to BPM or tempo, because these elements do not directly involved in the MIDI to Score conversion.

If these elements are necessary for your use case, I suggest considering custom extensions to the tokenizer and detokenizer:

  • Tokenizer (score_to_tokens.py): You can add the necessary elements to attribute_to_token(), while modifying attributes_to_tokens() as well.
  • Detokenizer (tokens_to_score.py): You can add the conversion to the music21 elements to single_token_to_obj().

suzuqn avatar Dec 03 '23 10:12 suzuqn

Yes, I did try that as well. However, I found that the model struggles to predict the bpm accurately, regardless of whether I explicitly specify bpm in the midi tokens (from midi tempo changes). Eventually, I removed the bpm indicators.

Additionally, I encountered another issue on the model side: when training with non-standard midis (typically converted from audio) and standard musicxml as data pairs, the model's prediction performance deteriorated significantly, often leading to missing notes. Do you have any suggestions?

Chunyuan-Li avatar Dec 06 '23 06:12 Chunyuan-Li

If your goal is to transcribe BPM from MIDI to Score, I think it's not necessary to include it in the token conversion process. Instead, you could simply append the BPM read from the MIDI as a tempo object to the transcribed score.

Your issue with "non-standard midis" seems similar to the case of unquantized (noisy) input described in Section 6.5 of my paper. The key might be in augmentation, regarding note timing and duration.

suzuqn avatar Dec 16 '23 15:12 suzuqn

Indeed, it bears a resemblance to the addition of noise discussed in Section 6.5 of the paper. I conducted a comparison using standard MIDI files, altering parameters such as noise ratio and range (duration). I observed that as noise increased, the model gradually started forgetting notes in musicxml. Additionally, in non-standard MIDI files, there were instances of note errors, making it seemingly more challenging for the model to learn. In light of this situation, do you have any suggestions for effective solutions?

Chunyuan-Li avatar Dec 18 '23 03:12 Chunyuan-Li