Kinyugo comments

Results 25 comments of


                                            Kinyugo

Add support for Modified Discrete Cosine Transform (MDCT)

Hi @mthrok & @carolineechen, I apologize for taking too long to get back to this. I have made an implementation of the MDCT algorithm in PyTorch [here](https://github.com/Kinyugo/torch_mdct). Kindly let me...

Implement L-BFGS-B optimizer and update InverseMelScale

I have been using an implementation where I compute the pseudo-inverse of the filterbanks. It seems to work nicely and it gives similar results as `lstsq`. An upside of the...

Implement L-BFGS-B optimizer and update InverseMelScale

Hi @nateanl, using `L-BFGS-B` will be a great solution. I am not sure if there is a specific implementation of it in PyTorch. It could be useful to run tests...

Slow Performance of `tokenize_midi_dataset` Function

Thanks @Natooz. You are right miditok v3.0 is much faster I have seen about a 10x improvement. I'll consider tokenizing on the fly, that would use PyTorch dataset mutiprocessing, right?...

Slow Performance of `tokenize_midi_dataset` Function

Interesting. If I get some time I'll test this and share the results with you. That will help reduce complexity a lot. But I assume for bpe you will want...

Slow Performance of `tokenize_midi_dataset` Function

Hello. Thanks for your apt replies. I have a few questions though. >Yes indeed. Note that BPE training is also done by tokenizing MIDIs on the fly. The big interest...

Slow Performance of `tokenize_midi_dataset` Function

> No I just meant that when training the tokenizer, the training data (MIDIs) is tokenized on the fly, there is no need to pre-tokenise it. :) Nice. I had...

Slow Performance of `tokenize_midi_dataset` Function

> About full samples: I am currently experimenting with a TSD tokenizer, trained with BPE on the MetaMIDI Dataset. Nice! Looking forward to your findings. > The resulting tokens wouldn't...

Slow Performance of `tokenize_midi_dataset` Function

No problem if I do get time too may be I can contribute that. Currently held up as well.

Slow Performance of `tokenize_midi_dataset` Function

Hello @Natooz Thanks for looking into the issue. I am currently running tests for this. However, I do notice an `IndexError: list index out of range` error with some files...