Nathan Fradet
Nathan Fradet
@Yikai-Liao @lzqlzzq 👋 Sorry to ping you with another bug 😅 I'm not sure what's the exact cause of the problem here, only that symusic fails to bind a few...
@lzqlzzq Thank you for your help! @ojus1 Great! I actually just added a `filter_dataset` method in #160 that foes exactly what its name suggests
Thank you for the fix!
Hello, Apologies for these errors, I do not always update the notebook when some changes are made to the library. The error is occurring because the data collator is not...
Wonderful! And I would be happy to merge it! I just ran everything, and indeed there are a few other changes to make. Here is a notebook version with everything...
Hi, This is a design choice (i.e. to only alter the ids) as the main purpose of encoding the sequence is to fed the ids to a model. If you...
Hi, thank you for your kind comment! :) In most cases, models take input ids (i.e. integers which can be seen as the idx of the tokens in the vocabulary),...
Thank you for your message! I am not sure to understand exactly what you are trying to accomplish. Is Orchestra a specific dataset? Or do you mean you are using...
This is done automatically once the tokenizer has been trained. :) You can see the documentation of the `encode` (to tokenize files) and `encode_token_ids` (to apply BPE/Unigram to token ids...
Hello, thank you for reporting this error. Indeed this argument is expected to be an integer. I fixed it in #240 and added support for float values as well (i.e....