Christopher Schröder
Christopher Schröder
Sorry for the long wait. I have been busy and so have my GPU resources. The implementation is now almost done except for some final tests and clean up. I...
Yes, such errors may happen, as models can have arbitrary arguments. What you suggest here sounds like a good solution when the calling side passes more parameters than the models...
Thanks for reporting this! I will look into it.
@RaymondUoE With just the additional `tokenizer.add_special_tokens()` call, I cannot reproduce the error. Can you provide details on the assertion output?
Thanks for the ping, Tom! Depends on the problem. Do you know how many classes your dataset will have @vahuja4? How many samples does your dataset contain? Without having that...
> @kgourgou - thank you! I will give it a shot. @chschroeder - thank you for your reply! The number of classes is 74 and the size of the corpus...
> small-text looks amazing and fits a use-case I have! > > @vahuja4 I expect that @chschroeder can give more precise answers than me, but in general it depends on...
Thank you, @MosheWasserb! I am honored you already noticed. SetFit has been working great for me, thanks for that as well. Such a metric seems reasonable but then how do...
> @chschroeder Is there are a few simple rules of thumb to choose the best strategy for a given data set, or should I try all of them? For example,...
Thank you for the tip! In the meanwhile I have discovered a pre-parsed dataset on huggingface hub: [wikimedia/wikipedia](https://huggingface.co/datasets/wikimedia/wikipedia). They also seem to use this parser, so I will try using...