Roman Joeres
Roman Joeres
@conda-forge/help-python What do I need to do for you to merge this PR? I passed all checked and completed all tasks. About the build number, I'm not sure where to...
@shermansiu I checked the boxes and updated the recipe based on the linter's suggestions. Thank you for your help.
I assume all three prior occurrences of `- python {{ python_min }}` should be changed to `- python >={{ python_min }}` too. So, I changed the other two accordingly.
Ok, with the latest linter, I'm completely confused where to put this `python {{ python_min }}` and where (or why not) to put `python >={{ python_min }}` Can you un-confuse...
Dear @isty2e, Thank you for your feedback and suggestions. We will definitely consider these for future versions and improvements of DataSAIL. Customized clustering is indeed something we haven't thought about...
Hi @atabaigi , You cannot use FoldSeek for protein sequences. FoldSeek is based on 3d structures. For sequences, can use MMseqs2 or CD-HIT. But as input for MMseqs or CD-HIT,...
Note to my-self: Recheck input validation and raise an error in these cases.
Thank you, that works. There is only a small correction: it's `is_pretokenized=True` in the `tokenizer.encode(...)` call.
This solved the problem of using a custom PreTokenizer in the tokenizers interface. But I still cannot train BPE (or other models) on it. Isn't there a way to get...
Hey After my initial post, it's not a single point that breaks. It's more conceptual, and I don't see if or how it is possible in the current implementation. In...