Roman Joeres

Results 20 comments of Roman Joeres

@conda-forge/help-python What do I need to do for you to merge this PR? I passed all checked and completed all tasks. About the build number, I'm not sure where to...

@shermansiu I checked the boxes and updated the recipe based on the linter's suggestions. Thank you for your help.

I assume all three prior occurrences of `- python {{ python_min }}` should be changed to `- python >={{ python_min }}` too. So, I changed the other two accordingly.

Ok, with the latest linter, I'm completely confused where to put this `python {{ python_min }}` and where (or why not) to put `python >={{ python_min }}` Can you un-confuse...

Dear @isty2e, Thank you for your feedback and suggestions. We will definitely consider these for future versions and improvements of DataSAIL. Customized clustering is indeed something we haven't thought about...

Hi @atabaigi , You cannot use FoldSeek for protein sequences. FoldSeek is based on 3d structures. For sequences, can use MMseqs2 or CD-HIT. But as input for MMseqs or CD-HIT,...

Note to my-self: Recheck input validation and raise an error in these cases.

Thank you, that works. There is only a small correction: it's `is_pretokenized=True` in the `tokenizer.encode(...)` call.

This solved the problem of using a custom PreTokenizer in the tokenizers interface. But I still cannot train BPE (or other models) on it. Isn't there a way to get...

Hey After my initial post, it's not a single point that breaks. It's more conceptual, and I don't see if or how it is possible in the current implementation. In...