StringZilla icon indicating copy to clipboard operation
StringZilla copied to clipboard

Aggregate a plain non-synthetic dataset for Bio sequences

Open ashvardanian opened this issue 1 year ago • 0 comments

For fair benchmarks of Needleman-Wunsch scoring algorithms we should find a real-world protein bank and ideally export it into a whitespace or newline delimited .txt file, that will be easy to parse not only in Python, but also in C++. Community contributions more than welcome 🤗

ashvardanian avatar Feb 13 '24 21:02 ashvardanian