Jan

Results 68 issues of Jan

## ๐Ÿš€ Feature Recently, we implemented `bbs_database download` for multiple different sources. It might be a good idea to extend our integration test to actually use this download (rather than...

๐Ÿงช testing

## ๐Ÿš€ Feature One can probably get big speedups if downloads are launched in parallel. However, it might require source specific logic. We should evaluate what the speedup could be...

optimization

## ๐Ÿš€ Feature It seems that currently there are no SQL indexes created when `bbs_database init ...` is run. See the CORD-19 code to find out what exact indexes should...

optimization
๐Ÿ—„๏ธ database

Are there any alternatives to GROBID and would there be any major advantages in using them? ### Alternatives (feel free to add new entries) - https://github.com/pdfminer/pdfminer.six - https://github.com/mstamy2/PyPDF2 - https://github.com/pymupdf/PyMuPDF...

question

## ๐Ÿš€ Feature Currently, all `ArticleParser` children are implemented inside of `bluesearch.database.article`. However, that means that to use one parser we need to import the dependencies of all existing parsers....

optimization

## ๐Ÿš€ Feature It would be very convenient to be able to "fetch" any article from the database based on its `article_id`. In the background the fetching would 1. Query...

new feature
๐Ÿ—„๏ธ database

## ๐Ÿš€ Feature I think we can incrementally start annotating existing code. It shouldn't really take that long and it could be a really nice way to practice and learn...

https://github.com/BlueBrain/Search/blob/78e82f4c0f8f04790f87424a80df622f500a9f5c/data_and_models/metrics/sentence_embedding/.gitignore#L27 It is still listing assets created by models that we do not have anymore. I would not be surprised if there are more gitignores around with a similar problem.