Jan

Results 68 issues of Jan

Would be nice to check whether it is faster:)

I suggest writing a step-by-step documentation on what is going on behind `download_dataset`. The source code is very hard to follow (I am blaming myself). We had multiple discussions about...

We should be able to assign an embedding to the abstract. However, currently the abstract is a separate field/attribute of the `Article` class. **Todos** - [x] Decide on the best...

🗄️ database

## 🐛 Bug description We noticed that `find_files` can get really slow when we try to do recursive matching for a regex pattern. It would be nice to investigate whether...

The following command: ``` bbs_database run -n --luigi-config-args "AddTask.db_url:my.sqlite" ``` gives this error. ``` change = re.split(r"[.:]", param, maxsplit=3) TypeError: set() takes from 3 to 4 positional arguments but 5...

🐛 bug fix

## 🚀 Feature It would be very useful to be able to send a signal (e.g. SIGINT / keyboard interrupt) to a running `bbs_database topic-extract` process and have a guarantee...

🗄️ database

## 🐛 Bug description Currently, the `add` command does not put the abstract in the sentences table. One problematic implication of this behavior is that adding pubmed articles will always...

🐛 bug fix

## 🚀 Feature The `topic-extract` command seems take all the files in the directory by default. However, it can totally happen that the user has some unwanted files inside of...

## 🚀 Feature Currently, the `bbs_database add` is going to error out when an article is already in the database. When trying to add all articles in a folder It...

## 🚀 Feature The `topic-extract` step extracts a lot of information about a topic of an article. However, in the current pipeline it is only used to decide whether a...

🗄️ database