Malte Pietsch

Results 22 comments of Malte Pietsch

Hey @datistiquo, Yep, cosine similarity on plain BERT embeddings usually doesn't work very well. That's why we use a [sentence-bert](https://huggingface.co/deepset/sentence_bert) model for English questions, which was trained on a NLI...

> Do you also have the experiments for German public on mlflow? Did you tried like finetuning on the german Covid QAs already? No, we focussed our experiments on English...

> I meant something like scraping all corona related text (in german). Ah okay. Yes, this could be helpful. However, it would require quite some substantial number of texts and...

Added a basic eval dataset with #16

Added a basic eval script: https://github.com/deepset-ai/COVID-QA/pull/23 Results will be tracked via mlflow: https://public-mlflow.deepset.ai/#/experiments/55 Now running evaluation for a few plain BERT models as a baseline ....

Added evaluations for some pretrained transformer models in #45. - Plain bert-base + sentence-bert (No finetuning done yet) - Extracting embeddings from last or second last layer; simple pooling methods...

removed the "help wanted" label just because we want help for all the open issues anyway right now :)

We could also think of extractive QA on the [CORD-19 dataset](https://pages.semanticscholar.org/coronavirus-research). Probably we should then separate in the UI the user groups (general public vs. researchers)

Hi @DRMALEK , For now: - We share a couple of high-level tasks in the bottom of the readme (https://github.com/deepset-ai/COVID-QA#heart-how-you-can-help) - We will create a few more issues around those...

Hey @trisongz, Thanks for sharing this! Did I understand correctly that you took the text corpus from BioASQ 7b and pretrained a BERT from scratch with MLM and NSP objective?...