Malte Pietsch comments

Results 22 comments of


                                            Malte Pietsch

What is the current model

Hey @datistiquo, Yep, cosine similarity on plain BERT embeddings usually doesn't work very well. That's why we use a [sentence-bert](https://huggingface.co/deepset/sentence_bert) model for English questions, which was trained on a NLI...

What is the current model

> Do you also have the experiments for German public on mlflow? Did you tried like finetuning on the german Covid QAs already? No, we focussed our experiments on English...

What is the current model

> I meant something like scraping all corona related text (in german). Ah okay. Yes, this could be helpful. However, it would require quite some substantial number of texts and...

Benchmark & improve different embedding models

Added a basic eval dataset with #16

Benchmark & improve different embedding models

Added a basic eval script: https://github.com/deepset-ai/COVID-QA/pull/23 Results will be tracked via mlflow: https://public-mlflow.deepset.ai/#/experiments/55 Now running evaluation for a few plain BERT models as a baseline ....

Benchmark & improve different embedding models

Added evaluations for some pretrained transformer models in #45. - Plain bert-base + sentence-bert (No finetuning done yet) - Extracting embeddings from last or second last layer; simple pooling methods...

Add extractive QA (aka SQuAD style)

removed the "help wanted" label just because we want help for all the open issues anyway right now :)

Add extractive QA (aka SQuAD style)

We could also think of extractive QA on the [CORD-19 dataset](https://pages.semanticscholar.org/coronavirus-research). Probably we should then separate in the UI the user groups (general public vs. researchers)

How to contribute

Hi @DRMALEK , For now: - We share a couple of high-level tasks in the bottom of the readme (https://github.com/deepset-ai/COVID-QA#heart-how-you-can-help) - We will create a few more issues around those...

BioBERT Model Available - Trained on BioASQ for Question Answer

Hey @trisongz, Thanks for sharing this! Did I understand correctly that you took the text corpus from BioASQ 7b and pretrained a BERT from scratch with MLM and NSP objective?...