covid-papers-browser
covid-papers-browser copied to clipboard
Train transformer models on MedNLI
Great work!
Heres some data that might be more domain related. Its not open but it might be helpful? https://jgc128.github.io/mednli/
Looks neat! Didn't know about datasets for medical NLI, that's perfect for our use case indeed!
If someone is interested in finetuning the three pretrained models SciBERT, BioBERT and CovidBERT on this MedNLI dataset using the finetune_nli.py
script and upload them on the HuggingFace cloud, I'll add them in the list!
I'm changing the Issue title to make this visible for other contributors!
I'm interested in finetuning BioBERT using MedNLI dataset. I need the following information
a) Why did you choose batch size of 64 instead of 16 to train all the NLI models (biobert-nli, scibert-nli and covidbert-nli) ? b) How many epochs did you train these models? (default no. of epochs is 1 in the sentence transformer library)
Thanks in advance............... @gsarti
Hi @kalyanks0611,
The choice of a larger batch size was only due to the intuition that this would limit noise during training, I have no empirical proof that this leads to better downstream performances in practice.
The NLI models were trained with different number of steps (20,000, 23,000 and 30,000 respectively): this is also due to GPU time allowances, and not set empirically. 30,000 steps at batch size 64 correspond to 1,920,000 examples, which is a bit less than two full epochs on MultiNLI + SNLI, that together account for roughly 1M sentence pairs.
Hope this helps!