BertSum icon indicating copy to clipboard operation
BertSum copied to clipboard

requirements for bert-large?

Open rush86999 opened this issue 5 years ago • 1 comments

What if any issues would occur if bert-large was used? For example gpu requirements and training time? would it be too costly? Any reason why bert-base was used instead of bert-large?

rush86999 avatar Jun 21 '19 12:06 rush86999

I'm also guessing that Yang Liu used bert-base instead of bert-large because bert-large would require more gpu, memory, and training time. Maybe using bert-large wouldn't result in greater improvements in performance, but I don't think the original paper talks about that. There aren't ablation studies about this in particular, but just my guess.

jihun-hong avatar Aug 23 '19 09:08 jihun-hong