BertSum requirements for bert-large?

requirements for bert-large?

Open rush86999 opened this issue 5 years ago • 1 comments

What if any issues would occur if bert-large was used? For example gpu requirements and training time? would it be too costly? Any reason why bert-base was used instead of bert-large?

Jun 21 '19 12:06 rush86999

I'm also guessing that Yang Liu used bert-base instead of bert-large because bert-large would require more gpu, memory, and training time. Maybe using bert-large wouldn't result in greater improvements in performance, but I don't think the original paper talks about that. There aren't ablation studies about this in particular, but just my guess.

Aug 23 '19 09:08 jihun-hong

BertSum BertSum copied to clipboard

requirements for bert-large?

BertSum
BertSum copied to clipboard