GPT2
GPT2 copied to clipboard
GPT vs BERT, under same computation and data resource, which one is better for downstream tasks like GLUE?
Thank you very much.