bigbird
bigbird copied to clipboard
What's the difference of bigbr_base and bigbr_base_tf2 at the gs://bigbird-transformer/pretrain ?
I found there are two bigbr_base pretrain weights at Google Cloud Storage Bucket, what is the difference? And I have checked that their word embeddings are different by this script, which means they are not only different in the type of tf2/tf1.