Matt
Matt
Hi @tmoroder, can you try on GPU with `jit_compile=True` in both 4.20 and 4.21? I believe the code had issues with XLA before 4.21, and TPU code is always compiled...
That makes sense - we made changes to the model to make it XLA-compatible in 4.21. XLA compatibility is necessary for TPU support, so the 4.20 model would never have...
This is now ready for review @sgugger @gante! I'm tracking down a couple of remaining bugs in the tests and doing some final manual checks, but almost everything should be...
@sgugger Tests are now enabled in `config.yml` and everything still looks green!

@sgugger tests are now actually passing! I had to skip one - it fails because of a known issue with shape inference on small datasets in `to_tf_dataset`. There is a...
TF 2.3 is quite old by now, and I wouldn't make a special effort to support it. Several nice TF features (like the Numpy-like API) only arrived in TF 2.4,...
Hi @WissamAntoun, this is an interesting issue! I honestly have no idea what the cause could be, but the fact that it highlights that function is interesting. The reason is...
Also cc @sanchit-gandhi because I'm not a TPU expert - don't worry about investigating this deeply, but if anything comes to mind when you read it, let me know!
@WissamAntoun Confirmed reproduction of the issue here. Our TF DeBERTa implementation seems to have issues with XLA - I'm investigating now.