OhadRubin

Results 29 issues of OhadRubin

See [here](https://github.com/PyTorchLightning/pytorch-lightning/issues/11043)

Hey guys, Just wanted to let you know about [this](https://github.com/illuhad/hipSYCL) project, that might allow you to cross-compile alpa to AMD gpus.

It will be really great if you guys had LLM.int8() support from https://github.com/TimDettmers/bitsandbytes/tree/main. As of now, the support with hugginface doesn't include pipeline parallelism, so I can I either do...

Are there any recommended settings for Transformer Language modeling?

Can you guys add the number of parameters each model (in the results table) is using?

@sanchit-gandhi

Hey, I've seen libraries like T5X that are able to read model configs from cloud buckets. As a product from google I kind of expected this feature to be implemented...

This may be kinda hacky, but what if I wanted to use some function and only define it via gin file, so this is the solution I came up with:...

Hey, There is an implementation for [this](https://arxiv.org/abs/2205.14135) memory efficient version of attention [here](https://github.com/lucidrains/flash-attention-jax). And I was wondering if it is possible to somehow integrate it into Flax.

Priority: P2 - no schedule

Add natural questions in a the closedbook setup. btw, the current nq_open task you have is one derieved from the original nq, and only contains ~3k of the total ~8k...