Carlos Mocholí comments

Results 427 comments of


                                            Carlos Mocholí

Separate out the biases

Argh good point. This needs to be solved at the CLI level, but I'm not sure of the best way to do it. Opened https://github.com/omni-us/jsonargparse/issues/479 to ask.

Categorize SFT and Pretraining data

Isn't this a duplicate of https://github.com/Lightning-AI/litgpt/issues/1084 of yours?

Categorize SFT and Pretraining data

Note that this needs to be done by having two different base classes and having the files use only one of them in their type signatures

Don't redownload files by default

This is feasible now that we get the list of bins or safetensors before running the download: https://github.com/Lightning-AI/litgpt/blob/main/litgpt/scripts/download.py#L54 What would you do if?: - The bins/safetensors exists but the lit_model.pth...

Don't redownload files by default

Sounds good to me!

Adds batched inference with left-padding

Hi @FlimFlamm! Thanks for working on this. I had this partially implemented but never pushed it and I might have lost it because I cannot find it in my stashes...

Support for BLOOM

This would require adding support for ALiBi. This is not a priority at the moment. A related feature request is https://github.com/Lightning-AI/lit-gpt/issues/199

Support for BLOOM

We don't support flash attention from `flash-attn`. Supporting alibi would warrant an entirely new `model_alibi.py` definition

Support for BLOOM

We support flash attention via PyTorch's `scaled_dot_product_attention`, just not via Tri Dao's `flash-attn`. The former uses one of the latter's implementations internally.

FSDP checkpointing uses deprecated APIs with PyTorch 2.2

Two more which probably need to be fixed in PyTorch ```python /home/carlos/nightly-env/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/api.py:1132: UserWarning: Please use DTensor instead and we are deprecating ShardedTensor. warnings.warn(DEPRECATE_MSG) ``` From (`print_stack` added by me): ```python...