Vitaliy Chiley comments

Results 64 comments of


                                            Vitaliy Chiley

Add 8-bit LION optimizer

Added lion8b to [this](https://github.com/mosaicml/llm-foundry/pull/271#issuecomment-1595337924) ![Screenshot 2023-06-16 at 4 46 54 PM](https://github.com/mosaicml/llm-foundry/assets/6439018/6b87e35c-e1f8-4119-886a-cbbbe0c1d8cc) ![Screenshot 2023-06-16 at 4 47 11 PM](https://github.com/mosaicml/llm-foundry/assets/6439018/e99ac3a0-8654-4c8d-b7ec-5e6c237c1f26) Lion8B does not hurt convergence at all. Current impl is slightly slower.

Model loading on local machine

can you show more of the error print out? I'm trying to figure out which file throws this error Note: for triton, you should install this version of it: `triton-pre-mlir@git+https://github.com/vchiley/triton.git@triton_pre_mlir_sm90#subdirectory=python`...

Vitaliy Chiley

Add 8-bit LION optimizer

Model loading on local machine

How to reproduce zero-shot evals from Table 1 in the blog?

Triton Test Failed: GPU SMs must run at 1350 MHz / GPU memory must run at 877 MHz

Triton Test Failed: GPU SMs must run at 1350 MHz / GPU memory must run at 877 MHz

Triton Test Failed: GPU SMs must run at 1350 MHz / GPU memory must run at 877 MHz

Tensor Parallel MLP with torch2.0

Upgrade to `mosaicml-streaming==0.5.x`

Loss explodes with Flash/Triton Attention

Loss explodes with Flash/Triton Attention