Carlos Mocholí
Carlos Mocholí
The script is designed to print the inference results to stdout and everything else to stderr, in case you want to pipe them separately. There might be something wrong with...
Have a look at this pretraining tutorial: https://github.com/Lightning-AI/lit-gpt/blob/main/tutorials/pretrain_tinyllama.md The only MoE model we support is Mixtral. You would need to replace tinyllama with it.
I agree
Same issue as in https://github.com/Lightning-AI/litgpt/issues/1402 cc @awaelchli
Are you using Google Colab? You could try using https://lightning.ai while this gets fixed. It should work there without issues
You should accompany any decision with a PoC of how to implement it. I say this because (to the best of my knowledge) a call like `litgpt finetune --method "lora"`...
See also my previous comment on this topic: https://github.com/Lightning-AI/litgpt/issues/996#issuecomment-1989618188
> The limitation you mentioned would be for selectively showing the LoRA args, correct? Yes. But also for the --data argument or the --generate subcommand etc. These are technical details...
I'm trying to install it on an A100 and this is the error I get: ```error: identifier "CUDNN_DATA_FP8_E5M2" is undefined``` Running: ```NVTE_FRAMEWORK=pytorch pip install --upgrade git+https://github.com/NVIDIA/TransformerEngine.git@stable``` ```shell ERROR: Command errored...
Upgrading to CUDA 12.1 allowed me to install it. Perhaps this comment is outdated and should be updated in the installation instructions > Transformer Engine requires CUDA 11.8