li-yang23
li-yang23
I found that in train.py `mu.data.numpy()` is used to get hidden_emb, but it would get None when using GCNModelAE as model, hidden_emb should be got from model.encode() instead.
I just checked codes in `generation.py` and I noticed it import `LLamaQaStoppingCriteria` from `transformers.generation.stopping_criteria`. However I did not find this function (or class) in the transformers/generation/stopping_criteria.py script. (not from the...
I downloaded the LlaMA-MoE-v1-3_0B-2_16 from huggingface via `huggingface-cli download llama-moe/LLaMA-MoE-v1-3_0B-2_16 --local-dir /path/to/my/local/dir` and use the same quick start inference code in readme file, except replace the `model_dir` to my local...
`litgpt==0.5.11 litdata==0.2.58, OS: windows 10, python==3.13, torch==2.9.0+cu126` I'm following the tutorials to pretrain a smollm2-135M with tinystories dataset, and my script is like this: ```python import lightning as L import...