li-yang23 issues

Results 4 issues of


                                            li-yang23

should use encode() to get hidden_emb

I found that in train.py `mu.data.numpy()` is used to get hidden_emb, but it would get None when using GCNModelAE as model, hidden_emb should be got from model.encode() instead.

LLamaQaStoppingCriteria not in transformers.generation.stopping_criteria?

I just checked codes in `generation.py` and I noticed it import `LLamaQaStoppingCriteria` from `transformers.generation.stopping_criteria`. However I did not find this function (or class) in the transformers/generation/stopping_criteria.py script. (not from the...

parameter of weight_gates are not initialized from huggingface checkpoint

I downloaded the LlaMA-MoE-v1-3_0B-2_16 from huggingface via `huggingface-cli download llama-moe/LLaMA-MoE-v1-3_0B-2_16 --local-dir /path/to/my/local/dir` and use the same quick start inference code in readme file, except replace the `model_dir` to my local...

Code from `tutorials\examples\ptl-trainer\litgpt_ptl_small.py` not working, model not learning

`litgpt==0.5.11 litdata==0.2.58, OS: windows 10, python==3.13, torch==2.9.0+cu126` I'm following the tutorials to pretrain a smollm2-135M with tinystories dataset, and my script is like this: ```python import lightning as L import...

question