lynn comments

Results 13 comments of


                                            lynn

AssertionError: Padding_idx must be within num_embeddings, demo.py

> Do you solve the problem? I meet the same problem. It seems to be the wrong llama model I download, I download it from modelscope, download from huggingface works...

cutoff_len 语义混淆

> > 大佬们。我有个问题。前提：对Qwen3-32B进行预训练，packing=false，长度选择2k，我的每条样本保证token长度不超过2k，假设位 1000。那么给模型的数据，每个样本数据的 input_ids就是长度是 1001(末尾加eos)，不足2k。 > > 疑问： 1、需要手动在 input_ids后面添加 pad_token 吗，把每条样本补长到 2k。 2、需要手动补充 labels列和attention_mask列吗。 labels前1001与input_ids一致，之后置IGNORE_INDEX；attention_mask前1001置1，之后置0 > > 1和2都不需要，会自动补齐 https://github.com/hiyouga/LLaMA-Factory/blob/main/src/llamafactory/data/processor/supervised.py#L32 请问这个 token是需要我们在制作数据集的时候添加吗，还是说LF会追加上去

error when installing mamba_ssm

Similar problem occured to me. import selective_scan_cuda ImportError: /home/aiscuser/.conda/envs/mamba/lib/python3.11/site-packages/selective_scan_cuda.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN3c1021throwNullDataPtrErrorEv python=3.11 pytorch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 pytorch-cuda=12.1

lynn

AssertionError: Padding_idx must be within num_embeddings, demo.py

cutoff_len 语义混淆

error when installing mamba_ssm

error when installing mamba_ssm

Is mamba slower than transformer?

Is mamba slower than transformer?

[QUESTION]How to calculate MFU based on the flops?

[QUESTION]How to calculate MFU based on the flops?

Can't train mamba2 from scratch with HF Trainer

Can't train mamba2 from scratch with HF Trainer