Zhang Peiyuan comments

Results 32 comments of


                                            Zhang Peiyuan

Encountered an issue while loading the model using transformers

You should probably upgrade your transformer version.

Reference for pretraining other small language models

https://github.com/jzhang38/TinyLlama/blob/main/lit_gpt/config.py You can pick one config from here or create your own config.

Predict command fails

I also encountered problem 3. Model: T5 (by passing "t5-base" to model_name_or_path). Dataset: NarrativeQA. Command: CUDA_VISIBLE_DEVICES=0 python src/run.py --dataset_config_name narrative_qa \ --dataset_name tau/scrolls --drop_duplicates_in_eval True \ --greater_is_better False --metric_for_best_model loss...

large memory usage

@LzhinFdu @GeneZC Yeah you need to shard the sequence yourself before feeding them into ring-flash-attention. I have an implementation here: https://github.com/jzhang38/EasyContext

model.py

The value of the tensor x is already modified **in_place** after the rotary_emb.apply_rotary function. It is just that we did not initialize a separate memory to store the output value,...

student_loss.backward() in LADD

x_1_approx_noised = (1 - reshape_t(renoise_timesteps)) * x_1_approx + reshape_t(renoise_timesteps) * x_0_latent I believe this line is wrong. Correct version should be" x_1_approx_noised = reshape_t(renoise_timesteps) * x_1_approx + ( 1 -...

Someone made LCM sampler only 10 steps can you add it to demo page and pipe?

I check the link and there is no model weight?

How do you make Transformer generate tokens in parallel?

Thanks for the quick answer! > The parallel decoding is done by feeding next-scale queries (e.g., 4x4=16) to the transformer decoder and getting 16 predicted token distributions simultaneously. This is...

[Bug] Attention to qkv should use QKVParallelLinear

https://github.com/hao-ai-lab/FastVideo/blob/bdfdf1dfeea2aebac8a462df9c3bcb2d1d11a01c/fastvideo/v1/models/loader/fsdp_load.py#L206 https://github.com/hao-ai-lab/FastVideo/blob/bdfdf1dfeea2aebac8a462df9c3bcb2d1d11a01c/fastvideo/v1/configs/models/dits/hunyuanvideo.py#L55

PCM Distillation

Please run precommit ```pre-commit run --all-files''' and pass precommit tests.