Zhang Peiyuan

Results 32 comments of Zhang Peiyuan

You should probably upgrade your transformer version.

https://github.com/jzhang38/TinyLlama/blob/main/lit_gpt/config.py You can pick one config from here or create your own config.

I also encountered problem 3. Model: T5 (by passing "t5-base" to model_name_or_path). Dataset: NarrativeQA. Command: CUDA_VISIBLE_DEVICES=0 python src/run.py --dataset_config_name narrative_qa \ --dataset_name tau/scrolls --drop_duplicates_in_eval True \ --greater_is_better False --metric_for_best_model loss...

@LzhinFdu @GeneZC Yeah you need to shard the sequence yourself before feeding them into ring-flash-attention. I have an implementation here: https://github.com/jzhang38/EasyContext

The value of the tensor x is already modified **in_place** after the rotary_emb.apply_rotary function. It is just that we did not initialize a separate memory to store the output value,...

x_1_approx_noised = (1 - reshape_t(renoise_timesteps)) * x_1_approx + reshape_t(renoise_timesteps) * x_0_latent I believe this line is wrong. Correct version should be" x_1_approx_noised = reshape_t(renoise_timesteps) * x_1_approx + ( 1 -...

Thanks for the quick answer! > The parallel decoding is done by feeding next-scale queries (e.g., 4x4=16) to the transformer decoder and getting 16 predicted token distributions simultaneously. This is...

https://github.com/hao-ai-lab/FastVideo/blob/bdfdf1dfeea2aebac8a462df9c3bcb2d1d11a01c/fastvideo/v1/models/loader/fsdp_load.py#L206 https://github.com/hao-ai-lab/FastVideo/blob/bdfdf1dfeea2aebac8a462df9c3bcb2d1d11a01c/fastvideo/v1/configs/models/dits/hunyuanvideo.py#L55

Please run precommit ```pre-commit run --all-files''' and pass precommit tests.