LWM icon indicating copy to clipboard operation
LWM copied to clipboard

Generate video Only First frame has img, other frames are random pixel

Open mengjiexu opened this issue 1 year ago • 0 comments

I use bash scripts/run_sample_video.sh, the sh file is: using LWM-Chat-1M-JAX model.

...

python3 -u -m lwm.vision_generation \
    --prompt='A long big pig is walking across the street' \
    --output_file='fireworks.mp4' \
    --temperature_image=1.0 \
    --temperature_video=1.0 \
    --top_k_image=8192 \
    --top_k_video=1000 \
    --cfg_scale_image=5.0 \
    --cfg_scale_video=1.0 \
    --vqgan_checkpoint="$vqgan_checkpoint" \
    --n_frames=8 \
    --mesh_dim='!1,1,2,1' \
    --dtype='bf16' \
    --load_llama_config='7b' \
    --update_llama_config="dict(sample_mode='vision',theta=50000000,max_sequence_length=32768,use_flash_attention=True,scan_attention=False,scan_query_chunk_size=256,scan_key_chunk_size=256,scan_mlp=False,scan_mlp_chunk_size=8192,scan_layers=True)" \
    --load_checkpoint="params::$lwm_checkpoint" \
    --tokenizer.vocab_file="$llama_tokenizer_path"
read

after generation, the output video only first frame has meaningful frame, other frame are all random pixel.

mengjiexu avatar Feb 19 '24 08:02 mengjiexu