LTX-Video icon indicating copy to clipboard operation
LTX-Video copied to clipboard

Loading checkpoint shards: 0%| | 0/2 [00:00<? what's wrong, I run it on CPU

Open diabloaltman opened this issue 7 months ago • 2 comments

LTX-Video> python inference.py --prompt "一只可 爱的小猫正在厨房里认真地做红烧肉,锅中冒着热气,小猫穿着厨师帽" --height 512 --width 512 --num_frames 24 --seed 42 --pipeline_config configs/ltxv-2b-0.9.6-distilled.yaml Running generation with arguments: Namespace(output_path=None, seed=42, num_images_per_prompt=1, image_cond_noise_scale=0.15, height=512, width=512, num_frames=24, frame_rate=30, device=None, pipeline_config='configs/ltxv-2b-0.9.6-distilled.yaml', prompt='一只可爱的小猫正在厨房里认真地做红烧肉,锅中冒着热气,小猫穿着厨师帽', negative_prompt='worst quality, inconsistent motion, blurry, jittery, distorted', offload_to_cpu=False, input_media_path=None, conditioning_media_paths=None, conditioning_strengths=None, conditioning_start_frames=None) Padded dimensions: 512x512x25 Loading checkpoint shards: 0%| | 0/2 [00:00<?

diabloaltman avatar May 16 '25 10:05 diabloaltman

pipeline_type: base checkpoint_path: "ltxv-2b-0.9.6-distilled-04-25.safetensors" guidance_scale: 1 stg_scale: 0 rescaling_scale: 1 num_inference_steps: 8 stg_mode: "attention_values" # options: "attention_values", "attention_skip", "residual", "transformer_block" decode_timestep: 0.05 decode_noise_scale: 0.025 text_encoder_model_name_or_path: "PixArt-alpha/PixArt-XL-2-1024-MS" precision: "bfloat16" sampler: "from_checkpoint" # options: "uniform", "linear-quadratic", "from_checkpoint" prompt_enhancement_words_threshold: 120 prompt_enhancer_image_caption_model_name_or_path: "MiaoshouAI/Florence-2-large-PromptGen-v2.0" prompt_enhancer_llm_model_name_or_path: "unsloth/Llama-3.2-3B-Instruct" stochastic_sampling: true

diabloaltman avatar May 16 '25 10:05 diabloaltman

FYI Prompts should be in English only. Can you describe what error are you getting?

ybitterman avatar May 21 '25 06:05 ybitterman