mengjiexu

Results 4 issues of mengjiexu

Now we can train llama-7B model on one RTX 3090, can we train llama-13B model with two RTX3090 with model parallelism?

我看训练文本是text only,在训练时好像没有区分instruction和output,是直接concat在一起训练的吗?

I use bash scripts/run_sample_video.sh, the sh file is: using **LWM-Chat-1M-JAX** model. ```python ... python3 -u -m lwm.vision_generation \ --prompt='A long big pig is walking across the street' \ --output_file='fireworks.mp4' \...

the error: `[2024-03-15 17:42:38,572] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect) Loading LLaVA from base model... Special tokens have been added in the vocabulary, make sure the associated word...