MagicSource
MagicSource
I tried using Cambiran 7M with systme prompts. I don't know why, maybe the more longer training process, shouldn't use same wramup ratio? I tried the intermediate model, the inference...
may I ask which modification focus on resolve this issue
@yanwei-li This is my script: ``` AUX_SIZE=768 deepspeed train_xformers_gemini.py \ --deepspeed ./scripts/gemini/zero2.json \ --model_name_or_path ./checkpoints/$MODEL_VERSION \ --version $PROMPT_VERSION \ --data_path ./data/llava_0.1/pretrain_data.json \ --image_folder ./data/images_all/pretrain_data \ --vision_tower ./checkpoints/clip-vit-large-patch14-336 \ --vision_tower_aux ./checkpoints/openclip-convnext-large-d-320-laion2B-s29B-b131K-ft-soup...
@yanwei-li Hi, why a template could effect final performance so much? Is there any deeper reason?
Oh, so that it could be better try same template in pretrain stage. Actually since the projector training stage didn't open LLM, it might be more suitable if keep same...
Sorry for I posted a simillar issue but almost exact same issue I got with sqlachemdemy. and I used time.sleep(2) in my function.. Is there any solution?
Hi, any time to upgrade to pypi?
TypeError: Scheduler.add_schedule() got an unexpected keyword argument 'max_running_jobs' Hi, just want to ask, thank u so much for the work, but can we just make the api a little bit...
Hopefully we won't change all the codes again when beta and final. -.-
Posting numerous different versions of the API is not beneficial for your project. Currently, artificial intelligence is trending. Training AI with incorrect version corpora only leads to increased confusion among...