xmc-andy comments

Results 18 comments of


                                            xmc-andy

Some weights of OtterForConditionalGeneration were not initialized from the model

> May I know your task type and which version of Otter model you are using for initialization? I am doing a classification task, with multiple images and a single...

Some weights of OtterForConditionalGeneration were not initialized from the model

export PYTHONPATH=. accelerate launch --config_file=./pipeline/accelerate_configs/accelerate_config_fsdp.yaml \ pipeline/train/instruction_following.py \ --pretrained_model_name_or_path /mnt/large_model/weights/OTTER-Image-MPT7B_git \ --mimicit_vt_path /mnt/large_model/output/XX/SD_instruction.json \ --images_vt_path /mnt/large_model/output/XX/SD.json \ --external_save_dir /mnt/large_model/output/XX/OTTER-Identify-Image-MPT7B-BC4-partScale-negAug3 \ --batch_size 1 \ --num_epochs 15 \ --run_name OTTER-Identify-Image-MPT7B-BC4-partScale-negAug3 \ --workers...

Some weights of OtterForConditionalGeneration were not initialized from the model

When loading the pre-trained weights you posted and the baseline weights I trained, there will be no log loss of weights, but there will be when loading the newly trained...

Some weights of OtterForConditionalGeneration were not initialized from the model

By the way, due to network problems, the network cannot download tokenizer_config.json from huggingface's MPT, so I downloaded it offline through "https://huggingface.co/mosaicml/mpt-7b-instruct", except for the bin file and in modeling_otter.py...

Some weights of OtterForConditionalGeneration were not initialized from the model

I compared the config.json. Except for "_name_or_path" and "transformers_version", the rest are consistent with what you posted. This should not be the problem. I used to convert the trained weights...

Some weights of OtterForConditionalGeneration were not initialized from the model

I checked the save_pretrained part as you said, I'm using the version about a month ago, the save code is as follows" unwrapped_model = accelerator.unwrap_model(model) checkpoint_dict = get_checkpoint(model=unwrapped_model) accelerator.save( checkpoint_dict,...