gill
gill copied to clipboard
param.grad is None !
Hi! Thank you for your great work!
After preparing datasets and pretrained model, I trained the model using this command:
randport=$(shuf -i8000-9999 -n1) # Generate a random port number
python -u main.py
--dist-url "tcp://127.0.0.1:${randport}" --dist-backend 'nccl'
--multiprocessing-distributed --world-size 1 --rank 0 --batch-size=256
--dataset=cc3m --val-dataset=cc3m
--exp-name='gill_exp' --image-dir='/data/vol1/public-datasets/03-CC/cc3m/' --log-base-dir='runs/'
--precision='bf16' --print-freq=100
--opt-version='/data/models/facebook/opt-6.7b' --visual-model='/data/pretrained_weights/openai/clip-vit-large-patch14' --workers=0
No code is modified execpt the data path. However, I got this error:
Traceback (most recent call last): File "/root/miniconda3/envs/vlm/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 74, in _wrap fn(i, *args) File "/data/vol1/zky/methods/gill/main.py", line 402, in main_worker train(train_loader, model, tokenizer, criterion, optimizer, epoch, scheduler, args) File "/data/vol1/zky/methods/gill/main.py", line 586, in train assert param.grad.shape[0] == len(tokenizer) AttributeError: 'NoneType' object has no attribute 'shape'
It seems like the parm in model.module.model.input_embeddings.parameters() has no grad. Could you teach me how to solve this problem? Thank you!