jdennis
jdennis
test error is easy to resolve, however it has very high error (mAP is below 0.1) -- caffe version is one of most accurate models. It seems conv2_1/1/conv layer is...
The error seems related to multiple gpus. When I tried single gpu (not all GPU ids, gpu id 1 is fine, but gpu id 2 encounters same above error), training...
FYI. coco model seems to work fine (e.g. coco/res50-15s-800-fpn-cascade is fine, res101 runs out of GPU memory on 1080 Ti), suggest you switch to coco flavor from voc.
seems to be caused by the included, old tokenizer in decapoda-research/llama-7b-hf, see detail here: https://github.com/huggingface/transformers/issues/22762. Was able to resolve the issue here by switching to [huggyllama/llama-7b](https://huggingface.co/huggyllama/llama-7b), which has newer, correct...
reproduced the 7B training using Nvidia A10 at AWS a couple of days ago without any error. Was using AWS-supplied ubuntu 20.04 Pytorch 2.0.0 AMI image.