recognize-anything
recognize-anything copied to clipboard
运行的结果没有示例图中好
非常棒的工作,标注效果相比Blip的有了很大的提升!nice!
主业的这张图中RAM的结果中如你展示和提醒的是有
lamp和door
标签的,但是我跑出来的结果中却没有
是什么原因导致的呢?
demo为了保证准确率,调高了阈值,牺牲了些召回, grounded sam的pipeline由于有grounding dino兜底,阈值会偏低些。 我们在精细的调调每个类的阈值。
是model.threshold
由0.68降到了0.64?我刚才改了但是好像没起到作用。还是其他哪个参数?谢谢
你好我运行测试命令的时候报错,请问您有遇到吗:python inference_tag2text.py --image 042.jpg --pretrained tag2text_swin_14m.pth 报错: magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'.
你好我运行测试命令的时候报错,请问您有遇到吗:python inference_tag2text.py --image 042.jpg --pretrained tag2text_swin_14m.pth 报错: magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'.
你的这个报错我没有遇到,我倒是遇到了另一个报错
Traceback (most recent call last):
File "inference_tag2text.py", line 94, in <module>
res = inference(image, model, args.specified_tags)
File "inference_tag2text.py", line 43, in inference
caption, tag_predict = model.generate(image,
File "/data2/home/tyu/stable_diffusion/promt_gen/Recognize_Anything-Tag2Text/models/tag2text.py", line 364, in generate
torch.sigmoid(logits) > self.class_threshold,
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
因为我是在显卡上跑的,所以遇到这样的报错,可以通过将对应的代码 https://github.com/xinyu1205/Recognize_Anything-Tag2Text/blob/ffd1a283caea70ab8436645c0fd0f366ae7de3f8/models/tag2text.py#L364
修改为
torch.sigmoid(logits) > self.class_threshold.to(image.device),
就行了,小问题 @Coler1994 @xinyu1205
是
model.threshold
由0.68降到了0.64?我刚才改了但是好像没起到作用。还是其他哪个参数?谢谢
应该只是阈值问题,我这儿降到0.63能出lamp
,door
还要更低些
你好我运行测试命令的时候报错,请问您有遇到吗:python inference_tag2text.py --image 042.jpg --pretrained tag2text_swin_14m.pth 报错: magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'.
你的这个报错我没有遇到,我倒是遇到了另一个报错
Traceback (most recent call last): File "inference_tag2text.py", line 94, in <module> res = inference(image, model, args.specified_tags) File "inference_tag2text.py", line 43, in inference caption, tag_predict = model.generate(image, File "/data2/home/tyu/stable_diffusion/promt_gen/Recognize_Anything-Tag2Text/models/tag2text.py", line 364, in generate torch.sigmoid(logits) > self.class_threshold, RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
因为我是在显卡上跑的,所以遇到这样的报错,可以通过将对应的代码
https://github.com/xinyu1205/Recognize_Anything-Tag2Text/blob/ffd1a283caea70ab8436645c0fd0f366ae7de3f8/models/tag2text.py#L364
修改为
torch.sigmoid(logits) > self.class_threshold.to(image.device),
就行了,小问题 @Coler1994 @xinyu1205
谢谢,发现问题了是模型文件没clone好,谢谢你的回复
你好我运行测试命令的时候报错,请问您有遇到吗:python inference_tag2text.py --image 042.jpg --pretrained tag2text_swin_14m.pth 报错: magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'.
你的这个报错我没有遇到,我倒是遇到了另一个报错
Traceback (most recent call last): File "inference_tag2text.py", line 94, in <module> res = inference(image, model, args.specified_tags) File "inference_tag2text.py", line 43, in inference caption, tag_predict = model.generate(image, File "/data2/home/tyu/stable_diffusion/promt_gen/Recognize_Anything-Tag2Text/models/tag2text.py", line 364, in generate torch.sigmoid(logits) > self.class_threshold, RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
因为我是在显卡上跑的,所以遇到这样的报错,可以通过将对应的代码
https://github.com/xinyu1205/Recognize_Anything-Tag2Text/blob/ffd1a283caea70ab8436645c0fd0f366ae7de3f8/models/tag2text.py#L364
修改为
torch.sigmoid(logits) > self.class_threshold.to(image.device),
就行了,小问题 @Coler1994 @xinyu1205
感谢你非常有价值的bug反馈,我已经修改对应的代码~