VLE
VLE copied to clipboard
VLE: Vision-Language Encoder (VLE: 视觉-语言多模态预训练模型)
反向传播不带梯度
你好,当使用反向传播到cross_modal_image_layers和cross_modal_text_layers的时候 为什么会没有梯度。在BERT的modelling_bert 设置梯度的时候显示是没有,请问怎么拿到每一层的梯度,例如cross_modal_image_layers倒数第一层的特征图和梯度。谢谢
What datasets were used in the pre-training stage? Especially the patch box classification task.
System enviroment: Ubuntu20.04 torch 2.0.0+cu118 torchvision 0.15.1+cu118 Commandline: #run_vqav2_ft.py --train_config_file=vqa_train_config.json **Error Description as below:** /home/steven/anaconda3/envs/nlp/bin/python /home/steven/workstore/nlp/VLE-main/run_vqav2_ft.py --train_config_file=vqa_train_config.json /home/steven/workstore/nlp/VLE-main/run_vqav2_ft.py:76: SyntaxWarning: "is" with a literal. Did you mean "=="? max_epochs=_config["max_epoch"] if max_steps...
who can help provide the relate paper download link about this project? I've search on internet, but still not found.
请问demo中的大语言模型(LLM)用的是哪个具体的模型呢?
from VLE import VLEForITM, VLEProcessor, VLEForITMPipeline from PIL import Image model_dir = "./pretrained/vle-base" itm_text = ["a photo of a cat.", "a photo of dogs."] itm_images = Image.open("pics/dogs.png") print("Init ITM model")...