Youku-mPLUG
Youku-mPLUG copied to clipboard
有人训练、微调成功过吗?
在按照仓库运行代码下载模型的时候,遇到一些问题,不知道有人是否成功复现过?
希望能得到好心人的解答!!
请详细说明一下遇到的问题?
@auhowielau 微调的时候,找不到处理视频数据集的代码,我想了解您是否能够提供mplug-owl在视频数据集上微调的代码,我将原本存在于/mPLUG-Owl/mPLUG-Owl/pipeline/data_utils/xgpt3_dataset.py的process_data函数参考/mPLUG-Owl/mPLUG-Owl/mplug_owl_video/processing_mplug_owl.py中的MplugOwlProcessor,修改了视频数据的加载方式,但是输入数据的形状对不上,我想请问一下,这里的视频数据加载方案应该怎么修改。 这是我修改之后的
def process_data(self, data, processor=None):
# Process Image if exists
if 'image' in data and len(data['image']) > 0:
video_features = []
for video in data['image']:
video_frames = load_video(video, num_frames=4) # 暂时写成4帧
if processor:
video_feature = [processor(image=video_frame, text=None)[0] for video_frame in video_frames]
video_features.extend(video_feature)
images = torch.stack(video_features, dim=0)
images = images.permute(0, 2, 1, 3)
这是修改之前的
def process_data(self, data, processor=None):
# Process Image if exists
if 'image' in data and len(data['image']) > 0:
if 'image_data' in data:
images = data['image_data']
else:
image_urls = data['image']
images = self._load_img(image_urls)
if processor:
images = [processor(image=image, text=None)[0]
for image in images]
images = torch.stack(images, dim=0)
else:
images = None
我想了解你们团队是怎么处理这里的视频数据的,能否给我提供一些建议,非常感谢