Youku-mPLUG icon indicating copy to clipboard operation
Youku-mPLUG copied to clipboard

有人训练、微调成功过吗?

Open haidequanbu opened this issue 1 year ago • 2 comments

在按照仓库运行代码下载模型的时候,遇到一些问题,不知道有人是否成功复现过?

希望能得到好心人的解答!!

haidequanbu avatar Feb 05 '24 06:02 haidequanbu

请详细说明一下遇到的问题?

auhowielau avatar Feb 22 '24 03:02 auhowielau

@auhowielau 微调的时候,找不到处理视频数据集的代码,我想了解您是否能够提供mplug-owl在视频数据集上微调的代码,我将原本存在于/mPLUG-Owl/mPLUG-Owl/pipeline/data_utils/xgpt3_dataset.py的process_data函数参考/mPLUG-Owl/mPLUG-Owl/mplug_owl_video/processing_mplug_owl.py中的MplugOwlProcessor,修改了视频数据的加载方式,但是输入数据的形状对不上,我想请问一下,这里的视频数据加载方案应该怎么修改。 这是我修改之后的

    def process_data(self, data, processor=None):
        # Process Image if exists
        if 'image' in data and len(data['image']) > 0:
            video_features = []
            for video in data['image']:
                video_frames = load_video(video, num_frames=4) # 暂时写成4帧
                if processor:
                    video_feature = [processor(image=video_frame, text=None)[0] for video_frame in video_frames]
                    video_features.extend(video_feature)
            images = torch.stack(video_features, dim=0)
            images = images.permute(0, 2, 1, 3)

这是修改之前的

    def process_data(self, data, processor=None):
        # Process Image if exists
        if 'image' in data and len(data['image']) > 0:
            if 'image_data' in data:
                images = data['image_data']
            else:
                image_urls = data['image']
                images = self._load_img(image_urls)
            if processor:
                images = [processor(image=image, text=None)[0]
                          for image in images]
                images = torch.stack(images, dim=0)
        else:
            images = None

我想了解你们团队是怎么处理这里的视频数据的,能否给我提供一些建议,非常感谢

novaliulan avatar Jan 06 '25 08:01 novaliulan