CarlCloudWang

Results 6 comments of CarlCloudWang

sample some frames per second... def load_video(video_file, skip=1): from decord import VideoReader vr = VideoReader(video_file) # Get video frame rate fps = vr.get_avg_fps() # Calculate frame indices for 1fps frame_indices...

是不是用了预训练模型,那loss肯定小到不显示

你试试直接把模型拆开来自己写个__init__加载模型,从scorer.py看,实际上这模型就三个部分: tokenizer = AutoTokenizer.from_pretrained(model_path, model_path=model_path, use_fast=False, cache_dir=cache_dir) model = MPLUGOwl2LlamaForCausalLM.from_pretrained(model_path, model_path=model_path, local_files_only=True, cache_dir=cache_dir, low_cpu_mem_usage=True, device_map="auto") image_processor = CLIPImageProcessor.from_pretrained(model_path) 在Modeling_llama.py的结尾,作者很巧妙地直接把llama2的函数动态加载了,所以随便加载权重。

In my survey, I found that no expandsqure can promote the capicity of whole model... u can only use preprocessing from CLIP_ImageProcessor, and this one can cause real improvement.