Bo He comments

Results 24 comments of


                                            Bo He

The results of every experiments are different.

I think there are some non-deterministic steps for the PyTorch code. It is not the setting of the seed.

Use demo with finetuned checkpoint

Hello, I have updated the demo.ipynb. So the model loads the default config from lavis/configs/models/blip2/blip2_instruct_vicuna7b.yaml. If you want to load a finetuned checkpoints, you need to first set the load_finetuned=True...

dataset processing

Hello! Currently, the downloading script only supports the MSRVTT and MSVD datasets. To obtain other datasets, please refer to the provided links and download the videos using the official download...

some confusion

Thanks for pointing out this bug. I fixed this error and updated it in the latest commit. For the query memory bank, you can check the detailed code here https://github.com/boheumd/MA-LMM/blob/main/lavis/models/blip2_models/blip2.py#L166

Run long video rightly

Hello, I have updated the demo.ipynb in the latest version. You can easily specify the memory_bank_length and num_frames when loading the model. Please note that, every time you change the...

Run long video rightly

Hello, please check for the latest code update. Currently the max_num_frames is set to 120 by default. If you need to test model on long videos, you need to set...

hugging face access token problem

Hi, did you follow the instruction from https://github.com/salesforce/LAVIS/tree/main/projects/instructblip to download the vicuna-7b v1.1 and apply the delta weights to the original LLaMA weights? Or according to this [issue](https://github.com/salesforce/LAVIS/issues/365#issuecomment-1593017454), you can...

hugging face access token problem

I have not came across the same problem before. But you can follow the instruction [here](https://github.com/lm-sys/FastChat/blob/main/docs/vicuna_weights_version.md) to prepare the right vicuna-v1.1 weights. You can clone the repository FastChat outside MA-LMM,...

#evaluation metrics of Summe and TVSum datasets τ and ρ

Hello. You can refer to the following code to calculate rho and tau results. https://github.com/e-apostolidis/CA-SUM/blob/main/evaluation/choose_best_model.py#L67 https://gitlab.uni-hannover.de/hussainkanafani/unsupervised-video-summarization/-/blob/master/src/evaluation/BaseEvaluator.py#L7 https://github.com/TIBHannover/MSVA/blob/master/train.py#L430

Doubts about data sources for text modal

Hi, you can refer to the implementation details section in the main paper. For the SumMe and TVSum dataset, we adopt the pre-trained image caption model GPT-2 to generate the...