ArrowLuo comments

Results 21 comments of


                                            ArrowLuo

Joint loss in pretraining

Hi @zhangliang-04, we use the masked sequences for the consistency of other losses. An elaborate design for the retrieval task may benefit from a non-masked version, however, we have not...

Run Without Distributed

Hi @Maddy12, what is your error when you run with distributed launch? It can be run on only one GPU. Otherwise, I think you should modify the code to the...

Hi @Maddy12, what is the error? Can you print it here? Or you can test `x= np.concatenate(tuple(x), axis=0)` as follows, ``` if isinstance(x, list): x= np.concatenate(tuple(x), axis=0) sx = np.sort(-x,...

MSVD Weights

Hi @ntseng450, sorry for the delayed reply. We had no plan to release the weights. Sorry for that.

how long have you been training

Hi @flyinghpluo, sorry for the delayed reply. Several hours or more than ten hours for different datasets.

Estimate of zero-shot performance

Hi @bpiyush, sorry for my delayed reply. I am also sorry that we have no results on the zero-shot performance.

How to train and evaluate the model on the Training-7k split?

Hi @tiesanguaixia, I think the question is how to obtain the dataset of MSRVTT. Please find the MSRVTT_train.7k.csv in the msrvtt_data.zip from the [readme](https://github.com/ArrowLuo/CLIP4Clip).

About tightTrans and type embedding

Hi @PanZP-CUC, we did not omit the type embedding, and it is initialized in the code directly. Plz find [here](https://github.com/ArrowLuo/CLIP4Clip/blob/master/modules/modeling.py#L326). Best~

run simple inference

Hi @jdso1988 1. We do not write such a tool, but it is easy to implement referring to the branch of `--args.do_eval` in main_task_retrieval.py#L577 2. I think it is the...

train on DiDeMo

Hi @qjyyyy, I suppose that it is a problem with video decoding. But I am unsure of the reason, and a suggestion is to scan all videos via `rawvideo_util.py#L25` offline...