ArrowLuo

Results 21 comments of ArrowLuo

Hi @zhaoying9105, sorry for the delayed reply. The readme said the transcription `needs to be generated by extra ASR tool from speech` (e.g., [Azure service](https://azure.microsoft.com/en-us/services/cognitive-services/speech-to-text/#overview)). We can not distribute the...

@sen0902, sorry that we have no plan to release the other codes and features.

Hi @Davidyao99, I guess you should use `python -m torch.distributed.launch --nproc_per_node=1` for 1 GPU instead of `python -m torch.distributed.launch --nproc_per_node=4`. If nothing is right after that, printing more logs here...

Hi @Davidyao99, what is your whole command?

Hi @jxrloveyou, what is your NumPy version? The 1.19.5 is ok.

Hi @onlyonewater, @jxrloveyou, thanks for your attention to this issue. It is indeed a bug for the line [#L442](https://github.com/microsoft/UniVL/blob/0a7c07f566a3b220731f4abcaa6e1ee59a686596/main_task_retrieval.py#L442). This line will return a list instead of a NumPy array....

Hi @ShinJQ, I am afraid that I can not share the file now. The official CSV contains about 1.2M video ids so you can generate the HowTo100M.csv easily. Best.

Hi @HuBot2020, It is ok for the postfix of the feature filename, e.g., '.mp4.npy' or '.npy'. We have no other processing on the extracted feature. But I do not know...

Hi @17321010162, plz see [here](https://github.com/microsoft/UniVL/blob/main/dataloaders/README.md).

Hi @ting-chih, sorry for the delayed reply. The model will also need T and V, which can be masked if you need only to input one of them. For example,...