Chuang Lin
[ICLR 2023] PyTorch implementation of VLDet (https://arxiv.org/abs/2211.14843)
clin1223
[ECCV 2022] Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation