trafficstars

GVT: Good Visual Tokenizer for LLMs

This repo contains assets in our paper What makes for Good Visual Tokenizers for Large Language Models?

Model

We provide related details in gvt.

GVTBench

We provide the Object Counting (OC) and Multi-Class Identification (MCI) on MS-COCO and VCR datasets in GVTBench.

Acknowledgement

Our work is built on VLMo LAVIS EVA Vicuna.

Thanks for their great work!

Citation

If you find this work useful, please cite:

@misc{wang2023gvt,
      title={What Makes for Good Visual Tokenizers for Large Language Models?}, 
      author={Guangzhi Wang and Yixiao Ge and Xiaohan Ding and Mohan Kankanhalli and Ying Shan},
      year={2023},
      eprint={2305.12223},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

GVT
GVT copied to clipboard

Metadata

GVT: Good Visual Tokenizer for LLMs

Model

GVTBench

Acknowledgement

Citation

← Metadata

Owner

Metadata

GVT GVT copied to clipboard

Metadata

GVT: Good Visual Tokenizer for LLMs

Model

GVTBench

Acknowledgement

Citation

← Metadata

Owner

Metadata

GVT
GVT copied to clipboard