wit
wit copied to clipboard

Published 20 hours ago •

→

Metadata

WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

Reame
Issues

Results 5 wit issues

Sort by recently updated

Baseline Models in WIT Paper

2

Do you plan to release the baseline models pre-trained on WIT dataset (as mentioned in WIT paper)? Thank you!

iamjanvijay

Change "filering" to "filtering". Add missing comma.

1

Line 52. Change "filering" to "filtering". Add a missing comma to the same sentence.

johnnypeck

release date of murals

1

what is the plan of releasing the code or murals?

LeeRock

Article (or category) list of the dataset

1

Hi, great work. Do we have any exhaustive list of the topics/categories pertaining to the data-set's contents ?

shiv6891

Rest text for wikiweb2m

Thanks for your great works on WikiWeb2M I just download the WikiWeb2M dataset and find there is only the first section of each wiki page, i wonder where is the...

tattain404

About

WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

machine-learning

nlp

wikipedia

multilingual

multimodal

cc-by-sa-3

961

Stars

39

Forks

Watchers

Owner

google-research-datasets

← Metadata

961

Stars

39

Forks

Watchers

Owner

google-research-datasets

Metadata

WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

Back

wit wit copied to clipboard

Metadata

Baseline Models in WIT Paper

Change "filering" to "filtering". Add missing comma.

release date of murals

Article (or category) list of the dataset

Rest text for wikiweb2m

← Metadata

Owner

Metadata

wit
wit copied to clipboard