Google Research Datasets

Results 70 repositories owned by


                                            Google Research Datasets

turkish-treebanks

Stars

Forks

Watchers

A human-annotated morphosyntactic treebank for Turkish.

google-research-datasets

TyDi QA contains 200k human-annotated question-answer pairs in 11 Typologically Diverse languages, written without seeing the answer and without the use of translation, and is designed for the trainin...

google-research-datasets

uibert

Stars

Forks

Watchers

It includes two datasets that are used in the downstream tasks for evaluating UIBert: App Similar Element Retrieval data and Visual Item Selection (VIS) data. Both datasets are written TFRecords.

google-research-datasets

uninum

Stars

Forks

Watchers

A database of number names for 186 languages, locales, and scripts

google-research-datasets

Video-Timeline-Tags-ViTT

Stars

Forks

Watchers

A collection of videos annotated with timelines where each video is divided into segments, and each segment is labelled with a short free-text description

google-research-datasets

WebRED

Stars

Forks

Watchers

WebRED is a large and diverse manually annotated dataset for extracting relationships from a variety of text found on the World Wide Web.

google-research-datasets

wiki-links

Stars

Forks

Watchers

Automatically exported from code.google.com/p/wiki-links

google-research-datasets

wiki-reading

266

Stars

Forks

Watchers

This repository contains the three WikiReading datasets as used and described in WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia, Hewlett, et al, ACL 2016 (the English Wiki...

google-research-datasets

swim-ir

Stars

Forks

Watchers

SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 languages, generated using PaLM 2 and summarize-then-ask promptin...

google-research-datasets

cross-lingual

datasets

deep-learning

information-retrieval

AIS

Stars

Forks

Watchers

AIS is an evaluation framework for assessing whether the output of natural language models only contains information about the external world that is verifiable in source documents, or "Attributable t...

google-research-datasets