Kyle Lo

Results 17 issues of Kyle Lo

project/data

Exact spec still WIP, but TODOs are basically: 1. Athena query to get Titles & Abstracts from S2AG. Form JSON blob per document of the form: ``` {"text": "...", "paper_id":...

project/data

**Describe the bug** I'm running: ``` nq = ir_datasets.load('natural-questions/dev') nq_query_id_to_query = {q.query_id: q.text for q in nq.queries_iter()} ``` But it's not only grabbing Dev but also Train ``` [INFO] [starting]...

bug

Took a while to debug why `google/t5-v1_1-small` wasn't working even though it's registered in `models/__init__.py`. It's not obvious how the shortened name is particularly beneficial whereas the cost is that...

Rasterizers, DocMetadataExtractors, Parsers, etc. all should emit Documents And we should have some sort of `Doc.update()` functionality or `merge()` functionality to combine Documents.

1. To avoid @soldni type checking horrors, let's add something akin to `.get()` 2. We may still want to represent fields as `None` to separate them from `[]` or `Layer(entities=[])`