Sean MacAvaney
Sean MacAvaney
Thanks! A few points: - Can we rename it to `trec-tot/2023/test`? This is more aligned with the typical naming convention. - Are the `sentence_annotations` of a standard structure? If so,...
Hey @MathVast! Sorry for the delay -- the start of semester is a busy time. Thanks for opening the issue. This seems doable and like a good addition to the...
Excellent, thanks @heinrichreimer! A while back I requested that they include offset files to facilitate random lookups, and it looks like it made it into the final spec! This will...
We're still in the process of requesting the data here. A sample would indeed be helpful for getting started. @heinrichreimer -- I've got a pretty busy couple of weeks coming...
Awesome, thanks! The most challenging bit is doing lookups, but with the offset file that's included, this should be much easier. Feel free to reach out if you have problems/questions/etc....
Great, thanks. This is aligned with `clueweb09/[lang]`
Good call -- I think that #103 (and [corresponding WIP branch](https://github.com/allenai/ir_datasets/tree/chaining_dlc)) should help with this, e.g., by adding base classes and such. Basically, a streamer is an object that provides...
Awesome! Let's sort out #213 first, then this should be easy.
The current structure actually already supports this! All you'd need to do is build a `RequestsDownload` object that **isn't** wrapped in `Cache`.
What Python version are you using? I think some of the bdist wheel dependencies are not yet provided for some of the more recent versions of Python.