observatory icon indicating copy to clipboard operation
observatory copied to clipboard

Request for Access to Wikitables Data Files

Open contraexemplo-0 opened this issue 3 months ago • 1 comments

Hello,

First of all, I would like to compliment you on the Observatory project — I find your work extremely valuable for research on table representation and evaluation metrics.

I'm currently working on testing the ColBERT v2 model according to the metrics described in your paper, starting with the "Row Order Insignificance" measure. In doing so, I've been adapting the methodology you used with BERT and RoBERTa to my experiments.

However, I encountered a problem: the TURL link that used to contain the files data/entity_vocab.txt and data/test_tables.jsonl derived from Wikitables is no longer available. I was wondering if you still have these files or could provide access to them, as I need them to properly compare my results with the ones reported for other models in your paper.

Additionally, I noticed your Observatory Library and I'm curious if it would be feasible to use it to help test ColBERT v2 with your evaluation metrics. I would greatly appreciate any guidance or suggestions regarding this.

contraexemplo-0 avatar Oct 08 '25 00:10 contraexemplo-0

Hi @contraexemplo-0, thank you for your kind words and sorry about the slow response!

Regarding TURL data, I verified that the link in their repo README is no longer valid. I also saw you opened an issue under their repo and it looks like they are working on a solution.

The Observatory Library is no longer under active development due to many other priorities but you are welcome to contribute to the project. I saw ColBERT v2 can be accessed through HuggingFace. As long as they expose the encoder API, it should work with the existing code for embedding inference.

superctj avatar Oct 13 '25 22:10 superctj