unitxt
unitxt copied to clipboard
Add a reduced version of ClapNQ
Currently, the full clapNQ dataset contains thousands of documents. It is not usable for testing a simple rag end-to-end flow.
we would like to create a subset version of these documents based on the ground-truth docIds present in the clapNQ question-answer data