deep_qa
deep_qa copied to clipboard
Tests and fixes to new data api
In summary this PR does the following few things:
- Moves
sort_by_padding
from theDataset
to theDataGenerator
- Adds tests for :
Vocabulary
,Dataset
, all concreteFields
,Embeddings
,TokenIndexers
,DatasetReaders
(The reader ones could be more comprehensive, probably). - Roughly fixes the
DataGenerator
to work with the newDataset
class; I haven't fixed the tests for this, because MattG wrote them and they rely on someFakeInstances
, which I'm not exactly sure how to translate to the new api. - Adds getter methods to the
Field
s,Instance
andDataset
classes, called things likefields()
for theInstance
class, andtokens()
for theTextField
etc. I think it's important to be able to interact with the different classes so that writing functions for mapping predictions back to text etc is easy.
This looks great to me, from my somewhat quick look 👍 . I didn't see any big API changes in this, just testing and cleanup, right? It's nice to have usage examples in the test suite, thank you for adding them!