deep_qa icon indicating copy to clipboard operation
deep_qa copied to clipboard

Tests and fixes to new data api

Open DeNeutoy opened this issue 7 years ago • 2 comments

DeNeutoy avatar Jun 20 '17 03:06 DeNeutoy

In summary this PR does the following few things:

  • Moves sort_by_padding from the Dataset to the DataGenerator
  • Adds tests for : Vocabulary, Dataset, all concrete Fields, Embeddings, TokenIndexers, DatasetReaders(The reader ones could be more comprehensive, probably).
  • Roughly fixes the DataGenerator to work with the new Dataset class; I haven't fixed the tests for this, because MattG wrote them and they rely on some FakeInstances, which I'm not exactly sure how to translate to the new api.
  • Adds getter methods to the Fields, Instance and Dataset classes, called things like fields() for the Instance class, and tokens() for the TextField etc. I think it's important to be able to interact with the different classes so that writing functions for mapping predictions back to text etc is easy.

DeNeutoy avatar Jun 22 '17 01:06 DeNeutoy

This looks great to me, from my somewhat quick look 👍 . I didn't see any big API changes in this, just testing and cleanup, right? It's nice to have usage examples in the test suite, thank you for adding them!

matt-peters avatar Jun 23 '17 18:06 matt-peters