castor
castor copied to clipboard
PyTorch deep learning models for text processing
Currently, we're using torchtext 0.2.* -- we should update to the next major version, 0.3.
Need to remove a few characters ( like `?`, `!` ) from sentences. In other words, add a few relevant delimiters.
so far, 2018-08-18. the data path using in the Castor/sm_cnn/create_dataset.sh such as ''../../Castor-data/TrecQA'' is NOT match with the real path in Castor-data dir. can you please check it?
I saw the dataset loading fixed to four dataset(sick,msrvid, trecqa, wikiqa). I wanted to know how to trained vdpwi with other datasets. what's more, how to reasonably organize the dataset....
According to @daemon - the VDPWI works https://github.com/castorini/Castor/tree/master/vdpwi But the effectiveness is still below STOA because the hyper-parameters haven't been tuned yet.
Will do this after #128
SM CNN needs to be refactored to match the API of MP CNN.
Ref #99 `conv_rnn` and `kim_cnn` are both sentence classification models - they should share the same API, and in general be structured the same way. @Impavidity @daemon please coordinate on...
@tuzhucheng Can you check in a MP-CNN model in `Caster-models/` and write up instructions on how to actually use it? I should be able to open a Python shell, copy-and-paste...
Given our code clean-up, it should be fairly straightforward to build a demo iPython notebook for MP-CNN to walkthrough its features? We can also try https://github.com/szagoruyko/pytorchviz to visualize e.g., https://github.com/szagoruyko/pytorchviz/blob/master/examples.ipynb