doc2vec
doc2vec copied to clipboard
Add CLI to the model
Add CLI support for the following commands:
- Pass a dataset to the model for training
- Pass a dataset to the model for testing given a trained model path
- Pass a dataset to the model which splits it for training then testing
- Pass a single sentence to the model for prediction given a trained model path
Depends on #14
Hi, I am interested in helping out with this issue ... I already made some progress and made the CLI for training, testing and single sentence prediction with python fire module ... I just wanted to clarify about the train test split requirement exactly ... you are expecting it to be saving the complete data after the train test split into files and loading them back for repeatability because that's already being taken care of by random state parameter ... so if you can add some info to the exact requirement for train test split separately would be helpful...
Thank You.
Hi @dheerajgattupalli, thanks for your collaboration, there's no need to save the train/test data into separate files on disk.
So what should that command do?
It would take a dataset path, read it into pandas
dataframe, then split it to train/test using sklearn
train_test_split
method, use the training data to train doc2vec
then classifier, use the testing data to test the trained classifier, report back the accuracy metrics.