doc2vec icon indicating copy to clipboard operation
doc2vec copied to clipboard

Add CLI to the model

Open ibrahimsharaf opened this issue 5 years ago • 4 comments

Add CLI support for the following commands:

  • Pass a dataset to the model for training
  • Pass a dataset to the model for testing given a trained model path
  • Pass a dataset to the model which splits it for training then testing
  • Pass a single sentence to the model for prediction given a trained model path

Depends on #14

ibrahimsharaf avatar Mar 29 '19 23:03 ibrahimsharaf

Hi, I am interested in helping out with this issue ... I already made some progress and made the CLI for training, testing and single sentence prediction with python fire module ... I just wanted to clarify about the train test split requirement exactly ... you are expecting it to be saving the complete data after the train test split into files and loading them back for repeatability because that's already being taken care of by random state parameter ... so if you can add some info to the exact requirement for train test split separately would be helpful...

Thank You.

dheerajgattupalli avatar Oct 20 '19 14:10 dheerajgattupalli

Hi @dheerajgattupalli, thanks for your collaboration, there's no need to save the train/test data into separate files on disk.

ibrahimsharaf avatar Oct 20 '19 19:10 ibrahimsharaf

So what should that command do?

dheerajgattupalli avatar Oct 21 '19 00:10 dheerajgattupalli

It would take a dataset path, read it into pandas dataframe, then split it to train/test using sklearn train_test_split method, use the training data to train doc2vec then classifier, use the testing data to test the trained classifier, report back the accuracy metrics.

ibrahimsharaf avatar Oct 21 '19 08:10 ibrahimsharaf