gateplugin-LearningFramework
gateplugin-LearningFramework copied to clipboard
Simplify DNN dense corpus and interaction with python backend
- see also https://github.com/GateNLP/gate-lf-python-data/issues/15
- Keep the option to have many features but make it easy to have just the simple one-feature approach.
- Store dense corpus instances as maps in each line with standard keys for label and possibly features
- unify with representation for unlabeled data (e.g. embedding creation or topic models) and other kinds of supervised/unsupervised tasks, e.g. seq2seq or semantic similarity
- !!!! change representation of sequences: instead of having a sequence of element with multiple features, have a sequence for each feature. Makes it MUCH easier to create batches later.
- Make it easy to swith between our output and the torchnlp library in the python backend
This should become a project possibly with several subissues.