mltk
mltk copied to clipboard
How to run GA2M with FAST?
I've been reading the docs trying to create an example of classification with GA2MLearner using command-line tools.
I checked in https://github.com/dfrankow/mltk/tree/master/examples with train_ga2m.sh. You should be able to check it out and run it. If I can get it to work, I'm happy to pass it back as an example, as requested in #17.
Several questions:
- How do we generate a sensible pairwise terms file to pass to GA2MLearner instead of including all? I think that would possibly include the FAST algorithm, but I don't know how to use it.
- Why does Evaluator not have any output?
- Can I use the command line to run predictions (in this case classification output) on the test set?
@sds-dubois - any suggestions? It looks like you used GA2Ms in #7.
Here is the script to get GA2M running from end to end. You might find mltk.predictor.evaluation.Predictor
and mltk.predictor.gam.interaction.FAST
useful. I will update wiki soon.
MLTK=/Users/yin_lou/repos/mltk-github/mltk/target/mltk-0.1.0-SNAPSHOT.jar
java -Xmx4g -cp $MLTK mltk.core.processor.Discretizer \
-r cal_housing.attr \
-t cal_housing.train.all \
-m cal_housing_binned.attr \
-i cal_housing.train.all \
-o cal_housing_binned.train.all
java -Xmx4g -cp $MLTK mltk.core.processor.Discretizer \
-r cal_housing.attr \
-d cal_housing_binned.attr \
-i cal_housing.train \
-o cal_housing_binned.train
java -Xmx4g -cp $MLTK mltk.core.processor.Discretizer \
-r cal_housing.attr \
-d cal_housing_binned.attr \
-i cal_housing.valid \
-o cal_housing_binned.valid
java -Xmx4g -cp $MLTK mltk.core.processor.Discretizer \
-r cal_housing.attr \
-d cal_housing_binned.attr \
-i cal_housing.test \
-o cal_housing_binned.test
java -Xmx4g -cp $MLTK mltk.predictor.gam.GAMLearner \
-r cal_housing_binned.attr \
-t cal_housing_binned.train \
-v cal_housing_binned.valid \
-m 1000 \
-l 1 \
-o gam.model
java -Xmx4g -cp $MLTK mltk.predictor.evaluation.Predictor \
-r cal_housing_binned.attr \
-d cal_housing_binned.train.all \
-m gam.model \
-g r \
-R cal_housing_residual.txt
java -Xmx4g -cp $MLTK mltk.predictor.gam.interaction.FAST \
-r cal_housing_binned.attr \
-d cal_housing_binned.train.all \
-R cal_housing_residual.txt \
-o pairs.txt
java -Xmx4g -cp $MLTK mltk.predictor.gam.GA2MLearner \
-r cal_housing_binned.attr \
-t cal_housing_binned.train \
-v cal_housing_binned.valid \
-I pairs.txt \
-m 100 \
-i gam.model \
-o ga2m.model