shogun
shogun copied to clipboard
Some ideas about extending the GP modular
I will do most of the following things in the order.
- Sparse inference (batch update)
- Stochastic inference (online/streaming update)
- GPU related stuff
- deep GP
- Inference for structure GP
- EP inference
- Inference for multi-class classification
Working in progress based on the eigen3 backend (ref https://github.com/shogun-toolbox/shogun/issues/2779) Sparse inference (batch update)
- sparse KL inference for regression (done)
- sparse KL inference for classification (this summer) will implement @emtiyaz's idea
Stochastic inference (online/streaming update)
- stochastic KL inference for regression (this summer)
- stochastic KL inference for classification (this summer)
To do (next year) GPU related stuff GPU speedup for some inference methods (need some help from @lambday )
Deep GP
- sparse KL inference for dimension reduction (GPLVM) (this summer)
- extending GPLVM (one hidden layer) to deep GP (many hidden layers) with GPU speedup
Inference for structure GP implementing inference methods based on Andrew's work (http://www.cs.cmu.edu/~andrewgw/)
in future (once I get time) EP inference for t distribution (fat-tailed distributions)
- Robust and parallel EP inference for t distribution (ref: GPStuff)
- EP inference for sparse GP (ref: GPML)
Inference for multi-class classification
@karlnapf take a look at this
I would like to add some non-algorithm related points.
- Benchmarks against other toolboxes. We want Shogun to be among the fastet ones, while having the most complete set of algorithms. This is really important
- Notebook for all methods, with clear examples that maybe reproduce experiments from the relevant papers. Users get confused if we just offer 20 different inference methods. Need some decision tree what to use.
- GPU and linalg stuff should be done on-the-fly. Chances are that most of the code is not touched again in quite a while, so I would rather go a bit slower but make full use of the linalg framework.
@karlnapf I agree with you. Yes, we do need some user guide.
A table about accuracy and speed of different inference methods will be added.
A table about performance (speed) across four GP tools (see below) in term of the same inference method will be added.
One question about benchmarks
We will compare the GP modular with other three GP tools (GPML, GPy and GPStuff).
Since I want to users can reproduce benchmark results, I may need some blog or framework to share benchmark Matlab codes for GPML and GPStuff.
We can use IPython notebook to share Python codes and run them on-the-fly.
Not sure how to share Matlab codes. Any suggestion about sharing Matlab codes? gist?
User guide: I suggest a tutorial style notebook, like the one you have, but with way less math and more hands on.
Benchmarks: @lambday did some benchmarking, and I think we could use his ideas to do this in a principled way. Thoughts @lambday ?
@karlnapf
@lambday
For GPU speedup, the first thing I need is the Cholesky decomposition.
@Ialong also needs this for some work he is doing @yorkerlin you could also try to add it yourself. Or @lambday can do it, he is fast in this stuff :)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.