shogun icon indicating copy to clipboard operation
shogun copied to clipboard

Some ideas about extending the GP modular

Open yorkerlin opened this issue 10 years ago • 10 comments

I will do most of the following things in the order.

  • Sparse inference (batch update)
  • Stochastic inference (online/streaming update)
  • GPU related stuff
  • deep GP
  • Inference for structure GP
  • EP inference
  • Inference for multi-class classification

yorkerlin avatar Jul 08 '15 14:07 yorkerlin

Working in progress based on the eigen3 backend (ref https://github.com/shogun-toolbox/shogun/issues/2779) Sparse inference (batch update)

  • sparse KL inference for regression (done)
  • sparse KL inference for classification (this summer) will implement @emtiyaz's idea

Stochastic inference (online/streaming update)

  • stochastic KL inference for regression (this summer)
  • stochastic KL inference for classification (this summer)

yorkerlin avatar Jul 08 '15 14:07 yorkerlin

To do (next year) GPU related stuff GPU speedup for some inference methods (need some help from @lambday )

Deep GP

  • sparse KL inference for dimension reduction (GPLVM) (this summer)
  • extending GPLVM (one hidden layer) to deep GP (many hidden layers) with GPU speedup

Inference for structure GP implementing inference methods based on Andrew's work (http://www.cs.cmu.edu/~andrewgw/)

yorkerlin avatar Jul 08 '15 14:07 yorkerlin

in future (once I get time) EP inference for t distribution (fat-tailed distributions)

  • Robust and parallel EP inference for t distribution (ref: GPStuff)
  • EP inference for sparse GP (ref: GPML)

Inference for multi-class classification

yorkerlin avatar Jul 08 '15 14:07 yorkerlin

@karlnapf take a look at this

yorkerlin avatar Jul 08 '15 14:07 yorkerlin

I would like to add some non-algorithm related points.

  • Benchmarks against other toolboxes. We want Shogun to be among the fastet ones, while having the most complete set of algorithms. This is really important
  • Notebook for all methods, with clear examples that maybe reproduce experiments from the relevant papers. Users get confused if we just offer 20 different inference methods. Need some decision tree what to use.
  • GPU and linalg stuff should be done on-the-fly. Chances are that most of the code is not touched again in quite a while, so I would rather go a bit slower but make full use of the linalg framework.

karlnapf avatar Jul 10 '15 09:07 karlnapf

@karlnapf I agree with you. Yes, we do need some user guide.

A table about accuracy and speed of different inference methods will be added. A table about performance (speed) across four GP tools (see below) in term of the same inference method will be added.

One question about benchmarks We will compare the GP modular with other three GP tools (GPML, GPy and GPStuff).
Since I want to users can reproduce benchmark results, I may need some blog or framework to share benchmark Matlab codes for GPML and GPStuff. We can use IPython notebook to share Python codes and run them on-the-fly. Not sure how to share Matlab codes. Any suggestion about sharing Matlab codes? gist?

yorkerlin avatar Jul 11 '15 16:07 yorkerlin

User guide: I suggest a tutorial style notebook, like the one you have, but with way less math and more hands on.

Benchmarks: @lambday did some benchmarking, and I think we could use his ideas to do this in a principled way. Thoughts @lambday ?

karlnapf avatar Jul 12 '15 15:07 karlnapf

@karlnapf @lambday For GPU speedup, the first thing I need is the Cholesky decomposition.

yorkerlin avatar Aug 01 '15 11:08 yorkerlin

@Ialong also needs this for some work he is doing @yorkerlin you could also try to add it yourself. Or @lambday can do it, he is fast in this stuff :)

karlnapf avatar Aug 03 '15 12:08 karlnapf

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Mar 02 '20 14:03 stale[bot]