shogun
shogun copied to clipboard
Tutorial for new users: Creating an own structured learning model
If a new user wants to quickly evaluate shogun for a problem he has, it's not clear where to start. It would be good to have a small tutorial for the website that shows how to deal with different problems.
It's similar to #2054
One example one could do in a small tutorial:
- explaining the structured output API
- creating a own model to do multiclass predictions or multilabel with calibrated label ranking (CLR)
- explain how to setup an own project (outside shogun source) to develop and compile this
- apply the problem so some numerical multiclass toy example
The second part of the tutorial could either be:
- explain how to apply it to hashed string features or use another kernel.
- explain how to parallelize multiple evaluations using openmp (both multiclass and multilabel require multiple evaluations)
Since we have Label classes for both Multiclass and Multilabels, it should be possible to create quick examples for Multiclass/Multilabel/CLR.
As in #2054, the focus shouldn't be on the machine learning part, but on how to use shogun and get it running. We assume that potential users know ML well and only want a quick intro on how to use SHOGUN.
I am attempting this one. @tklein23 I have a question: what should be the file format of tutorial (should it be C++ files or something other)?
Hey @abinashpanda - the task is yours!
A few comments, that might answer your question:
- The tutorial should explain, how to write a simple SO model on your own
- As markup language I'd suggest github markup (.md files), so we don't have to deal with HTML
- The model can only be written in C++, so the language is fixed.
- The example program and the written model should live in a seperate directory outside the shogun tree
- If you like, we can create a seperate repository for the resulting code and link it from the tutorial (as a short cut for the impatient ;))
Which structured output problem you choose as an example, is up to you. But I recommend choosing something with a simple decoding like (1) multiclass or (2) multilabel.
An example tutorial: http://docs.python.org/2/extending/extending.html
Note that you don't need to be as verbose; it's not necessary explain everything. It's more like showing how to glue the different pieces (Makefile, program, reading inputs, so model, decoding, so labels) to a working example.
Just a small remark to what Thoralf said a couple of comments ago. In fact, thanks to the SWIG director classes, it is possible to create a structured output model from Python as well. For an example, see https://github.com/iglesias/linal/blob/master/graph/structure_grid_crf.py#L20.
Thanks for your comment, @iglesias -- I already forgot that this is possible as well.
@abinashpanda - hope I'm not pushing you too hard, but let me know if you got stuck or need more information.
Hey, is it ok if I take this up as well?
@achintp - I think this task is in progress and is too easy to distribute it to more people. Lets discuss on the mailing list which task fits your skills.
@abinashpanda - your last comment was 10 days ago. Are you still working on this issue? Please give a quick status update.
@tklein23 Sorry for the delay in this task. I am feeling sick since past 2-3 days so unable to proceed on this task. I am ok if any one else is interested in taking this task as I would not be able to work for next 3-4 days.
@abinashpanda - thanks for your response. I hope you are recovering soon, but take the time you need.
@achintp - feel free to start working on this task. If something is missing, please let us know.
Cool!
i'm on it!
@achintp, I seems you're stuck. Do you need assistance on this issue?
Hey @tklein23 ,
Sorry my finals were going on and I couldn't devote much time to this. Understandably, I mised the GSoC deadline. I still want to complete it though, so I should have it in a couple of days
Great @achintp! We are looking forward to seeing this!