libdnn
libdnn copied to clipboard
Handle the multi-label task?
Hi Chou,
It seems that the libdnn couldn't handle the multi-label task. Would you like to add this new feature in the future?
Sounds like a good idea !! I'll add it in the future release. Thanks : )
I'm interested in this too. Have you already worked on this? If not, I forked your repository and am working on it right now.
That's great !! I haven't started yet.
Here's my idea. It's rough and maybe you can help me with this. Originally, the label comes along with the feature. Like this:
12 1:1 2:0.5 10:0.7 ...
where 12
is the label, and the rest 1:1 2:0.5 10:0.7 ...
represent the feature.
For multi-labeled feature, I was thinking about providing another label file. Like this:
# feature file
1:1 2:0.5 10:0.7 ...
where 12
is missing and 1:1 2:0.5 10:0.7 ...
still represent the feature.
# label file
12 15 17
where the above feature not only labeled 12
, but also 15
and 17
.
Besides, nn-train
and class BasicStream
should tell the difference between them and ask user to provide an additional label file if it's multi-labeled.
Any suggestion?
That was pretty much what I had in mind. I will work out some stuff (still reading in on your code), and will come back on this.
Hi~ @supergrover, @hsiangsky Have you guys started yet ?
I almost finish the support of multi-label.
Lots of refactoring in src/data-io.cpp
, src/dataset.cpp
, include/data-io.h
and include/dataset.h
.
Guess it's going to be cleaner.
@botonchou working on it now
Because my data IO class sucks, I decided to refactor it first. it's on another unpublished branch. (it's more readable and cleaner now)
Most of the functions needed in multi-class classification are done. (Except for the measurement of multiclass accuracy.) I'm going to publish it in about a week.
Do you want to start from that branch ?
Yes, please push. I will play around with it, although for consistency I think you should finish it
It's on branch feature/multi-label. Thanks : )