Jie Zhang comments

Results 119 comments of


                                            Jie Zhang

derivative of f repect to x

df / dx is needed for BP

Why the directions of digits in visualization are exactly the same?

margin won't change the direction. I think the weight initialization makes this.

Why this particular construction?

@bkj Hi, I'm not the author of the paper. The official implement released by author is [wy1iu/LargeMargin_Softmax_Loss](https://github.com/wy1iu/LargeMargin_Softmax_Loss). I personally tried `norm(w) * norm(x) * (cos(theta) / m)` and it plays...

Ordinal Regression Parameter

you write a wrong layer type. ``` layer { name: "loss" type: "OrdinalRegressionLoss" ordinal_regression_loss_param { k: 4 } bottom: "fc8bisi" bottom: "label" top: "loss" } ```

Ordinal Regression Parameter

There are two kind of weight. `inter weight` represents the importance of every task and `outer weight` balances the training samples of every task. In this layer implementation, `inter weight`...

Ordinal Regression Parameter

something like below, this layer won't give the final output label (which always a label integer for every data) and can only be used in training. You should implement yourself...

Ordinal Regression Parameter

this layer is just for test the accuracy. remove this layer can still train the network.

Ordinal Regression Parameter

The training details can be found in [kongsicong/Age_recognition_OR](https://github.com/kongsicong/Age_recognition_OR).

Meaning of constant K

If classifier of C0 ~ Ci gives label 1, then the output label is i, we predict age of [0, 99]. It's different from original paper. I recommend you can...

IMDB data training convergence is poor?

I don't know much detail of the training. Maybe @kongsicong can help this.