Rupesh K Srivastava
Rupesh K Srivastava
This requires conversion from models for NCHW format (Caffe) to those for NHWC (Brainstorm), so it's not straightforward, but should still be possible.
Cool, looking forward to it! NHWC layout makes things like this a bit trickier, but we think it's the better format for the long run. Plus, cuDNN v4 will fully...
Does Keras also use NHWC? We'd like to have a more general approach (full DAG). It's fine to start with handling simpler cases, with extensibility in mind. Brainstorm also works...
Good point. You're right, the layer implementation actually computes half of the squared difference (and the gradients accordingly), so `SquaredDifference` is a misnomer. We should do something about this. (CC...
The backward pass implementation could simply multiply deltas by 2, so the gradient check would work fine. Edit: I meant to say, sure we'll have to modify the backward pass,...
Here's a plan for this issue. We'll change `SquaredDifference` layer so that it computes the correct squared difference. We'll add a new layer (let's call it `SquaredLoss`, subject to change)...
I'm done with making the above change in a private branch. I've named the new layer `SquaredLoss`, but perhaps `SquaredError` or something else would be better? (Caffe calls it EuclideanLoss).
I agree, Euclidean loss is not really a name commonly used in NN literature. `MSE` is. `CE` is a good suffix, but also not commonly used in regression context. A...
I now realize that `EuclideanDistance` would clearly not be a correct name either.
:D Good point. However, I think that `SquaredError` is probably the best name for the new layer, even though it halves the error. The `Error` suffix can act as a...