Sander Dieleman comments

Results 136 comments of


                                            Sander Dieleman

Specify output height/width directly in TransformerLayer (instead of scaling factor)

> But what if you stack two of those things on top of each other (with some convolutions in between) and then end in a dense layer? It's wasteful to...

Specify output height/width directly in TransformerLayer (instead of scaling factor)

> If the TransformerLayer is not the first thing in your pipeline, then you don't necessarily know the input shape, and you don't necessarily have a particular output shape in...

Specify output height/width directly in TransformerLayer (instead of scaling factor)

regarding b), I guess it's better to support it since it doesn't really add any overhead, does it? Nor does it make the code any more difficult to understand (or...

Specify output height/width directly in TransformerLayer (instead of scaling factor)

I see. In that case maybe we shouldn't support it. Maybe let's see if anyone can come up with a plausible use case within the next two days or so...

hierarchical softmax as a layer?

This looks great! If anyone wants to throw it into a PR, that would be very welcome :)

Default initialization does not match default nonlinearity

That's a good point. However, people are much more likely to change the nonlinearity of the layer than the initialization strategy, so I'm not sure if that would be a...

InverseLayer cannot invert MergeLayer

I like wrt, it's consistent with Theano itself.

New Layer base class

I've only glanced over the proposal so far, but it looks good to me. It complicates the code quite a bit, unfortunately, but I think the use cases for this...

New Layer base class

> Maybe that's really something we shouldn't care too much about. If we decide not to worry about it, that would mean we are free to rename `Layer`, right? Or...

New Layer base class

Right, makes sense. Bummer :)