David Warde-Farley
David Warde-Farley
This should really conditionally sum over axis=1 if it exists. ~~Or maybe not at all, e.g. BinaryCrossEntropy doesn't.~~ https://github.com/mila-udem/blocks/blob/master/blocks/bricks/cost.py#L27
@bartvm pointed out that it can be very helpful to have a trace of what is pushing initialization/allocation config where. The best way to do this is to just add...
Just noticed this. Related to #1053, actually.
Of using aggregation schemes, etc., and an explanation of the default behaviour (is it just to drop everything but the most recently aggregated value on the floor? seems like it...
It used to, I think. Requiring an explicit file handle is more like pickle, but I don't think that's a great thing to emulate... numpy.load will accept either a str...
From @cooijmanstim: > I tried implementing it [[recurrent batch normalization](https://arxiv.org/abs/1603.09025)] in blocks at some point, but it became a mess because the batch statistics are in the inner graph so...
It would be nice if `Parallel` bricks (and subclasses) would play a bit nicer with `Convolutional` bricks (and subclasses) as their prototype (as well as things like `Pooling` bricks). These...
We should really have this. One issue is that you need to be able to a) inspect the Brick that the parameter belongs to (not a problem with annotations) and...
It seems like we should - extend the Feedforward interface to include shape _tuples_ like convolutions - allow for programmatically determining whether a Feedforward `Brick` can compute its own `output_dim`...