All buffers have to be of the same dtype
Currently, all buffers (parameters, internals, gradients, ...) are assumed to have the dtype (typically either float or double). This is a bit restrictive: For example, in a max-pooling operation, one would like to store which cell in the current window has the maximum value (as discussed in #29). Something similar would happen in a Maxout-Layer, or when implementing a Top-K Autoencoder. I can work around this for the max-pooling OP, but in general it would be nice to be able to specify an optional dtype for each ShapeTemplate.
Indeed, this is essentially since the layer has a handler and each handler currently only supports one dtype. This is more for simplicity than anything else -- we don't need separate kernels for inputs being int or floats, for example. We can transition to a default_dtype for each handler along with other supported dtypes in the future.
Let's set this aside as on the first TODO item after 0.5? This will require some changes in buffer allocation, and probably several additions to PyCudaHandler, but no major reorg.
Yes! Let's definitely not tackle that before the release.