allennlp
allennlp copied to clipboard
flatten_parameters automatically for multi-GPU RNNs
I'm not entirely sure whether this is intentional, but perhaps adding a flatten_parameters call to the underlying PyTorch RNN might be handy? Right now, multi-GPU RNNs throw warnings, and I'm not sure what the impact on performance is - #2294 references this, the author solved it by manually calling it.
RuntimeWarning: RNN module weights are not part of single contiguous chunk of memory. This means they need to be compacted at every call, possibly greatly increasing memory usage. To compact weights again call flatten_parameters().
we hope to get to this soon, but it's not an explicit priority right now. PR welcome.