taming-transformers icon indicating copy to clipboard operation
taming-transformers copied to clipboard

inference time

Open ak9250 opened this issue 4 years ago • 1 comments
trafficstars

are there ways to reduce inference time? Currently takes about 13 minutes on a k80 for the norway example at 432x288

ak9250 avatar Jan 12 '21 00:01 ak9250

Hi. Yes, one way is to store already calculated attention weights when creating a sequence. See for example https://huggingface.co/transformers/quickstart.html#using-the-past. Note that this is not currently implemented for our models as we wanted to stick to the very hackable minGPT implementation, but it would definitely be nice to look at.

rromb avatar Feb 12 '21 09:02 rromb