xihui-wu
xihui-wu
I can reproduce the decoding failure with [OpenAI GPT2](https://github.com/openai/gpt-2/blob/master/src/encoder.py#L105) however instead of throwing error it returns a replacement character. byte_pair_encoder.get_encoder('117M', 'models').decode([129]) # 129 is the index of "Å" in the...
As I tested an 16GB GPU VM, I do see that among the last nearly 2/3 of the epochs (10 epochs in total), a peak memory usage of 9187MB happens...
I just verified again on a new 16GB GPU DLVM instance created today, issue sustains.
@xanderdunn thanks for your post! TrainingLoop is currently in iterations to develop to cover more and more use cases. To answer you first question in supervised learning scenario - generally...
Thanks for correcting the link. You are welcome to try if making it into a struct together with some changes in TrainingLoop works. Again our current TrainingLoop implementation isn't in...