xihui-wu

Results 5 comments of xihui-wu

I can reproduce the decoding failure with [OpenAI GPT2](https://github.com/openai/gpt-2/blob/master/src/encoder.py#L105) however instead of throwing error it returns a replacement character. byte_pair_encoder.get_encoder('117M', 'models').decode([129]) # 129 is the index of "Å" in the...

As I tested an 16GB GPU VM, I do see that among the last nearly 2/3 of the epochs (10 epochs in total), a peak memory usage of 9187MB happens...

I just verified again on a new 16GB GPU DLVM instance created today, issue sustains.

@xanderdunn thanks for your post! TrainingLoop is currently in iterations to develop to cover more and more use cases. To answer you first question in supervised learning scenario - generally...

Thanks for correcting the link. You are welcome to try if making it into a struct together with some changes in TrainingLoop works. Again our current TrainingLoop implementation isn't in...