Jonathan Donnelly
Jonathan Donnelly
Code to train this model 'from scratch' using data-parallelism across multiple GPUs: https://github.com/jonnykira/openai_reproduction
hi @gitathrun. Are you using python 2 or 3?
Cool, that should work then. For python 2 you would also have to convert the UTF-8 string to a bytearray object within preprocess(). Out of curiosity have you successfully trained...
Hello, Here is code to train a [multiplicative LSTM language model in Tensorflow](https://github.com/jonnykira/Tensorflow_mLSTM) Hope it works! please feel free to leave feedback!
Hello, sorry for the delayed response. I have achieved pretty good performance using a normal distribution for the initial weights. Here is a link to my [Tensorflow Implementation](https://github.com/jonnykira/Tensorflow_mLSTM)
Hello @jozi ! Thank you for pointing this out. I have updated the extract_weights.py script to work with the train_mLSTM.py script as it is now. The Wmb variable was redundant...
hello @athon-millane ! Thank you for pointing out this typo! and yes it should be relatively straight forward to initialize the variables in the training script with the pre-trained numpy...
Hello, Sorry for the late reply! I have added a script called extract_weights.py to generate the .npy files in the format you want. All you need to do is pass...
To find the hidden neuron I would try to recreate figure 3 from the paper [Learning to Generate Reviews and Discovering Sentiment](https://arxiv.org/abs/1704.01444) by feeding in the positive and negative IMDB...