char-rnn.pytorch issues

More modular approach and some novelties in params and training

- Its possible to add validation data and monitor overfitting by means of early stopping, as well as saving the model training at best checkpoints. - Its possible to launch...

zutotonno

More training = much worse results?

1

I'm impressed with how you can get reasonable results with this after just a few minutes. However I tried running it overnight with 3 layers and 20'000 iterations and while...

xjcl

using lstm with cuda gives error

2

Command ``` python train.py runoja.txt --n_epochs 5000 --model lstm --cuda ``` gives ``` Traceback (most recent call last): File "train_.py", line 95, in loss = train(*random_training_set(args.chunk_len, args.batch_size)) File "train_.py", line...

htoyryla

Fix some bugs.

1

1. Fix out-of-array bug in training. 2. Allow to generate on a different architecture. 3. Fix lstm.

vnikme

Implementation change question

What's the difference between applying the loss function after each cell of the RNN vs applying on the entire sequence? In your example you feed only 1 character at a...

AlexandruGhiurutan

Tensor Size Mismatch During Training

3

While training at a seemingly random point, it fails with this error (both lstm and gru): ``` Traceback (most recent call last): File "train.py", line 98, in loss = train(*random_training_set(args.chunk_len,...

odhinnsrunes

The epoch argument is wrong

1

The `--n_epochs` argument is in fact an iteration, which updates the weight based on a random sample of data points. There is no conclusive definition, but the community generally define...

wlnirvana

Refactor to allow for non-ASCII data

This modifies the `read_file` method to extract the character vocabulary from the input file, which is then passed as an argument to the `generate` and `char_tensor` methods. This also modifies...

codeman38

Ak/fix train feeding

1

1. Fix index out of range error. 2. About 8x (for me) training speed-up by feeding whole sample sequence through CuDNN, not char-by-char. 3. Inference: GPU memory requirement not growing...

AlexeyKruglov

Using trained model in online manner for arbitrary length output.

1

Right now it seems like generate.py using a lot of cuda memory during inference. For example. I trained a small 2 layer 150 hidden sized GRU network on the Shakespeare...

filmo

char-rnn.pytorch
char-rnn.pytorch copied to clipboard

Metadata

More modular approach and some novelties in params and training

More training = much worse results?

using lstm with cuda gives error

Fix some bugs.

Implementation change question

Tensor Size Mismatch During Training

The epoch argument is wrong

Refactor to allow for non-ASCII data

Ak/fix train feeding

Using trained model in online manner for arbitrary length output.

← Metadata

Owner

Metadata

char-rnn.pytorch char-rnn.pytorch copied to clipboard

Metadata

← Metadata

Owner

Metadata

char-rnn.pytorch
char-rnn.pytorch copied to clipboard