misssprite comments

Results 4 comments of


                                            misssprite

How to use new snapshotting?

Net params in snapshot function in SolverWrapper is first unnormalized, saved and restored with normalized version. So the param version is up to when the snapshot in Caffe is called....

@Suyuanhang The default optimization algorithm used in config.lua is [`adadelta`](https://github.com/torch/optim/blob/master/doc/algos.md#optim.adadelta) , which adjusts learning rated automatically. You can try other algorithms like sgd with a learning rate parameter. Adadelta is...

The image of long width has a bad result, the short one does not

@Duum Why does the length of label have a negative effect on the precision of best path decoding? Is there any literature talking about this?

How to train models with attn_type=2 on wiki103 training set?

@HyacinthJingjing , besides data iterator, for pytorch code, set `ext_len=0, tgt_len=1, mem_len=`. You'll need a function calculate logits(inherit `ProjectedAdaptiveLogSoftmax` and call `_compute_logit` is convenient). By the way, `sample_softmax` option seems...

misssprite

How to use new snapshotting?

Set Learning Rate

The image of long width has a bad result, the short one does not

How to train models with attn_type=2 on wiki103 training set?