wlhgtc comments

Results 27 comments of


                                            wlhgtc

cuda out of memory

@jojonki Thanks for your reply. I debug my code the whole day. I test my model layer by layer(I comment out the backward step and optimizer step). The "out of...

cuda out of memory

@Vimos There are some sentences with lenth>500. You'd better set them with a fixed length(for me 300)

cuda out of memory

If the memory keeps steady, it's fine. It's a large model.

Allow seq2seq models to use multiple decoder layers

@schmmd I compared allennlp with Open-NMT, seem we use the same code in train and valid for generation. This makes hard to use multi-layer decoder. Any idea about solve it...

Allow seq2seq models to use multiple decoder layers

@dirkgr @epwalsh Sorry for bother, could you give me some method about how to **construct a 2 layer LSTM decoder** in allennlp-model ? I tried but failed.

Allow seq2seq models to use multiple decoder layers

Hi there, I add 2 new pr for this feature, one for allennlp-models [#90](https://github.com/allenai/allennlp-models/pull/90) and the other for allennlp #4462 . The changes in allenlp-models is easy. But I am...

BART 模型在 pretrain 和 finetune 阶段的数据预处理一致性

@choosewhatulike

BART 模型在 pretrain 和 finetune 阶段的数据预处理一致性

@choosewhatulike 按照这种 pretrain 阶段的预处理，几乎每个标点后都会跟 `[SEP]`，但是下游任务数据并不这么处理。按照前面的说法，这种不一致带来的影响目前没有评估，或者认为很小？

BART 模型在 pretrain 和 finetune 阶段的数据预处理一致性

@choosewhatulike 好的，感谢回复。后续会在自己的任务上测试，如果有影响会同步过来。

layoutlmv3-base-chinese tokenizer could not be loaded.

@HYPJUDY Seem the `XLMRobertaTokenizer ` define in config file only accept text as input; but the `LayoutLMv3Tokenizer` accept both text and bboxes as input. So it could add bboxes info...