James Jing Tang

Results 8 comments of James Jing Tang

@atulkum One more question, in your code, step_coverage_loss is the sum of the minimum of attn_dist and coverage in each element. https://github.com/atulkum/pointer_summarizer/blob/5e511697d5f00cc474370fd76ac1da450ffd4d2e/training_ptr_gen/train.py#L99 And coverage is coverage + attn_dist. https://github.com/atulkum/pointer_summarizer/blob/5e511697d5f00cc474370fd76ac1da450ffd4d2e/training_ptr_gen/model.py#L124 So,...

@atulkum have you ever try to set is_coverage as True, it's extremely easy to cause loss became NaN, less learning rate is useless for this issue.

@atulkum I think this operation may cause NaN https://github.com/atulkum/pointer_summarizer/blob/fd8dda35390d058c1745b9495634ea0ddadf71ad/training_ptr_gen/model.py#L95 calculating the memory of attention in each decoder step may create many computation graph branch in torch backend, but in fact...

@atulkum I set is_coverage as True after 500k step, but at the beginning of retraining, it always gives NaN. I'll continue to test, thank you again.

Thanks for suggestion. I initialized the model_file_path, but after no more than 100 iter, it get NaN:(

> 请问这个层次分类支持bert吗?如果可以的话在哪里修改呢? 当时没有直接支持,不过model/layer.py中已经定义了multiheadattn,所以应该添加一些ffn层,不用进行较多的开发就可以定义一个bert,然后就可以使用bert进行层次分类了,欢迎提交mr

另外我在提取地图的时候也遇到了一些错误,同样请教下怎么解决? ![image](https://user-images.githubusercontent.com/13873223/89280182-d53b2a00-d67a-11ea-908e-1a73d5bb21c7.png)

我已经放弃编译了。。。感觉坑很多,直接用release的话,应该怎么部署呢