colinzhaoxp comments

Results 16 comments of


                                            colinzhaoxp

indep_thres is defined by the top-90%？why？

> 你好，请问解决了吗？没有，可能就是一个认为设置的一个阈值吧

Circular import

Hello, I met the same issue and successfully solved it. The reason for "circular import" is that the name of my py file is "onnx2torch", but the model's name is...

这是我们的复现结果，性能非常低。代码参数基本没有改动。 { "GreaterThan/1/spl": 0.027155465037338764, "GreaterThan/1/success": 0.08859470468431772, "GreaterThan/5/spl": 0.0012588116817724068, "GreaterThan/5/success": 0.0015105740181268882, "done_count": 1.0, "ep_length": 6.0, "spl": 0.02666666666666667, "success": 0.09, "total_reward": 0.39089999999999747, "total_time": 0.33500297951698305 } 同时，我在训练的时候发现了如下一行代码： os.environ["OMP_NUM_THREADS"] = "1" 这行代码好像在预训练、训练和测试文件中都存在，我在训练的时候注释掉了本行代码，请问是因为这个问题吗？非常期待您的回复！

可以提供一下文章中已经训练好的模型吗？

os.environ["OMP_NUM_THREADS"] = "1" 请问作者，这行代码对于模型的最终性能有影响吗？

可以提供一下文章中已经训练好的模型吗？

非常感谢您的回复！！！我在论文中注意到预训练对于VTNet来说，非常重要，因此我想是不是预训练的问题呢？我的预训练模型的精度，在训练集合上表现在66左右，验证集表现6大多都保持在66左右，而测试集上的精度保持在60多点，且这些精度从epoch=0开始到最后，变化幅度都不大。请问您的训练过程也是这样吗？附上我的预训练日志和训练日志，十分期待您的回复：）预训练日志：[pretrain.txt](https://github.com/xiaobaishu0097/ICLR_VTNet/files/10100521/pretrain.txt) 训练输出：[train.txt](https://github.com/xiaobaishu0097/ICLR_VTNet/files/10100524/train.txt) 训练输出的tensorboard文件，由于github无法上传文件，我放到了其他平台上：https://drive.google.com/file/d/1KapOsv-H5QyODzpMYzYGSQ6_plUxexdS/view?usp=share_link

可以提供一下文章中已经训练好的模型吗？

感谢您的建议，我将马上着手实验：）

可以提供一下文章中已经训练好的模型吗？

> @colinzhaoxp 你好我无法用我自己训练的模型测试训练的模型是在trained_models/ XXX.dat 对吗 python full_eval.py --gpu-ids 0 --detr --save-model-dir {SAVE_MODEL_DIR} --results-json ./result.json --model VTNetModel --title a3c_previstrans_base 之后报错： AttributeError: 'NoneType' object has no attribute 'seek'. You can...

Does it support beam search generation strategy?

thanks for your replay! As for the beam search, do you mean this line of code? https://github.com/ML-GSAI/LLaDA/blob/4aa0dd2402fb9fec1137648cf768b56103a4a849/generate.py#L84 I also find that LLaDA supports two `remasking` strategies: `low_confidence` and `random`. If...

indexing error

I have the same problem, and have you solved it, please?

should I provide a true attention mask?

same issue #89