tianxin comments

Results 24 comments of


                                            tianxin

Few-shot EFL -- ValueError: Invalid name "my_dataset". Should be one of ['bustm', 'chid', 'iflytek', 'tnews', 'eprstmt', 'ocnli', 'csldcp', 'cluewsc', 'csl'].

Please ensure your Class `FewCLUE's` class member `BUILDER_CONFIGS` in fewclue.py has the key `your dataset_name`. Just like this

paddleNLP 2.3.1 paddlepaddle2.2.2 RuntimeError: (PreconditionNotMet) The third-party dynamic library (libcublas.so) that Paddle depends on is not configured correctly. (error code is libcublas.so: cannot open shared object file: No such file or directory)

1. 请 check 下安装的 paddle 对应的 cuda 版本是否为 cuda11.0 ? 2. 可以参考[这里](https://blog.csdn.net/Grit_007/article/details/85000297)配置动态库路径。

【Hackathon + GradientCache】

本论文中使用的数据集和 DPR Paper 中使用的数据集是对齐的，可以参考 [DPR 官方 Repo](https://github.com/facebookresearch/DPR) 的数据下载及数据处理。

【Hackathon + GradientCache】

可以确认下 Gradient Cache Paper 细节，我理解 Gradient Cache 为了和 DPR 公平对比，除了训练策略不一致之外，其余部分例如: 训练数据、模型结构和 DPR 部分应该是完全一致的。

> 我在阅读DPR源码的时候看到他们的bi-encoder用到了两个编码器：question_encoder和ctx_encoder，但是paddleNLP的example中的baseline和In-batch策略，Hardest策略的模型都只用到了一个编码器。想问下我是按照DPR的模型重新做还是按照我现在已经实现的基于baseline的模型？ GradientCache 复现的精度目标是和 Paper 中汇报的指标对齐，所以实现层面需要以 DPR Paper、Gradient Cache Paper 中的实现为准，不必局限于 PaddleNLP semantic_indexing 中已有算法的实现方式。

【Hackathon + GradientCache】

DPR 和 GradientCache 算法和论文对齐数据、策略之后的指标可以提供一下，作为算法实现正确性判断的参考依据。

【Hackathon + GradientCache】

> @tianxin1860 已完成基于paddleNLP的gradient_cache，除此之外，已完成google数据集Natural Question的读取和处理工具类（在文件NQdataset.py中），已完成DPR策略和基于gradient_cache策略的DPR训练（核心代码在biencoder_base_model.py和train_gradient_cache_DPR.py中），已测试最终结果，为85.032和83.021（测试一次耗时过长，故目前只测试了这两次）。 85.032 和 83.021 分别对应 DPR 和 DPR + GradientCache 在 NQ 数据集上的 Top100 召回指标么？@Elvisambition

【Hackathon + GradientCache】

> @tianxin1860 均为DPR+gradient cache实现，DPR本身的训练实现需要依靠8×V100显卡。指标含义是什么？

【Hackathon + GradientCache】

建议和论文中的训练超参、评估指标对齐，给出 Top5、Top20、Top100 的评估指标。@Elvisambition