TangentCFT icon indicating copy to clipboard operation
TangentCFT copied to clipboard

tangent_cft_back_end.py==>ValueError: max() arg is an empty sequence

Open tobimichigan opened this issue 2 years ago • 11 comments

HI, nice repo here. However whilst I tried to run python3 tangent_cft_front_end.py -ds "./NTCIR-12_MathIR_Wikipedia_Corpus/MathTagArticles" -cid 1 -em slt_encoder.tsv --mp slt_model --rf slt_ret.tsv --qd "/TestQueries" --ri 1

I get the following error:

Traceback (most recent call last): File "tangent_cft_front_end.py", line 88, in main() File "tangent_cft_front_end.py", line 62, in main tokenize_number=tokenize_number File "/home/TangentCFT/tangent_cft_back_end.py", line 32, in train_model self.__load_encoder_map(map_file_path) File "/home/TangentCFT/tangent_cft_back_end.py", line 175, in __load_encoder_map self.node_id = max(list(self.encoder_map_node.values())) + 1 ValueError: max() arg is an empty sequence

Any suggestions?

tobimichigan avatar Jun 28 '22 18:06 tobimichigan

Same error for me also if able to find the solution please keep me update how to fix it ASAP

sachinsourav avatar Jun 28 '22 20:06 sachinsourav

Thanks for the response @sachinsourav , I encountered it after successfully using configured conda env locally. @BehroozMansouri @ARQMath @zanibbi please could you guys kindly sort this issue out?

tobimichigan avatar Jun 28 '22 21:06 tobimichigan

@tobimichigan @sachinsourav @zanibbi The error is occurring because your encoder file is empty which causes the dictionary to be empty. We let this error to raise so that a model is not used with a wrong encoding of math tokens.

Please check if "slt_encoder.tsv" file and let me know if that is the case.

BehroozMansouri avatar Jun 30 '22 05:06 BehroozMansouri

(1)There is no slt_encoder.tsv in Embedding_Preprocessing in the git repository neither slt_ret.tsv (2) either these file generated if yes then what are the commands (3)if no please update these files

On Thu, 30 Jun 2022 at 10:32 AM, BehroozMansouri @.***> wrote:

@tobimichigan https://github.com/tobimichigan @sachinsourav https://github.com/sachinsourav @zanibbi https://github.com/zanibbi The error is occurring because your encoder file is empty which causes the dictionary to be empty. We let this error to raise so that a model is not used with a wrong encoding of math tokens.

Please check if "slt_encoder.tsv" file and let me know if that is the case.

— Reply to this email directly, view it on GitHub https://github.com/BehroozMansouri/TangentCFT/issues/12#issuecomment-1170766007, or unsubscribe https://github.com/notifications/unsubscribe-auth/AID42Q6LFFOA56Z5CARRCZ3VRUS7JANCNFSM52DCIEYA . You are receiving this because you were mentioned.Message ID: @.***>

sachinsourav avatar Jun 30 '22 05:06 sachinsourav

Same with other files like opt_encoder.tsv, opt_ret.tsv, slt_type_encoder.tsv, slt_type_ret.tsv in the git repository these files are not there. Thanks for the response @BehroozMansouri/TangentCFT @.***>

On Thu, Jun 30, 2022 at 11:14 AM Sachin Munda @.***> wrote:

(1)There is no slt_encoder.tsv in Embedding_Preprocessing in the git repository neither slt_ret.tsv (2) either these file generated if yes then what are the commands (3)if no please update these files

On Thu, 30 Jun 2022 at 10:32 AM, BehroozMansouri @.***> wrote:

@tobimichigan https://github.com/tobimichigan @sachinsourav https://github.com/sachinsourav @zanibbi https://github.com/zanibbi The error is occurring because your encoder file is empty which causes the dictionary to be empty. We let this error to raise so that a model is not used with a wrong encoding of math tokens.

Please check if "slt_encoder.tsv" file and let me know if that is the case.

— Reply to this email directly, view it on GitHub https://github.com/BehroozMansouri/TangentCFT/issues/12#issuecomment-1170766007, or unsubscribe https://github.com/notifications/unsubscribe-auth/AID42Q6LFFOA56Z5CARRCZ3VRUS7JANCNFSM52DCIEYA . You are receiving this because you were mentioned.Message ID: @.***>

sachinsourav avatar Jun 30 '22 05:06 sachinsourav

Any update on the above error how to solve it? Please help..

sachinsourav avatar Jul 03 '22 05:07 sachinsourav

@tobimichigan @sachinsourav @zanibbi The error is occurring because your encoder file is empty which causes the dictionary to be empty. We let this error to raise so that a model is not used with a wrong encoding of math tokens.

Please check if "slt_encoder.tsv" file and let me know if that is the case.

Why is my encoder.tsv file empty i don't know. After runing python3 tangent_cft_front_end.py -ds "/NTCIR12_MathIR_WikiCorpus_v2.1.0/MathTagArticles" -cid 1 -em slt_encoder.tsv --mp slt_model --rf slt_ret.tsv --qd "/TestQueries" --ri 1 Is there any previous steps.

sachinsourav avatar Jul 08 '22 12:07 sachinsourav

To solve this issue, after downloading the NTCIR-12 Collection, you need to unzip the `wpmath00000xx.tar.bz2' files using the command: tar -xf wpmath00000xx.tar.bz2

We will make a command for this. Also, we made a modification to the wiki_data_reader.py file, removing the Articles path in line 27.

The current issue that you are facing, is related to the fact that no data was read and an empty slt_encoder.tsv was created and when trying to get the max value an error was raised. Therefore, after unzipping make sure the slt_encoder.tsv does not exist or use another name to generate the encoder file.

BehroozMansouri avatar Jul 14 '22 17:07 BehroozMansouri

嗨,这里很好。然而,当我试图运行python3 tangent_cft_front_end.py -ds “/NTCIR-12_MathIR_Wikipedia_Corpus/数学标签文章” -cid 1 -em slt_encoder.tsv --mp slt_model --rf slt_ret.tsv --qd “/测试查询” --ri 1

我收到以下错误:

回溯(最近调用最后):文件“tangent_cft_front_end.py”,第 88 行,在 main() 文件“tangent_cft_front_end.py”,第 62 行,主tokenize_number=tokenize_number 文件“/主页/切线CFT/tangent_cft_back_end.py”,第 32 行,train_model self.__load_encoder_map(map_file_path) 文件“/家/切线CFT/tangent_cft_back_end.py”,第 175 行,__load_encoder_map self.node_id = 最大值(列表(self.encoder_map_node值)) + 1 值错误:最大值() 参数是空序列

有什么建议吗?

嗨,这里很好。然而,当我试图运行python3 tangent_cft_front_end.py -ds “/NTCIR-12_MathIR_Wikipedia_Corpus/数学标签文章” -cid 1 -em slt_encoder.tsv --mp slt_model --rf slt_ret.tsv --qd “/测试查询” --ri 1

我收到以下错误:

回溯(最近调用最后):文件“tangent_cft_front_end.py”,第 88 行,在 main() 文件“tangent_cft_front_end.py”,第 62 行,主tokenize_number=tokenize_number 文件“/主页/切线CFT/tangent_cft_back_end.py”,第 32 行,train_model self.__load_encoder_map(map_file_path) 文件“/家/切线CFT/tangent_cft_back_end.py”,第 175 行,__load_encoder_map self.node_id = 最大值(列表(self.encoder_map_node值)) + 1 值错误:最大值() 参数是空序列

有什么建议吗?

Hello, I have the same problem as you. How can I solve it?

SmallBall8 avatar Oct 22 '22 11:10 SmallBall8

We are working on Tangent-CFT code and will fix these issues in next coming week (or two) and will update the code that can be easily used for both NTCIR and ARQMath test collections.

BehroozMansouri avatar Oct 22 '22 15:10 BehroozMansouri

我们正在研究 Tangent-CFT 代码,并将在下周(或两周)修复这些问题,并将更新可以轻松用于 NTCIR 和 ARQMath 测试集合的代码。

您好,很冒昧打扰您,请问修复上述问题了吗?

SmallBall8 avatar Nov 08 '22 10:11 SmallBall8