SPARQA `kb_freebase_en_2013`文件夹以及relation match相关问题

作者您好，请教下kb_freebase_en_2013文件夹：

freebase_types_popularity：16033行
freebase_types_reverse：11576行
mediatortypes：1144行

Q1：为什么freebase_types_reverse比freebase_types_popularity少了5000行左右？能否解释一下这三个文件的区别以及是如何产生的？

此外，有未在上述3个文件中出现的relation，例如train set中的:base.cocktails.cocktail.standard_drinkware，来自：

# "what do people usually use to drink margarita?"

text = """
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX : <http://rdf.freebase.com/ns/> 
SELECT (?name AS ?value) WHERE {
SELECT DISTINCT ?name  WHERE { 
?x0 :type.object.name ?name .
?x0 :type.object.type :base.cocktails.cocktail_drinkware . 
VALUES ?x1 { :en.margarita } 
?x1 :base.cocktails.cocktail.standard_drinkware ?x0 . 
FILTER ( ?x0 != ?x1  )
}
}
"""

Q2：drink --> :base.cocktails.cocktail.standard_drinkware 这一步匹配是如何做的呢？

Q3：原文中提到

Given a test question, we find its most similar question in the training set that has the same number of dummy tokens in their patterns.

如果test question中需要的relation，train set中没有，该如何匹配呢？

谢谢~

Nov 10 '20 14:11 JimXiongGM

您好， Q1: 文件1: freebase_types_reverse，type label -> types. 会出现一个type label对应多个type的现象，比如compass direction {'base.mapcentral.compass_direction': 1.0, 'user.robert.earthquakes.compass_direction': 1.0} 文件2: freebase_types_popularity 是type的流行度，即每个type对应多个instances数量，比如base.romanamphorae.type_series 7，代表这个type有7个instances. 文件3: mediatortypes, 代表mediator类型，官方的叫：compound value type，详细可以Wen-tau Yih 2015 ACL论文有介绍。文件1和2是自己统计的，文件3是来自一个相关工作 (暂时不记得哪篇了，待我想起来，我会跟您补充).

Q2: drink --> :base.cocktails.cocktail.standard_drinkware，我们论文后面有word-level scorer and sentence-level scorer 尝试来处理这个问题，详细见https://ojs.aaai.org//index.php/AAAI/article/view/6426 第四章

Q3: 多粒度的打分器，如果train set没有，sentence-level scorer 就没有作用，这时候就要靠word-level scorer.

有问题，请继续留言，谢谢～

Nov 10 '20 15:11 simba0626

作者您好，Q1已经理解，Q2和Q3好像是一个问题，但是还未理解。

即：若query需要的边不在候选集合中，不论是Sentence-Level Scorer还是Word-Level Scorer，都没法打分获取？

是这样吗：例如test中的query to cook chicken and rice what tools are needed?需要的food.recipe.equipment不存在于train set和上述3文件，但是["food","recipe","equipment"]都存在于freebase_types_popularity文件，因此Word-Level Scorer的工作是逐个打分（有点像decoder）？但是formal query长度不一定是3，这样想似乎又不对。

还请您解答一下，谢谢~

Nov 11 '20 03:11 JimXiongGM

您好，给您带来困惑，抱歉，旅顺一下:step 1: 产生候选集合，对应Figure 3里面的Grounded Queries。 step 2: 利用Multi-Strategy Scoring对这些queries打分。若候选query没在候选集合（Grounded Queries），不管哪个scorer，是没法打分获得。

------------------ Original ------------------ From: "GentleMing_X"; Date: 2020年11月11日(星期三) 中午11:30 To: "nju-websoft/SPARQA"; Cc: "Yawei Sun"; "Comment"; Subject: Re: [nju-websoft/SPARQA] kb_freebase_en_2013文件夹以及relation match相关问题 (#8)

作者您好，Q1已经理解，Q2和Q3好像是一个问题，但是还未理解。

即：若query需要的边不在候选集合中，不论是Sentence-Level Scorer还是Word-Level Scorer，都没法打分获取？

是这样吗：例如test中的query to cook chicken and rice what tools are needed?需要的food.recipe.equipment不存在于train set和上述3文件，但是["food","recipe","equipment"]都存在于freebase_types_popularity文件，因此Word-Level Scorer的工作是逐个打分（有点像decoder）？但是formal query长度不一定是3，这样想似乎又不对。

还请您解答一下，谢谢~

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

Nov 11 '20 06:11 simba0626

您好，刚检查kb_freebase_en_2013文件夹里面少放了在多粒度打分器需用到的文件，抱歉，过会我更新一下，感觉对您理解这个例子有些帮助。有问题，继续留意谢谢～ ------------------ Original ------------------ From: "GentleMing_X"; Date: 2020年11月11日(星期三) 中午11:30 To: "nju-websoft/SPARQA"; Cc: "Yawei Sun"; "Comment"; Subject: Re: [nju-websoft/SPARQA] kb_freebase_en_2013文件夹以及relation match相关问题 (#8)

作者您好，Q1已经理解，Q2和Q3好像是一个问题，但是还未理解。

即：若query需要的边不在候选集合中，不论是Sentence-Level Scorer还是Word-Level Scorer，都没法打分获取？

是这样吗：例如test中的query to cook chicken and rice what tools are needed?需要的food.recipe.equipment不存在于train set和上述3文件，但是["food","recipe","equipment"]都存在于freebase_types_popularity文件，因此Word-Level Scorer的工作是逐个打分（有点像decoder）？但是formal query长度不一定是3，这样想似乎又不对。

还请您解答一下，谢谢~

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

Nov 11 '20 08:11 simba0626

感谢回复！您的工作确实很棒~

另，能否release一下freebase latest的virtuoso dump file？

Nov 14 '20 15:11 JimXiongGM

感谢关注，可以的，这个文件有点大，过几天请您留意README.md更新。谢谢

------------------ Original ------------------ From: "GentleMing_X"; Date: 2020年11月14日(星期六) 晚上11:45 To: "nju-websoft/SPARQA"; Cc: "Yawei Sun"; "Comment"; Subject: Re: [nju-websoft/SPARQA] kb_freebase_en_2013文件夹以及relation match相关问题 (#8)

感谢回复！您的工作确实很棒~

另，能否release一下freebase latest的virtuoso dump file？

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

Nov 14 '20 15:11 simba0626

非常感谢~

Nov 14 '20 16:11 JimXiongGM

请问百度云中的kg_75g.db包含了freebase全部的三元组数据还是仅存了和实验中数据集相关的部分三元组？

Mar 22 '21 09:03 zyn412

kg_75g.db包含了freebase全部三元组, 是2013年的版本 ------------------ Original ------------------ From: "zyn412"; Date: 2021年3月22日(星期一) 下午5:52 To: "nju-websoft/SPARQA"; Cc: "Yawei Sun"; "Comment"; Subject: Re: [nju-websoft/SPARQA] kb_freebase_en_2013文件夹以及relation match相关问题 (#8)

请问百度云中的kg_75g.db包含了freebase全部的三元组数据还是仅存了和实验中数据集相关的部分三元组？

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

Mar 22 '21 10:03 simba0626

SPARQA SPARQA copied to clipboard

`kb_freebase_en_2013`文件夹以及relation match相关问题

SPARQA
SPARQA copied to clipboard