domain-relevance icon indicating copy to clipboard operation
domain-relevance copied to clipboard

section 4.1 what do you mean "Each broad domain and its sub-domains share seed terms"

Open lihuiliullh opened this issue 3 years ago • 5 comments
trafficstars

Suppose there are two seed terms "A" and "B", given the hierarchy CS -> AI -> ML. Do you mean "A" and "B" connect to "CS" "AI' and "ML" at the same time?

image

lihuiliullh avatar Feb 13 '22 23:02 lihuiliullh

It is not "A" and "B" connect to "CS" "AI' and "ML" at the same time, but in our setting, CS/AI/ML shares the same set of seed terms (because we use the same domain corpus to extract the terms to initialize the graph). You may also refer to the statistics of data in Table 1, where #terms is the number of "seed terms".

In this paper, we use "seed terms" to refer to a large set of terms extracted from the corpus, which may be different from the concept in graph mining, e.g., a small number of seed nodes to initialize the algorithm.

jeffhj avatar Feb 14 '22 02:02 jeffhj

Thanks. I have another question. In section 3.3, each core term has one or several categories, and all these categories can form a category tree. Does this mean all the core terms from the same large area, like "computer science" or "Physics"? In the picture blew, "for a given domain", does the "domain" here mean the small domain in the large area?

For example, in "computer science" dataset, the root of the tree is "subfields of computer science", if I want to find information about "deep learning", e.g., given domain "deep learning", then in the sentence "For a given domain, we can first traverse from a root category and collect some gold subcategories.", the root category here means "subfields of computer science" or "deep learning"?

image

lihuiliullh avatar Feb 14 '22 05:02 lihuiliullh

Yes. Your understanding is correct. The root category here means "Category:Subfields of computer science" (https://en.wikipedia.org/wiki/Category:Subfields_of_computer_science) or "Category:Machine learning" (https://en.wikipedia.org/wiki/Category:Machine_learning)

jeffhj avatar Feb 15 '22 04:02 jeffhj

May I know the label here is a boolean value ( e.g., 1 means related and 0 means unrelated) or a scalar (e.g., 0.8, 0.7)? If the domain changes, does all the terms need to be relabeled again?

Also, in the paragraph above Equation (7), "all the core terms are labeled at each level of the hierarchy", can a term belong to several different hierarchy at the same time?

image

lihuiliullh avatar Feb 16 '22 05:02 lihuiliullh

Yes. The label here is a boolean value. Yes. All the terms need to be relabeled for a new domain (In our paper, we introduce an automatic approach to do this) Yes. A term can belong to several different hierarchies at the same time.

jeffhj avatar Feb 16 '22 06:02 jeffhj