AutoNER icon indicating copy to clipboard operation
AutoNER copied to clipboard

Questions about the unknown type high quality phrases.

Open failable opened this issue 5 years ago • 1 comments

Hi, the original paper says

In our AutoNER model, these “unknown” positions have undefined boundary and type losses, be- cause (1) they make the boundary labels unclear; and (2) they have no type labels. Therefore, they are skipped.

Is that mean high quality phrase should not have entity types that we are trying to identify? Otherwise, the model will predict it as Entity Type: None as shown in Figure 2 for 8GB RAM. And if AutoNER is applied to the example of Figure 1, can it and should it identify prostaglandin synthesis as a named entity?

Thanks.

failable avatar Dec 06 '19 00:12 failable

@isolet The related work says,

aspect term extraction, which can be viewed as an entity recognition task of a single type for business reviews. As shown in our experiments, our models can outperform Distant-LSTMCRF significantly on the laptop review dataset.

Because laptop review dataset has no entity type for entity mentions, I think there is only a single type for the latop review dataset, ComputerTerm or not. That's why they can be compared with the Aspect Term extraction model

houking-can avatar Dec 08 '19 02:12 houking-can