relex icon indicating copy to clipboard operation
relex copied to clipboard

Incorrect POS tags for "38.5%"

Open timothywangdev opened this issue 11 years ago • 3 comments

Please check the parsing results of "There were 769 households out of which 38.5% had children under the age of 18 living with them."

(PartOfSpeechLink (stv 1.000000 1.000000) (WordInstanceNode "38.5@b13e52c0-9f9c-4305-a0be-bb306249009e") (DefinedLinguisticConceptNode "det") )

(PartOfSpeechLink (stv 1.000000 1.000000) (WordInstanceNode "%@2f68f0db-bfbb-4285-88ef-c3f53bb053a7") (DefinedLinguisticConceptNode "noun") )

I don't think that these two words are correctly tagged.

timothywangdev avatar Jul 08 '14 16:07 timothywangdev

I think "noun" is correct. % is taken to be the same as the word "percent", which would be a noun

and 38.5 is a modifier of % (how many percent is it? Its 38.5)

You could has said 38.5 miles or 38.5 days ... etc

The 38.5 is a "numeric determiner" -- relex should be creating a _quantity for this; it should be _quantity(%, 38.5)

Note in link-grammar Dmcn and ND are more or less the same thing.

linas avatar Jul 08 '14 16:07 linas

Thanks for the explanation!

timothywangdev avatar Jul 08 '14 17:07 timothywangdev

re-opened, the first one is still wrong, it should be (DefinedLinguisticConceptNode "quantity") and not (DefinedLinguisticConceptNode "det")

linas avatar Jul 08 '14 17:07 linas