FEL icon indicating copy to clipboard operation
FEL copied to clipboard

Question about Chinese entity linking

Open sxyao opened this issue 8 years ago • 3 comments

Is "mvn exec:java -Dexec.mainClass=com.yahoo.semsearch.fastlinking.FastEntityLinker -Dexec.args=“zh/chinese-dec15.hash" the right command to do fastlinking of Chinese?

I run that command and got into the interactive shell. But when I input some sentence, it does not shows the entities. I tried Spanish, and the same thing happened. What could be the problem? Thanks a lot!

sxyao avatar Jun 17 '17 02:06 sxyao

@sxyao Can you copy paste your command's output?

aasish avatar Jul 05 '17 01:07 aasish

@aasish The problem comes from that fact that Chinese words/phrases are not seperated by space.
For example,

I live in the New York city.
我住在纽约市。

However, if I pass 我住在纽约市 directly to FEL, there is no entity found. If I chunk the sentence by myself

I live in the New York city.
我 住在 纽约市

then FEL will return entities like new york city.

liehe avatar Jul 13 '17 13:07 liehe

@LiamHe The chinese model expects tokenized text. Please feel free to create a pull request to handle chinese tokenization.

aasish avatar Nov 05 '17 14:11 aasish