nltk_data
nltk_data copied to clipboard
murciélago , spanish for "bat" is not found in wordnet (omw)
grep murciélago 20274:02141611-n lemma murciélago ratonero 20285:02143142-n lemma murciélago trompudo mexicano
^ it exists in the omw folder
>>> print(wn.synsets("murciélago", lang="spa"))
[]
>>> print(wn.synsets("gato", lang="spa"))
[Synset('cat.n.01'), Synset('tom.n.02'), Synset('dodger.n.01')]
Cat is found but bat is not
Hi,
please try using our new improved interface: https://github.com/goodmami/wn
On Mon, Mar 1, 2021 at 11:16 PM Jamie Holland [email protected] wrote:
grep murciélago 20274:02141611-n lemma murciélago ratonero 20285:02143142-n lemma murciélago trompudo mexicano
^ it exists in the omw folder
print(wn.synsets("murciélago", lang="spa")) [] print(wn.synsets("gato", lang="spa")) [Synset('cat.n.01'), Synset('tom.n.02'), Synset('dodger.n.01')]
Bat is found but cat is not
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nltk/nltk_data/issues/151, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIPZRREPZTU55OZZQAXDJ3TBOVVFANCNFSM4YMXDCPA .
-- Francis Bond http://www3.ntu.edu.sg/home/fcbond/ Division of Linguistics and Multilingual Studies Nanyang Technological University
@JamesArthurHolland the OMW lines that you quote don't mean that "murciélago" exists as a single word in OMW-1.4, but that it exists as a part of two compounds:
from nltk.corpus import wordnet as wn print(wn.synsets("murciélago_ratonero", lang="spa"))
[Synset('mouse-eared_bat.n.01')]
print(wn.synsets("murciélago_trompudo_mexicano", lang="spa"))
[Synset('hognose_bat.n.01')]
print(wn.synsets("bat")[0].lemmas(lang="spa"))
[Lemma('bat.n.01.chiroptera')]
However, the word "murciélago" exists in the Spanish wordnet released by MCR in 2016, so it could help if OMW caught up with that data.