h2o-3
h2o-3 copied to clipboard
The h2o.findSynonyms failed if the 'word' parameter is uknown for the word2vec model
Received the following error when attempting to execute print(h2o.findSynonyms(w2v_model, "National", count = 5)): Error in eval(substitute(expr), data, enclos = parent.frame()) : object 'score' not found Curious about the absence of the 'score' parameter.
In contrast, when employing print(h2o.findSynonyms(w2v_model, "national", count = 5)), the score is generated as expected.
Hi @dmresearch15. Thanks for reporting this issue.
It looks like there is a bug, that we cannot return results without an error for an unseen word.
We definitely need to fix it.
I reproduced the error by this code:
job_titles <- h2o.importFile("https://s3.amazonaws.com/h2o-public-test-data/smalldata/craigslistJobTitles.csv", col.names = c("category", "jobtitle"), col.types = c("String", "String"), header = TRUE)
words <- h2o.tokenize(job_titles, " ")
vec <- h2o.word2vec(training_frame = words)
// pass
syn <- h2o.findSynonyms(vec, "teacher", count = 20)
print(syn)
// fail
syn2 <- h2o.findSynonyms(vec, "Tteacher", count = 20)
print(syn2)
I'm presently incorporating this into my project. It's helpful to have a timeframe for resolving this issue.
Hi @dmresearch15, I fixed the bug in R API here: https://github.com/h2oai/h2o-3/pull/16280. Hopefully, this change will be released in the fix release at the end of the week.
If the model can't find synonyms, it failed with the error you shared. The question still is, why can your model find synonyms for "national" and not for "National"? You may need to tune your model a little bit.