gnverifier icon indicating copy to clipboard operation
gnverifier copied to clipboard

Wrong OTL ids returned

Open Adafede opened this issue 4 years ago • 3 comments

Hi,

I am actually comparing the Open Tree of Life IDs retrieved via GNVerify and via rotl (official Open Tree of Life API). They give almost identical results (which is good!) but sadly in some cases, they differ.

Here is an example:

echo "Petroselinum crispum" | gnverify -s 179 -f pretty

giving

"recordId": "959097"

where when doing it via rotl (in R):

library(rotl)

name <- "Petroselinum crispum"

tnrs_match_names(
  names = name,
  do_approximate_matching = FALSE,
  include_suppressed = FALSE
)

search_string unique_name approximate_match ott_id is_synonym flags 1 petroselinum crispum Petroselinum crispum FALSE 2485 FALSE number_matches 1 1

Which indeed verifies:

https://tree.opentreeoflife.org/taxonomy/browse?name=2485

vs

https://tree.opentreeoflife.org/taxonomy/browse?name=959097

Thank you again for your wonderful work, hope those issues help!

Adafede avatar Feb 08 '21 11:02 Adafede

I wonder if they made updates to OTT, but did not publish it yet. It seems they still have OTT v3.2 from 2019 for download

https://tree.opentreeoflife.org/about/taxonomy-version/ott3.2

dimus avatar Feb 08 '21 17:02 dimus

When looking at both

https://tree.opentreeoflife.org/taxonomy/browse?name=2485

vs

https://tree.opentreeoflife.org/taxonomy/browse?name=959097

Couldn't it be that you have only the first part of the line being "Petroselinum crispum" in your data and that the "Neapolitanum Group" was cropped? It could maybe explain it...

I have the local ott3.2 version and they are present in this exact same way.

Adafede avatar Feb 08 '21 17:02 Adafede

I see, I think it is a bug. gnverify should return as the best result https://tree.opentreeoflife.org/taxonomy/browse?name=2485, because it was parsed clearly.

dimus avatar Feb 08 '21 17:02 dimus