rentrez icon indicating copy to clipboard operation
rentrez copied to clipboard

Suggest: more helpful error message for entrez_search with high retmax value

Open richelbilderbeek opened this issue 3 years ago • 4 comments

Dear rentrez maintainer,

Thanks (again!) for the awesome package. Here I submit a feature request and suggest a more helpful error message.

rentrez::entrez_search usually works fine, also when there are 48836 IDs. Setting the retmax to a value above the number of IDs works as expected:

# Works, 48836 hits
rentrez::entrez_search( 
  db = "SNP",
  term = "EGFR[Gene Name]",
  retmax = 50000
)

However, when doubling the retmax value, I see an unhelpful error:

  # Error in ans[[1]] : subscript out of bounds
  rentrez::entrez_search(
    db = "SNP",
    term = "EGFR[Gene Name]",
    retmax = 100000
  )

I would prefer an error message that is more descriptive: now I can only guess what goes wrong.

I think this will make rentrez even nicer to use.

Cheers, Richel Bilderbeek

richelbilderbeek avatar Nov 16 '20 12:11 richelbilderbeek

We can try to catch these or think of a way around the problem. But for now, this is the result of R turning 100000 into 1e5. You could either set retmax to 100001 or use options("scipen"=15) to overwrite change how quickly R starts using scientific noatation.

Linking to #157, which also deals with catching NCBI reported ERROR fields in records returned with HTTP code 200

dwinter avatar Nov 16 '20 20:11 dwinter

Thanks for this useful workaround!

I know in my packages, when I need to save a large number to file as-is (e.g. 1000000, I do temporarily set the scipen to a high value, after which I undo that operation again. A bit clumsy (a printf-like statement feels superior), but I felt it best for the user.

I do volunteer to fix this Issue using your scipen approach by Pull Request. Let me know :+1:

richelbilderbeek avatar Nov 17 '20 06:11 richelbilderbeek

You can also convert it back to a number with a quick as.integer(retmax), which avoids messing with users' options.

allenbaron avatar Aug 23 '21 18:08 allenbaron

IMHO, instead of manipulating with scipen, it is better to use format(x, scientific=FALSE), which will convert the integer into text, no side-effects involved. trim=TRUE is optional.

J-Moravec avatar Sep 29 '22 01:09 J-Moravec