readxl icon indicating copy to clipboard operation
readxl copied to clipboard

cell range with column "CR" create an error.

Open ksetdekov opened this issue 6 years ago • 9 comments

Brief description of the problem While running read_xlsx on a list of files that have a range of required data similar to "B2:CR127642" get an error "Cell reference follows neither the A1 nor R1C1 format. Example:" instead of processing data.

for (i in 61:61) {
+   print(paste("Iteration:", i))
+   cur.file <- readxl::read_xlsx(paste(path, files[i], sep = "/"),
+                                 range = ranges[i])
+   print(object.size(cur.file) / 1024 / 1024)
+   size = size + object.size(cur.file) / 1024 / 1024
+ }

#>[1] "Iteration: 61" #> Error: Cell references aren't uniformly A1 or R1C1 format: #> B2:CR127642 #> In addition: Warning message: #> Cell reference follows neither the A1 nor R1C1 format. Example: #> CR127642 #> NAs generated.

ksetdekov avatar Feb 19 '19 13:02 ksetdekov

From the readxl Sheet Geometry documentation:

Excel-style range: Specify a fixed rectangle with range = "A1:D4" or range = "R1C1:R4C4". You can even prepend the worksheet name like so: range = "foofy!A1:D4" and it will be passed along to the sheet argument.

It's not parsing B2:CR127642 as either of these formats, which is why you're getting that error. Is CR the last column you want to select and 127,642 the last row?

batpigandme avatar Feb 19 '19 13:02 batpigandme

This was fixed in https://github.com/rsheets/cellranger/pull/28 which, as far as I can tell, hasn't been released to CRAN.

nacnudus avatar Feb 19 '19 13:02 nacnudus

@batpigandme Yes, this is exactly the case. By accident some of the data files I encountered had CR as last column.

Is CR the last column you want to select and 127,642 the last row?

ksetdekov avatar Feb 19 '19 15:02 ksetdekov

Hi there,

I am getting an error when using range to read data. Could you point me on how do I overcome this? Screenshot 2019-09-18 at 13 30 57

Kind Regards, Karthik

karthikkommalapati avatar Sep 18 '19 12:09 karthikkommalapati

Try installing cellranger from GitHub for the fix mentioned above, i.e. for it to recognize these larger row numbers. I suppose I need to do a cellranger release as this particular bug has come up a couple times in the past few weeks. Are spreadsheets getting bigger? 🤔

jennybc avatar Sep 18 '19 15:09 jennybc

Hi @jennybc,

thank you for getting back, just installed cellranger, but still have the same issue. No, the excel document is really small one.

I have a workaround for this issue (used anchored instead and it works fine), but at the same, this is an open issue I guess for someone who is using the range?

Kind Regards, Karthik

karthikkommalapati avatar Sep 18 '19 15:09 karthikkommalapati

No, the excel document is really small one.

But it has 100,000 rows? I'm getting that from your need to read the range C7:bb100000.

jennybc avatar Sep 18 '19 20:09 jennybc

Apologies for the late reply, not the spreadsheet only have a couple of hundred rows in one particular document and in there are thousands so for safe side I put 100k.

Kind Regards, Karthik

karthikkommalapati avatar Sep 20 '19 20:09 karthikkommalapati

I suggest you take a different approach to specifying the range then. Are you sure readxl can't figure it out automatically (or certain bounds of the box) just by looking at which cells are populated?

https://readxl.tidyverse.org/articles/sheet-geometry.html

jennybc avatar Sep 20 '19 22:09 jennybc