CytoExploreR
CytoExploreR copied to clipboard
Cannot read files including"[ ]"
Describe the bug Files of which name includes "[ ]" (e.g., [test].fcs ) can not be read by cyto_setup(). Because In SONY MA900 cell sorter, some of the files automatically include "[ ]" (e.g., test_[15 mL Tubes] Data Source - 1.fcs), I would be grateful if these files can be read without rename.
To Reproduce library(CytoExploreR) filename <- list.files("./") ##load files gs <- cyto_setup("./", select = filename, gatingTemplate = "Activation-gatingTemplate.csv") print(filename) cyto_names(gs)
output:
print(filename) [1] "[test].fcs" "test.fcs" "test.R"
cyto_names(gs) [1] "test.fcs"
Desktop (please complete the following information):
- OS: Win11, Rstudio, R4.1.0
Additional context Thank you for your great packages!
@tamaikeiichi, can't say I have ever tried to read files with square brackets in the name. Are you able to share a file so I can track down and fix the problem?
Thank you for your quick reply. Attached files are the same records, but the file names are different (renamed). test.zip
I can confirm that you are able to read in the files in the coming version of CytoExploreR:
cs <- cyto_load("Debug-Files")
cyto_names(cs)
[1] "[test].fcs" "test.fcs"
I will switch to the version of CytoExploreR you are using and see if it is an easy fix.
It seems like the problem comes from the file selection through select
:
cs <- cyto_load("Debug-Files", select = c("test.fcs", "[test].fcs"))
cyto_names(cs)
[1] "test.fcs"
I will take a closer look and report back soon.
Thank you for your kind reply.
I have also confirmed the problem of select
.
temp <- cyto_setup()
cyto_names(temp) [1] "[test].fcs" "test.fcs"
I'd appreciate it if you fix this problem.
Yeah the problem is due the fuzzy matching performed by select
. It is trying to match [test].fcs
in the filenames but this syntax actually has different meaning for regular expressions - so it returns incorrect matching.
The solution would be for me to set fixed = TRUE
when I perform the fuzzy matching so that it tries to match [test].fcs
exactly. The problem is that I switch to fixed
matching the case becomes important, so if you accidently type [Test].fcs
you won't get a match to [test].fcs
.
Hmm... I might need to think about this a bit more but one alternative is I can add a fixed
argument to cyto_load()
and cyto_setup()
so that you can have control over the matching.
Is there any reason why you can't do the following instead:
cs <- cyto_load("." select = "test"))
I would avoid passing special characters like [
, {
, .
, :
, !
, |
or *
to the select
and exclude
arguments.
OK so I think I have a solution. I think I will convert the matching criteria and file names to lower case during matching so that the match is not case sensitive but still an exact match.
This sort of matching would come in handy in a few places so I may need to write I separate function to handle this.
I will let you know once I have pushed the fixed to GitHub.
I really appreciate your cooperation. I'm looking forward to it.
@DillonHammill I trust you to make a good solution, but Linux, MacOS, and R itself are case sensitive environments so ignoring case could have more extreme consequences as a work-around for handling special characters. Could the brackets just be delivered to select as properly escaped regular expressions?