Atlas-CNV
Atlas-CNV copied to clipboard
cp: cannot create regular file
Hello,
I've been getting the following error messages for every sample in my list:
cp: cannot create regular file ‘/62209052_S14-ready.DATA.sample_interval_summary’: Permission denied cp: cannot stat ‘testmidpool’: No such file or directory Error in file(file, "rt") : cannot open the connection Calls: read.table -> file In addition: Warning message: In file(file, "rt") : cannot open file 'testmidpool': No such file or directory Execution halted (sample_list.txt).tfile RPKM_matrix.testmidpool
There are no permission issues, the DATA.sample_interval_summary files have rwx for both user and group.
I am running the tool using this command: perl atlas_cnv.pl --config config --panel panel_file.txt --sample sample_list.txt
My config file looks like this: GATKDIR=/data/cnv/gatk_doc/atlas_cnv/[SAMPLE_FCLBC].DATA.sample_interval_summary ATLASCNV=/home/josianne/tools/Atlas-CNV RPATH=/home/josianne/tools/R-3.6.2/bin/R RSCRIPT=/home/josianne/tools/R-3.6.2/bin/Rscript
My panel file looks like this: 1:10356935-10357156 KIF1B_19 Y NM_015074.3 1:10357212-10357325 KIF1B_20 Y NM_015074.3 1:10380081-10380215 KIF1B_21 Y NM_015074.3 1:10381747-10381936 KIF1B_22 Y NM_015074.3 1:10383922-10384141 KIF1B_23 Y NM_015074.3 1:10384796-10384974 KIF1B_24 Y NM_015074.3 1:10386149-10386438 KIF1B_25 Y NM_015074.3 1:10394558-10394717 KIF1B_26 Y NM_015074.3
And my sample list looks like this: 10001798-1c_S1-ready F testmidpool 10001984_S3-ready_2078171086 M testmidpool 10002025-1c_S4-ready M testmidpool 10002032_S28-ready F testmidpool 10002045_S21-ready M testmidpool 10002159_S20-ready F testmidpool 10002291_S26-ready M testmidpool 10002299_S25-ready M testmidpool
The script used to be able to generate the folder testmidpool and the 'cp: cannot create regular file' messages. However after making a few changes, it now doesn't even generate the testmidpool folder.
Thank you for your help, Josianne
Hi Josianne,
I'm sorry about the issues. The basic idea is that it copies files from a directory containing GATK DoC files (ie. *.DATA.sample_interval_summary) to a midpool folder, ie.'testmidpool' as defined in your sample file.
The source folder is defined in the config: /data/cnv/gatk_doc/atlas_cnv/. You should have the following files:
/data/cnv/gatk_doc/atlas_cnv/10001798-1c_S1-ready.DATA.sample_interval_summary
/data/cnv/gatk_doc/atlas_cnv/10001984_S3-ready_2078171086.DATA.sample_interval_summary
/data/cnv/gatk_doc/atlas_cnv/10002025-1c_S4-ready.DATA.sample_interval_summary
etc...
And you should check that you have permissions to create folders in the directory you are running the perl atlas_cnv.pl command. Check with the following commands. Do you see 'Permission Denied' or some error?
mkdir testmidpool
cp /data/cnv/gatk_doc/atlas_cnv/10001798-1c_S1-ready.DATA.sample_interval_summary testmidpool/
Hope this helps, Ted
I'm able to create a directory and copy the GATK DoC files into the directory without any issues. I decided to manually create the testmidpool directory and copy all the *DATA.sample_interval_summary files into it. And commented out the parts of the script that do that.
However, I am now running into issues with the convert_GATK_DoC.R script. There seems to be errors with dimensions of the GATK DoC files and the panel file.
Error in rrr[1, which(PANEL$Call_CNV == "Y")] : subscript out of bounds Execution halted
When I commented that line of code out (since all my target regions have Call_CNV = Y), I got the following error:
Error: convert_GATK_DoC.R: GATK_DoC Exon_Target coords do not match the panel design. Exit R and die.
I noticed that the GATK DoC files were of different lengths than the panel file, and fixed it by removing the duplicate rows and ROIs that had no data in the DoC files. But I am still getting error messages:
Warning message: In cbind(as.vector(PANEL$Gene_Exon[which(PANEL$Call_CNV == "Y")]), : number of rows of result is not a multiple of vector length (arg 1)
I'm wondering if maybe there is something wrong with the DoC files to start with. Do you have any advice as to what could be going on here?
Thank you
The GATK DoC file should have exactly the same coordinates as the panel file.
ie. the first column of the *.DATA.sample_interval_summary
Target 1:10356935-10357156 1:10357212-10357325 ...
should be the same as the first column of the panel file.
Exon_Target 1:10356935-10357156 1:10357212-10357325 ...
You can do "cut -f1" on a GATK DoC and on the panel file and then do a "diff" to see if they are identical:
cut -f1 aaaa.DATA.sample_interval_summary > col1.gatk
cut -f1 panel_file.txt > p.col1
diff col1.gatk p.col1
Also I noticed that GATK DoC may merge targets which overlap (if I recall vagule) which may throw off the agreement with the coordinates in panel file. The idea is make sure both match.
If the problem persists, you can post a sample GATK file and the panel file, and I can have a look. Or send them as attachments by email if you prefer.
Ted