slamdunk
slamdunk copied to clipboard
error in alleyoop summary
Hi Tobias,
I'm having problems with alleyoop summary.
I'm using slam dunk thought docker and I have run the following commands.
slamdunk all alleyoop summary
This is the command:
alleyoop summary -t /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/count/
-o /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/summary/out_file_summary
/data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/filter/*_filtered.bam
The error is getting is:
Running alleyoop summary for 6 files
Traceback (most recent call last):
File "/opt/conda/envs/slamdunk/bin/alleyoop", line 8, in
log file contains:
/opt/conda/envs/slamdunk/lib/python3.7/site-packages/slamdunk/plot/PCAPlotter.R -f /tmp/tmpxclbfgl2 -O /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/summary/out_file_summary_PCA.pdf -P /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/summary/out_file_summary_PCA.txt b'Error in pca$x[, 2] : subscript out of bounds\n'b'Calls: data.frame\n'b'Execution halted\n'
Thanks for your help,
Felipe
Hi Felipe,
do you have a file produced in /tmp/tmpxclbfgl2
? And if yes, how does it look like?
Yes, I do have the file /tmp/tmpxclbfgl2
This is what the file contains. It's the tsv files path.
sample_0 /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/count/0d_1rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv sample_0 /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/count/0d_2rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv sample_0 /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/count/3d_1rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv sample_0 /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/count/3d_2rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv sample_0 /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/count/Wash_1rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv sample_0 /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/count/Wash_2rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv
Hm so a very stupid thing to suggest is to start R from inside the container and go throught he PCAPlotter.R script and see what's the issue - it doesnt really look like some library loading issue but rather something being malformed in the files itself. Is that something you could check?
I went through the PCAPlotter.R and I don't understand why the "countsList" output is 14 columns should be 6 because I have 6 tsv files.
countsList = list()
for (i in 1:nrow(samples)) {
curTab = read.delim(samples$file[i],stringsAsFactors=FALSE, comment.char="#")
countsList[[samples$sample[i]]] = curTab$TcReadCount
}
Samples looks correct.
samples sample 1 sample_0 2 sample_0 3 sample_0 4 sample_0 5 sample_0 6 sample_0 file 1 /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/count/0d_1rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv 2 /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/count/0d_2rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv 3 /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/count/3d_1rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv 4 /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/count/3d_2rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv 5 /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/count/Wash_1rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv 6 /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/count/Wash_2rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv
However, when I print "countsList" I have 14 columns.
If I keep going in the code "variances = apply(countMatrix, 1, var)" variances result in a matrix full of NA, which is going to affect downstream code.
I didn't find the error.
Is there any chance you could zip me the _tcount.tsv
files as well as the /tmp/tmpxclbfgl2
file so I can check myself?
Is there any chance you could zip me the
_tcount.tsv
files as well as the/tmp/tmpxclbfgl2
file so I can check myself?
Yes I can. How I send the data only to you ?
You could email it - [email protected]
Ok now I know what's going on - sorry for taking so long.
sample_0 /groups/zuber/zubarchive/USERS/tobias/tmp/slamdunkdebug/felipe_tsv_files/0d_1rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv
sample_0 /groups/zuber/zubarchive/USERS/tobias/tmp/slamdunkdebug/felipe_tsv_files/0d_2rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv
sample_0 /groups/zuber/zubarchive/USERS/tobias/tmp/slamdunkdebug/felipe_tsv_files/3d_1rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv
sample_0 /groups/zuber/zubarchive/USERS/tobias/tmp/slamdunkdebug/felipe_tsv_files/3d_2rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv
sample_0 /groups/zuber/zubarchive/USERS/tobias/tmp/slamdunkdebug/felipe_tsv_files/Wash_1rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv
sample_0 /groups/zuber/zubarchive/USERS/tobias/tmp/slamdunkdebug/felipe_tsv_files/Wash_2rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv
The first column has always the same name. Did you name your samples differently or always "sample_0"?
How do you run slamdunk?
Since it's only 1 sample in there, a PCA naturally makes no sense and will crash
slamdunk all -t 40 -o /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36 -r /data/references/hg19/genes/gencode/male.hg19.fa -b /data/references/hg19/genes/gencode/gencode_v19_3utr_comprehensive_sorted_merged.bed /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/fastq_files/Wash_1rep_R1.fastq.gz
All other alleyoop function works fine, only PCA does not work.
How can I fix this issue?
What if you input all fastq files into the slamdunk all command via wildcard (*)?
slamdunk all -t 40 -o /data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36 -r /data/references/hg19/genes/gencode/male.hg19.fa -b /data/references/hg19/genes/gencode/gencode_v19_3utr_comprehensive_sorted_merged.bed
/data/users/felipe/data/rnaseq/slam_seq/lucas_data/h3k36/fastq_files/Wash_1rep_R1.fastq.gz
All other alleyoop function works fine, only PCA does not work.
How can I fix this issue?
Em ter., 29 de mar. de 2022 às 07:23, Tobias Neumann < @.***> escreveu:
Ok now I know what's going on - sorry for taking so long.
sample_0 /groups/zuber/zubarchive/USERS/tobias/tmp/slamdunkdebug/felipe_tsv_files/0d_1rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv sample_0 /groups/zuber/zubarchive/USERS/tobias/tmp/slamdunkdebug/felipe_tsv_files/0d_2rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv sample_0 /groups/zuber/zubarchive/USERS/tobias/tmp/slamdunkdebug/felipe_tsv_files/3d_1rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv sample_0 /groups/zuber/zubarchive/USERS/tobias/tmp/slamdunkdebug/felipe_tsv_files/3d_2rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv sample_0 /groups/zuber/zubarchive/USERS/tobias/tmp/slamdunkdebug/felipe_tsv_files/Wash_1rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv sample_0 /groups/zuber/zubarchive/USERS/tobias/tmp/slamdunkdebug/felipe_tsv_files/Wash_2rep_R1.fastq_slamdunk_mapped_filtered_tcount.tsv
The first column has always the same name. Did you name your samples differently or always "sample_0"?
How do you run slamdunk?
Since it's only 1 sample in there, a PCA naturally makes no sense and will crash
— Reply to this email directly, view it on GitHub https://github.com/t-neumann/slamdunk/issues/112#issuecomment-1081748132, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACAPROZJYNP2XJ6JU6OZYGDVCLR2TANCNFSM5Q7IB6PQ . You are receiving this because you authored the thread.Message ID: @.***>
Well you run it only on a single sample from what I see correct? Then a PCA does not really make sense