slamdunk icon indicating copy to clipboard operation
slamdunk copied to clipboard

SlamDunk for smallRNA data

Open jaqx008 opened this issue 4 years ago • 26 comments

Hello, I have a couple of questions:

  1. Am I able to use slamdunk with small RNA library? I am trying to see nascent small RNAs using 4SU labelling.
  2. I am having difficulty with the usage and non of the discussions I have seen have helped my issue. Thank you

@t-neumann

jaqx008 avatar Dec 09 '20 20:12 jaqx008

Hi,

  1. Not really, but there is a paper from the same lab dedicated to small RNAs https://pubmed.ncbi.nlm.nih.gov/31350118/
  2. What are you having difficulties with?

t-neumann avatar Dec 10 '20 08:12 t-neumann

Thank you for your response.

See my command bellow and the error I get after words.

Command slamdunk all -r PATHTO/genomic.fna -b HE.all.bed -o SLAMDUNK -5 0 -n 100 -t 8 -m -rl 32 PATHTOsamp1.R1_001.fastq PATHTOsamp1.R2_001.fastq PATHTOsamp1.R3_001.fastq PATHTOsamp1.R4_001.fastq PATHTOsamp2.R1_001.fastq PATHTOsamp2.R2_001.fastq PATHTOsamp2.R3_001.fastq PATHTOsamp2.R4_001.fastq

ERRROR MESSAGE

slamdunk all Running slamDunk map for 8 files (8 threads) Traceback (most recent call last): File "/Users/Alex/miniconda3/bin/slamdunk", line 8, in sys.exit(run()) File "/Users/Alex/miniconda3/lib/python3.6/site-packages/slamdunk/slamdunk.py", line 520, in run runAll(args) File "/Users/Alex/miniconda3/lib/python3.6/site-packages/slamdunk/slamdunk.py", line 245, in runAll runMap(tid, bam, referenceFile, n, args.trim5, args.maxPolyA, args.quantseq, args.endtoend, args.topn, sampleInfo, dunkPath, args.skipSAM) File "/Users/Alex/miniconda3/lib/python3.6/site-packages/slamdunk/slamdunk.py", line 149, in runMap mapper.Map(inputBAM, referenceFile, outputSAM, getLogFile(outputLOG), quantseqMapping, endtoendMapping, threads=threads, trim5p=trim5p, maxPolyA=maxPolyA, topn=topn, sampleId=tid, sampleName=sampleName, sampleType=sampleType, sampleTime=sampleTime, printOnly=printOnly, verbose=verbose) File "/Users/Alex/miniconda3/lib/python3.6/site-packages/slamdunk/dunks/mapper.py", line 104, in Map run("ngm -r " + inputReference + " -q " + inputBAM + " -t " + str(threads) + " " + parameter + " -o " + outputSAM, log, verbose=verbose, dry=printOnly) File "/Users/Alex/miniconda3/lib/python3.6/site-packages/slamdunk/utils/misc.py", line 196, in run raise RuntimeError("Error while executing command: "" + cmd + """) RuntimeError: Error while executing command: "ngm -r PATHTO/genomic.fna -q PATHTOR1_001.fastq -t 8 --no-progress --slam-seq 2 --max-polya 4 -l --rg-id 0 --rg-sm PATHTOR1_001:pulse:0 -n 100 --strata -o SLAMDUNK/map/PATHTOR1_001_slamdunk_mapped.sam"

@t-neumann

jaqx008 avatar Dec 10 '20 15:12 jaqx008

Can you go to the map folder and have a look at the log files there - what do they say?

t-neumann avatar Dec 15 '20 09:12 t-neumann

This is what "myfastq.slamdunk_mapped.log" says.

b'/bin/sh: ngm: command not found\n'b'/bin/sh: ngm: command not found\n'

the other map file, "files_slamdunk_mapped.log" is empty

Also I see this in my slamdunk version.py incase you want to see.

Overall slamDunk version

version = "0.4.3"

File format version of BAM files from slamdunk filter

bam_version = "3"

File format version of count files from slamdunk count

count_version = "3"

Required NextGenMap version

ngm_version = "0.5.5"

@t-neumann

jaqx008 avatar Dec 15 '20 15:12 jaqx008

Hm how did you install slamdunk? Via conda? Or Docker?

Sounds like NGM was not installed.

t-neumann avatar Dec 15 '20 20:12 t-neumann

I used conda.

@t-neumann

jaqx008 avatar Dec 15 '20 23:12 jaqx008

Then this is weird, did you set up all channels correctly? What if you redo it like this?

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

conda create --name slamdunk slamdunk

source activate slamdunk

export PYTHONNOUSERSITE=1

And afterwards run your command?

t-neumann avatar Dec 16 '20 11:12 t-neumann

Then this is weird, did you set up all channels correctly? What if you redo it like this?

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

conda create --name slamdunk slamdunk

source activate slamdunk

export PYTHONNOUSERSITE=1

And afterwards run your command?

Thank you very much for your help. It started running fine but terminated and produced the following (IN TERMINAL) It also generated 5 sam output instead of 8. Not sure why.

IN TERMINAL slamdunk all Running slamDunk map for 8 files (8 threads) ......Traceback (most recent call last): File "/Users/Alex/miniconda3/envs/slamdunk/bin/slamdunk", line 10, in sys.exit(run()) File "/Users/Alex/miniconda3/envs/slamdunk/lib/python3.8/site-packages/slamdunk/slamdunk.py", line 520, in run runAll(args) File "/Users/Alex/miniconda3/envs/slamdunk/lib/python3.8/site-packages/slamdunk/slamdunk.py", line 245, in runAll runMap(tid, bam, referenceFile, n, args.trim5, args.maxPolyA, args.quantseq, args.endtoend, args.topn, sampleInfo, dunkPath, args.skipSAM) File "/Users/Alex/miniconda3/envs/slamdunk/lib/python3.8/site-packages/slamdunk/slamdunk.py", line 149, in runMap mapper.Map(inputBAM, referenceFile, outputSAM, getLogFile(outputLOG), quantseqMapping, endtoendMapping, threads=threads, trim5p=trim5p, maxPolyA=maxPolyA, topn=topn, sampleId=tid, sampleName=sampleName, sampleType=sampleType, sampleTime=sampleTime, printOnly=printOnly, verbose=verbose) File "/Users/Alex/miniconda3/envs/slamdunk/lib/python3.8/site-packages/slamdunk/dunks/mapper.py", line 104, in Map run("ngm -r " + inputReference + " -q " + inputBAM + " -t " + str(threads) + " " + parameter + " -o " + outputSAM, log, verbose=verbose, dry=printOnly) File "/Users/Alex/miniconda3/envs/slamdunk/lib/python3.8/site-packages/slamdunk/utils/misc.py", line 196, in run raise RuntimeError("Error while executing command: "" + cmd + """) RuntimeError: Error while executing command: "ngm -r /Users/Alex/Downloads/GCF_002022765.2_C_virginica-3.0_genomic.fna -q /Volumes/MasterBackUp/_A_Amanitin_S2_L003_R1_001.fastq.trimmed -t 8 --no-progress --slam-seq 2 --max-polya 4 -l --rg-id 6 --rg-sm _A_Amanitin_S2_L003_R1_001.fastq:pulse:0 -n 100 --strata -o SLAMDUNK/map/_A_Amanitin_S2_L003_R1_001.fastq_slamdunk_mapped.sam"

@t-neumann

jaqx008 avatar Dec 16 '20 16:12 jaqx008

Please do you have any idea why the run is not being completed? and the error? @t-neumann

jaqx008 avatar Dec 22 '20 12:12 jaqx008

Sorry lost track of this: Can you let me know if there is anything written in the log files in the map folder?

t-neumann avatar Dec 22 '20 12:12 t-neumann

Thats Ok.

To recap. 5 sam files were made (instead of 8), 5 mapped log files were created alongside. Below is what was written in one of the log files:

b'[MAIN] NextGenMap 0.5.5\n'b'[MAIN] Startup : x64 (build Mar 2 2019 13:21:35)\n'b'[MAIN] Starting time: 2020-12-16.09:46:31\n'b'[CONFIG] Parameter: --affine 0 --argos_min_score 0 --bin_size 2 --block_multiplier 2 --broken_pairs 0 --bs_cutoff 6 --bs_mapping 0 --cpu_threads 8 --dualstrand 1 --fast 0 --fast_pairing 0 --force_rlength_check 0 --format 1 --gap_extend_penalty 5 --gap_read_penalty 20 --gap_ref_penalty 20 --gpu 0 --hard_clip 0 --keep_tags 0 --kmer 13 --kmer_min 0 --kmer_skip 2 --local 1 --match_bonus 10 --match_bonus_tc 2 --match_bonus_tt 10 --max_cmrs 2147483647 --max_equal 1 --max_insert_size 1000 --max_polya 4 --max_read_length 0 --min_identity 0.650000 --min_insert_size 0 --min_mq 0 --min_residues 0.500000 --min_score 0.000000 --mismatch_penalty 15 --mode 0 --no_progress 1 --no_unal 0 --ocl_threads 1 --output SLAMDUNK/map/SeqdataID.fastq_slamdunk_mapped.sam --overwrite 1 --pair_score_cutoff 0.900000 --paired 0 --parse_all 1 --pe_delimiter / --qry /Volumes/MasterBackUp/SeqdataID.fastq.trimmed --qry_count -1 --qry_start 0 --ref /Users/Alex/Downloads/RefGENOMEfna --ref_mode -1 --rg_id 0 --rg_sm SeqdataID.fastq:pulse:0 --sensitive 0 --silent_clip 0 --skip_mate_check 0 --skip_save 0 --slam_seq 2 --step_count 4 --strata 1 --topn 100 --trim5 0 --update_check 0 --very_fast 0 --very_sensitive 0\n'b'[NGM] Opening for output (SAM): SLAMDUNK/map/SeqdataID.fastq_slamdunk_mapped.sam\n'b'[SEQPROV] Reading encoded reference from /Users/Alex/Downloads/RefGENOMEfna-enc.2.ngm\n'b'[SEQPROV] Reading 684 Mbp from disk took 0.16s\n'b'[PREPROCESS] Reading RefTable from /Users/Alex/Downloads/RefGENOMEfna-ht-13-2.3.ngm\n'b'[PREPROCESS] Reading from disk took 0.66s\n'b'[PREPROCESS] Max. k-mer frequency set so 179!\n'b'[INPUT] Input is single end data.\n'b'[INPUT] Opening file /Volumes/MasterBackUp/SeqdataID.fastq.trimmed for reading\n'b'[INPUT] Input is Fastq\n'b'[INPUT] Estimating parameter from data\n'b'[INPUT] Reads found in files: 5064335\n'b'[INPUT] Average read length: 24 (min: 18, max: 34)\n'b'[INPUT] Corridor width: 8\n'b'[INPUT] Average kmer hits pro read: 2.616784\n'b'[INPUT] Max possible kmer hit: 4\n'b'[INPUT] Estimated sensitivity: 0.654196\n'b'[INPUT] Estimating parameter took 2.739s\n'b'[INPUT] Input is Fastq\n'b'[CONFIG] Value gpu : 0 out of range [1, 32] - defaulting to 1\n'b'[CONFIG] No array declared for gpu\n'b'[OPENCL] Available platforms: 1\n'b'[OPENCL] Apple\n'b'[OPENCL] Selecting OpenCl platform: Apple\n'b'[OPENCL] Platform: OpenCL 1.2 (May 24 2018 20:07:03)\n'b'[CONFIG] Value gpu : 0 out of range [1, 32] - defaulting to 1\n'b'[CONFIG] No array declared for gpu\n'b'[CONFIG] Value gpu : 0 out of range [1, 32] - defaulting to 1\n'b'[CONFIG] No array declared for gpu\n'b'[CONFIG] Value gpu : 0 out of range [1, 32] - defaulting to 1\n'b'[CONFIG] No array declared for gpu\n'b'[CONFIG] Value gpu : 0 out of range [1, 32] - defaulting to 1\n'b'[CONFIG] No array declared for gpu\n'b'[CONFIG] Value gpu : 0 out of range [1, 32] - defaulting to 1\n'b'[CONFIG] No array declared for gpu\n'b'[CONFIG] Value gpu : 0 out of range [1, 32] - defaulting to 1\n'b'[CONFIG] No array declared for gpu\n'b'[CONFIG] Value gpu : 0 out of range [1, 32] - defaulting to 1\n'b'[CONFIG] No array declared for gpu\n'b'[OPENCL] Context for GPU devices created.\n'b'[OPENCL] 2 GPU device(s) found: \n'b'[OPENCL] Device 0: AMD Radeon HD - FirePro D300 Compute Engine (Driver: 1.2 (Jun 29 2018 18:33:51))\n'b'[OPENCL] Device 1: AMD Radeon HD - FirePro D300 Compute Engine (Driver: 1.2 (Jun 29 2018 18:33:51))\n'b'[Score (OpenCL)] Releasing pinned memory.\n'b'[Score (OpenCL)] Releasing pinned memory.\n'b'[Score (OpenCL)] Releasing pinned memory.\n'b'[Score (OpenCL)] Releasing pinned memory.\n'b'[Score (OpenCL)] Releasing pinned memory.\n'b'[Score (OpenCL)] Releasing pinned memory.\n'b'[Score (OpenCL)] Releasing pinned memory.\n'b'[Score (OpenCL)] Releasing pinned memory.\n'b'[MAIN] Alignments computed: 23734051\n'b'[MAIN] Done (4841107 reads mapped (95.59%), 223228 reads not mapped, 23920820 lines written)(elapsed: 162.368103s)\n'b'[UPDATE_CHECK] Your version of NGM is more than 6 months old - a newer version may be available. (For performing an automatic check use --update-check)\n'

jaqx008 avatar Dec 22 '20 12:12 jaqx008

Ok I suppose the 5 samples worked also because I dont see an error in the log files. Can you try to only run it on the failed samples and see what happens there?

t-neumann avatar Dec 22 '20 12:12 t-neumann

Ok. I will get back ASAP

jaqx008 avatar Dec 22 '20 13:12 jaqx008

Hello, So I ran the entire slamdunk all command on my magnolia (cloud ) account, the sam files were created, bam files created but not indexed. I am guessing the run is supposed to create indices. Also in the count folder, every file there is empty, while the snp and filter have data in it. How do I create the index and also creat the pdfs please? thanks

@t-neumann

jaqx008 avatar Dec 23 '20 19:12 jaqx008

Did you get any error messages for those files? How do the input fastq files look like?

t-neumann avatar Dec 23 '20 20:12 t-neumann

I honestly don't see any error in the log file. However, the run terminated with the following message. Also my fastq files is a regular fastq file (uncompressed).

slamdunk all Running slamDunk map for 8 files (20 threads) ........ Running slamDunk sam2bam for 8 files (20 threads) ........ Running slamDunk filter for 8 files (20 threads) ........

Running slamDunk SNP for 8 files (10 threads) ........ Running slamDunk tcount for 8 files (20 threads) joblib.externals.loky.process_executor._RemoteTraceback: """ Traceback (most recent call last): File "/home/jopeter/miniconda2/envs/slamdunk/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 431, in _process_worker r = call_item() File "/home/jopeter/miniconda2/envs/slamdunk/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 285, in call return self.fn(*self.args, **self.kwargs) File "/home/jopeter/miniconda2/envs/slamdunk/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 595, in call return self.func(*args, **kwargs) File "/home/jopeter/miniconda2/envs/slamdunk/lib/python3.8/site-packages/joblib/parallel.py", line 262, in call return [func(*args, **kwargs) File "/home/jopeter/miniconda2/envs/slamdunk/lib/python3.8/site-packages/joblib/parallel.py", line 262, in return [func(*args, **kwargs) File "/home/jopeter/miniconda2/envs/slamdunk/lib/python3.8/site-packages/slamdunk/slamdunk.py", line 202, in runCount tcounter.computeTconversions(ref, bed, inputSNP, bam, maxLength, minQual, outputCSV, outputBedgraphPlus, outputBedgraphMinus, conversionThreshold, log) File "/home/jopeter/miniconda2/envs/slamdunk/lib/python3.8/site-packages/slamdunk/dunks/tcounter.py", line 170, in computeTconversions raise RuntimeError("Input BED file does not contain stranded intervals.") RuntimeError: Input BED file does not contain stranded intervals. """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/jopeter/miniconda2/envs/slamdunk/bin/slamdunk", line 10, in sys.exit(run()) File "/home/jopeter/miniconda2/envs/slamdunk/lib/python3.8/site-packages/slamdunk/slamdunk.py", line 520, in run runAll(args) File "/home/jopeter/miniconda2/envs/slamdunk/lib/python3.8/site-packages/slamdunk/slamdunk.py", line 328, in runAll results = Parallel(n_jobs=n, verbose=verbose)(delayed(runCount)(tid, dunkbufferIn[tid], referenceFile, args.bed, args.maxLength, args.minQual, args.conversionThreshold, dunkPath, snpDirectory, vcfFile) for tid in range(0, len(samples))) File "/home/jopeter/miniconda2/envs/slamdunk/lib/python3.8/site-packages/joblib/parallel.py", line 1054, in call self.retrieve() File "/home/jopeter/miniconda2/envs/slamdunk/lib/python3.8/site-packages/joblib/parallel.py", line 933, in retrieve self._output.extend(job.get(timeout=self.timeout)) File "/home/jopeter/miniconda2/envs/slamdunk/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 542, in wrap_future_result return future.result(timeout=timeout) @t-neumann

jaqx008 avatar Dec 23 '20 23:12 jaqx008

Ok how does your bed file look? Look like there's the strand information missing.

t-neumann avatar Dec 24 '20 08:12 t-neumann

Hello. happy holidays. my bed file looks like this; chr1 17625 17799 174 chr2 19196 19279 83 chr2 134518 134571 53 chr1 441385 441610 225

I tried to add strand information so that they look like this; chr1 17625 17799 174 + chr2 19196 19279 83 - chr2 134518 134571 53 - chr1 441385 441610 225 +

The run still terminated with the error "RuntimeError: Input BED file does not contain stranded intervals."

Am I doing it wrong? is there another way to add thE strand info?

@t-neumann

jaqx008 avatar Dec 27 '20 01:12 jaqx008

Yes you miss the score column after the gene id - check out the official bed format. you can put the score to 1000, slamdunk doesnt really look at it.

t-neumann avatar Dec 27 '20 14:12 t-neumann

Hello. I added the score and was able to execute the command. Thank you very much for your help. Unrelated. I observed more TC in my untreated samples than in my 4SU treated samples. Does this suggest the experiment did not work? Thanks

@t-neumann

jaqx008 avatar Dec 31 '20 16:12 jaqx008

Can you send the qc files for this observation? This sounds not right.

t-neumann avatar Dec 31 '20 20:12 t-neumann

What do you mean by QC please?

jaqx008 avatar Dec 31 '20 20:12 jaqx008

@t-neumann

jaqx008 avatar Dec 31 '20 20:12 jaqx008

Like where to you see that you have more T>C conversions in your untreated sample compare to the labelled ones.

t-neumann avatar Dec 31 '20 20:12 t-neumann

Yes that indeed looks like you have way more conversions in your untreated sample. U sure there was no sample swap or anything?

t-neumann avatar Jan 01 '21 18:01 t-neumann