STAR-SEQR
STAR-SEQR copied to clipboard
nested renamer is not supported
Hi I am running starseqr on my samples and stuck in error.
starseqr.py -1 sample_1.fastq.gz -2 sample_2.fastq.gz -m 1 -p starseqr_test -t 50 -i STAR_FUSION_LIB/ref_genome.fa.star.idx/ -g genomic.gtf -r genomic.fa -vv
2021-06-22 10:13 - INFO - STAR-SEQR***
2021-06-22 10:13 - INFO - CMD = /home/nipgr/software/STAR-SEQR/myenv/bin/starseqr.py -1 sample_1.fastq.gz -2 sample_2.fastq.gz -m 1 -p starseqr_test -t 50 -i STAR_FUSION_LIB/ref_genome.fa.star.idx/ -g genomic.gtf -r genomic.fa -vv
2021-06-22 10:13 - INFO - STAR-SEQR_version = 0.6.7
2021-06-22 10:13 - INFO - Starting to work on sample: /home/nipgr/Documents/chickpea/starseqr_test
2021-06-22 10:13 - INFO - Found input: sample_1.fastq.gz
2021-06-22 10:13 - INFO - Found input: sample_2.fastq.gz
2021-06-22 10:13 - INFO - Found input: genomic.fa
2021-06-22 10:13 - INFO - Found input: genomic.gtf
2021-06-22 10:13 - INFO - Starting STAR Alignment
2021-06-22 10:13 - INFO - *STAR Command: STAR --readFilesIn sample_1.fastq.gz sample_2.fastq.gz --readFilesCommand zcat --runThreadN 50 --genomeDir STAR_FUSION_LIB/ref_genome.fa.star.idx --outFileNamePrefix starseqr_test_STAR-SEQR/starseqr_test. --chimScoreJunctionNonGTAG -1 --outSAMtype None --chimOutType Junctions SeparateSAMold --alignSJDBoverhangMin 5 --outFilterMultimapScoreRange 1 --outFilterMultimapNmax 5 --outMultimapperOrder Random --outSAMattributes NH HI AS nM --chimSegmentMin 10 --chimJunctionOverhangMin 10 --chimScoreMin 1 --chimScoreDropMax 30 --chimScoreSeparation 7 --chimSegmentReadGapMax 3 --chimFilter None --twopassMode None --alignSJstitchMismatchNmax 5 -1 5 5 --chimMainSegmentMultNmax 10
2021-06-22 10:14 - INFO - b'Jun 22 10:13:02 ..... started STAR run\nJun 22 10:13:02 ..... loading genome\nJun 22 10:13:04 ..... started mapping\nJun 22 10:14:37 ..... finished mapping\nJun 22 10:14:37 ..... finished successfully\n'
2021-06-22 10:14 - INFO - STAR Alignment Finished!
2021-06-22 10:14 - INFO - Importing junctions
2021-06-22 10:14 - INFO - Number of candidates removed due to Mitochondria filter: 0
2021-06-22 10:14 - INFO - Removing duplicate reads
2021-06-22 10:14 - INFO - Begin multiprocessing of function apply_cigar_overhang in a pool of 50 workers using map_async protocol
2021-06-22 10:14 - INFO - Ordering junctions
2021-06-22 10:14 - INFO - Normalizing junctions
2021-06-22 10:14 - INFO - Begin multiprocessing of function apply_normalize_jxns in a pool of 50 workers using map_async protocol
2021-06-22 10:14 - INFO - Getting gene strand and flipping info as necessary
2021-06-22 10:14 - INFO - Begin multiprocessing of function apply_jxn_strand in a pool of 50 workers using map_async protocol
2021-06-22 10:15 - INFO - Begin multiprocessing of function apply_flip_func in a pool of 50 workers using map_async protocol
2021-06-22 10:15 - INFO - Aggregating junctions
Traceback (most recent call last):
File "/home/user/software/STAR-SEQR/myenv/bin/starseqr.py", line 622, in
i have tried to print dataframe in "starseqr.py" just before pass it to function "count_jxns" in core.py
it gives output header as:
index, chrom1, pos1, str1, chrom2, pos2, str2, jxntype, jxnleft, jxnright, readid, base1, cigar1, base2, cigar2, identity, overhang_len, order, name, test_strand, flip
So maybe count_jxns tried to aggregate non existing columns, as mentioned in core.py:
new_df = grouped_df.agg(OrderedDict([('readid', OrderedDict([('reads', lambda col: ','.join( col)), ('counts', 'count')])), ('overhang_len', 'max')])).reset_index()
or maybe any other error?
I was able to fix this by installing pandas==0.25.0
... not really sure how great of a solution this is but oh well.