vdjer
vdjer copied to clipboard
problem running vdjer on bam file from GTEx
We hard time running the tool for bam file from GTEx project. We use bam files produced by GTEx consortium. Reads were mapped to hg19 using Tophap. Bam files contain both mapped and unmapped reads:
This is how we run it
vdjer --in ../G18051.GTEX-OHPM-0526-SM-2YUMJ.1.bam --ins 175 --chain IGH --ref-dir igh
And we are getting this error
Increasing read_name bufIncreasing read_name bufIncreasing read_name bufSegmentation fault
The full log file is the end of the message.
Do you have any idea why we are getting this error Thanks, Serghei
ELAPSED_SECS START 0 0 PROC_STATUS START Name: vdjer PROC_STATUS START State: R (running) PROC_STATUS START Tgid: 14054 PROC_STATUS START Pid: 14054 PROC_STATUS START PPid: 12870 PROC_STATUS START TracerPid: 0 PROC_STATUS START Uid: 9872 9872 9872 9872 PROC_STATUS START Gid: 8164 8164 8164 8164 PROC_STATUS START Utrace: 0 PROC_STATUS START FDSize: 256 PROC_STATUS START Groups: 8164 11539 12325 PROC_STATUS START VmPeak: 19092 kB PROC_STATUS START VmSize: 19092 kB PROC_STATUS START VmLck: 0 kB PROC_STATUS START VmHWM: 1468 kB PROC_STATUS START VmRSS: 1468 kB PROC_STATUS START VmData: 2260 kB PROC_STATUS START VmStk: 92 kB PROC_STATUS START VmExe: 944 kB PROC_STATUS START VmLib: 3420 kB PROC_STATUS START VmPTE: 56 kB PROC_STATUS START VmSwap: 0 kB PROC_STATUS START Threads: 1 PROC_STATUS START SigQ: 2/96529 PROC_STATUS START SigPnd: 0000000000000000 PROC_STATUS START ShdPnd: 0000000000000000 PROC_STATUS START SigBlk: 0000000000000000 PROC_STATUS START SigIgn: 0000000000000000 PROC_STATUS START SigCgt: 0000000180000000 PROC_STATUS START CapInh: 0000000000000000 PROC_STATUS START CapPrm: 0000000000000000 PROC_STATUS START CapEff: 0000000000000000 PROC_STATUS START CapBnd: ffffffffffffffff PROC_STATUS START Cpus_allowed: fff PROC_STATUS START Cpus_allowed_list: 0-11 PROC_STATUS START Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000003 PROC_STATUS START Mems_allowed_list: 0-1 PROC_STATUS START voluntary_ctxt_switches: 47 PROC_STATUS START nonvoluntary_ctxt_switches: 1 BUDDY_INFO START Node 0, zone DMA 1 1 2 1 2 0 0 0 1 1 3 BUDDY_INFO START Node 0, zone DMA32 94266 38488 6900 980 66 9 1 1 0 1 0 BUDDY_INFO START Node 0, zone Normal 484059 45546 19 0 1 2 2 0 1 0 0 BUDDY_INFO START Node 1, zone Normal 285971 247761 167789 80793 19026 1751 180 58 16 1 0 STAT:1488256353 START 14054 (vdjer) R 12870 14054 12870 34843 14054 4202496 460 0 3 0 0 0 0 0 20 0 1 0 11708590 19550208 369 1073741824 4194304 5160262 140722877782336 140722877776104 140603237045872 0 0 0 0 0 0 0 17 2 0 0 0 0 0 input ../G18051.GTEX-OHPM-0526-SM-2YUMJ.1.bam min node freq 3 min base qual 90 min contig score (log scaled) -5.000000 num threads 1 v anchor file /u/home/n/ngcrawfo/project-zarlab/igor/imrep_revision/tools/vdjer/igh//v_index j anchor file /u/home/n/ngcrawfo/project-zarlab/igor/imrep_revision/tools/vdjer/igh//j_index max anchor mismatches: 4 min V/J window 10 max V/J window 90 conserved J AA 87 total untrimmed contig length 486 extension beyond conserved J AA 162 fasta file containing functional V/D/J sequences /u/home/n/ngcrawfo/project-zarlab/igor/imrep_revision/tools/vdjer/igh//ig_vdj.fa variable region locus chr14:105566277-106879844 constant region locus chr14:105566277-105939754 median insert length 175 read coverage floor for contig filtering 1 kmer 35 source node similarity file /u/home/n/ngcrawfo/project-zarlab/igor/imrep_revision/tools/vdjer/igh//v_region.fa vregion kmer size 15 minimum homology score for source nodes 30 span for read base filtering 35 span for mate base filtering 48 start point for contig filtering 52 stop point for contig filtering 411 window overlap check size 320 read length: 76Loading vmers Loading jmers ELAPSED_SECS POST_VJF_INIT 228 228 PROC_STATUS POST_VJF_INIT Name: vdjer PROC_STATUS POST_VJF_INIT State: R (running) PROC_STATUS POST_VJF_INIT Tgid: 14054 PROC_STATUS POST_VJF_INIT Pid: 14054 PROC_STATUS POST_VJF_INIT PPid: 12870 PROC_STATUS POST_VJF_INIT TracerPid: 0 PROC_STATUS POST_VJF_INIT Uid: 9872 9872 9872 9872 PROC_STATUS POST_VJF_INIT Gid: 8164 8164 8164 8164 PROC_STATUS POST_VJF_INIT Utrace: 0 PROC_STATUS POST_VJF_INIT FDSize: 256 PROC_STATUS POST_VJF_INIT Groups: 8164 11539 12325 PROC_STATUS POST_VJF_INIT VmPeak: 289504 kB PROC_STATUS POST_VJF_INIT VmSize: 256732 kB PROC_STATUS POST_VJF_INIT VmLck: 0 kB PROC_STATUS POST_VJF_INIT VmHWM: 243432 kB PROC_STATUS POST_VJF_INIT VmRSS: 210784 kB PROC_STATUS POST_VJF_INIT VmData: 239900 kB PROC_STATUS POST_VJF_INIT VmStk: 92 kB PROC_STATUS POST_VJF_INIT VmExe: 944 kB PROC_STATUS POST_VJF_INIT VmLib: 3420 kB PROC_STATUS POST_VJF_INIT VmPTE: 480 kB PROC_STATUS POST_VJF_INIT VmSwap: 0 kB PROC_STATUS POST_VJF_INIT Threads: 1 PROC_STATUS POST_VJF_INIT SigQ: 2/96529 PROC_STATUS POST_VJF_INIT SigPnd: 0000000000000000 PROC_STATUS POST_VJF_INIT ShdPnd: 0000000000000000 PROC_STATUS POST_VJF_INIT SigBlk: 0000000000000000 PROC_STATUS POST_VJF_INIT SigIgn: 0000000000000000 PROC_STATUS POST_VJF_INIT SigCgt: 0000000180000000 PROC_STATUS POST_VJF_INIT CapInh: 0000000000000000 PROC_STATUS POST_VJF_INIT CapPrm: 0000000000000000 PROC_STATUS POST_VJF_INIT CapEff: 0000000000000000 PROC_STATUS POST_VJF_INIT CapBnd: ffffffffffffffff PROC_STATUS POST_VJF_INIT Cpus_allowed: fff PROC_STATUS POST_VJF_INIT Cpus_allowed_list: 0-11 PROC_STATUS POST_VJF_INIT Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000003 PROC_STATUS POST_VJF_INIT Mems_allowed_list: 0-1 PROC_STATUS POST_VJF_INIT voluntary_ctxt_switches: 137691 PROC_STATUS POST_VJF_INIT nonvoluntary_ctxt_switches: 11958 BUDDY_INFO POST_VJF_INIT Node 0, zone DMA 1 1 2 1 2 0 0 0 1 1 3 BUDDY_INFO POST_VJF_INIT Node 0, zone DMA32 94271 38488 6871 981 66 9 1 1 0 1 0 BUDDY_INFO POST_VJF_INIT Node 0, zone Normal 70236 22773 0 1 0 0 0 0 1 0 0 BUDDY_INFO POST_VJF_INIT Node 1, zone Normal 75966 223104 151832 73709 17664 2288 809 512 238 15 0 STAT:1488256581 POST_VJF_INIT 14054 (vdjer) R 12870 14054 12870 34843 14054 4202496 22385 0 3 0 11576 704 0 0 20 0 1 0 11708590 262893568 52696 1073741824 4194304 5160262 140722877782336 140722877780248 140603237045872 0 0 0 0 0 0 0 17 2 0 0 183 0 0 Extracting reads... primary_reads size1: [0] secondary_reads size1: [0] Increasing read_name bufIncreasing read_name bufIncreasing read_name bufSegmentation fault
The V'DJer indices were built from hg38. I would suggest aligning to hg38 with STAR and let me know if your problem persists.
could you provide a link to the location of the hg38 genome you used? I encountered a similar error when using vdjer: ELAPSED_SECS START 0 0 PROC_STATUS START Name: vdjer PROC_STATUS START State: R (running) PROC_STATUS START Tgid: 21922 PROC_STATUS START Pid: 21922 PROC_STATUS START PPid: 21829 PROC_STATUS START TracerPid: 0 PROC_STATUS START Uid: 4350 4350 4350 4350 PROC_STATUS START Gid: 100000 100000 100000 100000 PROC_STATUS START Utrace: 0 PROC_STATUS START FDSize: 64 PROC_STATUS START Groups: 1000 1005 20085 100000 PROC_STATUS START VmPeak: 19764 kB PROC_STATUS START VmSize: 19764 kB PROC_STATUS START VmLck: 0 kB PROC_STATUS START VmHWM: 1804 kB PROC_STATUS START VmRSS: 1804 kB PROC_STATUS START VmData: 2264 kB PROC_STATUS START VmStk: 96 kB PROC_STATUS START VmExe: 976 kB PROC_STATUS START VmLib: 4044 kB PROC_STATUS START VmPTE: 76 kB PROC_STATUS START VmSwap: 0 kB PROC_STATUS START Threads: 1 PROC_STATUS START SigQ: 0/1032505 PROC_STATUS START SigPnd: 0000000000000000 PROC_STATUS START ShdPnd: 0000000000000000 PROC_STATUS START SigBlk: 0000000000000000 PROC_STATUS START SigIgn: 0000000000380000 PROC_STATUS START SigCgt: 0000000180000000 PROC_STATUS START CapInh: 0000000000000000 PROC_STATUS START CapPrm: 0000000000000000 PROC_STATUS START CapEff: 0000000000000000 PROC_STATUS START CapBnd: ffffffffffffffff PROC_STATUS START Cpus_allowed: ffff,ffffffff PROC_STATUS START Cpus_allowed_list: 0-47 PROC_STATUS START Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00 000003 PROC_STATUS START Mems_allowed_list: 0-1 PROC_STATUS START voluntary_ctxt_switches: 12 PROC_STATUS START nonvoluntary_ctxt_switches: 3 BUDDY_INFO START Node 0, zone DMA 1 1 1 1 2 1 0 0 1 1 3 BUDDY_INFO START Node 0, zone DMA32 9706 4144 1591 758 336 169 78 37 16 33 43 BUDDY_INFO START Node 0, zone Normal 12402 1360 342 129 75 44 25 20 7 230 2732 BUDDY_INFO START Node 1, zone Normal 2099 997 406 66310 15452 3302 1683 1032 557 305 5725 STAT:1495041123 START 21922 (vdjer) R 21829 21829 21829 0 -1 4202496 557 0 0 0 0 0 0 0 20 0 1 0 646319174 20238336 453 18446744073709551615 4194304 5191506 140734652054800 140 734652054008 217951614144 0 0 3670016 0 0 0 0 17 0 0 0 0 0 0 input /afs/cats.ucsc.edu/users/q/chkcole/TEST_RNASEQ/Cd27Cd38_Plate1_A10_S472_L008/1/Aligned.sortedByCoord.out.bam min node freq 3 min base qual 90 min contig score (log scaled) -5.000000 num threads 8 v anchor file /afs/cats.ucsc.edu/users/q/chkcole/vdjer/igh/v_index j anchor file /afs/cats.ucsc.edu/users/q/chkcole/vdjer/igh/j_index max anchor mismatches: 4 min V/J window 10 max V/J window 90 conserved J AA 87 total untrimmed contig length 486 extension beyond conserved J AA 162 fasta file containing functional V/D/J sequences /afs/cats.ucsc.edu/users/q/chkcole/vdjer/igh/ig_vdj.fa variable region locus chr14:105566277-106879844 constant region locus chr14:105566277-105939754 median insert length 150 read coverage floor for contig filtering 1 kmer 35 source node similarity file /afs/cats.ucsc.edu/users/q/chkcole/vdjer/igh/v_region.fa vregion kmer size 15 minimum homology score for source nodes 30 span for read base filtering 35 span for mate base filtering 48 start point for contig filtering 52 stop point for contig filtering 411 window overlap check size 320 read length: 151Loading vmers Loading jmers ELAPSED_SECS POST_VJF_INIT 11 11 PROC_STATUS POST_VJF_INIT Name: vdjer PROC_STATUS POST_VJF_INIT State: R (running) PROC_STATUS POST_VJF_INIT Tgid: 21922 PROC_STATUS POST_VJF_INIT Pid: 21922 PROC_STATUS POST_VJF_INIT PPid: 21829 PROC_STATUS POST_VJF_INIT TracerPid: 0 PROC_STATUS POST_VJF_INIT Uid: 4350 4350 4350 4350 PROC_STATUS POST_VJF_INIT Gid: 100000 100000 100000 100000 PROC_STATUS POST_VJF_INIT Utrace: 0 PROC_STATUS POST_VJF_INIT FDSize: 64 PROC_STATUS POST_VJF_INIT Groups: 1000 1005 20085 100000 PROC_STATUS POST_VJF_INIT VmPeak: 279064 kB PROC_STATUS POST_VJF_INIT VmSize: 246180 kB PROC_STATUS POST_VJF_INIT VmLck: 0 kB PROC_STATUS POST_VJF_INIT VmHWM: 233040 kB PROC_STATUS POST_VJF_INIT VmRSS: 200280 kB PROC_STATUS POST_VJF_INIT VmData: 228680 kB PROC_STATUS POST_VJF_INIT VmStk: 96 kB PROC_STATUS POST_VJF_INIT VmExe: 976 kB PROC_STATUS POST_VJF_INIT VmLib: 4044 kB PROC_STATUS POST_VJF_INIT VmPTE: 476 kB PROC_STATUS POST_VJF_INIT VmSwap: 0 kB PROC_STATUS POST_VJF_INIT Threads: 1 PROC_STATUS POST_VJF_INIT SigQ: 0/1032505 PROC_STATUS POST_VJF_INIT SigPnd: 0000000000000000 PROC_STATUS POST_VJF_INIT ShdPnd: 0000000000000000 PROC_STATUS POST_VJF_INIT SigBlk: 0000000000000000 PROC_STATUS POST_VJF_INIT SigIgn: 0000000000380000 PROC_STATUS POST_VJF_INIT SigCgt: 0000000180000000 PROC_STATUS POST_VJF_INIT CapInh: 0000000000000000 PROC_STATUS POST_VJF_INIT CapPrm: 0000000000000000 PROC_STATUS POST_VJF_INIT CapEff: 0000000000000000 PROC_STATUS POST_VJF_INIT CapBnd: ffffffffffffffff PROC_STATUS POST_VJF_INIT Cpus_allowed: ffff,ffffffff PROC_STATUS POST_VJF_INIT Cpus_allowed_list: 0-47 PROC_STATUS POST_VJF_INIT Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,000 00000,00000003 PROC_STATUS POST_VJF_INIT Mems_allowed_list: 0-1 PROC_STATUS POST_VJF_INIT voluntary_ctxt_switches: 32 PROC_STATUS POST_VJF_INIT nonvoluntary_ctxt_switches: 1095 BUDDY_INFO POST_VJF_INIT Node 0, zone DMA 1 1 1 1 2 1 0 0 1 1 3 BUDDY_INFO POST_VJF_INIT Node 0, zone DMA32 9706 4144 1591 758 336 169 78 37 16 33 43 BUDDY_INFO POST_VJF_INIT Node 0, zone Normal 12423 1095 361 146 88 54 30 13 7 168 2732 BUDDY_INFO POST_VJF_INIT Node 1, zone Normal 2262 999 406 66085 15452 3302 1683 1032 557 274 5725 STAT:1495041134 POST_VJF_INIT 21922 (vdjer) R 21829 21829 21829 0 -1 4202496 19605 0 0 0 1035 18 0 0 20 0 1 0 646319174 252088320 50070 18446744073709551615 4194304 5191506 14 0734652054800 140734652043592 217951614144 0 0 3670016 0 0 0 0 17 0 0 0 0 0 0 Extracting reads... primary_reads size1: [0] secondary_reads size1: [0] Segmentation fault (core dumped)
Thank you very much.
We used the UCSC analysis set: http://hgdownload.cse.ucsc.edu/goldenpath/hg38/bigZips/analysisSet/
including EBV, but not including the alt contigs. Be sure that the alt contigs are NOT included in your reference.
I can't think of a reason why any version of hg38 would cause a core dump though. Can you confirm that you mapped using STAR, the BAM file is sorted by coordinate and that the BAM file has been indexed via samtools index ?
The result I get from running
samtools view -H /afs/cats.ucsc.edu/users/q/chkcole/TEST_RNASEQ/Cd27Cd38_Plate1_A10_S472_L008/1/Aligned.sortedByCoord.out.bam
is @HD VN:1.4 SO:coordinate
@SQ SN:1 LN:248956422
@SQ SN:10 LN:133797422
@SQ SN:11 LN:135086622
@SQ SN:12 LN:133275309
@SQ SN:13 LN:114364328
@SQ SN:14 LN:107043718
@SQ SN:15 LN:101991189
@SQ SN:16 LN:90338345
@SQ SN:17 LN:83257441
@SQ SN:18 LN:80373285
@SQ SN:19 LN:58617616
@SQ SN:2 LN:242193529
@SQ SN:20 LN:64444167
@SQ SN:21 LN:46709983
@SQ SN:22 LN:50818468
@SQ SN:3 LN:198295559
@SQ SN:4 LN:190214555
@SQ SN:5 LN:181538259
@SQ SN:6 LN:170805979
@SQ SN:7 LN:159345973
@SQ SN:8 LN:145138636
@SQ SN:9 LN:138394717
@SQ SN:MT LN:16569
@SQ SN:X LN:156040895
@SQ SN:Y LN:57227415
@SQ SN:KI270728.1 LN:1872759
@SQ SN:KI270727.1 LN:448248
@SQ SN:KI270442.1 LN:392061
@SQ SN:KI270729.1 LN:280839
@SQ SN:GL000225.1 LN:211173
@SQ SN:KI270743.1 LN:210658
@SQ SN:GL000008.2 LN:209709
@SQ SN:GL000009.2 LN:201709
@SQ SN:KI270747.1 LN:198735
@SQ SN:KI270722.1 LN:194050
@SQ SN:GL000194.1 LN:191469
@SQ SN:KI270742.1 LN:186739
@SQ SN:GL000205.2 LN:185591
@SQ SN:GL000195.1 LN:182896
@SQ SN:KI270736.1 LN:181920
@SQ SN:KI270733.1 LN:179772
@SQ SN:GL000224.1 LN:179693
@SQ SN:GL000219.1 LN:179198
@SQ SN:KI270719.1 LN:176845
@SQ SN:GL000216.2 LN:176608
@SQ SN:KI270712.1 LN:176043
@SQ SN:KI270706.1 LN:175055
@SQ SN:KI270725.1 LN:172810
@SQ SN:KI270744.1 LN:168472
@SQ SN:KI270734.1 LN:165050
@SQ SN:GL000213.1 LN:164239
@SQ SN:GL000220.1 LN:161802
@SQ SN:KI270715.1 LN:161471
@SQ SN:GL000218.1 LN:161147
@SQ SN:KI270749.1 LN:158759
@SQ SN:KI270741.1 LN:157432
@SQ SN:GL000221.1 LN:155397
@SQ SN:KI270716.1 LN:153799
@SQ SN:KI270731.1 LN:150754
@SQ SN:KI270751.1 LN:150742
@SQ SN:KI270750.1 LN:148850
@SQ SN:KI270519.1 LN:138126
@SQ SN:GL000214.1 LN:137718
@SQ SN:KI270708.1 LN:127682
@SQ SN:KI270730.1 LN:112551
@SQ SN:KI270438.1 LN:112505
@SQ SN:KI270737.1 LN:103838
@SQ SN:KI270721.1 LN:100316
@SQ SN:KI270738.1 LN:99375
@SQ SN:KI270748.1 LN:93321
@SQ SN:KI270435.1 LN:92983
@SQ SN:GL000208.1 LN:92689
@SQ SN:KI270538.1 LN:91309
@SQ SN:KI270756.1 LN:79590
@SQ SN:KI270739.1 LN:73985
@SQ SN:KI270757.1 LN:71251
@SQ SN:KI270709.1 LN:66860
@SQ SN:KI270746.1 LN:66486
@SQ SN:KI270753.1 LN:62944
@SQ SN:KI270589.1 LN:44474
@SQ SN:KI270726.1 LN:43739
@SQ SN:KI270735.1 LN:42811
@SQ SN:KI270711.1 LN:42210
@SQ SN:KI270745.1 LN:41891
@SQ SN:KI270714.1 LN:41717
@SQ SN:KI270732.1 LN:41543
@SQ SN:KI270713.1 LN:40745
@SQ SN:KI270754.1 LN:40191
@SQ SN:KI270710.1 LN:40176
@SQ SN:KI270717.1 LN:40062
@SQ SN:KI270724.1 LN:39555
@SQ SN:KI270720.1 LN:39050
@SQ SN:KI270723.1 LN:38115
@SQ SN:KI270718.1 LN:38054
@SQ SN:KI270317.1 LN:37690
@SQ SN:KI270740.1 LN:37240
@SQ SN:KI270755.1 LN:36723
@SQ SN:KI270707.1 LN:32032
@SQ SN:KI270579.1 LN:31033
@SQ SN:KI270752.1 LN:27745
@SQ SN:KI270512.1 LN:22689
@SQ SN:KI270322.1 LN:21476
@SQ SN:GL000226.1 LN:15008
@SQ SN:KI270311.1 LN:12399
@SQ SN:KI270366.1 LN:8320
@SQ SN:KI270511.1 LN:8127
@SQ SN:KI270448.1 LN:7992
@SQ SN:KI270521.1 LN:7642
@SQ SN:KI270581.1 LN:7046
@SQ SN:KI270582.1 LN:6504
@SQ SN:KI270515.1 LN:6361
@SQ SN:KI270588.1 LN:6158
@SQ SN:KI270591.1 LN:5796
@SQ SN:KI270522.1 LN:5674
@SQ SN:KI270507.1 LN:5353
@SQ SN:KI270590.1 LN:4685
@SQ SN:KI270584.1 LN:4513
@SQ SN:KI270320.1 LN:4416
@SQ SN:KI270382.1 LN:4215
@SQ SN:KI270468.1 LN:4055
@SQ SN:KI270467.1 LN:3920
@SQ SN:KI270362.1 LN:3530
@SQ SN:KI270517.1 LN:3253
@SQ SN:KI270593.1 LN:3041
@SQ SN:KI270528.1 LN:2983
@SQ SN:KI270587.1 LN:2969
@SQ SN:KI270364.1 LN:2855
@SQ SN:KI270371.1 LN:2805
@SQ SN:KI270333.1 LN:2699
@SQ SN:KI270374.1 LN:2656
@SQ SN:KI270411.1 LN:2646
@SQ SN:KI270414.1 LN:2489
@SQ SN:KI270510.1 LN:2415
@SQ SN:KI270390.1 LN:2387
@SQ SN:KI270375.1 LN:2378
@SQ SN:KI270420.1 LN:2321
@SQ SN:KI270509.1 LN:2318
@SQ SN:KI270315.1 LN:2276
@SQ SN:KI270302.1 LN:2274
@SQ SN:KI270518.1 LN:2186
@SQ SN:KI270530.1 LN:2168
@SQ SN:KI270304.1 LN:2165
@SQ SN:KI270418.1 LN:2145
@SQ SN:KI270424.1 LN:2140
@SQ SN:KI270417.1 LN:2043
@SQ SN:KI270508.1 LN:1951
@SQ SN:KI270303.1 LN:1942
@SQ SN:KI270381.1 LN:1930
@SQ SN:KI270529.1 LN:1899
@SQ SN:KI270425.1 LN:1884
@SQ SN:KI270396.1 LN:1880
@SQ SN:KI270363.1 LN:1803
@SQ SN:KI270386.1 LN:1788
@SQ SN:KI270465.1 LN:1774
@SQ SN:KI270383.1 LN:1750
@SQ SN:KI270384.1 LN:1658
@SQ SN:KI270330.1 LN:1652
@SQ SN:KI270372.1 LN:1650
@SQ SN:KI270548.1 LN:1599
@SQ SN:KI270580.1 LN:1553
@SQ SN:KI270387.1 LN:1537
@SQ SN:KI270391.1 LN:1484
@SQ SN:KI270305.1 LN:1472
@SQ SN:KI270373.1 LN:1451
@SQ SN:KI270422.1 LN:1445
@SQ SN:KI270316.1 LN:1444
@SQ SN:KI270340.1 LN:1428
@SQ SN:KI270338.1 LN:1428
@SQ SN:KI270583.1 LN:1400
@SQ SN:KI270334.1 LN:1368
@SQ SN:KI270429.1 LN:1361
@SQ SN:KI270393.1 LN:1308
@SQ SN:KI270516.1 LN:1300
@SQ SN:KI270389.1 LN:1298
@SQ SN:KI270466.1 LN:1233
@SQ SN:KI270388.1 LN:1216
@SQ SN:KI270544.1 LN:1202
@SQ SN:KI270310.1 LN:1201
@SQ SN:KI270412.1 LN:1179
@SQ SN:KI270395.1 LN:1143
@SQ SN:KI270376.1 LN:1136
@SQ SN:KI270337.1 LN:1121
@SQ SN:KI270335.1 LN:1048
@SQ SN:KI270378.1 LN:1048
@SQ SN:KI270379.1 LN:1045
@SQ SN:KI270329.1 LN:1040
@SQ SN:KI270419.1 LN:1029
@SQ SN:KI270336.1 LN:1026
@SQ SN:KI270312.1 LN:998
@SQ SN:KI270539.1 LN:993
@SQ SN:KI270385.1 LN:990
@SQ SN:KI270423.1 LN:981
@SQ SN:KI270392.1 LN:971
@SQ SN:KI270394.1 LN:970
@PG ID:STAR PN:STAR VN:STAR_2.5.2b CL:/afs/cats.ucsc.edu/users/q/chkcole/STAR/bin/Linux_x86_64/STAR --runThreadN 8 --genomeDir /afs/cats.ucsc.edu/users/q/chkcole/REFERENCES/STAR_hg38 --readFilesIn /afs/cats.ucsc.edu/users/q/chkcole/TEST_RNASEQ/Cd27Cd38_Plate1_A10_S472_L008_R1_001.paired.1.fastq /afs/cats.ucsc.edu/users/q/chkcole/TEST_RNASEQ/Cd27Cd38_Plate1_A10_S472_L008_R2_001.paired.1.fastq --outFileNamePrefix /afs/cats.ucsc.edu/users/q/chkcole/TEST_RNASEQ/Cd27Cd38_Plate1_A10_S472_L008/1/ --outSAMtype BAM SortedByCoordinate --outSAMunmapped Within
@CO user command line: /afs/cats.ucsc.edu/users/q/chkcole/STAR/bin/Linux_x86_64/STAR --runThreadN 8 --genomeDir /afs/cats.ucsc.edu/users/q/chkcole/REFERENCES/STAR_hg38 --outSAMunmapped Within --outSAMtype BAM SortedByCoordinate --readFilesIn /afs/cats.ucsc.edu/users/q/chkcole/TEST_RNASEQ/Cd27Cd38_Plate1_A10_S472_L008_R1_001.paired.1.fastq /afs/cats.ucsc.edu/users/q/chkcole/TEST_RNASEQ/Cd27Cd38_Plate1_A10_S472_L008_R2_001.paired.1.fastq --outFileNamePrefix /afs/cats.ucsc.edu/users/q/chkcole/TEST_RNASEQ/Cd27Cd38_Plate1_A10_S472_L008/1/
the indexing was done with
samtools index /afs/cats.ucsc.edu/users/q/chkcole/TEST_RNASEQ/Cd27Cd38_Plate1_A10_S472_L008/1/Aligned.sortedByCoord.out.bam
and vdjer was run with the following commands
~/vdjer/vdjer --in $BAM_FILE --t 8 --ins 150 --chain IGH --ref-dir ~/vdjer/igh > Cd27Cd38_Plate1_A10_1.sam
with the variable set to ''/afs/cats.ucsc.edu/users/q/chkcole/TEST_RNASEQ/Cd27Cd38_Plate1_A10_S472_L008/1/Aligned.sortedByCoord.out.bam"
I've taken a look at the source code and I noticed that the v segment parameters for IGH are set to "chr14:105566277-106879844".
If you pass this to samtools in the form of samtools view [bam/sam file] chr14:105566277-106879844 and try to extract from a bam file where the chromosome segments are named simply "14" as is the case with ensembl references, it won't return anything.
However, I'm not sure how you process this info downstream, so it might still be in issue on my end.
Yes, chr14 clearly won't match "14". My understanding is that hg38 has standardized on the "chr14" style notation. You'll need to use an hg38 reference in that format in order to run V'DJer.
Dear chkcole,
Using Imrep we were able to profile B and T cell receptors across all GTex samples. Some of the data is available as the resource here https://smangul1.github.io/TheAIR/
We will be happy to share the data and collaborate on it if you are interested
Thanks, Serghei