smoove icon indicating copy to clipboard operation
smoove copied to clipboard

smoove call ends earlier than expected

Open tracychew opened this issue 5 years ago • 6 comments

Hi, I am using smoove to call SVs in ~170 samples. I ran through the whole pipeline and the final "square" file only contains variants up to chr 2. I went back to check per-sample calls from the smoove call output and found 11 samples with calls only for the first few chromosomes. I thought this was odd because most of the other samples have calls ending at chromosome MT or Y. I checked the bam files for these samples using samtools quickcheck and samtools view and they appear fine to me, there are reads across the whole genome.

I don't get any error messages for these samples. When I re-run the smoove call step for these 11 samples, the output looks something like this:

[smoove] 2019/05/23 00:22:36 starting with version 0.2.2 [smoove] 2019/05/23 00:22:36 calculating bam stats for 1 bams [smoove] 2019/05/23 00:22:51 done calculating bam stats [smoove] 2019/05/23 00:23:12 removed 0 alignments out of 255359 (0.00%) with low mapq, depth > 1000, or from excluded chroms from FD01115731.split.bam in 21 seconds [smoove] 2019/05/23 00:23:12 removed 0 alignments out of 255359 (0.00%) that were bad interchromosomals or flanked-splitters from FD01115731.split.bam [smoove] 2019/05/23 00:23:15 removed 0 alignments out of 525047 (0.00%) with low mapq, depth > 1000, or from excluded chroms from FD01115731.disc.bam in 24 seconds [smoove] 2019/05/23 00:23:15 removed 0 alignments out of 525047 (0.00%) that were bad interchromosomals or flanked-splitters from FD01115731.disc.bam [smoove] 2019/05/23 00:23:18 removed 0 singletons of 255359 reads (0.00%) from FD01115731.split.bam in 7 seconds [smoove] 2019/05/23 00:23:18 255359 reads (100.00%) of the original 255359 remain from FD01115731.split.bam [smoove] 2019/05/23 00:23:26 removed 1590 singletons and isolated interchromosomals of 525047 reads (0.30%) from FD01115731.disc.bam in 11 seconds [smoove] 2019/05/23 00:23:26 523457 reads (99.70%) of the original 525047 remain from FD01115731.disc.bam [smoove] 2019/05/23 00:23:26 starting lumpy [smoove] 2019/05/23 00:23:26 wrote lumpy command to /scratch/skeldys/167_samples/smoove/results-smoove/FD01115731-lumpy-cmd.sh [smoove] 2019/05/23 00:23:26 writing sorted, indexed file to /scratch/skeldys/167_samples/smoove/results-smoove/FD01115731-smoove.genotyped.vcf.gz [smoove] 2019/05/23 00:23:26 excluding variants with all unknown or homozygous reference genotypes [smoove] 2019/05/23 00:23:27 1 1000000 [smoove] 2019/05/23 00:23:27 > gsort version 0.0.6 [smoove] 2019/05/23 00:23:29 2 1000000 [smoove] 2019/05/23 00:23:31 3 1000000 [smoove] 2019/05/23 00:23:34 4 1000000 [smoove] 2019/05/23 00:23:37 5 [smoove] 2019/05/23 00:23:37 [smoove] 2019/05/23 00:23:37 1000000 [smoove] 2019/05/23 00:23:37 [smoove] 2019/05/23 00:23:39 6 [smoove] 2019/05/23 00:23:39 [smoove] 2019/05/23 00:23:39 1000000 [smoove] 2019/05/23 00:23:39 [smoove] 2019/05/23 00:23:42 7 1000000 [smoove] 2019/05/23 00:25:51 wrote sorted, indexed file to /smoove/results-smoove/1115731-smoove.genotyped.vcf.gz

I checked the split/disc files created by smoove and it seems either 1 or both also end where the VCFs end, but I can't figure out why this is so. Any ideas?

Thanks.

tracychew avatar May 23 '19 01:05 tracychew

can you remove the .split and .disc bams, start again and report the output?

before that, you could also check samtools view $sample.disc.bam | cut -f 3 | uniq -c and repeat for .split.bam and report the output.

I am guessing they ran out of memory or otherwise failed on the initial run.

Also, the square command should fail if there are any samples with differing numbers of variants, so it probably exited early, with an error as well.

I recommend to use something like this: https://github.com/brwnj/smoove-nf which uses next-flow to run smoove so that it catches errors and retries (that may not be the problem here, but it often helps).

brentp avatar May 23 '19 01:05 brentp

Hi Brent,

Thanks for your speedy reply! I did as you suggested, I removed all of these sample outputs from smoove call (including .split and .disc) bams and re-ran smoove call. I am getting the same error where .split / .disc bams don't contain split/disc reads for the whole genome (only the first few chromosomes, I used your samtools view command) and the calls only go as far as the split/disc bams to (although it did fix 1 sample)

Sample samtools view *split.bam | cut -f 3 | uniq -c (last line) samtools view 2525133.disc.bam | cut -f 3 | uniq -c (last line) zcat *-smoove.genotyped.vcf.gz | tail -1
2525133 13581 5 42543 5 5 105086482 1456 N <DEL> 218.86 . SVTYPE=DEL;SVLEN=-87;END=105086569;STRANDS=+-:9;CIPOS=-10,9;CIEND=-4,3;CIPOS95=0,0;CIEND95=0,0;SU=9;PE=1;SR=8;PRPOS=1.01311e-32,1.60364e-29,2.53823e-26,4.01761e-23,6.3587e-20,1.0064e-16,1.59272e-13,2.52053e-10,3.9885e-07,0.000631171,0.998739,0.000629161,3.96351e-07,2.49661e-10,1.57257e-13,9.90509e-17,6.23855e-20,3.92896e-23,2.47441e-26,1.55829e-29;PREND=1.5067e-13,2.41776e-10,3.87983e-07,0.000622509,0.998739,0.000637898,4.07438e-07,2.60543e-10;AC=1;AN=2 GT:GQ:SQ:GL:DP:RO:AO:QR:QA:RS:AS:ASC:RP:AP:AB 0/1:32:218.86:-23,-1,-4:18:8:9:8:9:8:7:1:0:0:0.53
2525135 319 16 208 16 16 876641 3540 N <DEL> 758.59 . SVTYPE=DEL;SVLEN=-317;END=876958;STRANDS=+-:22;CIPOS=-2,0;CIEND=-2,1;CIPOS95=0,0;CIEND95=0,0;SU=22;PE=3;SR=19;PRPOS=6.56017e-16,2.56245e-08,1;PREND=5.9838e-16,2.44672e-08,1,1.62785e-07;AC=1;AN=2 GT:GQ:SQ:GL:DP:RO:AO:QR:QA:RS:AS:ASC:RP:AP:AB 0/1:42:758.59:-78,-2,-6:49:17:31:17:30:0:18:5:17:7:0.64
2525141 418 MT 663 MT Y 28815222 5194 N <DUP> 87.29 . SVTYPE=DUP;SVLEN=2776;END=28817998;STRANDS=-+:20;IMPRECISE;CIPOS=-256,29;CIEND=-30,502;CIPOS95=-34,7;CIEND95=-6,31;SU=20;PE=20;SR=0;PRPOS=3.2757e-19,9.3472e-19,1.61081e-18,2.76669e-18,4.22725e-18,5.93702e-18,8.45672e-18,1.1361e-17,1.47884e-17,1.97707e-17,2.58543e-17,3.4311e-17,4.36735e-17,5.56445e-17,7.04388e-17,8.99407e-17,1.13374e-16,1.43733e-16,1.81162e-16,2.28803e-16,2.8455e-16,3.53087e-16,4.38752e-16,5.42168e-16,6.7137e-16,8.29637e-16,1.01823e-15,1.2547e-15,1.53486e-15,1.88879e-15,2.30425e-15,2.7997e-15,3.42213e-15,4.1669e-15,5.09874e-15,6.18454e-15,7.5556e-15,9.16447e-15,1.1122e-14,1.35823e-14,1.64326e-14,1.98958e-14,2.3973e-14,2.89705e-14,3.47849e-14,4.19417e-14,5.062e-14,6.11586e-14,7.35033e-14,8.82085e-14,1.06117e-13,1.27034e-13,1.52773e-13,1.8291e-13,2.18405e-13,2.62834e-13,3.13519e-13,3.77612e-13,4.52191e-13,5.42682e-13,6.48542e-13,7.73751e-13,9.24841e-13,1.10549e-12,1.31445e-12,1.56573e-12,1.86998e-12,2.22919e-12,2.66023e-12,3.16314e-12,3.75925e-12,4.46864e-12,5.30702e-12,6.29583e-12,7.50149e-12,8.90821e-12,1.0554e-11,1.25123e-11,1.48707e-11,1.7612e-11,2.09086e-11,2.47704e-11,2.93448e-11,3.4661e-11,4.10094e-11,4.84355e-11,5.72446e-11,6.7852e-11,8.00952e-11,9.47458e-11,1.11637e-10,1.31624e-10,1.55213e-10,1.8273e-10,2.15402e-10,2.53629e-10,2.97734e-10,3.51033e-10,4.14112e-10,4.86252e-10,5.70926e-10,6.71115e-10,7.892e-10,9.24939e-10,1.08674e-09,1.27366e-09,1.49109e-09,1.74863e-09,2.0419e-09,2.38939e-09,2.79827e-09,3.27084e-09,3.82338e-09,4.46518e-09,5.2211e-09,6.09027e-09,7.10729e-09,8.29792e-09,9.66619e-09,1.12689e-08,1.31386e-08,1.52924e-08,1.77779e-08,2.06656e-08,2.3999e-08,2.79181e-08,3.24322e-08,3.76412e-08,4.37461e-08,5.0749e-08,5.89601e-08,6.83791e-08,7.93144e-08,9.18294e-08,1.06322e-07,1.23048e-07,1.42091e-07,1.6432e-07,1.89978e-07,2.19627e-07,2.53911e-07,2.92759e-07,3.38309e-07,3.89356e-07,4.48668e-07,5.16664e-07,5.94981e-07,6.84509e-07,7.87871e-07,9.06269e-07,1.04299e-06,1.19932e-06,1.37739e-06,1.57966e-06,1.81015e-06,2.07673e-06,2.37905e-06,2.72315e-06,3.11789e-06,3.56514e-06,4.07218e-06,4.64319e-06,5.29721e-06,6.05464e-06,6.90281e-06,7.87506e-06,8.96647e-06,1.01939e-05,1.15989e-05,1.31842e-05,1.50027e-05,1.70288e-05,1.93274e-05,2.19005e-05,2.48283e-05,2.81299e-05,3.18542e-05,3.59944e-05,4.07276e-05,4.60348e-05,5.19674e-05,5.86494e-05,6.61393e-05,7.45956e-05,8.39939e-05,9.43223e-05,0.000106105,0.00011928,0.000133966,0.000150457,0.000168576,0.000189095,0.000211942,0.000237181,0.000265552,0.000296484,0.000330909,0.000369429,0.000411026,0.000458483,0.000510347,0.000567579,0.000630573,0.000700785,0.000776988,0.000861188,0.000953942,0.00105642,0.00116822,0.00129046,0.00142634,0.00157388,0.00173631,0.00191149,0.0021056,0.00231664,0.002546,0.00279703,0.00306919,0.00336398,0.00368494,0.00403983,0.0044181,0.00482864,0.00528281,0.00576994,0.00629357,0.00686118,0.00746985,0.00812703,0.00883318,0.00959238,0.0104157,0.0113063,0.0122547,0.0132599,0.0143466,0.0155083,0.0167347,0.0180761,0.0194834,0.0210115,0.0226533,0.0243721,0.0262055,0.0281419,0.0302082,0.0323939,0.0346934,0.0371555,0.0397423,0.0424612,0.0453589,0.0483556,0.051565,0.0549517,0.0584578,0.0456839,0.0356831,0.0278667,0.0217321,0.0124507,0.00713135,0.00407774,0.00233202,0.00133206,0.000760187,0.000433694,0.000247275,0.0001408,8.01781e-05,4.56003e-05,2.59313e-05,1.47241e-05,8.35575e-06,4.74029e-06,2.68694e-06,1.52173e-06,8.6076e-07,3.58077e-07,1.48776e-07,4.54321e-08,1.38603e-08,4.22592e-09,1.28784e-09,3.92493e-10;PREND=1.60193e-12,5.18007e-12,1.67413e-11,5.4061e-11,1.74436e-10,5.62531e-10,1.81387e-09,5.84474e-09,1.88173e-08,6.05297e-08,1.43127e-07,3.38115e-07,7.97957e-07,1.88204e-06,4.43648e-06,1.04494e-05,2.45908e-05,5.79005e-05,0.00013611,0.000319673,0.000750721,0.00129525,0.00223117,0.00384047,0.00660536,0.0113616,0.0195284,0.0335439,0.0423628,0.0533669,0.0672428,0.0622589,0.0575776,0.0531833,0.0491057,0.0452988,0.0417556,0.0384549,0.0354093,0.032573,0.029938,0.0274987,0.0252549,0.023167,0.021229,0.0194569,0.017797,0.0162904,0.014887,0.0136008,0.0124182,0.0113243,0.0103224,0.00939972,0.00854981,0.00778039,0.00707119,0.00642393,0.00583038,0.0052865,0.00478996,0.00433636,0.0039262,0.00355107,0.00320953,0.00289894,0.00261671,0.00235971,0.00212805,0.00191905,0.0017272,0.0015528,0.00139606,0.00125391,0.00112608,0.00101048,0.000906271,0.000812251,0.000727689,0.000651261,0.00058214,0.000520214,0.00046478,0.000415112,0.000370278,0.000329947,0.000293837,0.000261718,0.000233019,0.000207189,0.000184022,0.000163375,0.000145038,0.000128753,0.00011421,0.000101195,8.96188e-05,7.93438e-05,7.01566e-05,6.19925e-05,5.47747e-05,4.83698e-05,4.26697e-05,3.76327e-05,3.31922e-05,2.92288e-05,2.57505e-05,2.26643e-05,1.99416e-05,1.75297e-05,1.5395e-05,1.35272e-05,1.18773e-05,1.04193e-05,9.13594e-06,8.0033e-06,7.01169e-06,6.13713e-06,5.3698e-06,4.69423e-06,4.10265e-06,3.58705e-06,3.13366e-06,2.73663e-06,2.387e-06,2.08064e-06,1.81305e-06,1.58105e-06,1.37671e-06,1.19791e-06,1.04212e-06,9.06448e-07,7.87742e-07,6.84457e-07,5.94194e-07,5.15947e-07,4.47581e-07,3.88141e-07,3.36345e-07,2.9126e-07,2.52314e-07,2.18402e-07,1.8911e-07,1.63574e-07,1.41307e-07,1.22182e-07,1.05733e-07,9.12859e-08,7.87183e-08,6.79301e-08,5.85981e-08,5.05226e-08,4.35437e-08,3.75217e-08,3.23281e-08,2.78234e-08,2.39355e-08,2.06031e-08,1.76973e-08,1.5206e-08,1.30646e-08,1.12187e-08,9.63118e-09,8.26114e-09,7.0934e-09,6.08558e-09,5.2167e-09,4.467e-09,3.82696e-09,3.27546e-09,2.80612e-09,2.39831e-09,2.05156e-09,1.75554e-09,1.50087e-09,1.28281e-09,1.09554e-09,9.36035e-10,7.98629e-10,6.82095e-10,5.82046e-10,4.97066e-10,4.23932e-10,3.61466e-10,3.08024e-10,2.62589e-10,2.23632e-10,1.9033e-10,1.61951e-10,1.37804e-10,1.17067e-10,9.95012e-11,8.4706e-11,7.19582e-11,6.11198e-11,5.18409e-11,4.39357e-11,3.72782e-11,3.16406e-11,2.68325e-11,2.27593e-11,1.92872e-11,1.6336e-11,1.38408e-11,1.1717e-11,9.91132e-12,8.38126e-12,7.09106e-12,6.0057e-12,5.08058e-12,4.29323e-12,3.63194e-12,3.06933e-12,2.59102e-12,2.18659e-12,1.84451e-12,1.55687e-12,1.31498e-12,1.10834e-12,9.35663e-13,7.87535e-13,6.63875e-13,5.59393e-13,4.71019e-13,3.96396e-13,3.33211e-13,2.80144e-13,2.35783e-13,1.98417e-13,1.6685e-13,1.4046e-13,1.17943e-13,9.91202e-14,8.31848e-14,6.97942e-14,5.86448e-14,4.92577e-14,4.12685e-14,3.4642e-14,2.90557e-14,2.43507e-14,2.042e-14,1.7105e-14,1.43193e-14,1.19786e-14,1.00251e-14,8.37263e-15,7.00925e-15,5.85439e-15,4.89656e-15,4.09101e-15,3.42474e-15,2.85944e-15,2.38418e-15,1.99302e-15,1.66148e-15,1.38737e-15,1.15847e-15,9.67867e-16,8.082e-16,6.73619e-16,5.61311e-16,4.67994e-16,3.89583e-16,3.24318e-16,2.6999e-16,2.24679e-16,1.87224e-16,1.55969e-16,1.29689e-16,1.07785e-16,8.95971e-17,7.43238e-17,6.1753e-17,5.12035e-17,4.24748e-17,3.52623e-17,2.92407e-17,2.42776e-17,2.01211e-17,1.66583e-17,1.38308e-17,1.14576e-17,9.48621e-18,7.85862e-18,6.50925e-18,5.39166e-18,4.46394e-18,3.69015e-18,3.05323e-18,2.52433e-18,2.08649e-18,1.72404e-18,1.42229e-18,1.17583e-18,9.70346e-19,8.03039e-19,6.6136e-19,5.44927e-19,4.49521e-19,3.71153e-19,3.05471e-19,2.51627e-19,2.07704e-19,1.71243e-19,1.41079e-19,1.16331e-19,9.5849e-20,7.89676e-20,6.4893e-20,5.33113e-20,4.39641e-20,3.60963e-20,2.96158e-20,2.43642e-20,2.00127e-20,1.64693e-20,1.35192e-20,1.10936e-20,9.09555e-21,7.45941e-21,6.10823e-21,5.00241e-21,4.10152e-21,3.35282e-21,2.75234e-21,2.26057e-21,1.85466e-21,1.51827e-21,1.2407e-21,1.01423e-21,8.29474e-22,6.79287e-22,5.55649e-22,4.54426e-22,3.70363e-22,3.0218e-22,2.46788e-22,2.01477e-22,1.64255e-22,1.33476e-22,1.0869e-22,8.85899e-23,7.21341e-23,5.89693e-23,4.79873e-23,3.90965e-23,3.17868e-23,2.58412e-23,2.10672e-23,1.71156e-23,1.39069e-23,1.13335e-23,9.20458e-24,7.46174e-24,6.0523e-24,4.90822e-24,3.99264e-24,3.23634e-24,2.62319e-24,2.12139e-24,1.7195e-24,1.39674e-24,1.13301e-24,9.16356e-25,7.42783e-25,5.99952e-25,4.86564e-25,3.93139e-25,3.17798e-25,2.56392e-25,2.06965e-25,1.67725e-25,1.35037e-25,1.09179e-25,8.82044e-26,7.11224e-26,5.7362e-26,4.61769e-26,3.71669e-26,3.00257e-26,2.41903e-26,1.9517e-26,1.56394e-26,1.25985e-26,1.0108e-26,8.10446e-27,6.52616e-27,5.23757e-27,4.20787e-27,3.3784e-27,2.70747e-27,2.17758e-27,1.74616e-27,1.39938e-27,1.12423e-27,8.98328e-28,7.2177e-28,5.77605e-28,4.62721e-28,3.72423e-28,2.98228e-28,2.38844e-28,1.90661e-28,1.52254e-28,1.21584e-28,9.66618e-29,7.75439e-29,6.18765e-29,4.91961e-29,3.91862e-29,3.12079e-29,2.48379e-29,1.97712e-29,1.57292e-29,1.24754e-29,9.91794e-30,7.87013e-30,6.27246e-30,4.98796e-30,3.96071e-30,3.12928e-30,2.47561e-30,1.96128e-30,1.55817e-30,1.23793e-30,9.78015e-31,7.71831e-31,6.08715e-31,4.79871e-31,3.78294e-31,2.98013e-31,2.34581e-31,1.84434e-31,1.4517e-31,1.1468e-31,9.00848e-32,7.06493e-32,5.56105e-32,4.38915e-32,3.4439e-32,2.7066e-32,2.12583e-32,1.67266e-32,1.31452e-32,1.0228e-32,7.97587e-33,6.22882e-33,4.88406e-33,3.83511e-33,2.98288e-33,2.32358e-33,1.79945e-33,1.41041e-33,1.09867e-33,8.55814e-34,6.65193e-34,5.14319e-34,3.99901e-34,3.09902e-34,2.39545e-34,1.86102e-34,1.44509e-34,1.1123e-34,8.61359e-35,6.66623e-35,5.13627e-35,3.94591e-35,3.03502e-35,2.34052e-35,1.79638e-35,1.3725e-35,1.05388e-35,8.09454e-36,6.21267e-36,4.71633e-36,3.60381e-36,2.74727e-36,2.10252e-36,1.60102e-36,1.22023e-36,9.22983e-37,6.96674e-37,5.27841e-37,4.01505e-37,3.05985e-37,2.31147e-37,1.73321e-37,1.30582e-37,9.89065e-38,7.44347e-38,5.58784e-38,4.23208e-38,3.14884e-38,2.36466e-38,1.73682e-38,1.28966e-38,9.70088e-39,7.16193e-39,5.29797e-39,3.88432e-39,2.88511e-39,2.11582e-39,1.55223e-39,1.14455e-39,8.31613e-40,6.04416e-40,4.39585e-40,3.20284e-40,2.31788e-40,1.67534e-40,1.20247e-40,8.63489e-41,6.16995e-41,4.38024e-41,3.07196e-41,2.16925e-41,1.51796e-41,1.04772e-41,7.24079e-42,4.98257e-42,3.4353e-42,2.33082e-42,1.50917e-42,1.01995e-42,6.5684e-43,4.36444e-43,2.81189e-43,1.68191e-43,1.04223e-43,5.87278e-44,2.94649e-44,1.46779e-44;AC=1;AN=2 GT:GQ:SQ:GL:DP:RO:AO:QR:QA:RS:AS:ASC:RP:AP:AB 0/1:87:87.29:-30,-22,-65:766:708:58:707:57:432:0:1:275:56:0.075
2525154 15983 3 40055 3 3 86868689 913 N <DEL> 107.38 . SVTYPE=DEL;SVLEN=-645;END=86869334;STRANDS=+-:8;IMPRECISE;CIPOS=-10,9;CIEND=-10,9;CIPOS95=-1,1;CIEND95=-1,1;SU=8;PE=6;SR=2;PRPOS=9.34835e-09,5.75816e-08,3.54552e-07,2.18138e-06,1.34225e-05,8.25465e-05,0.000507887,0.00312377,0.0192155,0.118162,0.726266,0.112159,0.0173089,0.00267141,0.000412313,6.36031e-05,9.8136e-06,1.51375e-06,2.33501e-07,3.5994e-08;PREND=7.2586e-09,4.6283e-08,2.95094e-07,1.88106e-06,1.19898e-05,7.63892e-05,0.000486692,0.00310025,0.0197451,0.125741,0.797024,0.050409,0.00318797,0.000201609,1.27494e-05,8.05027e-07,3.7366e-08,1.73438e-09,8.05027e-11,3.7366e-12;AC=1;AN=2 GT:GQ:SQ:GL:DP:RO:AO:QR:QA:RS:AS:ASC:RP:AP:AB 0/1:107:107.38:-24,-13,-66:95:81:14:81:13:55:3:4:26:5:0.14
FD02525167 1402 9 2836 9 9 8848770 2532 N <DEL> 311.72 . SVTYPE=DEL;SVLEN=-232;END=8849002;STRANDS=+-:13;CIPOS=-10,9;CIEND=-10,9;CIPOS95=0,0;CIEND95=0,0;SU=13;PE=0;SR=13;PRPOS=9.99987e-53,1.58487e-47,2.51185e-42,3.98102e-37,6.30949e-32,9.99987e-27,1.58487e-21,2.51185e-16,3.98102e-11,6.30949e-06,0.999987,6.30949e-06,3.98102e-11,2.51185e-16,1.58487e-21,9.99987e-27,6.30949e-32,3.98102e-37,2.51185e-42,1.58487e-47;PREND=9.99987e-53,1.58487e-47,2.51185e-42,3.98102e-37,6.30949e-32,9.99987e-27,1.58487e-21,2.51185e-16,3.98102e-11,6.30949e-06,0.999987,6.30949e-06,3.98102e-11,2.51185e-16,1.58487e-21,9.99987e-27,6.30949e-32,3.98102e-37,2.51185e-42,1.58487e-47;AC=1;AN=2 GT:GQ:SQ:GL:DP:RO:AO:QR:QA:RS:AS:ASC:RP:AP:AB 0/1:118:311.72:-32,-1,-13:37:22:14:22:14:22:12:1:0:0:0.39

And I did check the bams for these samples, they are OK and have reads across the genome.

I checked the memory usage, and it's using less than what I've requested so there shouldn't be an issue there. Most samples use 1-2 CPUs and >50Gb of mem, I've requested 2 CPU and 96GB mem (below is an example).

Walltime requested: 10:00:00 : Walltime used: 00:12:07 : walltime percent: 2.0% -- Nodes Summary ----------------------------------------------------- -- node hpc213 summary Cpus requested: 2 : Cpus Used: 1.71 Cpu Time: 00:20:44 : Cpu percent: 85.6% Mem requested: 96.0GB : Mem used: 48.7GB : Mem percent: 50.7%

-- WARNINGS ----------------------------------------------------------

tracychew avatar May 23 '19 04:05 tracychew

I'm not sure how it would be using 48GB of memory. Can you show the stderr and stdout of a failing sample?

If the memory use is that high, the sample is likely low quality (assuming these are <60X genomes). If it continues to fail, you could run smoove call with -F to avoid the higher memory steps, but I'd be interested to see the log first.

brentp avatar May 23 '19 12:05 brentp

Hi Brent,

I had originally requested for 64Gb of mem for the other samples, and sometimes this wasn't enough, so I started to request for 96Gb. Most of the samples are < 60X (about 45 X).

Here is the stderr, stout and usage files for the first sample 2525133: sterr: [tche7417@login2 smoove_cont]$ cat smoove1.e460403.1 Singularity is a powerful tool for creating reproducible, portable software environments. However, this control comes with the responsibility to maintain your own containers. Due to the almost infinite possible number of containerised applications available, we cannot help troubleshoot or support applications running inside Singularity containers. If your containerised application is not working, we recommend contacting the container developer for assistance. [smoove] 2019/05/23 01:51:22 starting with version 0.2.2 [smoove] 2019/05/23 01:51:22 calculating bam stats for 1 bams [smoove] 2019/05/23 01:51:44 done calculating bam stats [smoove]: 2019/05/23 02:00:30 finished process: lumpy-filter (set -eu; lumpy_filter -f /project/skeldys/assets/hs37d5.fasta /project/skeldys/R_170717_ANDZAN/Align) in user-time:15m6.200236s system-time:1m40.44073s [smoove] 2019/05/23 02:00:55 removed 759267 alignments out of 1181736 (64.25%) with low mapq, depth > 1000, or from excluded chroms from 2525133.split.bam in 25 seconds [smoove] 2019/05/23 02:00:55 removed 19416 alignments out of 1181736 (1.64%) that were bad interchromosomals or flanked-splitters from 2525133.split.bam [smoove] 2019/05/23 02:01:04 removed 280013 singletons of 403053 reads (69.47%) from 2525133.split.bam in 9 seconds [smoove] 2019/05/23 02:01:04 123040 reads (10.41%) of the original 1181736 remain from 2525133.split.bam [smoove] 2019/05/23 02:02:01 removed 1054321 alignments out of 3494846 (30.17%) with low mapq, depth > 1000, or from excluded chroms from 2525133.disc.bam in 91 seconds [smoove] 2019/05/23 02:02:01 removed 436819 alignments out of 3494846 (12.50%) that were bad interchromosomals or flanked-splitters from 2525133.disc.bam [smoove] 2019/05/23 02:02:33 removed 1613617 singletons and isolated interchromosomals of 2003706 reads (80.53%) from 2525133.disc.bam in 32 seconds [smoove] 2019/05/23 02:02:33 390089 reads (11.16%) of the original 3494846 remain from 2525133.disc.bam [smoove] 2019/05/23 02:02:34 starting lumpy [smoove] 2019/05/23 02:02:34 wrote lumpy command to /scratch/skeldys/167_samples/smoove/results-smoove/2525133-lumpy-cmd.sh [smoove] 2019/05/23 02:02:34 writing sorted, indexed file to /scratch/skeldys/167_samples/smoove/results-smoove/2525133-smoove.genotyped.vcf.gz [smoove] 2019/05/23 02:02:34 excluding variants with all unknown or homozygous reference genotypes [smoove] 2019/05/23 02:02:35 > gsort version 0.0.6 [smoove] 2019/05/23 02:02:35 1 1000000 [smoove] 2019/05/23 02:02:36 2 1000000 [smoove] 2019/05/23 02:02:38 3 1000000 [smoove] 2019/05/23 02:02:38 [smoove] 2019/05/23 02:02:39 4 1000000 [smoove] 2019/05/23 02:02:41 5 [smoove] 2019/05/23 02:02:41 [smoove] 2019/05/23 02:02:41 1000000 [smoove] 2019/05/23 02:02:41 [smoove] 2019/05/23 02:03:17 wrote sorted, indexed file to /scratch/skeldys/167_samples/smoove/results-smoove/2525133-smoove.genotyped.vcf.gz

stdout (empty) [tche7417@login2 smoove_cont]$ cat smoove1.o460403.1 [tche7417@login2 smoove_cont]$

usage [tche7417@login2 smoove_cont]$ cat smoove1.o460403.1_usage -- Job Summary ------------------------------------------------------- Job Id: 460403[1].pbstraining for user tche7417 in queue training Job Name: smoove1 Project: RDS-CORE-Training-RW Exit Status: 0 Job run as chunks (hpc213:ncpus=2:mem=100663296kb) Array id: 460403[].pbstraining array index: 1 Walltime requested: 10:00:00 : Walltime used: 00:12:07 : walltime percent: 2.0% -- Nodes Summary ----------------------------------------------------- -- node hpc213 summary Cpus requested: 2 : Cpus Used: 1.71 Cpu Time: 00:20:44 : Cpu percent: 85.6% Mem requested: 96.0GB : Mem used: 48.7GB : Mem percent: 50.7%

The other failed samples look similar to the above, some stopping later/earlier than chromosome 5, some using much less memory than 48Gb but going further along the genome.

tracychew avatar May 24 '19 05:05 tracychew

hmm, and you're sure the original bams are ok? I would:

  1. check if orig.disc.bam and orig.split.bam go past chr5
  2. remove *.disc and split.bam and run with -F which skips most of the filtering
  3. run indexcov on a couple of the failed samples? it's just:
goleft indexcov -d output-dir/  *.bam

or for crams:

goleft indexcov -d output-dir/ --fai $reference.fai *.crai

this will make an html, I just want to be sure that you have aligned data after chr5

brentp avatar May 24 '19 13:05 brentp

any update on this? if you can privately share (part of) the bam, I could also have a look.

brentp avatar May 30 '19 16:05 brentp