canvas
canvas copied to clipboard
Is the SmallPedigree-WGS README.md example out of date
The example in the README.md
dotnet /CanvasDIR/Canvas.dll SmallPedigree-WGS --bam=/basespace/Projects/canvas/AppResults/bams/Files/father.bam --bam=/basespace/Projects/canvas/AppResults/bams/Files/mother.bam --bam=/basespace/Projects/canvas/AppResults/bams/Files/child1.bam --mother=mother --father=father --proband=child1 -r /basespace/Projects/canvas/AppResults/canvasdata/Files/kmer.fa -g /basespace/Projects/canvas/AppResults/canvasdata/Files/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta --sample-b-allele-vcf /basespace/Projects/canvas/AppResults/snvvcf/Files/Pedigree.vcf.gz -f /basespace/Projects/canvas/AppResults/canvasdata/Files/filter13.bed -o /tmp/gHapMixDemo --ploidy-vcf="/basespace/Projects/canvas/AppResults/snvvcf/Files/MultiSamplePloidy.vcf"
Seems out of date from the --help:
Mode-specific options: -b, --bam=VALUE1 VALUE2 VALUE3 bam [pedigree-member] [sample-name] Option can be specified multiple times. (required)
bam: sample .bam file (required)
pedigree-member: Pedigree member type (either
proband, mother, father or other). Default is
other
Also -- is proband even an option? I've been using it like so:
dotnet /misc/vcgs/exome/cpipe-2.3-research/tools/canvas/1.39.0.1598/Canvas.dll SmallPedigree-WGS --bam=/path/to/FATHERID.bam father FATHERID --bam=/path/to/MOTHERID.bam mother MOTHERID --bam=/path/to/PROBANDID.bam proband PROBANDID.......(+otherargs)
but im getting
2018-11-24T16:36:08+11:00,Running checkpoint 01: Validate input 2018-11-24T16:36:08+11:00,ERROR: Error: found unexpected arguments '--proband=PROBANDID'
I was checking the source code in ModeParserTests.cs and couldn't see anything related to setting bamfile.FullName, "proband", "SampleID"..
What am I doing wrong?
I managed to get this to run by not specify any of the relationships next to the bam args and just the bam file i.e. -b /path/to/mother.bam -b /path/to/father.bam -b/path/to/proband.bam
Will it infer the relationship from the multi sample vcf?
Changing the issue to when running SmallPedigree-WGS I'm running into this error:
Job error message:
2018-11-27T12:23:43+11:00,ERROR: Exception caught in WorkDoerFactory. Cancelling all jobs. Exception:
Cannot calculate median of an empty SortedList.
System.Exception: Cannot calculate median of an empty SortedList.
at Illumina.Common.SortedListExtensions.Median[T](SortedList`1 list, Func`3 average)
at CanvasPedigreeCaller.SampleMetrics.GetSampleInfo(IReadOnlyList`1 segments, String ploidyBedPath, Int32 numberOfTrimmedBins, SampleId id) in D:\TeamCity\buildAgent\work\a29a190a11771d97\Src\Canvas\CanvasPedigreeCaller\SampleMetrics.cs:line 38
at CanvasPedigreeCaller.CanvasPedigreeCaller.CallVariants(List`1 variantFrequencyFiles, List`1 segmentFiles, IFileLocation outVcfFile, String ploidyBedPath, String referenceFolder, List`1 sampleNames, String commonCnvsBedPath, List`1 sampleTypes) in D:\TeamCity\buildAgent\work\a29a190a11771d97\Src\Canvas\CanvasPedigreeCaller\CanvasPedigreeCaller.cs:line 93
The --bam argument takes three values for example -b=/path/to/my.bam,mother,MotherID the default sample type is "OTHER" and the default sample ID is the ID specified in the SM tag of the bam
Hi @eroller,
I ended up running the the SmallPedigree-WGS with and without specifying the bam familial relationship and id, in both cases I'm getting the error:
Job error message:
2018-11-28T10:48:44+11:00,ERROR: Exception caught in WorkDoerFactory. Cancelling all jobs. Exception:
Cannot calculate median of an empty SortedList.
System.Exception: Cannot calculate median of an empty SortedList.
at Illumina.Common.SortedListExtensions.Median[T](SortedList`1 list, Func`3 average)
at CanvasPedigreeCaller.SampleMetrics.GetSampleInfo(IReadOnlyList`1 segments, String ploidyBedPath, Int32 numberOfTrimmedBins, SampleId id) in D:\TeamCity\buildAgent\work\a29a190a11771d97\Src\Canvas\CanvasPedigreeCaller\SampleMetrics.cs:line 38
at CanvasPedigreeCaller.CanvasPedigreeCaller.CallVariants(List`1 variantFrequencyFiles, List`1 segmentFiles, IFileLocation outVcfFile, String ploidyBedPath, String referenceFolder, List`1 sampleNames, String commonCnvsBedPath, List`1 sampleTypes) in D:\TeamCity\buildAgent\work\a29a190a11771d97\Src\Canvas\CanvasPedigreeCaller\CanvasPedigreeCaller.cs:line 93
Am I missing something simple?
make sure the sample IDs in the --sample-b-allele-vcf file match the SM tags in the bam header. If not you will need to specify the correct sampleIDs for each bam file on the command line.
I have another question. According to : https://github.com/Illumina/canvas/wiki comparing family samples should produce DQ score. In my comparison all DQ score fields are ".". Can you explain to me if this is correct behavior of the application? When is DQ score calculated in Small Pedigree Workflow?
Canvas SmallPedigree-WGS --bam=./bam_fam/CF_6924.bam father CF_6924 --bam=./bam_fam/CF_6925.bam mother CF_6925 --bam=./bam_fam/CF_6916.bam proband CF_6916 --sample-b-allele-vcf=./temp2.vcf -o ./CNV/FAMILY -r ./data/kmers.fasta -g ./data/canFam3/ --filter-bed=./data/filter.bed --ploidy-vcf=./data/ploidy.vcf
example :
9 19897945 Canvas:REF:9:19897945-31453316 N . . PASS END=31453316;CIPOS=-609,609;CIEND=-537,549 GT:RC:BC:CN:MCC:MCCQ:QS:FT:DQ ./.:105.97:9649:2:.:.:16.87:PASS:. ./.:104.99:9649:2:.:.:16.73:PASS:. ./.:102.00:9649:2:.:.:16.43:PASS:.
DQ is calculated when there is a conflicting set of copy number genotypes in the trio (Mother/Father/Child). In the example you should each sample has the reference copy number of 2 so there is no conflict. An example of a conflict would be if there was a deletion in the child (e.g. CN=1), but reference copy number in each parent (i.e. CN=2).