gtc2vcf icon indicating copy to clipboard operation
gtc2vcf copied to clipboard

Any suggestions on handling manifests with missing RefStrand and SourceSeq columns?

Open rajwanir opened this issue 3 months ago • 2 comments

Hi,

I am compiling a data catalog for my department where the data set span 27+ manifests which come from different times some of them pre-2009 era. Many of these manifests do not have the RefStrand and Sourceseq information either in the binary bpm or the csv version. My goal is to use these manifests to convert GTCs to VCFs. I assume the gtc2vcf plugin absolutely requires these column or atleast the sourceseq column to compute the RefStrand by aligning the flank sequences. Could you suggest a work around in absence of both these columns?

I have following columns consistently populated across different manifests that I am working:

address_a_id chr genome_build ilmn_id ilmn_strand map_info name ploidy snp source source_strand source_version species

And I am working with following set of manifests for now:

HumanOmni2.5-4v1_B.csv GSAMD-24v1-0_20011747_A1.csv GSAMD-24v2-0_20024620_B1.csv Human610-Quadv1_B.csv Human660W-Quad_v1_A.csv Cardio-Metabo_Chip_11395247_C.csv Rare_Cancer_272049_A.csv HumanOmniExpress-12v1_A.csv Peguses_FU_11602373_A.csv Human1M-Duov3_B.csv HumanHap550v3_B.csv Consortium-OncoArray_15047405_A.csv Cancer_BeadChip_11459870_B.csv Immuno_BeadChip_11419691_B.csv CGEMS_P_F2_272225_A.csv Breast_Wide_Track_271628_A.csv BDCHP-1X10-HUMANHAP550_11218540_C.csv HumanOmni2.5S-8v1_B.csv HumanOmni1-Quad_v1-0_B.csv HumanExome-12v1_A.csv HumanOmni1S-8v1_A.csv GSAv3Confluence_20032937X371431_A1.csv HumanOmni25-4v1_C.csv HumanOmni2.5-8v1_A.csv HumanOmni2.5-8v1_A.csv HumanOmni5-4v1_B.csv

Much appreciate if you could share any possible work around. Thank you.

rajwanir avatar Mar 15 '24 14:03 rajwanir