SeqArray
SeqArray copied to clipboard
Create GDS file from imputed data using dosage || No variable 'annotation/format/DS' in the FORMAT field.
Hi @zhengxw-ab
I am interested to create GDS file using VCF from imputed data. I would like to keep dosage information intact in this process. I use command as:
seqVCF2GDS("CHR22.recode.vcf.gz","check.gds",verbose=TRUE,genotype.var.name="annotation/format/DS",scenario=c("imputation"))
I get messages as:
verbose=TRUE,genotype.var.name="annotation/format/DS",scenario=c("imputation"))
Wed Aug 4 13:14:47 2021
Variant Call Format (VCF) Import:
file(s):
CHR22_CHGWAS_rsq80_MAC10.recode.vcf.gz (1.8G)
file format: VCFv4.1
the number of sets of chromosomes (ploidy): 2
the number of samples: 12,508
genotype storage: bit2
compression method: LZMA_RA
# of samples: 12508
scenario: imputation
annotation/format/DS: packedreal16
annotation/format/GP: packedreal16
No variable 'annotation/format/DS' in the FORMAT field.
Output:
check.gds
Parsing 'CHR22_CHGWAS_rsq80_MAC10.recode.vcf.gz':
It says 'annotation/format/DS' in the FORMAT field.
How do I ensure to guide seqVCF2GDS
function to pick dosage value?
Thanks,
You misused "genotype.var.name", dosages are always stored in 'annotation/format/DS'.
Remove ,genotype.var.name="annotation/format/DS"
@zhengxwen thank you for your reply.
Would following command be fine to tell seqVCF2GDS
to use dosage values?
SeqArray::seqVCF2GDS(vcf.fn,output_file, verbose=TRUE,scenario=c("imputation"))