AnnotSV icon indicating copy to clipboard operation
AnnotSV copied to clipboard

Error when converting AnnotSV TSV to VCF (single sample)

Open wwgordon opened this issue 1 year ago • 9 comments

Hello,

This is another issue relating to the AnnotSV -> variantconvert handoff for VCF creation. I have created an issue in the variantconvert github and am linking it here in case the issue is arising from AnnotSV.

Essentially, I am having difficulty converting AnnotSV TSV output for a single sample to a VCF using variantconvert. From what I can tell, the TSV looks OK, but I am consistently getting the following error when trying to convert: ValueError: When using an AnnotSV file generated from a VCF, all samples in Samples_ID column are expected to have their own column in the input AnnotSV file.

Thanks for all of your help and work! William

wwgordon avatar Feb 05 '24 18:02 wwgordon

This bothers me a lot, but I don't have the skills to debug this. It's really specific to variantconvert. Thank you for creating the variantconvert issue.

lgmgeo avatar Feb 05 '24 19:02 lgmgeo

Hi, I am facing the same problem and I have just written a comment in the issue opened at variantconvert github. I actually think there is a bug in AnnotSV as the root of the problem is the TSV file generated by AnnotSV. I am using single sample vcfs as input, but the TSV generated by AnnotSV has missing values for some cells of the "Samples_ID" column:

image

It seems that variantconvert needs those values to be present to generate the VCF. If I manually modify the TSV reported by AnnotSV by writing "SAMPLE" in every cell under the column "Samples_ID", variantconvert is able to generate the VCF.

So, I think the question is why AnnotSV is not including the sample ids for some entries in the reported TSV.

Edit: I've just seen an open issue about this here #180

fanavarro avatar Mar 07 '24 10:03 fanavarro

So, I think the question is why AnnotSV is not including the sample ids for some entries in the reported TSV.

If the GT is 0/0 for a given sample, AnnotSV does not report the sample name in the Samples_ID column.

-- cf README:

FYI 1) Samples_ID: List of the samples ID for which the SV was called (according to the SV input file)

FYI 2) image

FYI 3) image

lgmgeo avatar Mar 11 '24 08:03 lgmgeo

@lgmgeo thanks for the answer, now its clear for me.

fanavarro avatar Mar 12 '24 10:03 fanavarro

Hi, Has the bug in variantconvert been fixed yet?

gevro avatar Mar 19 '24 14:03 gevro

The developer is working on it but I don't dare give a deadline...

lgmgeo avatar Mar 19 '24 14:03 lgmgeo

variantconvert version 2.0.0 is posted. This should fix the bug. I'm going on vacation this evening. I'll take care of it when I get back.

lgmgeo avatar Apr 19 '24 09:04 lgmgeo

AnnotSV 3.4.1 is posted with an updated version of variantconvert (v2.0.1).

Can you confirm that this fixes your bug?

lgmgeo avatar May 06 '24 13:05 lgmgeo

Thanks! I will check

gevro avatar May 06 '24 13:05 gevro

Tested and working on my end. Thank you!

wwgordon avatar May 07 '24 00:05 wwgordon