nanosv icon indicating copy to clipboard operation
nanosv copied to clipboard

NanoSV with truvari

Open Tintest opened this issue 5 years ago • 3 comments

Hello, I'm trying to benchmark NIST002 vcf files produced by NanoSV against the GIAB NIST002 SV truth set, but not any variant is classfied as true positive.

Here is the logs :

2019-11-20 12:09:19,348 [INFO] Running /home/tintest/miniconda3/bin/truvari -b ../HG002_SVs_Tier1_v0.6.vcf.gz -c /home/tintest/bettik/SV/nanopore/Biomnis/vcf/nanosv/NIST-002_merged_ngmlr_sorted.vcf.gz -o NIST-002_merged_ngmlr_sorted -r 2000 --pctsim 0 --passonly --includebed ../HG002_SVs_Tier1_v0.6.bed --giabreport
2019-11-20 12:09:19,348 [INFO] Params:
{
    "sizemax": 50000,
    "reference": null,
    "noprog": false,
    "multimatch": false,
    "pctsize": 0.7,
    "cSample": null,
    "includebed": "../HG002_SVs_Tier1_v0.6.bed",
    "no_ref": false,
    "passonly": true,
    "pctsim": 0.0,
    "pctovl": 0.0,
    "comp": "/home/tintest/bettik/SV/nanopore/Biomnis/vcf/nanosv/NIST-002_merged_ngmlr_sorted.vcf.gz",
    "refdist": 2000,
    "base": "../HG002_SVs_Tier1_v0.6.vcf.gz",
    "giabreport": true,
    "sizefilt": 30,
    "typeignore": false,
    "gtcomp": false,
    "debug": false,
    "output": "NIST-002_merged_ngmlr_sorted",
    "bSample": null,
    "sizemin": 50
}
2019-11-20 12:09:20,646 [INFO] Including 34830 bed regions
2019-11-20 12:09:20,646 [INFO] Creating call interval tree for overlap search
2019-11-20 12:09:30,717 [INFO] 35145 call variants in total
2019-11-20 12:09:30,717 [INFO] 0 call variants within size range (30, 50000)
2019-11-20 12:09:49,403 [INFO] 20041 base variants
2019-11-20 12:09:49,423 [INFO] Matching base to calls
2019-11-20 12:10:09,183 [WARNING] No TP or FP calls in base!
2019-11-20 12:10:09,183 [INFO] Parsing FPs from calls
2019-11-20 12:10:18,714 [INFO] Stats: {
    "TP-base": 0,
    "TP-call": 0,
    "FP": 0,
    "FN": 9641,
    "precision": 0,
    "recall": 0,
    "f1": "NaN",
    "base cnt": 9641,
    "call cnt": 0,
    "base size filtered": 6309,
    "call size filtered": 0,
    "base gt filtered": 0,
    "call gt filtered": 0,
    "TP-call_TP-gt": 0,
    "TP-call_FP-gt": 0,
    "TP-base_TP-gt": 0,
    "TP-base_FP-gt": 0,
    "gt_precision": 0,
    "gt_recall": 0,
    "gt_f1": "NaN"
}
2019-11-20 12:10:18,715 [INFO] Creating GIAB report
2019-11-20 12:10:20,919 [INFO] Finished

Did you ever tried truvari with a vcf from NanoSV ?

It works flawlessly with SV callers such as Svim, Pbsv or Sniffles.

Here is the link to the vcf : https://filesender.renater.fr/?s=download&token=8905688b-e98a-859c-c841-ad7e9088a2c6

Here is the NanoSV command to produce the vcf file from a bam produced by ngmlr :

singularity exec -B /bettik/tintest/:/mnt /home/tintest/bettik/SV/nanopore/Chaissonetal2019/nanosv.simg NanoSV --bed /mnt/SV/nanopore/human_hg19.bed -t 4 -s samtools /mnt/SV/nanopore/Biomnis/bam/ngmlr/NIST-002_merged_ngmlr.bam -o /mnt/SV/nanopore/Biomnis/vcf/nanosv/NIST-002_merged_ngmlr.vcf

Same problem with NanoSV vcf from bam files produced by minimap2.

I guess the problem must come from the NanoSV vcf format.

Regards.

Tintest avatar Nov 20 '19 17:11 Tintest

I came with the same issue, that Truvari detected no SV within size range

yekaizhou avatar Apr 11 '21 05:04 yekaizhou

you may need to remove the --passonly flag (since NanoSV does not use PASS in vcf's FILTER field), also, you need to change RT=3 to RT=. (or RT=1) in the vcf header

LYC-vio avatar Nov 24 '21 21:11 LYC-vio

And change ##FILTER=<ID=Gap to ##FILTER=<ID=GAP

LYC-vio avatar Jan 03 '22 23:01 LYC-vio