octopus
octopus copied to clipboard
Are there some option to reduce temporary vcf file number?
Hi,
It seems that octopus open one temporary vcf file per contigs/scaffolds. For many non-model species, there are many contigs/scaffolds in their reference genome, for example https://www.ncbi.nlm.nih.gov/assembly/GCA_000966675.2/ , the number of contig/scaffold of this assembly is 4,464,856. And I thank octopus can not use for these species because there are too many files need to open.
Are there some options to reduce temporary vcf file number, or would please add some?
Best, Kun
Hi, you're correct that Octopus creates a temporary VCF for each contig in the input regions, this is to enable parallel processing of each contig. However, these temporary VCFs are opened dynamically so there should only be one temporary VCF open at any one time. If you're running into problems can you post the error you're seeing?
Hi,
No Octopus' errors, but file number was up to my hardware system limits and I can not write any thing before removing those temporary VCFs, there are 4000+ individuals need to call. I think you can move a single contig temporary VCF into individual's vcf file and remove it immediately when it was finished .
Best, Kun
Or can I use Ns to connect contigs/scaffolds to construct longer scaffolds to reduce the temporary vcf files?