glactools icon indicating copy to clipboard operation
glactools copied to clipboard

epo file format

Open xinkaitong opened this issue 4 years ago • 27 comments

hello, grenaud! now, I just have an outgroup vcf file (other mammals not human), could you tell me how can I transform vcf file to epo file?

xinkaitong avatar Nov 26 '21 03:11 xinkaitong

That is very tough! What is much easier is to transform that VCF to ACF, and use the specific population as the root/anc. However be wary that you don't really have an ancestor, you have an outgroup :-)

On Fri, Nov 26, 2021 at 4:49 AM xinkaitong @.***> wrote:

hello, grenaud! now, I just have an outgroup vcf file (other mammals not human), could you tell me how can I transform vcf file to epo file?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/grenaud/glactools/issues/25, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQRNIZXIMTP352MCJZF7ITUN37VBANCNFSM5IZYG5KQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

grenaud avatar Nov 26 '21 12:11 grenaud

Hello, If I want to calculate β2 using an outgroup to call substitutions and ancestral alleles, can I just get epo files from outgroup vcf file, without common ancestor? Thank you very much

DongmeiDai avatar Oct 06 '23 08:10 DongmeiDai

Yes! You need to convert VCF to ACF and run replaceancestor I believe. Let me know if that works!

grenaud avatar Oct 06 '23 09:10 grenaud

Thank you. I will try it ! If I use vcfm2acf to convert VCF to ACF , then I use usepopsrootanc to tell the root(outgroup), finally I use the replaceanc, the resulting file is the epo file that I need in betascan2 software? Does this epo file include all the individuals I calculate later?

DongmeiDai avatar Oct 06 '23 09:10 DongmeiDai

Hello, I have another question. Is the file2.acf.gz in https://github.com/grenaud/glactools the result using the glactools with vcfm2acf --onlyGT --fai parameter? I am not sure if using these parameters is correct?

Thanks

DongmeiDai avatar Oct 06 '23 10:10 DongmeiDai

Hello, the file1.acf.gz only have a outgroup individual: image when I use the "usepopsrootanc" parameter, I only tell the outgroup, then the error is as follows image I think it lack ancestral information, it is right? so the file1.acf.gz needs the root and ancestor information?

DongmeiDai avatar Oct 06 '23 13:10 DongmeiDai

Hello, What does "ChimpHumanAncestor " refer to specifically?

Thank you very much

DongmeiDai avatar Oct 07 '23 03:10 DongmeiDai

Thank you. I will try it ! If I use vcfm2acf to convert VCF to ACF , then I use usepopsrootanc to tell the root(outgroup), finally I use the replaceanc, the resulting file is the epo file that I need in betascan2 software? Does this epo file include all the individuals I calculate later?

Would you describe seems like it will work. The EPO is only human+chimp+other great apes and the inferred ancestral states, there is a single human "individual" and that is the reference

grenaud avatar Oct 07 '23 17:10 grenaud

Hello, I have another question. Is the file2.acf.gz in https://github.com/grenaud/glactools the result using the glactools with vcfm2acf --onlyGT --fai parameter? I am not sure if using these parameters is correct?

Thanks

I am not sure how this was generated but acf files can be generated in very different ways.

grenaud avatar Oct 07 '23 17:10 grenaud

There is root and ancestor. In the case of humans, we can think of the chimps as the "root" as all humans are equitistant to the chimp. However there is a chimp-human ancestor, we do not have access to the sequence of its genome as it died out 6 million years ago. However we can make a probabilistic inference of what this genome had in terms of alleles. This is precisely what EPO does.

grenaud avatar Oct 07 '23 17:10 grenaud

Hello, Thank you for your reply!

  1. May I ask if I have only root and ancestor, but no groups like other apes, is it OK to generate the follow file.acf.gz as my epo file for Betascan2? image

  2. After I get the epo file, I get an error adding the index file? How to create this index file? image

  3. The first line of the epo file seems to be different from the example you showed. Is it correct for my epo file? image Thanks

DongmeiDai avatar Oct 09 '23 09:10 DongmeiDai

How did you generate this file? Could you paste the command lines?

grenaud avatar Oct 11 '23 21:10 grenaud

I think I misunderstood your meaning. Is file4.acf.gz in your example (glactools replaceanc file2.acf.gz file3.acf.gz > file4.acf.gz) equal to the 1kg_chr6.AFRsubset.acf.gz in following code ( glactools vcfm2acf --onlyGT --epo all.epo.gz --fai human_g1k_v37.fasta.fai chr6.AFRsubset.recode.vcf > 1kg_chr6.AFRsubset.acf.gz)? These two files are actually equivalent, right?

DongmeiDai avatar Oct 12 '23 08:10 DongmeiDai

file4.acf.gz is just a placeholder name, it could be anything.

grenaud avatar Oct 12 '23 19:10 grenaud

sorry,

  1. glactools replaceanc file2.acf.gz <(glactools usepopsrootanc -u file1.acf.gz chimp ChimpHumanAncestor ) > file4.acf.gz
  2. glactools vcfm2acf --onlyGT --epo all.epo.gz --fai human_g1k_v37.fasta.fai chr6.AFRsubset.recode.vcf > 1kg_chr6.AFRsubset.acf.gz

What I understand is that if there is no epo file, file4 generated according to step 1 and 1kg_chr6.AFRsubset.acf.gz generated using epo in 2 are both acf files containing outgroup information. steps 1 and 2 can generate the same file, and the steps are equivalent, is it right?

DongmeiDai avatar Oct 13 '23 01:10 DongmeiDai

While I encourage people to use file descriptors, they are not great for debugging because you don't know which one fails.

Could you try to break down these commands into three separate ones, produce the output for each and inspect that using glactools view?

grenaud avatar Oct 15 '23 13:10 grenaud

Yes, I used 3 commands to generate the file successfully.

Since my species is not human, I want to know if the output file of 1 is the equivalent of the output file of 2 ?

Thank you very much

DongmeiDai avatar Oct 15 '23 13:10 DongmeiDai

Could you please paste the commands? I am not sure what is file 1 and file 2 here.

grenaud avatar Oct 15 '23 13:10 grenaud

image I mean these two ways above. Since my object is not human, there is no epo file, so I can only use the method of 1. I am not sure whether the file I generated is the input file of betascan2? the following is my output image

DongmeiDai avatar Oct 15 '23 13:10 DongmeiDai

Again, please no file descriptors, just single commands.

grenaud avatar Oct 15 '23 13:10 grenaud

The following is code of my test file. IN1 is outgroup, ZA1 is ancestor. test.chr24.vcf is my pop1 file.

glactools vcfm2acf --onlyGT --fai $ref_genome outgroup_root_test.chr24.vcf > outgroup_root_test.chr24.acf.gz glactools vcfm2acf --onlyGT --fai $ref_genome test.chr24.vcf > test.chr24.acf.gz glactools usepopsrootanc outgroup_root_test.chr24.acf.gz IN1 ZA1 > outgroup_root_test_identify.chr24.acf.gz glactools replaceanc test.chr24.acf.gz outgroup_root_test_identify.chr24.acf.gz > outgroup_root_test_merge.chr24.acf.gz

DongmeiDai avatar Oct 15 '23 14:10 DongmeiDai

ok thank you! which command fails?

grenaud avatar Oct 15 '23 14:10 grenaud

No error reported. I'm just not sure if the outgroup_root_test_merge.chr24.acf.gz is the betascan2 input file?

DongmeiDai avatar Oct 15 '23 14:10 DongmeiDai

ah no it is not! you need to use glactools acf2betascan

grenaud avatar Oct 15 '23 14:10 grenaud

glactools acf2betascan --useroot outgroup_root_test_merge.chr24.acf.gz | gzip > outgroup_root_test.chr24.beta.txt.gz

Is it right to do ?

DongmeiDai avatar Oct 15 '23 14:10 DongmeiDai

should be!

grenaud avatar Oct 15 '23 14:10 grenaud

Ok! Thank you very much !

DongmeiDai avatar Oct 15 '23 14:10 DongmeiDai