popstats
popstats copied to clipboard
Question about 'Dp_main = sum(t_list) / sum(n_list)'
Hi Pontus I am trying to run the D-statistics. I followed your suggestion to put the population name in the first column. Then I feed popstats.py with my .tped and .tfam file then it shows
########################################
Traceback (most recent call last):
File "popstats.py", line 2105, in
Command I used was
python2 popstats.py -p good.tped -f good.tfam --not23
-b 1000000 --pops pop1,pop2,pop3,pop4 --informative
What kind of error in my data lead to this?
Thank you!
Hi!
Thanks for using popstats. You need to change how you specify the input files. Could you try
python2 popstats.py --tped good.tped --tfam good.tfam --not23 --pops pop1,pop2,pop3,pop4 --informative
?
Hi Thanks, I tried to modify my command lines but it still show the same error.
Below is how my tped file looks 1 LG1:31384 0 31384 T T T C T T T T T T T T T T T T T T 0 T 0 0 T T T T T T T 1 LG1:31396 0 31396 A T T T G T T T T T T T T T T T T T T T T 0 T C T T T T T And tfam pop indv1 indv1 0 0 0 0 pop indv2 indv2 0 0 0 0
Thanks for help!
Hi,
Your tfam file should have e.g. "pop1" or "pop2" and not just "pop" in the first column. If changing this doesn't work, would it be possible for you to send your whole file to me by email?
Pontus
Hi Thanks, in my tfam it is like pop1 indv1 indv1 0 0 0 0 pop1 indv2 indv2 0 0 0 0 pop1 indv3 indv3 0 0 0 0 pop2 indv4 indv4 0 0 0 0 pop2 indv5 indv5 0 0 0 0 pop3 indv6 indv7 0 0 0 0 pop4 indv7 indv7 0 0 0 0
Of course I can send my tfam and tped to you. Can you give me your email address please?
Thanks for helping!
Hi Many thanks! I solved the problem, it is because of the data filtering that my outgroup data was mistakenly filtered.
Great, thanks for your interest in popstats!
Hi, I too an trying to runt eh D-statistic and have a similar error when running the command python2 popstats.py --tped stacks10_r0.6_7pops.tped --tfam stacks10_r0.6_7pops.tfam --not23 --pops boutoniana,pterocalyx,nodosa,borbonica --informative
ERROR
Traceback (most recent call last):
File "popstats.py", line 1838, in
and my .tfam file looks similar, I was wondering what could be going wrong? @tomatomaolajiao could my outgroup data have been mistakenly filtered as well in a similar way to yours? boutoniana 10_01 0 0 0 0 boutoniana 10_02 0 0 0 0 boutoniana 10_03 0 0 0 0 boutoniana 10_04 0 0 0 0 pterocalyx 84_03 0 0 0 0 pterocalyx 84_04 0 0 0 0 nodosa 87_05 0 0 0 0 nodosa 87_07 0 0 0 0 etc.
my .tped file looks like 0 5_3 0 149 G G G G G G G G G A G A G
Hi,
Are all your chromosome names 0? Try changing them to 1 or >1. Also, depending on your genome size you might need to change the default block size of 5 mb in order to be able to run a proper jackknife to estimate standard errors.
Best, Pontus
On Tue, Jul 11, 2017 at 1:21 PM, alx552 [email protected] wrote:
Hi, I too an trying to runt eh D-statistic and have a similar error when running the command python2 popstats.py --tped stacks10_r0.6_7pops.tped --tfam stacks10_r0.6_7pops.tfam --not23 --pops boutoniana,pterocalyx,nodosa,borbonica --informative
ERROR
Traceback (most recent call last): File "popstats.py", line 1838, in Dp_main = sum(t_list) / sum(n_list) ZeroDivisionError: integer division or modulo by zero
and my .tfam file looks similar, I was wondering what could be going wrong? @tomatomaolajiao https://github.com/tomatomaolajiao could my outgroup data have been mistakenly filtered as well in a similar way to yours? boutoniana 10_01 0 0 0 0 boutoniana 10_02 0 0 0 0 boutoniana 10_03 0 0 0 0 boutoniana 10_04 0 0 0 0 pterocalyx 84_03 0 0 0 0 pterocalyx 84_04 0 0 0 0 nodosa 87_05 0 0 0 0 nodosa 87_07 0 0 0 0 etc.
my .tped file looks like 0 5_3 0 149 G G G G G G G G G A G A G
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pontussk/popstats/issues/3#issuecomment-314513746, or mute the thread https://github.com/notifications/unsubscribe-auth/AF3_IxW3QR8pC6ogUQgBbc8-zOmcNUeAks5sM68xgaJpZM4McjVF .
I have changed the chromosome names to 1 (this is a RADseq dataset). I am not sure how to change the default block size. None of the program options really jumped out as being able to change that. When I run the same command as above with chromosomes all named 1, I get the following genocount != popcount, 634 176 I am not sure what this means, and no output file is produced?
Hi,
This error message suggests that you don't have the same number of individuals in your tped and tfam files. Perhaps try running them through plink to make sure they seem correct.
Pontus
On Tue, Jul 11, 2017 at 5:37 PM, alx552 [email protected] wrote:
I have changed the chromosome names to 1 (this is a RADseq dataset). I am not sure how to change the default block size. None of the program options really jumped out as being able to change that. When I run the same command as above with chromosomes all named 1, I get the following genocount != popcount, 634 176 I am not sure what this means, and no output file is produced?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pontussk/popstats/issues/3#issuecomment-314579782, or mute the thread https://github.com/notifications/unsubscribe-auth/AF3_I3Mb3TCTlyiX9kKRgmwQywbo2X6lks5sM-sugaJpZM4McjVF .
I think I have corrected the error to ensure the same number of individuals are in both tped and tfam files. However, when I run the program I still get an error:
Traceback (most recent call last):
File "popstats.py", line 2166, in
Hi,
Can you please send me both your files or upload them somewhere?
Best, Pontus
On Wed, Jul 12, 2017 at 5:40 PM, alx552 [email protected] wrote:
I think I have corrected the error to ensure the same number of individuals are in both tped and tfam files. However, when I run the program with the corrected files I get
Traceback (most recent call last): File "popstats.py", line 2166, in D_pop =sumt / sumn ZeroDivisionError: float division by zero
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pontussk/popstats/issues/3#issuecomment-314905311, or mute the thread https://github.com/notifications/unsubscribe-auth/AF3_IxAkMLJzFOwKiUtMlav8s746EFXHks5sNT1MgaJpZM4McjVF .
Thanks for sending the files. Also posting my reply here with your population labels redacted.
The problem is that your data only spans 750kb but the default block size for the jackknife is 5 mb blocks (or separate for chromosomes). If you truly believe that all these markers are not in LD with each other you can set the block size to 1 (--block_size 1), so that each SNP is treated as independent. I ran this command, see also the output.
python popstats.py --tped NNN.tped --tfam NNN.tfam --not23 --pops A,B,X,Y --informative --block_size 1 A B X Y 0.271476194992 0.0818491532067 3.31678684942 137 137 50 42 46 38
On Mon, Jul 17, 2017 at 3:42 PM, Pontus Skoglund [email protected] wrote:
Hi,
Can you please send me both your files or upload them somewhere?
Best, Pontus
On Wed, Jul 12, 2017 at 5:40 PM, alx552 [email protected] wrote:
I think I have corrected the error to ensure the same number of individuals are in both tped and tfam files. However, when I run the program with the corrected files I get
Traceback (most recent call last): File "popstats.py", line 2166, in D_pop =sumt / sumn ZeroDivisionError: float division by zero
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pontussk/popstats/issues/3#issuecomment-314905311, or mute the thread https://github.com/notifications/unsubscribe-auth/AF3_IxAkMLJzFOwKiUtMlav8s746EFXHks5sNT1MgaJpZM4McjVF .
Thanks so much Pontus!
So, if I am interpreting this correctly, the positive D-stat indicates admixture between A and X? Also given that I know the phylogeny of these species I rearranged the populations and set pop Y to be the ancestor. I get a D-stat of -0.03, given this is close to 0. would I interpret this to mean there is admixture between pop X and Y?
Yes the positive D-statistic suggests gene flow between A and X and/or between B and Y.
To see if your D-statistic is significantly deviating from 0, look at the Z-score in column 7, common thresholds for significance are |Z| >2 or |Z|
On Wed, Jul 19, 2017 at 3:44 PM, alx552 [email protected] wrote:
Thanks so much Pontus!
So, if I am interpreting this correctly, the positive D-stat indicates admixture between A and X? Also given that I know the phylogeny of these species I rearranged the populations and set pop Y to be the ancestor. I get a D-stat of -0.03, given this is close to 0. would I interpret this to mean there is admixture between pop X and Y?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pontussk/popstats/issues/3#issuecomment-316495528, or mute the thread https://github.com/notifications/unsubscribe-auth/AF3_I61Otx6ZojCRZT3dRS236ak-Km9Pks5sPlyQgaJpZM4McjVF .
Dear Pontus,
I have also exact same problem. Could you help me on that stuff? I am sending you tped and tfam files by e-mail.
thank you.
I solved this problem. So please ignore my question. Thank you. emrah
Your tfam file describes four individuals but as you can see your tped file only has two chromosomes represented.