popstats icon indicating copy to clipboard operation
popstats copied to clipboard

Question about 'Dp_main = sum(t_list) / sum(n_list)'

Open XueyunF opened this issue 7 years ago • 19 comments

Hi Pontus I am trying to run the D-statistics. I followed your suggestion to put the population name in the first column. Then I feed popstats.py with my .tped and .tfam file then it shows

######################################## Traceback (most recent call last): File "popstats.py", line 2105, in Dp_main = sum(t_list) / sum(n_list) ZeroDivisionError: integer division or modulo by zero ########################################

Command I used was python2 popstats.py -p good.tped -f good.tfam --not23
-b 1000000 --pops pop1,pop2,pop3,pop4 --informative

What kind of error in my data lead to this?

Thank you!

XueyunF avatar Mar 14 '17 13:03 XueyunF

Hi!

Thanks for using popstats. You need to change how you specify the input files. Could you try

python2 popstats.py --tped good.tped --tfam good.tfam --not23 --pops pop1,pop2,pop3,pop4 --informative

?

pontussk avatar Mar 20 '17 21:03 pontussk

Hi Thanks, I tried to modify my command lines but it still show the same error.

Below is how my tped file looks 1 LG1:31384 0 31384 T T T C T T T T T T T T T T T T T T 0 T 0 0 T T T T T T T 1 LG1:31396 0 31396 A T T T G T T T T T T T T T T T T T T T T 0 T C T T T T T And tfam pop indv1 indv1 0 0 0 0 pop indv2 indv2 0 0 0 0

Thanks for help!

XueyunF avatar Mar 21 '17 11:03 XueyunF

Hi,

Your tfam file should have e.g. "pop1" or "pop2" and not just "pop" in the first column. If changing this doesn't work, would it be possible for you to send your whole file to me by email?

Pontus

pontussk avatar Mar 22 '17 01:03 pontussk

Hi Thanks, in my tfam it is like pop1 indv1 indv1 0 0 0 0 pop1 indv2 indv2 0 0 0 0 pop1 indv3 indv3 0 0 0 0 pop2 indv4 indv4 0 0 0 0 pop2 indv5 indv5 0 0 0 0 pop3 indv6 indv7 0 0 0 0 pop4 indv7 indv7 0 0 0 0

Of course I can send my tfam and tped to you. Can you give me your email address please?

Thanks for helping!

XueyunF avatar Mar 22 '17 08:03 XueyunF

Hi Many thanks! I solved the problem, it is because of the data filtering that my outgroup data was mistakenly filtered.

XueyunF avatar Mar 28 '17 20:03 XueyunF

Great, thanks for your interest in popstats!

pontussk avatar Mar 30 '17 03:03 pontussk

Hi, I too an trying to runt eh D-statistic and have a similar error when running the command python2 popstats.py --tped stacks10_r0.6_7pops.tped --tfam stacks10_r0.6_7pops.tfam --not23 --pops boutoniana,pterocalyx,nodosa,borbonica --informative

ERROR Traceback (most recent call last): File "popstats.py", line 1838, in Dp_main = sum(t_list) / sum(n_list) ZeroDivisionError: integer division or modulo by zero

and my .tfam file looks similar, I was wondering what could be going wrong? @tomatomaolajiao could my outgroup data have been mistakenly filtered as well in a similar way to yours? boutoniana 10_01 0 0 0 0 boutoniana 10_02 0 0 0 0 boutoniana 10_03 0 0 0 0 boutoniana 10_04 0 0 0 0 pterocalyx 84_03 0 0 0 0 pterocalyx 84_04 0 0 0 0 nodosa 87_05 0 0 0 0 nodosa 87_07 0 0 0 0 etc.

my .tped file looks like 0 5_3 0 149 G G G G G G G G G A G A G

alx552 avatar Jul 11 '17 17:07 alx552

Hi,

Are all your chromosome names 0? Try changing them to 1 or >1. Also, depending on your genome size you might need to change the default block size of 5 mb in order to be able to run a proper jackknife to estimate standard errors.

Best, Pontus

On Tue, Jul 11, 2017 at 1:21 PM, alx552 [email protected] wrote:

Hi, I too an trying to runt eh D-statistic and have a similar error when running the command python2 popstats.py --tped stacks10_r0.6_7pops.tped --tfam stacks10_r0.6_7pops.tfam --not23 --pops boutoniana,pterocalyx,nodosa,borbonica --informative

ERROR

Traceback (most recent call last): File "popstats.py", line 1838, in Dp_main = sum(t_list) / sum(n_list) ZeroDivisionError: integer division or modulo by zero

and my .tfam file looks similar, I was wondering what could be going wrong? @tomatomaolajiao https://github.com/tomatomaolajiao could my outgroup data have been mistakenly filtered as well in a similar way to yours? boutoniana 10_01 0 0 0 0 boutoniana 10_02 0 0 0 0 boutoniana 10_03 0 0 0 0 boutoniana 10_04 0 0 0 0 pterocalyx 84_03 0 0 0 0 pterocalyx 84_04 0 0 0 0 nodosa 87_05 0 0 0 0 nodosa 87_07 0 0 0 0 etc.

my .tped file looks like 0 5_3 0 149 G G G G G G G G G A G A G

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pontussk/popstats/issues/3#issuecomment-314513746, or mute the thread https://github.com/notifications/unsubscribe-auth/AF3_IxW3QR8pC6ogUQgBbc8-zOmcNUeAks5sM68xgaJpZM4McjVF .

pontussk avatar Jul 11 '17 20:07 pontussk

I have changed the chromosome names to 1 (this is a RADseq dataset). I am not sure how to change the default block size. None of the program options really jumped out as being able to change that. When I run the same command as above with chromosomes all named 1, I get the following genocount != popcount, 634 176 I am not sure what this means, and no output file is produced?

alx552 avatar Jul 11 '17 21:07 alx552

Hi,

This error message suggests that you don't have the same number of individuals in your tped and tfam files. Perhaps try running them through plink to make sure they seem correct.

Pontus

On Tue, Jul 11, 2017 at 5:37 PM, alx552 [email protected] wrote:

I have changed the chromosome names to 1 (this is a RADseq dataset). I am not sure how to change the default block size. None of the program options really jumped out as being able to change that. When I run the same command as above with chromosomes all named 1, I get the following genocount != popcount, 634 176 I am not sure what this means, and no output file is produced?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pontussk/popstats/issues/3#issuecomment-314579782, or mute the thread https://github.com/notifications/unsubscribe-auth/AF3_I3Mb3TCTlyiX9kKRgmwQywbo2X6lks5sM-sugaJpZM4McjVF .

pontussk avatar Jul 11 '17 21:07 pontussk

I think I have corrected the error to ensure the same number of individuals are in both tped and tfam files. However, when I run the program I still get an error: Traceback (most recent call last): File "popstats.py", line 2166, in D_pop =sumt / sumn ZeroDivisionError: float division by zero

alx552 avatar Jul 12 '17 21:07 alx552

Hi,

Can you please send me both your files or upload them somewhere?

Best, Pontus

On Wed, Jul 12, 2017 at 5:40 PM, alx552 [email protected] wrote:

I think I have corrected the error to ensure the same number of individuals are in both tped and tfam files. However, when I run the program with the corrected files I get

Traceback (most recent call last): File "popstats.py", line 2166, in D_pop =sumt / sumn ZeroDivisionError: float division by zero

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pontussk/popstats/issues/3#issuecomment-314905311, or mute the thread https://github.com/notifications/unsubscribe-auth/AF3_IxAkMLJzFOwKiUtMlav8s746EFXHks5sNT1MgaJpZM4McjVF .

pontussk avatar Jul 17 '17 19:07 pontussk

Thanks for sending the files. Also posting my reply here with your population labels redacted.

The problem is that your data only spans 750kb but the default block size for the jackknife is 5 mb blocks (or separate for chromosomes). If you truly believe that all these markers are not in LD with each other you can set the block size to 1 (--block_size 1), so that each SNP is treated as independent. I ran this command, see also the output.

python popstats.py --tped NNN.tped --tfam NNN.tfam --not23 --pops A,B,X,Y --informative --block_size 1 A B X Y 0.271476194992 0.0818491532067 3.31678684942 137 137 50 42 46 38

On Mon, Jul 17, 2017 at 3:42 PM, Pontus Skoglund [email protected] wrote:

Hi,

Can you please send me both your files or upload them somewhere?

Best, Pontus

On Wed, Jul 12, 2017 at 5:40 PM, alx552 [email protected] wrote:

I think I have corrected the error to ensure the same number of individuals are in both tped and tfam files. However, when I run the program with the corrected files I get

Traceback (most recent call last): File "popstats.py", line 2166, in D_pop =sumt / sumn ZeroDivisionError: float division by zero

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pontussk/popstats/issues/3#issuecomment-314905311, or mute the thread https://github.com/notifications/unsubscribe-auth/AF3_IxAkMLJzFOwKiUtMlav8s746EFXHks5sNT1MgaJpZM4McjVF .

pontussk avatar Jul 17 '17 20:07 pontussk

Thanks so much Pontus!

So, if I am interpreting this correctly, the positive D-stat indicates admixture between A and X? Also given that I know the phylogeny of these species I rearranged the populations and set pop Y to be the ancestor. I get a D-stat of -0.03, given this is close to 0. would I interpret this to mean there is admixture between pop X and Y?

alx552 avatar Jul 19 '17 19:07 alx552

Yes the positive D-statistic suggests gene flow between A and X and/or between B and Y.

To see if your D-statistic is significantly deviating from 0, look at the Z-score in column 7, common thresholds for significance are |Z| >2 or |Z|

On Wed, Jul 19, 2017 at 3:44 PM, alx552 [email protected] wrote:

Thanks so much Pontus!

So, if I am interpreting this correctly, the positive D-stat indicates admixture between A and X? Also given that I know the phylogeny of these species I rearranged the populations and set pop Y to be the ancestor. I get a D-stat of -0.03, given this is close to 0. would I interpret this to mean there is admixture between pop X and Y?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pontussk/popstats/issues/3#issuecomment-316495528, or mute the thread https://github.com/notifications/unsubscribe-auth/AF3_I61Otx6ZojCRZT3dRS236ak-Km9Pks5sPlyQgaJpZM4McjVF .

pontussk avatar Jul 19 '17 20:07 pontussk

Dear Pontus,

I have also exact same problem. Could you help me on that stuff? I am sending you tped and tfam files by e-mail.

thank you.

emrahkirdok avatar Jul 25 '18 15:07 emrahkirdok

I solved this problem. So please ignore my question. Thank you. emrah

emrahkirdok avatar Jul 26 '18 14:07 emrahkirdok

Your tfam file describes four individuals but as you can see your tped file only has two chromosomes represented.

pontussk avatar Sep 09 '18 15:09 pontussk