Dsuite icon indicating copy to clipboard operation
Dsuite copied to clipboard

Dsuite showing error with tree file

Open BiodeB opened this issue 3 years ago • 16 comments

Dear Experts,

I'm trying to run Dsuite for the first time and I'm facing below error, although without using -t option and input tree the program is running well .

Dsuite Dtrios GB_snpS.vcf SETS.txt -t astal_species_TRoot.nwk -o treeGB

There are 112 sets (excluding the Outgroup)
Going to calculate D and f4-ratio values for 227920 trios
Out of Range error: map::at
species[i]: Species1
It seems that this species is in the SETS.txt file but can't be found in the tree. Please check the spelling and completeness of your tree file.

I checked the tree for spelling mistake or not but I didn't found anything like that . Also I checked for completeness of the tree it resulted that the The Binary Tree is not complete. Therefore it is my humble request that it will be an immense help if someone kindly correct me and suggest how to overcome this issue.

Thanks, Debajyoti

BiodeB avatar Aug 25 '21 05:08 BiodeB

I also came to this error. I checked files like you done before but have no idea how to solve it yet. Now I wonder whether you solved it or not. And I'd appreciate any information you could give me.

XiaXiaTianTian avatar Oct 30 '21 05:10 XiaXiaTianTian

Hi, I recently had the issue too and realised that the populations/species names in my tree did not exactly match those I had specified previously in the SETs.txt - hope that helps

RishiDeKayne avatar Nov 05 '21 08:11 RishiDeKayne

Thanks a lot. I will try to check tree file again. Maybe something wrong with it.

XiaXiaTianTian avatar Nov 05 '21 09:11 XiaXiaTianTian

Hi everyone, I am also finding the same problem. Double-checked names and files...Did you manage to sort out this problem?

SilvaFE avatar May 19 '22 19:05 SilvaFE

In my case I was able to sort it by making sure that the tree I provided was at the population level that I specified in the SETS.txt file. For example if I grouped three individuals indiv1, indiv2, and indiv3 into Species1 then my tree file had Species1 listed not indivs1-3 (this is also true of the outgroup which is called 'Outgroup' in my tree). I did this just by pruning the tree and re-naming nodes using the ape package in R.

RishiDeKayne avatar May 19 '22 20:05 RishiDeKayne

Thanks a lot @RishiDeKayne ! It works for me! best regards

SilvaFE avatar May 21 '22 13:05 SilvaFE

In my case I was able to sort it by making sure that the tree I provided was at the population level that I specified in the SETS.txt file. For example if I grouped three individuals indiv1, indiv2, and indiv3 into Species1 then my tree file had Species1 listed not indivs1-3 (this is also true of the outgroup which is called 'Outgroup' in my tree). I did this just by pruning the tree and re-naming nodes using the ape package in R.

How did you do it @RishiDeKayne ; If possible, please help me to solve my problem. When I changed individual names to their related species name, I get this error: "ERROR: Duplicate value in the tree "Species1"

Thanks

RezaFahi avatar Sep 05 '22 10:09 RezaFahi

It sounds like you might have multiple individuals that are now called "Species 1" in your tree? This won't work but instead what you should do is collapse the monophyletic node in your tree to correspond to the population you specified in SETS.txt . E.g. if individuals 1, 2, and 3 are called 'Species 1' in your SETS.txt and are each others closest relatives in your individual-level phylogenetic tree and are monophyletic then use the Ape package in R to collapse this node so instead of having individuals 1,2, and 3, it now just has a single tip called "Species 1"

RishiDeKayne avatar Sep 05 '22 10:09 RishiDeKayne

I did exactly the same as @RishiDeKayne However, I didn't use the package Ape, but I inform the tree in a newick file and I opened this tree using iTOL to see if the topology was consistent to what we know for the study group. It was not a problem in case because I had only a few species/populations. But it can be if you have several species/populations in your tree. Best wishes Felipe

On Mon, 5 Sept 2022, 06:46 RezaFahi, @.***> wrote:

how did you that

Thanks a lot @RishiDeKayne https://github.com/RishiDeKayne ! It works for me! best regards

How did you it @SilvaFE https://github.com/SilvaFE ; If possible, please help me to solve my problem

Thanks

— Reply to this email directly, view it on GitHub https://github.com/millanek/Dsuite/issues/40#issuecomment-1236838457, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHHR7JDXESO5BSSVGKGBMSDV4XFSHANCNFSM5CYH3CBQ . You are receiving this because you were mentioned.Message ID: @.***>

SilvaFE avatar Sep 05 '22 16:09 SilvaFE

Yes and just to clarify I think if you did not group things into populations but just had an outgroup and individuals make sure every single name in the tree is exactly the same as the names in the second column of your SETS.txt file so if your tree has a topology something like (((indiv1, indiv2), indiv3), Outgroup) then make sure that your SETS.txt file looks like:

indiv1    indiv1
indiv2   indiv2
indiv3    indiv3
indiv4   Outgroup

dont forget as per the manual (https://github.com/millanek/Dsuite) that: "For Dtrios, at least one individual needs to be specified to be the outgroup by using the Outgroup keyword as shown above."

RishiDeKayne avatar Sep 05 '22 17:09 RishiDeKayne

@RishiDeKayne Could you please let me know what functions you used in ape to prune your tree? I'm trying to do the same, but I'm unfamiliar with the package.

Thanks

declanjhoulihan avatar Sep 20 '22 21:09 declanjhoulihan

Yes, I think you're looking for the following: https://www.rdocumentation.org/packages/ape/versions/5.6-2/topics/drop.tip there should be lots of tutorials for using ape in R online so hopefully once you read in your tree this will help.

RishiDeKayne avatar Sep 20 '22 21:09 RishiDeKayne

I'm having an issue with this same error as well. In my case, my dataset is very simple (1 individual per taxon) and I can't see a mistake anywhere. I've tried recreating the sets.txt and tree files several times and switching from the actual species name to "Species1, Species2, etc...", and I still get this error. The contents of my sets.txt and newick file are below. Like the OP, I'm able to run the analysis just fine without the -t parameter, but I really want to get this to work with the tree file. Any advice is much appreciated.

sets.txt: PanObs Outgroup H16057 Species1 H21189 Species2 TJH3395 Species3

newick file: (((Species3,Species2),Species1),Outgroup);

Thanks!

thomlmarshall avatar Oct 26 '22 21:10 thomlmarshall

I am having this same issue - if I collapse nodes / change labels in the SETS files as per the above suggestions then I have issues with samples not being represented from the VCF file.

mirandasherlock avatar Jul 24 '23 15:07 mirandasherlock

@mirandasherlock just to clarify what worked for me: in the sets.txt file the left column must be sample IDs that match those in your VCF exactly and in the right column must be population names that match the names of tips in your tree file exactly. i.e. if you are going to group multiple individuals into populations then the tree must also be presented at the population level with tip labels that exactly match the right column of the sets.txt file rather than including each individuals name. The tree file should only include tip names present in the right column. I had to double check this a few times to make sure there were no typos or individual names that had incorrectly been left in the tree file when I carried out the pruning. e.g. if your outgroup sample has 'Outgroup' in the right column of the sets.txt file (as it should) it must also be called 'Outgroup' and represented by a single tip in the tree file. Hope this helps!

RishiDeKayne avatar Aug 01 '23 16:08 RishiDeKayne

Hi Rishi,

Thanks so much for replying - I forgot to update it but I got it working using the same method as you.

Thanks again,

Miranda

On 1 Aug 2023, at 17:40, Rishi De-Kayne @.***> wrote:

@mirandasherlock https://github.com/mirandasherlock just to clarify what worked for me: in the sets.txt file the left column must be sample IDs that match those in your VCF exactly and in the right column must be population names that match the names of tips in your tree file exactly. i.e. if you are going to group multiple individuals into populations then the tree must also be presented at the population level with tip labels that exactly match the right column of the sets.txt file rather than including each individuals name. The tree file should only include tip names present in the right column. I had to double check this a few times to make sure there were no typos or individual names that had incorrectly been left in the tree file when I carried out the pruning. e.g. if your outgroup sample has 'Outgroup' in the right column of the sets.txt file (as it should) it must also be called 'Outgroup' and represented by a single tip in the tree file. Hope this helps!

— Reply to this email directly, view it on GitHub https://github.com/millanek/Dsuite/issues/40#issuecomment-1660706497, or unsubscribe https://github.com/notifications/unsubscribe-auth/A33F2VUTMLFFKHH3QBMQ3E3XTEWODANCNFSM5CYH3CBQ. You are receiving this because you were mentioned.

mirandasherlock avatar Aug 01 '23 16:08 mirandasherlock