adegenet DAPC returning 'k' rather than 'k-1' discriminant functions?

DAPC returning 'k' rather than 'k-1' discriminant functions?

Open cizydorczyk opened this issue 7 years ago • 2 comments

Hello,

When I run the dapc() function on a SNP dataset of ~ 100k SNPs split into 31 clusters, at the stage where I get asked how many discriminant functions to retain, if I keep all then I am keeping 31.

Isn't DAPC (by definition) supposed to return k-1 (31-1) discriminant functions? How is it possible that the function returns k (31)?

I run dapc() like so: dapc.100 <- dapc(snps, hcpc.pca1.clusters.factor, pca.select="percVar", perc.pca=100)

where hcpc.pca1.clusters.factor is a factor containing my isolate groupings (31 clusters).

When I try running DAPC with one of the sample datasets from the adegenet R package (eg. dapcIllus$a), however, I get the proper number (5) of discriminant functions (since k=6 in that sample dataset).

Is it at all possible that under certain conditions (eg. given a particular dataset), that the DAPC function will return k discriminant functions? I thought by definition there would only be k-1 such functions if using k population clusters.

Any help resolving this issue is much appreciated.

Thank you, Conrad Izydorczyk

Dec 15 '17 20:12 cizydorczyk

Hi Conrad,

You are correct that you should be getting 30 axes instead of 31, however it's difficult to know why without a reproducible example (https://stackoverflow.com/a/5963610/2752888) or even knowing what versions of R and adegenet you have.

A few questions that may help get to the root problem:

Does this behavior occur if you subset your data to a smaller number of groups?
Does this behavior occur if you subset your data to a smaller number of loci?
How many samples are in your dataset?
Do any of your groups have one sample each?
Are there more levels in hcpc.pca1.clusters.factor than are actually represented by the data?

Dec 15 '17 20:12 zkamvar

To comment on this: there should be k-1 axes, without exception.

One potential glitch would be that the factor actually contains one more group than expected - e.g. a 'ghost' group, without actual member, left over as a level:

> a=factor(c("a", "b", "c"))
> a
[1] a b c
Levels: a b c
> a[1:2]
[1] a b
Levels: a b c

Dec 20 '17 17:12 thibautjombart

adegenet adegenet copied to clipboard

DAPC returning 'k' rather than 'k-1' discriminant functions?

adegenet
adegenet copied to clipboard