pegas
pegas copied to clipboard
Subset a haplotype object based on names?
Hello,
Is it possible to subset a haplotype class object based on the names of the haplotypes? I have a region of interest that is 3593 bases long, and has 700 haplotypes. When filtering with subset(h, minfreq=2) I am left with 142 haplotypes. I am trying to filter it down to ~50 of the most common haplotypes, however, I have two populations of interest with unique haplotypes (n=3) that I want to preserve that would not pass any filtering based solely on size.
For example, I have object h, with haplotypes I, II, III, IV, V, and VI with frequencies of 1, 25, 1, 10, 2, 10. How I would subset this to only take haplotypes I, II, III, and VI, for example?
I have tried using subset as if it was a list, an atomic vector, and a matrix, but all that keeps happening is it sets the haplotype frequencies equal to 1, which is fine for generating the network I guess, but trying to plot any sort of frequency information is impossible at that point.
Thank you kindly,
Chris
Hello,
You can get the numbers of each haplotype with:
Nh <- summary(h)
Then you can define a selection with, for instance:
sel <- Nh == 2 | Nh > 25
Since the "haplotype" object is a matrix, you can subset it with:
h[sel, ]
and for the vector (no comma):
Nh[sel]
Both objects have the same names (ie, all(rownames(h) == names(Nh))
should return TRUE
), so you can also subset with, eg:
sel <- c("I", "II", "VI")
Cheers,
Emmanuel
Thank you kindly, Emmanuel. I will give this a try.
Chris