pegas icon indicating copy to clipboard operation
pegas copied to clipboard

Subset a haplotype object based on names?

Open ChrisK1988 opened this issue 10 months ago • 2 comments

Hello,

Is it possible to subset a haplotype class object based on the names of the haplotypes? I have a region of interest that is 3593 bases long, and has 700 haplotypes. When filtering with subset(h, minfreq=2) I am left with 142 haplotypes. I am trying to filter it down to ~50 of the most common haplotypes, however, I have two populations of interest with unique haplotypes (n=3) that I want to preserve that would not pass any filtering based solely on size.

For example, I have object h, with haplotypes I, II, III, IV, V, and VI with frequencies of 1, 25, 1, 10, 2, 10. How I would subset this to only take haplotypes I, II, III, and VI, for example?

I have tried using subset as if it was a list, an atomic vector, and a matrix, but all that keeps happening is it sets the haplotype frequencies equal to 1, which is fine for generating the network I guess, but trying to plot any sort of frequency information is impossible at that point.

Thank you kindly,

Chris

ChrisK1988 avatar Apr 16 '24 18:04 ChrisK1988

Hello,

You can get the numbers of each haplotype with:

Nh <- summary(h)

Then you can define a selection with, for instance:

sel <- Nh == 2 | Nh > 25

Since the "haplotype" object is a matrix, you can subset it with:

h[sel, ]

and for the vector (no comma):

Nh[sel]

Both objects have the same names (ie, all(rownames(h) == names(Nh)) should return TRUE), so you can also subset with, eg:

sel <- c("I", "II", "VI")

Cheers,

Emmanuel

emmanuelparadis avatar Apr 21 '24 10:04 emmanuelparadis

Thank you kindly, Emmanuel. I will give this a try.

Chris

ChrisK1988 avatar Apr 26 '24 13:04 ChrisK1988