pegas icon indicating copy to clipboard operation
pegas copied to clipboard

Ops.factor(left, right) error

Open SimeoneCA opened this issue 10 months ago • 3 comments

Hi Emmanuel,

I am working with phased loci data in pegas - when I was trying to compute pairwise distances among haplotypes, using dist.haplotype.loci(), I came across the error:

"Error in Ops.factor(left, right) : level sets of factors are different"

The data seem to be imported and haplotype() created an object "haplotype.loci" class. Looking at my imported dataset, I noticed that some positions have a factor of 3 (i.e 0|1, 1|0, 1|1) or 4 (i.e 0|0, 0|1, 1|0, 1|1). I am not sure if this is the reason for the error, but is there any suggestion on how to work with phased data that may have variable factored data at certain positions?

Thank you, Chris

SimeoneCA avatar Apr 23 '24 18:04 SimeoneCA

Hi Chris,

You seem to have identified the problem correctly. You can try to edit the function yourself. First make a copy in your workspace:

mydist <- pegas::dist.haplotype.loci

You can then either print and copy the code in a a file[1] that you can source() later (a good solution so you add this in your script), or edit the code directly with fix(mydist):

### find this line
            d[k] <- sum(x[, i] != x[, j])
### to become:
            d[k] <- sum(as.character(x[, i]) != as.character(x[, j]))

You should be able to use mydist() or if you prefer copy it in your workspace (so you don't need to change your script):

dist.haplotype.loci <- mydist

You can still use the pegas version in the same session with pegas::dist.haplotype.loci().

Tell me how this works with your data.

Cheers, Emmanuel

[1] Or more directly:

cat(deparse(pegas::dist.haplotype.loci), sep = "\n", file = "mydist.R")

emmanuelparadis avatar May 02 '24 03:05 emmanuelparadis

Hi Emmanuel,

Thank you for your response! The changes above do allow for dist.haplotyp.loci (in this case above, mydist()) to proceed. However, all of the distances calculated across all variant pairwise comparisons is = 1. This command calculates the Hamming distance for pairwise comparisons, right? Is it taking account the differences for a single position, or is the distance calculated across the entire haplotype? Having all of my distances = 1 is unclear, though I may be misinterpreting the calculation.

Thank you, Chris

SimeoneCA avatar May 08 '24 15:05 SimeoneCA

Yes, this function computes the (unscaled) Hamming distance. Maybe each haplotype differs by one SNP?

emmanuelparadis avatar May 09 '24 03:05 emmanuelparadis