sgkit icon indicating copy to clipboard operation
sgkit copied to clipboard

Allow parents dimension > 2 for Hamilton-Kerr methods

Open timothymillar opened this issue 3 years ago • 0 comments

Currently most of the pedigree methods for estimating kinship, inbreeding and etc. require that the parent dimension is of length two. I've run into this limitation when working with pedigree data structured with three parent columns - Mother, Father and Origin/Source to indicate clonal propagation. These data could be manually re-coded into two columns, but this is cumbersome to do in xarray (dimension resizing) and looses the clear distinction between parental types.

This limitation could be relaxed for the Hamilton-Kerr methods where the 'tau' parameter explicitly indicates contributions from parents. These methods will still be applicable so long as there are at most two contributing parents. Rather than generalizing the implementations of these methods an arbitrary parents dimension (which will likely incur a performance cost), a better option may be to "compress" the parent array (and associated arrays) down to two columns on the fly. This is a simple O(N) operation which would only apply if the parents dimension is not of length 2.

timothymillar avatar Sep 20 '22 09:09 timothymillar