sgkit icon indicating copy to clipboard operation
sgkit copied to clipboard

Add mean imputation function

Open tszfungc opened this issue 2 years ago • 9 comments

ref #609

Add mean impute function for call_dosage, call_genotype, and call_genotype_probability

tszfungc avatar Aug 17 '22 04:08 tszfungc

Thanks for looking into this @tszfungc! I think this could be a great approach for imputing call_dosage and call_genotype_probability. However, I don't think it will produce the desired result for call_genotype.

The values in call_genotype are (potentially unsorted) alleles whose order along the ploidy dimension doesn't have any particular meaning. So, as far as I can tell, the mean of those alleles can't really be used for anything.

timothymillar avatar Aug 18 '22 09:08 timothymillar

Thanks for the review @tomwhite @timothymillar. I agree that the allele order doesn't have a particular meaning. The order along ploidy should be ignored by computing the mean along dim=['samples', 'ploidy'], But this is also an unusual use to me.

tszfungc avatar Aug 18 '22 19:08 tszfungc

@jeromekelleher the trade-off between returning new variables or replacing existing variables was previously discussed in https://github.com/pystatgen/sgkit/pull/308#issuecomment-705706571. I personally have a slight preference for replacing existing variables but there are some good points raised in that discussion. The primary concern seems to be that replacing existing variables is effectively a mutate operation, which goes against the general pattern of treating arrays as immutable.

timothymillar avatar Sep 01 '22 21:09 timothymillar

I see, thanks. Hmm, not much choice other than to create a bunch of new variables then.

jeromekelleher avatar Sep 13 '22 08:09 jeromekelleher

This PR has conflicts, @tszfungc please rebase and push updated version 🙏

mergify[bot] avatar Mar 29 '23 13:03 mergify[bot]

This PR has conflicts, @tszfungc please rebase and push updated version 🙏

mergify[bot] avatar Sep 05 '23 12:09 mergify[bot]

This PR has conflicts, @tszfungc please rebase and push updated version 🙏

mergify[bot] avatar Nov 13 '23 14:11 mergify[bot]

This PR has conflicts, @tszfungc please rebase and push updated version 🙏

mergify[bot] avatar Feb 05 '24 16:02 mergify[bot]

This PR has conflicts, @tszfungc please rebase and push updated version 🙏

mergify[bot] avatar Jun 19 '24 09:06 mergify[bot]