sgkit
sgkit copied to clipboard
Is there a diference between variables `dosage` and `call_dosage`
Are these duplicate variables? call_dosage
seems to be more explicit with ndim=2 and the associated call_dosage_mask
but isn't currently being used. Variable dosage
doesn't specify dimensions and is used for LD pruning.
@eric-czech you're probably be best positioned to answer this one, would you mind taking a look?
Hm I can't remember or think of a good reason why the dosage
variable needs to exist separately from call_dosage
. +1 to switching references of dosage
to call_dosages
like in regenie or the bgen reader.
Thanks @eric-czech, I missed the uses in regenie and the bgen reader. I can have a look at merging these variables when I get a chance. A couple of related questions:
- Would it make sense to replace the constants defined here with their equivalent variables from
variables.py
? - Would it be worth having the dosage argument of
regenie
default tovariables.call_dosage
, or is this best left as an explicit choice?
Would it make sense to replace the constants defined here with their equivalent variables from variables.py?
Yep, I don't see why not. I believe I added that well before variables.py
existed so it should use that convention instead.
Would it be worth having the dosage argument of regenie default to variables.call_dosage, or is this best left as an explicit choice?
I think I'm very mildly opposed to that since it wouldn't be uncommon to provide an array with imputed dosages or some other transformation that might get ignored if a user forgets to pass the new variable name, but I can see it either way.