logitr icon indicating copy to clipboard operation
logitr copied to clipboard

Function to re-code data with outside good

Open jhelvy opened this issue 1 year ago • 0 comments

For experiments with outside goods ("none" options), the data need to be encoded in a particular way. I frequently see people make mistakes with this, so it's probably worth writing a function that handles this encoding for them. It needs to handle the following two conditions:

  • For continuous variables that don't have a 0 in them already (e.g. price), you should also subtract off the lowest value from all the values. By doing this, the value of 0 now means something (e.g. for price, it would be the lowest price), and everything different from 0 refers to the difference from the lowest value. If you don't do this, then the 0s in attributes like price are essentially saying the alternative had a price of 0, which is not correct.
  • For categorical variables, it is best to also manually dummy-code them and insert those dummy-coded variables into pars. Then you would also create a dummy-coded "no choice" column that is also separately included in pars. This way you'll get a separate coefficient for the "no choice" option that isn't conflated with the other categorical variables (e.g. brand in the example yogurt data).

jhelvy avatar Oct 17 '23 13:10 jhelvy