stata-gtools
stata-gtools copied to clipboard
Some commands appear to ignore [w=weights]
I noticed that some commands like gegen take weights but actually do not use them. For example, if you try to generate a count or total of a variable and use weights. Below I provide an example where the total is 5, but I thought it would be 25 if using the weights. Perhaps it would be good to stop allowing the use of weights and give an error if the user uses them for these particular commands.
set obs 5
g units=1
g weight=5
gegen total=total(units) [w=weight]
sum total
// Variable | Obs Mean Std. Dev. Min Max
// -------------+---------------------------------------------------------
// total | 5 5 0 5 5
@erciomunoz This is not a gtools bug per se, so to the extent it's a bug at all it would be on Stata. As documented in the help files, weights in gegen and gcollapse are meant to mimic collapse (see weights section of the collapse help pdf). You can see
clear
set obs 5
g units=1
g weight=5
collapse (sum) units [w = weight]
disp units
Gives 5 as the answer. The reason is that aweights are normalized to sum to the number of observations. Try fweight, pweight, or iweight to see the difference. Also
clear
set obs 5
g units=_n
g weight=_n
gegen total =total(units) [w=weight]
gegen totalu=total(units)
sum total*
Gives a different answer, showing weights are not being ignored (they just work different than how you might intuit). Cheers.
@erciomunoz I've decided to leave this issue up in case others have a similar question about how gegen and gcollapse weights are meant to work. I've changed the title to one I think is more precise but LMK if you disagree. Cheers.
Thanks for the quick reply.
This makes total sense, now I regret a little bit that I did not spend more time thinking about it before posting.