ietoolkit
ietoolkit copied to clipboard
iesave: code breaks if too many unique values in a string variable
If there are too many unique values in a string/categorical variable, levelsof
breaks with an error message of "cannot compute". I have just run into this with a variable that had 700k+ unique values.
It now runs with the workaround of replacing the following lines https://github.com/worldbank/ietoolkit/blob/fa1146ebd0c6f74cd5e2af1df84b717c9bf838b3/src/ado_files/iesave.ado#L599-L603
with
* Number of levels
preserve
keep `var'
duplicates drop
count
local varlevels = r(r)
restore
* Number of complete observations
qui count if !missing(`var')
local varcomplete = r(N)
There may be a more elegant approach, though. If no one can think of one, I can open a PR with this one.