ietoolkit icon indicating copy to clipboard operation
ietoolkit copied to clipboard

iesave: code breaks if too many unique values in a string variable

Open luizaandrade opened this issue 11 months ago • 0 comments

If there are too many unique values in a string/categorical variable, levelsof breaks with an error message of "cannot compute". I have just run into this with a variable that had 700k+ unique values.

It now runs with the workaround of replacing the following lines https://github.com/worldbank/ietoolkit/blob/fa1146ebd0c6f74cd5e2af1df84b717c9bf838b3/src/ado_files/iesave.ado#L599-L603

with

* Number of levels
preserve 
	keep `var'
	duplicates drop
	count
	
	local varlevels = r(r)
restore

* Number of complete observations
qui count if !missing(`var')		
local varcomplete	= r(N)

There may be a more elegant approach, though. If no one can think of one, I can open a PR with this one.

luizaandrade avatar Mar 19 '24 22:03 luizaandrade