taxdata
taxdata copied to clipboard
Impute non-top-coded XTOT in PUF
Would it be useful to un-top-code XTOT in the PUF using the CPS data?
There are currently 19 records in the PUF with XTOT>5, all imputed nonfilers from the CPS (see below). A similar process could fix XTOT, unless it needs to align with other variables that would be hard to impute simultaneously.
Based on https://github.com/PSLmodels/Tax-Calculator/blob/master/taxcalc/calcfunctions.py, XTOT is used in AGI() and ChildDepTaxCredit().
In [1]: pd.read_csv('~/puf.csv', usecols=['XTOT', 'data_source']).groupby(['XTOT', 'data_source']).size()
Out[1]:
XTOT data_source
0 1 9583
1 0 4679
1 81501
2 0 2218
1 73602
3 0 291
1 30266
4 0 98
1 29917
5 0 41
1 16376
6 0 16
7 0 3