taxdata icon indicating copy to clipboard operation
taxdata copied to clipboard

Impute non-top-coded XTOT in PUF

Open MaxGhenis opened this issue 6 years ago • 0 comments

Would it be useful to un-top-code XTOT in the PUF using the CPS data?

There are currently 19 records in the PUF with XTOT>5, all imputed nonfilers from the CPS (see below). A similar process could fix XTOT, unless it needs to align with other variables that would be hard to impute simultaneously.

Based on https://github.com/PSLmodels/Tax-Calculator/blob/master/taxcalc/calcfunctions.py, XTOT is used in AGI() and ChildDepTaxCredit().

In [1]: pd.read_csv('~/puf.csv', usecols=['XTOT', 'data_source']).groupby(['XTOT', 'data_source']).size()
Out[1]: 
XTOT  data_source
0     1               9583
1     0               4679
      1              81501
2     0               2218
      1              73602
3     0                291
      1              30266
4     0                 98
      1              29917
5     0                 41
      1              16376
6     0                 16
7     0                  3

MaxGhenis avatar Feb 21 '19 22:02 MaxGhenis