Tax-Calculator icon indicating copy to clipboard operation
Tax-Calculator copied to clipboard

Remove direct reliance on PUF and CPS files: Stage 1 (PUF)

Open chusloj opened this issue 4 years ago • 6 comments

This PR is the first in a pair of PRs that will remove Tax-Calculator's direct reliance on puf.csv and cps.csv files. This PR focuses on all content regarding the PUF, and the next will focus on the CPS file.

Each PR will be updated in stages:

  1. Functions (and docstrings) and the CLI
  2. Testing
  3. Documentation

chusloj avatar Jan 29 '21 19:01 chusloj

Just a small note, the only change this will make to the Tax-Calculator API is that users will be required to specify the path to their data, weights, and adjust_ratios. They'll also be required to provide the start_year for their data.

Currently, if a user has the PUF file, a Records object would just be created as:

recs = Records()

Now, the user has to specify the 4 variables mentioned above:

puf_data = 'puf_data.csv'
puf_weights = 'puf_weights.csv'
puf_ratios = 'puf_ratios.csv'
start_year = 2012

recs = Records(data=puf_data, weights=puf_weights, adjust_ratios=puf_ratios, start_year=start_year)

chusloj avatar Feb 02 '21 21:02 chusloj

The test suite has been updated. PUF data and related files have been replaced with synthetic data in tests. test_compare, test_puf_var_stats and test_pufcsv will be moved to TaxData once this PR closes. Also, test_compatabile_data will be moved to Tax-Brain.

chusloj avatar Feb 10 '21 21:02 chusloj

The CLI, docs for the CLI and recipes have been updated to support the new API.

chusloj avatar Feb 12 '21 21:02 chusloj

A script to generate random test data has been included to provide test data to replace the PUF. The generated data supplements data produced by validation set 'c' from the TAXSIM32 PR.

All tests are passing. AppVeyor is failing because the test data generation script is only run in the GH action.

chusloj avatar Feb 18 '21 21:02 chusloj

As a check, I'll test the TAXSIM32 suite using the new API/CLI once that PR is merged.

chusloj avatar Feb 26 '21 21:02 chusloj

The only failing test here is a difference between reforms_actual.csv and reforms_expect.csv. Locally however, many tests fail (most in test_reforms.py) with the following error:

 IndexError: positional indexers are out-of-bounds

This might be related to a newly updated dependency either in taxcalc or paramtools affecting how JSON files are treated. The same dependency issues might be affecting tests in #2588 as well.

chusloj avatar Apr 29 '21 14:04 chusloj