solutions icon indicating copy to clipboard operation
solutions copied to clipboard

Implement population harmonization

Open denised opened this issue 3 years ago • 1 comments

Population harmonization is a feature that underlies all the models. In the Excel, it is implemented as a hidden part of the Data Interpolator tab. Basically, what happens is this:

  • Someone enters a new data source and does a data interpolation process on it. The data interpolator then records, in a hidden part of the tab, what population model was in effect at the time of the interpolation.
  • Later, as part of an integration update, a new population model is created. Then for every data source that is part of a TAM or adoption, they are re-interpolated using the ratio of the new population model to the one used for the original interpolation. The newly interpolated sources are updated in the model to replace (or augment?) the older ones.

From the python side, I expect that we would want to do this a little differently: each data source should have attached metadata that indicates which population assumption is associated with the data in that form. Then a harmonization process can automatically be performed to interpret that data according to any different population model. In an integration process, this updated data would be stored as a new/replacement source. But the harmonization could also be done dynamically, to reflect differing assumptions for different scenarios.

For existing data sets, I am going to guess that there are probably cases in which the original population model was not recorded (data sources that never needed interpolation, for example). So we may need to make some assumptions about those.

denised avatar Sep 13 '21 21:09 denised

Note this is functionality that underlies #416

denised avatar Sep 13 '21 21:09 denised