estimagic icon indicating copy to clipboard operation
estimagic copied to clipboard

Bootstrap weights

Open alanlujan91 opened this issue 11 months ago • 8 comments

Add weights kwarg to handle survey weights

alanlujan91 avatar Mar 04 '24 22:03 alanlujan91

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 93.11%. Comparing base (424fdda) to head (b33835d).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #485      +/-   ##
==========================================
+ Coverage   93.09%   93.11%   +0.01%     
==========================================
  Files         195      195              
  Lines       14718    14757      +39     
==========================================
+ Hits        13702    13741      +39     
  Misses       1016     1016              

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Mar 05 '24 14:03 codecov[bot]

Hey Alan! Thanks for opening another pull request.

Since this time, the PR is not related to a minor fix but an extension, could you extend the PR description? In particular, could you state the problem (i.e. the missing feature), how this is called in the bootstrap literature, and how you plan to implement it. If you know other libraries that implement this feature, this would also be great to know.

timmens avatar Mar 05 '24 18:03 timmens

Hey Alan! Thanks for opening another pull request.

Since this time, the PR is not related to a minor fix but an extension, could you extend the PR description? In particular, could you state the problem (i.e. the missing feature), how this is called in the bootstrap literature, and how you plan to implement it. If you know other libraries that implement this feature, this would also be great to know.

Will try to address all of this in the coming days! Thanks!

alanlujan91 avatar Mar 06 '24 13:03 alanlujan91

Hey @alanlujan91, thanks again for working on this feature. We are planning a major refactoring and want to close as many PRs as possible beforehand.

Are you planning to work on this PR in the near future?

timmens avatar May 02 '24 16:05 timmens

hi @timmens

I know this feature as proportional sampling, which is something we have done in Econ-ARK but I don't have a good link to the literature.

https://en.wikipedia.org/wiki/Probability-proportional-to-size_sampling

I can work on this more if you think this is a desired feature, I've just been pretty busy so this hasn't been a priority.

alanlujan91 avatar May 02 '24 16:05 alanlujan91

Hi Alan, thanks for the PR. I think this would be nice to have in estimagic. As Tim said, we are planning major refactorings and would like to have few PRs open when we start. Do you think you can finish within the next three weeks? Otherwise it might be a good idea to convert this to an issue and postpone the actual implementation until you have more free time.

janosg avatar May 03 '24 06:05 janosg

@janosg I can definitely finish this within 3 weeks

alanlujan91 avatar May 06 '24 20:05 alanlujan91

Modified code according to above comments from @timmens, except for

If the weights are uniform (as in your example), the bootstrap indices with and without the weights argument should be the same. Could you test that? You will need to create two random number generators with the same state.

This one is complicated because of the following note from numpy

"Setting user-specified probabilities through p uses a more general but less efficient sampler than the default. The general sampler produces a different sample than the optimized sampler even if each element of p is 1 / len(a)."

It can be done, but requires generating dummy p's as p = np.ones(n_obs)/n_obs when weights_by is None. Let me know if this is desirable

alanlujan91 avatar May 21 '24 15:05 alanlujan91