pymatch icon indicating copy to clipboard operation
pymatch copied to clipboard

Adding option to match without replacement.

Open tlooden opened this issue 6 years ago • 2 comments

Hi Ben,

Thanks for making this nice tool! If you like, i've implemented a new feature that is common in eg. R packages for the same purpose which is to have the option to match without replacement. The downsides to this can be slightly worse matching overall as well as possible order effects - however for some types of analyses you really want to have unique subjects in each group. Now the user has the choice to make that decision! :)

I've also implemented (line 189) a randomization for the order in which the matching proceeds. This is so that you can check for said ordering effects, and e.g. run it a couple of times until the matching is at a desirable level.

Please let me know if i can make anything more clear. it's my first GH pull request so i hope i am following the right protocol.

All the best!

Tristan

tlooden avatar Aug 07 '19 13:08 tlooden

same as #13 , this somehow does not work correctly, or I did something wrong?

from pymatch.Matcher import Matcher
import pandas as pd

cases_ages =[23, 21, 26, 25, 23, 44, 24, 22, 46, 26]
controls_ages = [34, 30, 24, 25, 25, 27, 30, 33, 53, 27, 26, 28, 23, 23, 28, 23, 24, 22, 23, 25]
cases_group = [1 for _ in range(len(cases_ages))]
controls_group = [0 for _ in range(len(cases_ages))]

df_cases = pd.DataFrame(list(zip(cases_ages, cases_group )), columns=['age', 'group'])
df_controls = pd.DataFrame(list(zip(controls_ages, controls_group )), columns=['age', 'group'])

m = Matcher(df_cases , df_controls , yvar='group')
m.fit_scores(balance=True, nmodels=100)
m.match(method='min', nmatches=1, with_replacement=False)
print(m.matched_data)
# only 4 matches are found?

skjerns avatar May 17 '20 10:05 skjerns

@tlooden Thank you for this feature :)

harveyaa avatar Apr 28 '22 17:04 harveyaa