posteriordb icon indicating copy to clipboard operation
posteriordb copied to clipboard

Filtering API for python

Open eerolinna opened this issue 5 years ago • 0 comments

The R filtering API is like

pos <- filter_posteriors(my_pdb, data_name == "eight_schools")

The same exact API cannot be done in python*. I propose this instead

pos = my_pdb.filter_posteriors(lambda posterior: posterior.data.name == "eight_schools")

In other words, filter_posteriors takes a function that takes a posterior object and returns a bool.

The function doesn't have to be given inline, the following is valid too

def filter_function(posterior):
    return posterior.data.name == "eight_schools"
pos = my_pdb.filter_posteriors(filter_function)

Return value of filter_posteriors is a list of posterior objects.

I also propose adding filter_models and filter_data that are equivalent to filter_posteriors but act instead on models or data.

The following query finds the posteriors where model has keyword bda3_example

filtered_models = my_pdb.filter_models(lambda model: "bda3_example" in model.information["keywords"])
filtered_posteriors = my_pdb.filter_posteriors(lambda posterior: posterior.model in filtered_models)

[*] This is because python doesn't support unevaluated expressions like data_name == "eight_schools"

eerolinna avatar Jan 16 '20 13:01 eerolinna