Cardea
Cardea copied to clipboard
Compose migration
Prediction Engineering
How to use compose
to write the problem definition
component in cardea.
Compose is a machine learning tool for automated prediction engineering. It allows you to structure prediction problems and generate labels for supervised learning. We can use compose
to search for the cutoff times for a specific prediction problem (e.g. los) and return label_times
.
The component should be easily adaptable to support multiple prediction problems:
- appointment no show
- mortality prediction
- length of stay
- etc
Design
There are two main parts that we need to define:
- Class with main function of generating label times
- Functions defining the prediction problem in mind
- We also require helper functions to create the prediction problem
Design of data_laber.py
class DataLabeler:
"""Class that defines the prediction problem.
This class supports the generation of `label_times` which
is fundamental to the feature generation phase as well
as specifying the target labels.
Args:
function (method):
function that defines the labeling function, it should return a
tuple of labeling function, the dataframe, and the name of the
target entity.
"""
def __init__(self, function):
self.function = function
def generate_label_times(self, es, *args, **kwargs):
"""Searches the data to calculate label times.
Args:
df (pandas.DataFrame):
Data frame to search and extract labels.
Returns:
composeml.LabelTimes:
Calculated labels with cutoff times.
"""
pass
Design of a prediction function (e.g. appointment_no_show.py
)
def appointment_no_show(es):
def missed(ds, **kwargs):
return True if 'noshow' in ds["status"].values else False
meta = {
# values to define prediction task
"entity": "appointment",
"time_index": "created",
"type": "classification",
"num_examples_per_instance": 1
}
df = denormalize(es, entities=['Appointment'])
return missed, df, meta