patsy icon indicating copy to clipboard operation
patsy copied to clipboard

Less awkward API for using simple stateful transforms?

Open shoyer opened this issue 10 years ago • 0 comments

I found it awkward to use the syntax for using stateful transforms, as shown by the tutorial example:

>>> build_design_matrices([mat.design_info.builder], new_data)[0]

Two reasons:

  1. Understanding the full expression entails a deep dive into patsy's API.
  2. As a code reviewer, I also worry when I see things like [0] at the end of an expression because it looks like data might be being thrown away. (I suppose one solution to this is to do explicit assignment like new_mat, = ....)

So instead, I wrote a helper function:

def updated_design_matrix(design_matrix, data, NA_action='drop'):
    """Shortcut to ``build_design_matrices`` with the builder from
    ``design_matrix.design_info.builder``
    """
    if have_pandas and isinstance(design_matrix, pandas.DataFrame):
        return_type = 'dataframe'
    else:
        return_type = 'matrix'
    return build_design_matrices([design_matrix.design_info.builder], data,
                                 NA_action, return_type, design_matrix.dtype)[0]

This lets me write this instead:

>>> updated_design_matrix(mat, new_data)

...which looks much closer to the "high level" syntax.

Does something like this belong in core patsy?

Note: we will need a similarly named (but probably not the same) function to handle the "update" syntax of . (note #28).

P.S. In case it isn't obvious, it has been a pleasure for me to discover and use patsy over the past few weeks :).

shoyer avatar Oct 30 '13 07:10 shoyer