contracts icon indicating copy to clipboard operation
contracts copied to clipboard

Feature Request: Ability to provide detailed failure message for @require and @ensure

Open kot-behemoth opened this issue 5 years ago • 0 comments

Hi deadpixi,

First of all, just want to tell you that this library has been immensely helpful - small, simple, and works great. Awesome way to introduce great ideas into projects!

However, I found one thing that I feel could improve the user experience, and could lead to a more widespread usage of this library within Python data community.

In short, sometimes a generic message that a particular contract failed might not be enough. Usually, there will be some non-trivial amount of work to do to figure out the root of the problem. Therefore, I think if there was some kind of mechanism to parameterise message upon failure, it could improve the usefulness of the contracts further.

Here's a concrete example. Say, I'm writing some kind of data-transformation functions on DataFrames using Pandas (a very common task in Python data world). I want to codify the requirement that certain columns - say, a and c - need to be present in the input DataFrame.

I'm using Python 3.6.5, Pandas 0.23.4, but this should work on 3.6+, and any non-ancient Pandas.

The current way of implementing this:

from dpcontracts import require

# create our simple DataFrame with two columns - `a` and `b`.
df = pd.DataFrame([dict(a=1, b=2)])

def cols_present_simple(required_cols: set, cols):
    """Check if all columns in `required_cols` are present in `cols`."""
    return all([col in cols for col in required_cols])

@require('Certain columns need to be present', lambda a: cols_present_simple({'a', 'c'}, a.df.columns))
def func_simple(df):
    # do stuff
    pass

func_simple(df)

This errors out with PreconditionError: Certain columns need to be present, which is great since we know the contract failed before the function is run. But now it's up to us to figure out which columns are not present. In this case it's trivial, but in the real-world it's normal to see DataFrames with hundreds of columns, and requirements including tens of columns.

This is an approach inspired by engarde library which is similar, but gives a clear, actionable example of what exactly has broken the contract - so the user doesn't have to do any extra work.

def cols_present_with_msg(required_cols: set, cols: set):
    """Check if all columns in `required_cols` are present in `cols`. If not, raise an assertion erro"""
    try:
        assert required_cols.issubset(cols)
        return True
    except AssertionError as e:
        import sys
        from dpcontracts import PreconditionError
        missing_cols = required_cols - cols
        e.args = [f"These columns are missing: {missing_cols}"]
        raise PreconditionError from e

@require('Certain columns need to be present', lambda a: cols_present_with_msg({'a', 'c'}, set(a.df.columns)))
def func_with_msg(df):
    # do stuff
    pass

func_with_msg(df)

This fails with AssertionError: These columns are missing: {'c'}, so it's obvious what needs fixing.

I should say that this idea came from using this library in a data-heavy context, so I would completely understand if this is out of scope for this library. The code was purely to illustrate the idea, and is most definitely not indicative of how it could be implemented.

kot-behemoth avatar Nov 13 '18 11:11 kot-behemoth