fugue
fugue copied to clipboard
[FEATURE] Improve error messages around schema
Is your feature request related to a problem? Please describe.
When creating a transformer
, the error message for invalid annotations is currently very vague. The following example:
from typing import List
from fugue import transform
import pandas as pd
def map_phone_to_location(df: List[List]) -> List[List]:
for row in df:
row.append(["test"])
return df
data = pd.DataFrame({"phone": ["(217)-123-4567", "(217)-234-5678", "(407)-123-4567",
"(407)-234-5678", "(510)-123-4567"]})
transform(data, map_phone_to_location, schema="phone:str, new_col:str")
produces an error like:
FugueInterfacelessError: ('<function map_phone_to_location at 0x7fd1894c5b00> is not a valid transformer', TypeError("Input types not valid IndexedOrderedDict([('df', [Other])]) for <function map_phone_to_location at 0x7fd1894c5b00>"))
Describe the solution you'd like
Instead, I think this error should say something like:
"Detected type "List[List]". The valid type annotations are: ...."
To add to this, missing type annotations also give an unhelpful error:
def clip(df: pd.DataFrame):
df['value'] = df['value'].clip(1,2)
return df
Gives:
FugueInterfacelessError: ('<function clip at 0x7fe34365c710> is not a valid transformer', FugueInterfacelessError("* can't be used on cotransformer output schema"))