mypy icon indicating copy to clipboard operation
mypy copied to clipboard

Static type check for DataFrame types

Open biosunsci opened this issue 4 months ago • 0 comments

Feature

We Know that mypy and typing now support

from typing import TypedDict, Optional, Literal
class OverlapsDict(TypedDict):
    id: int
    seq_id: int
    pr_order: int
    pos1: int
    seq: str    
    pos2: int
    seq_len: int
    repeat_info: float # exactly, should be np.nan
    repeat_type: float # exactly, should be np.nan
    item_type: Literal['overlap']
    devmode: str
    update_time: str

we can use OverlapsDict to restrict dict parameters like

def myfunc(a:OverlapsDict):
    pass

but in a lot of Data Science senerios, we need this parameter to be a DataFrame with certain columns in certain dtypes. is it possible to achieve a new type class TypedDataFrame which can be used as the following code?

class OverlapsDataFrame(TypedDataFrame):
    id: int
    seq_id: int
    pr_order: int
    pos1: int
    seq: str    
    pos2: int
    seq_len: int
    repeat_info: float # exactly, should be np.nan
    repeat_type: float # exactly, should be np.nan
    item_type: Literal['overlap']
    devmode: str
    update_time: str

and can restrict DataFrame parameters with the OverlapsDataFrame

def myfunc2(a:OverlapsDataFrame):
    pass

constraint is a must be a DataFrame and has columns of certain names with certain types defined by OverlapsDataFrame

Pitch

biosunsci avatar Oct 14 '24 02:10 biosunsci