pandas icon indicating copy to clipboard operation
pandas copied to clipboard

ENH: Are there any possibility to add the support for static type hint of the DataFrame and Series ?

Open biosunsci opened this issue 1 year ago • 0 comments

Feature Type

  • [X] Adding new functionality to pandas

  • [ ] Changing existing functionality in pandas

  • [ ] Removing existing functionality in pandas

Problem Description

New Feature Wanted

We Know that mypy and typing now support dict type hint by

from typing import TypedDict, Optional, Literal
class OverlapsDict(TypedDict):
    id: int
    seq_id: int
    pr_order: int
    pos1: int
    seq: str    
    pos2: int
    seq_len: int
    repeat_info: float # exactly, should be np.nan
    repeat_type: float # exactly, should be np.nan
    item_type: Literal['overlap']
    devmode: str
    update_time: str

we can use OverlapsDict to restrict dict parameters like

def myfunc(a:OverlapsDict):
    pass

Can the DataFrame and Series type also support a type hint like this, which will check that the DataFrame.columns or Series.index has the certain columns and their values are in certain dtypes.

Feature Description

is it possible to achieve a new type class TypedDataFrame or make DataFrame or Seires Generic TypeVar which can be used as the following code?

# define
class OverlapsDataFrame(TypedDataFrame):
    id: int
    seq_id: int
    pr_order: int
    pos1: int
    seq: str    
    pos2: int
    seq_len: int
    repeat_info: float # exactly, should be np.nan
    repeat_type: float # exactly, should be np.nan
    item_type: Literal['overlap']
    devmode: str
    update_time: str
# usage
def myfunc2(a:OverlapsDataFrame):
    pass

and / or

def myfunc2(a:pd.DataFrame[OverlapsDict]):
    pass

After the definition, We can restrict DataFrame parameters with the OverlapsDataFrame. The constraint in the example is a must be a DataFrame and has columns of certain names with certain types defined by OverlapsDataFrame or pd.DataFrame[OverlapsDict]

Alternative Solutions

If both forms can be added is the best, either is OK.

Additional Context

No response

biosunsci avatar Oct 15 '24 12:10 biosunsci