ydata-profiling icon indicating copy to clipboard operation
ydata-profiling copied to clipboard

[Bug Report] ValueError: NaTType does not support strftime

Open yuzeh opened this issue 2 years ago • 7 comments

Current Behaviour

rendering a ProfileReport with tsmode=True crashes rendering timeseries gaps; The stack trace leads us to _render_gap_tab.

Expected Behaviour

see code

Data Description

see code

Code that reproduces the bug

import pandas as pd
import numpy as np
from ydata_profiling import ProfileReport

df = pd.DataFrame({"dt": pd.date_range(pd.to_datetime("2023-01-01"), pd.to_datetime("2023-02-01")), "y": np.arange(32)})
profile = ProfileReport(
    df,
    tsmode=True,
    sortby="dt",
    type_schema={
        "dt": "datetime",
        "y": "timeseries",
    },
)
profile.widgets

pandas-profiling version

v4.5.1

Dependencies

pandas==2.0.3

OS

No response

Checklist

  • [X] There is not yet another bug report for this issue in the issue tracker
  • [X] The problem is reproducible from this bug report. This guide can help to craft a minimal bug report.
  • [X] The issue has not been resolved by the entries listed under Common Issues.

yuzeh avatar Sep 03 '23 00:09 yuzeh

+1 on this I am experiencing the same issue.

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 477 entries, 0 to 476
Data columns (total 6 columns):
 #   Column        Non-Null Count  Dtype         
---  ------        --------------  -----         
 0   DT            477 non-null    datetime64[ns]
 1   CHANNEL       477 non-null    object        
 2   IMPRESSIONS   477 non-null    int64         
 3   CLICKS        477 non-null    int64         
 4   CONVERSIONS   477 non-null    int64         
 5   AD_SPEND_USD  477 non-null    float64       
dtypes: datetime64[ns](1), float64(1), int64(3), object(1)
memory usage: 22.5+ KB

profile = ProfileReport(
    site_df,
    tsmode=True,
    type_schema=type_schema,
    sortby="DT",
    title="Time-Series EDA for a channel",
)

profile.to_file("report_timeseries.html")

Getting the same error as @yuzeh ValueError: NaTType does not support strftime

priamai avatar Sep 03 '23 23:09 priamai

+1 On also having this issue, i don't have time for a fix so i patched a workaround

if you are desparate this will bypass the issue:

Change: ydata_profiling/report/formatters.py function fmt_numeric starting at line 236 from

@list_args
def fmt_numeric(value: float, precision: int = 10) -> str:
    """Format any numeric value.

    Args:
        value: The numeric value to format.
        precision: The numeric precision

    Returns:
        The numeric value with the given precision.
    """

    fmtted = f"{{:.{precision}g}}".format(value)
   
    for v in ["e+", "e-"]:
        if v in fmtted:
            sign = "-" if v in "e-" else ""
            fmtted = fmtted.replace(v, " × 10<sup>") + "</sup>"
            fmtted = fmtted.replace("<sup>0", "<sup>")
            fmtted = fmtted.replace("<sup>", f"<sup>{sign}")

    return fmtted

Patched version, consequences unknown

@list_args
def fmt_numeric(value: float, precision: int = 10) -> str:
    """Format any numeric value.

    Args:
        value: The numeric value to format.
        precision: The numeric precision

    Returns:
        The numeric value with the given precision.
    """
    fmtted = None
    try:
        fmtted = f"{{:.{precision}g}}".format(value)
    except Exception as e:
        fmtted = str(value)+'e+1'

    for v in ["e+", "e-"]:
        if v in fmtted:
            sign = "-" if v in "e-" else ""
            fmtted = fmtted.replace(v, " × 10<sup>") + "</sup>"
            fmtted = fmtted.replace("<sup>0", "<sup>")
            fmtted = fmtted.replace("<sup>", f"<sup>{sign}")

    return fmtted

I think it might get to fmt_numeric wrongly though as the stack trace falls out the "else" catch all of one of the fmt_time functions... happy bug hunting

kylelt avatar Sep 13 '23 14:09 kylelt

Hi @yuzeh

thank for creating this issue. Indeed it seems something that only happens for pandas version bigger than 2. I've added to the backlog of tasks for the next package release.

@kylelt would you be open to contribute with a PR?

fabclmnt avatar Sep 19 '23 01:09 fabclmnt

thank for creating this issue. Indeed it seems something that only happens for pandas version bigger than 2. I've added to the backlog of tasks for the next package release.

@kylelt would you be open to contribute with a PR?

Yeah, the eta will be early december though as far as availability.

kylelt avatar Sep 20 '23 08:09 kylelt

Hi @yuzeh

thank for creating this issue. Indeed it seems something that only happens for pandas version bigger than 2. I've added to the backlog of tasks for the next package release.

@kylelt would you be open to contribute with a PR?

This is not isolated to just Pandas >= 2 .. I tested with pandas == 1.5.3 and I see the same error arise

mritonia avatar Oct 05 '23 13:10 mritonia

Hi @yuzeh thank for creating this issue. Indeed it seems something that only happens for pandas version bigger than 2. I've added to the backlog of tasks for the next package release. @kylelt would you be open to contribute with a PR?

This is not isolated to just Pandas >= 2 .. I tested with pandas == 1.5.3 and I see the same error arise

We have been using pandas version < 2 and weren't able to reproduce this error. Might be useful if you can share more details on your environment (python version, packages version, etc.)

fabclmnt avatar Oct 09 '23 13:10 fabclmnt

Running into this issue as well. Is there a fix on the horizon?

jmrichardson avatar Dec 28 '23 22:12 jmrichardson