ydata-profiling icon indicating copy to clipboard operation
ydata-profiling copied to clipboard

Loading report without access to the original dataframe

Open psorianom opened this issue 4 years ago • 4 comments

Missing functionality

Proposed feature

I am computing the profiles of several hundreds of csv files. These files are not changing anytime soon and accessing is a little bit cumbersome. In this context, saving a profile via the dump() method is ideal for caching. That said, it seems to me that I need the original dataframe when using the load() or loads() methods, which is cumbersome. Would it be possible to load the profile report without the need of the original dataset file? Alternatives considered

Loading a binary report without the need of the original dataset. Additional context

psorianom avatar Feb 18 '21 16:02 psorianom

Should be (made) possible. @psorianom are you interested in trying to contribute a PR?

sbrugman avatar Feb 22 '21 17:02 sbrugman

Yes of course! I will look into it and propose a PR.

psorianom avatar Feb 22 '21 21:02 psorianom

I see that currently we are able to dump and load profile reports without needing the dataframe. But its not possible to use the loaded profile to be used in compare method.

i am dumping last month profile and comparing with this months profile. This months profile have the dataframe, but last months loaded profile doesnt have dataframe . Can we compare without dataframe? Any alternatives? please advice..

getting below error:

compare_reports.py in validate_reports(reports, configs) 185 is_df_available = [r.df is not None for r in reports] # type: ignore 186 if not all(is_df_available): --> 187 raise ValueError("Reports where not initialized with a DataFrame.") 188 189 if isinstance(reports[0], ProfileReport):

ValueError: Reports where not initialized with a DataFrame.

Ananthbabu86 avatar Sep 21 '23 06:09 Ananthbabu86

@sbrugman , @psorianom

Ananthbabu86 avatar Sep 21 '23 06:09 Ananthbabu86