Loading report without access to the original dataframe
Missing functionality
Proposed feature
I am computing the profiles of several hundreds of csv files. These files are not changing anytime soon and accessing is a little bit cumbersome. In this context, saving a profile via the dump() method is ideal for caching. That said, it seems to me that I need the original dataframe when using the load() or loads() methods, which is cumbersome. Would it be possible to load the profile report without the need of the original dataset file?
Alternatives considered
Loading a binary report without the need of the original dataset. Additional context
Should be (made) possible. @psorianom are you interested in trying to contribute a PR?
Yes of course! I will look into it and propose a PR.
I see that currently we are able to dump and load profile reports without needing the dataframe. But its not possible to use the loaded profile to be used in compare method.
i am dumping last month profile and comparing with this months profile. This months profile have the dataframe, but last months loaded profile doesnt have dataframe . Can we compare without dataframe? Any alternatives? please advice..
getting below error:
compare_reports.py in validate_reports(reports, configs) 185 is_df_available = [r.df is not None for r in reports] # type: ignore 186 if not all(is_df_available): --> 187 raise ValueError("Reports where not initialized with a DataFrame.") 188 189 if isinstance(reports[0], ProfileReport):
ValueError: Reports where not initialized with a DataFrame.
@sbrugman , @psorianom