healthcareai-py
healthcareai-py copied to clipboard
Add Pandas-Profiling data profiler
Include pandas-profiling
Desired User Experience (Open to Feedback)
- A user (who presumably has data in a dataframe) instantiates an instance of
trainer = SupervisedModelTrainer(...all the args)
- The user can then call the method
trainer.data_profile_report()
(or better more logical name) - This method calls pandas_profiler and creates the report. By default it should save the html report with an ISO 8601 timestamp as the name. For example
profile_report_2017-12-10T05-33-53.html
- This method should print out to the console the name of the file that was saved and the full path to the file.
- If the user specifies a
filename=
argument, the method should save that accordingly.
Other Notes
- Document this new method (including a small interesting screenshot) in the Profiling Your Data document.
- Because this profile report is so awesome we should credit it in our readme (or other suitable place).
- It would be nice if the data profiler can be called after importing healthcareai without requiring the user to instantiate a trainer.
- Create a simple test to ensure that an html document is saved when calling the method. If there is anything else reasonable to test (without testing the profiler itself), feel free to do so.
Is there code-sharing potential to #420 ?