dataMaid icon indicating copy to clipboard operation
dataMaid copied to clipboard

Provide an executive summary in the reports

Open richierocks opened this issue 8 years ago • 5 comments

It would be useful to have an executive summary page near the start of the report generated by clean that provides an overview of any problems found. (This is particularly useful for larger datasets.)

This summary could contain

  • How many columns of each type the dataset contains.
  • The distribution of the fraction of missing values for each column.
  • The names of the top 5 most problematic columns.

richierocks avatar Mar 27 '17 06:03 richierocks

Something along those lines were added about a week ago. If you have suggestions/ideas to include in the table then please let us know.

ekstroem avatar Mar 27 '17 07:03 ekstroem

I've just installed the development version, and I see that the number of rows and columns in the dataset are shown, along with a table of which checks were performed.

The point of an executive summary would be to minimize the time for readers to find the big problems with their dataset. So a table of the top 5 columns ordered by most rows failing their checks would be useful.

You could also look at individual checks, and see which columns have the most rows failing that particular check. For example, the top 5 rows failing the "missing values" check.

Hyperlinks to the sections describing those columns would be a bonus.

richierocks avatar Mar 29 '17 14:03 richierocks

Oh, hang on, I see that there is a summary table with missing values by column now too.

I want something like this, but ordered in decreasing number of missing values. And the same for other things like number of duplicates.

richierocks avatar Mar 29 '17 14:03 richierocks

The names in the table are already hyperlinks to the relevant space in the document but aren't currently underlined. For html output it might be an idea to have this table as an html datatable where the user an sort the table according to different columns.

ekstroem avatar Mar 29 '17 20:03 ekstroem

I looked a bit into trying to include this table as a DT widget, so that might be a path to pursue. However, inserting widgets wasn't working right out of the box so this might take some extra time to get implemented.

ekstroem avatar Oct 03 '18 10:10 ekstroem