badfish
badfish copied to clipboard
Plotting missing values
Code run: mf.plot(kind='pattern', norm = False, threshold=0.0)
This is difficult to interpret. An ideal arrangement of the rows would be as shown here:
.
I am not sure if I can spot the difference is. R's pattern plot also seems to be sorted with counts. Are you suggesting I should add the perpendicular barchart for counts?
It's a good idea! Not so straight-forward. Would you like to send a PR?
Should also increase the fontsize of the counts on the left.
The difference I was pointing at- (i) if you can remove the columns that dont have missing values from the graph, since it just gives more to look at and serves no purpose. (ii) If Ozone and Temp both had 8 missingvalues together, then Temp should come before 'Ozone and Temp'. In our case, it seems Ozone and Temp missing is 8 while just Temp is 6. Is that possible? Am I getting smth wrong? (iii) Would it be possible to sort the columns based on highest number of missing values? So Ozone would come first, Temp perhaps second etc.
Note- the dataset used by both images is different.
Should the barchart be shown with this or keep it as a separate function? Whichever is functionally more useful.