splink
splink copied to clipboard
Profiling upgrades/fixes
Wrapping up a few other issues:
- [x] #969
- [ ] #856
- [ ] #131
And suggesting more:
-
Profiling continuous/date columns (i.e. distributions rather than sorted histograms) - Dates are currently counted and sorted in order of frequency rather than date. Would instead be useful to identify a peak or min/max range of plausible dates. Likewise, fields like "age" or "height" where individual value counts are less useful than the overall distribution. Might also help to include some measures of average/spread of the distribution, where appropriate.