ratatool icon indicating copy to clipboard operation
ratatool copied to clipboard

[WIP] Descriptive statistics

Open andrusha opened this issue 5 years ago • 3 comments

Hackday project in need of feedback.

Working with data you're interested in:

  • Shape of data, which could be your schema
  • General case, which is addressed by bigSampler
  • Edge cases, which this PR tries to tackle

It's inspired by summary from R.

Todo:

  • [ ] Property and unit tests
  • [ ] Support booleans
  • [ ] Support floating point numbers
  • [ ] Support for different formats (protobuf)?

andrusha avatar Aug 30 '19 15:08 andrusha

Hey, thanks for taking the initiative! We have some internal stuff that overlaps a bit. I've been thinking a lot about the future of that and might be good to make sure we're on the same page.

idreeskhan avatar Sep 03 '19 21:09 idreeskhan

@idreeskhan may I ask, were data profiling tools opensourced since then?

andrusha avatar Jun 11 '21 13:06 andrusha

Sorry this comment got lost in email while I was on vacation back in June. They have not been open sourced but we are hesitant to merge this in. Internally the data profiling tools fit our needs and if we merge this it means we are taking over maintenance which we don't really want to do at the moment.

idreeskhan avatar Aug 30 '21 14:08 idreeskhan