pinot icon indicating copy to clipboard operation
pinot copied to clipboard

Add functions for statistical analysis in SQL

Open jasperjiaguo opened this issue 2 years ago • 3 comments

As discussed with @siddharthteotia, consider adding some common statistical analysis methods SQL language.

Few examples:

  1. Pearson's coefficient
  2. Sampling (bernoulli/stratified)
  3. Histogram
  4. Entropy
  5. Linear regression
  6. Logistic regression
  7. SVM

jasperjiaguo avatar Apr 08 '22 05:04 jasperjiaguo

Designing the one request - multiple (sequential) queries model for statistical functions. Planning to use mini-batch stochastic gradient descent for regression algorithms 2. 3. 4.

jasperjiaguo avatar Apr 08 '22 05:04 jasperjiaguo

Supporting histogram, entropy like computations could also be potentially useful

siddharthteotia avatar Apr 08 '22 05:04 siddharthteotia

Anyone working for supporting Sampling? Do we know how much effort is it going to be ? will it be few days or weeks?

@jasperjiaguo @siddharthteotia

shahharshil46 avatar Apr 04 '24 18:04 shahharshil46