ekuiper icon indicating copy to clipboard operation
ekuiper copied to clipboard

Why do the stdev and var aggregate functions not ignore null values like sum, min, max, and avg ?

Open ngjaying opened this issue 1 year ago • 9 comments

Discussed in https://github.com/lf-edge/ekuiper/discussions/2589

Originally posted by EscanorUt January 26, 2024 Hello,

I've noticed that while using eKuiper, the stdev and var aggregate functions seem to consider null values in their calculations, unlike other aggregate functions like sum, min, max, and avg which ignore null values. Can anyone shed light on the reason behind this behavior and whether there are any workarounds or alternative approaches to handle null values with stdev and var functions in eKuiper

Thank you

ngjaying avatar Jan 26 '24 13:01 ngjaying

Hello @ngjaying sir , I am Prabal Pratap Singh Rathore , second - year student of Btech in Artificial Intelligence and Data Science. I am good at python and several libraries with experience in ML and DL with keras, tensorflow and Pytorch. I want to look into this issue , I am currently exploring eKuiper , So please assign this Good First Issue to me .

BNNARAJ avatar May 10 '24 06:05 BNNARAJ

we have to see how this function stdev and var are implemented internally because aggregate functions exclude null values for calculation but for these function involves stastical calculation that can be mislead by null values. So, please can you navigate to the directory or code file to see how they are present in code.

BNNARAJ avatar May 11 '24 10:05 BNNARAJ

@BNNARAJ Sorry for the late response. The functions are in https://github.com/lf-edge/ekuiper/blob/master/internal/binder/function/funcs_agg.go. Actually, you can do a search in the codebase to find it next time.

ngjaying avatar May 23 '24 01:05 ngjaying

Thank you sir

BNNARAJ avatar May 24 '24 07:05 BNNARAJ

Hello @ngjaying sir, In the definition of the stddev and var function, we can see the cast function which is reconstructing the float64slice and removing null values "float64Slice, err := cast.ToFloat64Slice(arg0, cast.CONVERT_SAMEKIND, cast.IGNORE_NIL)", so it looks like it does take null values but while calculating it is reconstructing the input array while removing null values by using cast.IGNORE_NIL . As I am a beginner please ensure that I am following the correct understanding and approach.

BNNARAJ avatar May 25 '24 13:05 BNNARAJ

Hi @BNNARAJ, Maybe it was fixed. Could you try to add a ut case to confirm that? If that's the case, we can push the test case as a PR to close this issue. Thanks!

ngjaying avatar May 28 '24 08:05 ngjaying

Hello @ngjaying Sir , I saw there is a test case for that and I added one more nil input and comment that test case for better recognition and made a PR.

BNNARAJ avatar May 29 '24 06:05 BNNARAJ

@EscanorUt Looks like there is no problem for null values. Do you still encounter that issue? If so, could you please provide a test case?

ngjaying avatar May 29 '24 09:05 ngjaying

Hello @ngjaying This issue was fixed in https://github.com/lf-edge/ekuiper/pull/2748 so you can close it. Thank you

BlancoMY avatar May 30 '24 07:05 BlancoMY