soda-sql icon indicating copy to clipboard operation
soda-sql copied to clipboard

[Feature] Add support for boolean columns

Open HansBambel opened this issue 3 years ago • 4 comments

Is your feature request related to a problem? Please describe. Currently boolean columns are unsupported. I was not able to find a reason for this in the docs. Is there a particular reason why this is not supported?

Describe the solution you'd like Have metrics that are able to use boolean columns. Right now I am using a custom sql_metric such as:

SUM(CASE WHEN is_clean = true THEN 1 ELSE 0 END) as clean_amount,
SUM(CASE WHEN is_clean = true THEN 1 ELSE 0 END)/COUNT(*)::float as clean_percent

Soda SQL Version: 2.1.0b18

HansBambel avatar Nov 08 '21 08:11 HansBambel

Hi Hans,

I can't see what kind of metrics we can have on boolean columns. One thing I can think of is count of true/false. Are there any other aggregations you can think of here?

vijaykiran avatar Nov 19 '21 08:11 vijaykiran

Yes, I could think of the count as you mentioned and also the percentage of trueness.

For example we are having a column for a cleanly parsed document called is_clean. The metric for us would be that this column clean_percent is always above a certain threshold:

tests:
    - clean_percent > 80

HansBambel avatar Nov 19 '21 10:11 HansBambel

Okay. Got it. I think count and percentage go hand in hand.

vijaykiran avatar Nov 19 '21 10:11 vijaykiran

This is a good idea, I am currently using custom metrics here as well. There is the additional issue that sometimes we see different representations for True and False. ('T', 'true', 'True', 1, '1', etc). Also with respect to the percentages, one thing to think about is how you factor in the entries that are invalid or null. So if you have 10 values, 2 are invalid, 4 are true and 4 are false, is that 40% true? Or is it the percent of the valid values which is 50%?

mistynodine avatar Dec 01 '21 18:12 mistynodine