soda-sql
soda-sql copied to clipboard
[Feature] Add support for boolean columns
Is your feature request related to a problem? Please describe. Currently boolean columns are unsupported. I was not able to find a reason for this in the docs. Is there a particular reason why this is not supported?
Describe the solution you'd like Have metrics that are able to use boolean columns. Right now I am using a custom sql_metric such as:
SUM(CASE WHEN is_clean = true THEN 1 ELSE 0 END) as clean_amount,
SUM(CASE WHEN is_clean = true THEN 1 ELSE 0 END)/COUNT(*)::float as clean_percent
Soda SQL Version: 2.1.0b18
Hi Hans,
I can't see what kind of metrics we can have on boolean columns. One thing I can think of is count of true/false. Are there any other aggregations you can think of here?
Yes, I could think of the count as you mentioned and also the percentage of trueness.
For example we are having a column for a cleanly parsed document called is_clean
. The metric for us would be that this column clean_percent
is always above a certain threshold:
tests:
- clean_percent > 80
Okay. Got it. I think count and percentage go hand in hand.
This is a good idea, I am currently using custom metrics here as well. There is the additional issue that sometimes we see different representations for True and False. ('T', 'true', 'True', 1, '1', etc). Also with respect to the percentages, one thing to think about is how you factor in the entries that are invalid or null. So if you have 10 values, 2 are invalid, 4 are true and 4 are false, is that 40% true? Or is it the percent of the valid values which is 50%?