questdb
questdb copied to clipboard
coalesce returns null on result from division by zero
TLDR Update: in aggregate functions (AVG / SUM / MAX / etc.), QDB internally handles -Infinity and Infinity values as nulls and do not calculate them into the final result. Column with 1000 Infinity values and two values of 1,3 will have the AVG of 2. which is wrong.
I see now behavior is changed and its actually returning null when dividing by zero. which can be treaded with a simple coalesce.
Describe the bug
coalesce should convert the nulls to something else.. in this case 0 but the result is all nulls.
I know some of my tickers may seem weird but this is a real scenario that actually happened to me on my server, I have just reproduced it on demo for your convenience.
To reproduce
try on demo questdb
SELECT *,A/G,coalesce(A/G,0) FROM (SELECT galon_price A,coalesce((galon_price1),0) G FROM (SELECT * FROM gas_prices b lt join (SELECT * FROM gas_prices a) on timestamp))
Expected Behavior
result should be 0
Environment
- **QuestDB version**: demo
- **OS**: ?
- **Browser**: chrome
Additional context
No response
thanks @superichmann for reporting the issue and for the reproducer.
I used your reproducer to create a simpler one:
select coalesce(1/n, 0) from (
select 0::double as n from long_sequence(1)
);
I reckon it's related to the strategy to find the common type between double and long.
ok, it's not due to typecasting. I assume you are using the REST interface. Is that so?
Here is what is going on:
- Floating point division by
0results inInfinityas defined by IEEE 754. Infinityis notnullhence thecoalesce()function also returnsInfinity.- QuestDB REST interface uses JSON to serialize query results, but JSON as defined by ECMA-404 does not know
Infinity. This is what the JSON spec says:
Numeric values that cannot be represented as sequences of digits (such as Infinity and NaN) are not permitted
Hence QuestDB converts all Infinity values to null when serializing query results to JSON.
This is not ideal, but I am afraid the alternative is to use strings to represent double/float values, which is not great either.
As a workaround you can cast the result of the coalesce() function to string in your query:
SELECT *,A/G,coalesce(A/G,0)::string
FROM (
SELECT galon_price A,coalesce((galon_price1),0) G
FROM (SELECT * FROM gas_prices b lt join (SELECT * FROM gas_prices a) on timestamp)
);
Would this work for you?
I am using webui for my tests so same as demo.questdb.com. I have noticed your fix suggestion but the problem starts when I need to compare and convert all kinds of results to Infinity and -Infinity (some results are null as well but actually are -Infinity) so my SELECTs are too complex, such as:
SELECT coalesce(nullif(((COLUMN_1))::string,'Infinity')::double,99999) only to get one double value.. and as well I need to handle -Infinity
I have noticed that a conversion from double to INT gets back INT.MAX(2147483647), so why not do the same for double? see demo example:
SELECT *,(A/G)::char,(A/G)::int,(A/G)::string,coalesce(A/G,0),A/G FROM (SELECT galon_price A,coalesce((galon_price1),0) G FROM (SELECT * FROM gas_prices b lt join (SELECT * FROM gas_prices a) on timestamp))
hello @superichmann,
I'm not sure I understand your last message. You can work with double (and thus Infinity) in all layers of your query. You only cast from double to string at the outermost projection - right before QuestDB converts it to JSON - as that's the only part where Infinity is converted to null as I explained above.
You wrote:
I have noticed that a conversion from double to INT gets back INT.MAX(2147483647), so why not do the same for double?
I am not sure if I understand it well. Are you proposing the REST interface should not serialize Infinity as null, but as some kind of sentinel value like Integer.MAX_VALUE or Long.MAX_VALUE? Which one would you choose and why?
fwiw, in JavaScript JSON.stringify(Infinity) also returns null. I take it as an indication the current JSON serialization behaviour is not completely baseless:
- questdb serializes
Infinitytonullby default - as a user you can use explicit casting (to string, long, up to you) if you need a different behaviour.
to be honest I have no idea I just showing something I saw on WEBUI. As far as I see it if a value is not null then it should not be presented to the user as null.
You have much more understanding then me about it so feel free to handle this issue as you see :]
well, that's the limitation of the JSON format. It does not allow numeric values which cannot be represented as sequences of digits. so there is no good option here - it's all about trade-offs.
I'm closing this ticket for now, we might revisit this in the future. if anyone feels strongly about this then feel free to add +1 or re-open the ticket.
Hi,
Maybe I am still missing something but the main thing I am trying to say is that functions like AVG or SUM ignore nulls - ignoring missing data
I think that the internal mechanisms in QDB treat these Infinity values as null values and not calculating them, instead of calculating them as infinity or as a very high number.
Could you check this on your side? is this the expected behavior? the end result is that queries return unrealistic values according to the calculation.
I might as well not be using the correct aggregate function, please correct me if I am wrong (ksum nsum sum) maybe I need to use one of these.
I mean, you really don't think its weird?
try this query on https://demo.questdb.io/
SELECT extra, extra/0,CASE WHEN extra/0 > 1 THEN 99999 ELSE -111 END from trips
Can you please instruct me on how can I handle this situation on my queries? I run calculations on 3 million rows, each addition to the command such as coalesce or infinity or null or 0 checks is adding much time to the queries :[
How can I check for these Infinity -Infinity values in query?
there is as well different behavior for number sent through query and numbers retrieved from columns. when sending numbers division by zero is always null but when data is retrieved from qdb then sometimes its Infinity and sometimes null - if 0/0 then null, if 9/0 then infinity.
This is from my check..
now if COUNT(*) would have worked in all calculated queries and not only in top level query I could have circumvent this count only non nulls behavior..
SELECT COUNT(*)/AVG(dropoff_longitude)+COUNT(*)*SUM(dropoff_longitude) from trips