selectstarsql
selectstarsql copied to clipboard
Incorrect solution in GROUP BY section
Where
The problem exists at the page: https://selectstarsql.com/longtail.html
.
What
Current solution's code for reference:
SELECT
county,
last_statement IS NOT NULL AS has_last_statement,
COUNT(*)
FROM executions
GROUP BY county, has_last_statement
The given solution's code of the second task in the section The GROUP BY Block
does not, as the task states:
Modify this query to find the number of executions from each county with and without a last statement.
but instead it fits the following description:
- Column 1 is the county
- Column 2 is
1
or0
interpreted as boolean, whether there are anylast_statement is not NULL
for the correspoding county - Column 3 is the total amount of executions for the corresponding county
The post-run indicator Correct / Incorrect
is also wrong analog to the problem above.
Solution
Change the task description to fit the solution
or
change the solution's code to fit the task and change the expected dataset accordingly. An example for the solution's code using case-when could be:
SELECT
county,
COUNT(CASE WHEN last_statement IS NOT NULL THEN 1 ELSE NULL END) AS with_statement,
COUNT(CASE WHEN last_statement IS NULL THEN 1 ELSE NULL END) AS no_statement
FROM executions
GROUP BY county
Cheers and thank you a ton for a great tutorial!
Just want to say that after reading the problem description I came to the exact same solution as the one you've provided above @Kiraub and found myself rather confused after revealing the expected solution.
I'm guessing that the problem is meant to illustrate grouping by multiple columns? Not sure though as the expected solution is rather confusing without further explanation.
Coming here to say the same. This was my solution after interpreting the question:
SELECT
county,
SUM(CASE WHEN last_statement
IS NOT NULL THEN 1 ELSE 0 END)
AS has_last_statement,
SUM(CASE WHEN last_statement
IS NULL THEN 1 ELSE 0 END)
AS no_last_statement
FROM executions
GROUP BY county
When I look at the results of Brazos county, a manual count reports 10 with statement, 2 null. My SQL gives this value and the solution provided by the author provides 0 with a statement and 2 null.
Thanks for the feedback everyone! I see how the solution is confusing given the question. My read is that the question is too contrived because if you were performing this task with real interest in the output, you'd want all the data consolidated into one row per county.
However, @DanielBoa is spot on that the intent of the question is to show grouping by multiple columns. To me, figuring out the grouping structure is the harder part of learning GROUP BY, and once learners understand that, it's relatively easy to build aggregation functions on the group.
I've changed the question in https://github.com/zichongkao/selectstarsql/commit/dbf5cda793111be50590a9d44d6e8225a9161463 to hopefully clarify things.
What do y'all think? @samjewell @Kiraub @phillipamann
After rereading the section again i think that the rephrased task along with the provided column (which to me was a unintuitive to come up with as the learner) nudge the reader in the intended direction to solve the exercise correctly. So looks good to me @zichongkao !