selectstarsql icon indicating copy to clipboard operation
selectstarsql copied to clipboard

Incorrect solution in GROUP BY section

Open Kiraub opened this issue 5 years ago • 2 comments

Where

The problem exists at the page: https://selectstarsql.com/longtail.html.

What

Current solution's code for reference:

SELECT
  county,
  last_statement IS NOT NULL AS has_last_statement,
  COUNT(*)
FROM executions
GROUP BY county, has_last_statement

The given solution's code of the second task in the section The GROUP BY Block does not, as the task states:

Modify this query to find the number of executions from each county with and without a last statement.

but instead it fits the following description:

  • Column 1 is the county
  • Column 2 is 1 or 0 interpreted as boolean, whether there are any last_statement is not NULL for the correspoding county
  • Column 3 is the total amount of executions for the corresponding county

The post-run indicator Correct / Incorrect is also wrong analog to the problem above.

Solution

Change the task description to fit the solution

or

change the solution's code to fit the task and change the expected dataset accordingly. An example for the solution's code using case-when could be:

SELECT
  county,
  COUNT(CASE WHEN last_statement IS NOT NULL THEN 1 ELSE NULL END) AS with_statement,
  COUNT(CASE WHEN last_statement IS NULL THEN 1 ELSE NULL END) AS no_statement
FROM executions
GROUP BY county

Cheers and thank you a ton for a great tutorial!

Kiraub avatar Jan 15 '20 14:01 Kiraub

Just want to say that after reading the problem description I came to the exact same solution as the one you've provided above @Kiraub and found myself rather confused after revealing the expected solution.

I'm guessing that the problem is meant to illustrate grouping by multiple columns? Not sure though as the expected solution is rather confusing without further explanation.

DanielBoa avatar Mar 08 '20 18:03 DanielBoa

Coming here to say the same. This was my solution after interpreting the question:

SELECT
  county,
  SUM(CASE WHEN last_statement 
	  IS NOT NULL THEN 1 ELSE 0 END) 
	  AS has_last_statement,
  SUM(CASE WHEN last_statement 
	  IS NULL THEN 1 ELSE 0 END) 
	  AS no_last_statement
FROM executions
GROUP BY county

When I look at the results of Brazos county, a manual count reports 10 with statement, 2 null. My SQL gives this value and the solution provided by the author provides 0 with a statement and 2 null.

phillipamann avatar Jun 15 '20 15:06 phillipamann

Thanks for the feedback everyone! I see how the solution is confusing given the question. My read is that the question is too contrived because if you were performing this task with real interest in the output, you'd want all the data consolidated into one row per county.

However, @DanielBoa is spot on that the intent of the question is to show grouping by multiple columns. To me, figuring out the grouping structure is the harder part of learning GROUP BY, and once learners understand that, it's relatively easy to build aggregation functions on the group.

I've changed the question in https://github.com/zichongkao/selectstarsql/commit/dbf5cda793111be50590a9d44d6e8225a9161463 to hopefully clarify things.

What do y'all think? @samjewell @Kiraub @phillipamann

zichongkao avatar Dec 19 '22 16:12 zichongkao

After rereading the section again i think that the rephrased task along with the provided column (which to me was a unintuitive to come up with as the learner) nudge the reader in the intended direction to solve the exercise correctly. So looks good to me @zichongkao !

Kiraub avatar Dec 19 '22 19:12 Kiraub