sql-ecology-lesson icon indicating copy to clipboard operation
sql-ecology-lesson copied to clipboard

Introduce JOIN by showing a two table SELECT first

Open rsh249 opened this issue 6 years ago • 2 comments

This is referring to the material in Episode 3 (https://datacarpentry.org/sql-ecology-lesson/03-sql-joins/index.html).

In my experience with SQL it seems that the logical progression to introduce JOINs is to first introduce SELECT statements with more than one table as is done below with the brief reference to implicit JOINs in the WHERE clause.

Lines 88-96 in the .md file for this episode appear as:

Many databases, including SQLite, also support a join through the `WHERE` clause of a query.  
For example, you may see the query above written without an explicit JOIN.

	SELECT surveys.year, surveys.month, surveys.day, species.genus, species.species
	FROM surveys, species
	WHERE surveys.species_id = species.species_id;

For the remainder of this lesson, we'll stick with the explicit use of the `JOIN` keyword for 
joining tables in SQL.  

If this were moved to the beginning of the episode I would propose something like this as a transition towards JOIN syntax:

It is possible to extract data based on conditional matches between two database tables using the `WHERE` clause of a query.  This can be used to match data from to related tables into one result.

	SELECT *
	FROM surveys, species
	WHERE surveys.species_id = species.species_id;

The query above selects and merges rows from both surveys and species where the species_id field matches between the two tables.

SQL, however, provides a formal way to do this through the JOIN syntax. Using JOIN provides a safer, more structured way to bring together data from multiple tables

For me it's not that implicit JOINs are bad or problematic, but rather that explicit JOINs are more formal and probably a bit safer. In personal experience I find simple queries that link more than two tables easier to express as an implicit JOIN rather than a 3+ table explicit JOIN. But maybe that's just my preference.

Maybe if someone can set me fully straight here on why I should always use explicit JOIN statements then that conversation could be useful for developing this lesson. I've looked on Stackoverflow threads only to find that there is little actual difference in efficiency of implicit vs. explicit JOINs.

rsh249 avatar Aug 14 '18 19:08 rsh249