DNAunion2000
01-13-2004, 03:51 PM
In SQL, there is an old way to join tables that uses the WHERE clause, and a new way that uses the FROM clause. Here is an example of the old style.
SELECT publishers.name, publishers.id, books.title
FROM publishers, books
WHERE publishers.id = books.publisher_id
ORDER BY publishers.name, books.title ;
Specifying joins in the WHERE clause is considered, at least by some, to be poor style these days. First, if one or more restrictions are being applied to the rows being retrieved from one or both of the tables, then the WHERE clause gets cluttered, having both the join and restrictions bunched together: clarity is improved by moving the join condition out of the WHERE clause and into the FROM clause.
Here’s how this could be done
SELECT publishers.name, publishers.id, books.title
FROM publishers INNER JOIN books ON publishers.id = books.publisher_id
ORDER BY publishers.name, books.title ;
************************************************
NOTE: Even the above improved command can be improved, by using table aliases, such as:
SELECT p.name, p.id, b.title
FROM publishers p INNER JOIN books b ON p.id = b.publishers_id
ORDER BY p.name, b.title ;
The benefits of using table aliases will be taken up in a separate thread.
************************************************
For several reasons, most sources recommend using the newer syntax for joins.
”The join in Figure 7-2 is accomplished using the ANSI JOIN SQL syntax, introduced in SQL Server version 6.5. Many examples and applications still continue to use the old-style JOIN syntax, which is shown below. (The term “old-style JOIN” is actually used in the SQL-92 specification.) The ANSI JOIN syntax is based on ANSI SQL-92. The main differences between the two types of join formulations are:
[1]The ANSI JOIN actually uses the keyword JOIN.
[2]The ANSI JOIN segregates join conditions from search conditions.
The ANSI JOIN syntax specifies the JOIN conditions in the ON clause (one for each pair of tables), and the search conditions are specified in the WHERE clause – for example, WHERE state <> ‘CA’. Although slightly more verbose, the explicit JOIN syntax is more readable.
…
One of the most common errors new SQL users make when using the old-style JOIN syntax is not specifying a join condition. Omitting the WHERE clause is still a valid SQL request and causes an answer to be returned. However, the result set is likely not what the user wanted. In Figure 7-3, omitting the WHERE clause would return the Cartesian product of the three tables: it would generate every possible combination of rows between them. Although in a few unusual cases you might want all permutations, usually this is just a user error. The number of rows returned is huge and typically doesn’t represent anything meaningful. For example, the Cartesian product of the three small tables here (authors, titles, and titleauthor each has less than 26 rows) generates 10,350 rows of (probably) meaningless output.
Using the ANSI JOIN syntax, it’s impossible to accidentally return a Cartesian product – one reason to use ANSI JOINs exclusively.” (Inside Microsoft SQL Server 7.0, Ron Soukup and Kalen Delaney, Microsoft Press, 1999, p305-306)
“The concept of a joined table (or joined relation) was incorporated into SQL to permit users to specify a table resulting from a join operation in the FROM clause of a query. This construct may be easier to comprehend than mixing together all the select and join conditions in the WHERE clause.” (Fundamentals of Database Systems: Fourth Edition, Ramez Elmasri & Shamkant B. Navathe, Addison Wesley, 2004, p237)
”When you join tables, you can use either the older SQL ’89 ANSI syntax, or the newer SQL ’99 ANSI join syntax.
…
SQL ’99 ANSI Join Syntax
The join statements for the newer ANSI syntax show up in the FROM clause of the SELECT statement:
…
The WHERE clause selects rows from the joined rows to be returned. You can choose three types of ANSI join statements: INNERJOIN, OUTERJOIN, and CROSSJOIN.
SQL ’89 Join Syntax
Although SQL Server 2000 still supports the older SQL ’89 syntax, we suggest you use the newer ANSI standard syntax instead:” (Teach Yourself Microsoft SQL Server 2000 in 21 Days: Second Edition, Richard Waymire and Rick Sawtell, SAMS, 2003, p361-362)
Finally, in the section on joins (pages 195 – 201), the book “Mastering SQL Server 2000” (Mike Gunderloy & Joseph L. Jorden, Sybex, 2000) does not even mention the old-style join using a WHERE clause; all of its joins use the new-style join using the FROM clause.
So while it is not an error to implement joins using the WHERE clause, it (1) is the old style of doing joins, which if nothing else means it may be dropped from the SQL specification in the near future, (2) can lead to a cluttered up WHERE clause, making that clause more difficult to comprehend, (3) has reduced clarity compared to having the join condition on its own, and (4) makes the process of joining tables more error prone (at least for those fairly new to SQL).
SELECT publishers.name, publishers.id, books.title
FROM publishers, books
WHERE publishers.id = books.publisher_id
ORDER BY publishers.name, books.title ;
Specifying joins in the WHERE clause is considered, at least by some, to be poor style these days. First, if one or more restrictions are being applied to the rows being retrieved from one or both of the tables, then the WHERE clause gets cluttered, having both the join and restrictions bunched together: clarity is improved by moving the join condition out of the WHERE clause and into the FROM clause.
Here’s how this could be done
SELECT publishers.name, publishers.id, books.title
FROM publishers INNER JOIN books ON publishers.id = books.publisher_id
ORDER BY publishers.name, books.title ;
************************************************
NOTE: Even the above improved command can be improved, by using table aliases, such as:
SELECT p.name, p.id, b.title
FROM publishers p INNER JOIN books b ON p.id = b.publishers_id
ORDER BY p.name, b.title ;
The benefits of using table aliases will be taken up in a separate thread.
************************************************
For several reasons, most sources recommend using the newer syntax for joins.
”The join in Figure 7-2 is accomplished using the ANSI JOIN SQL syntax, introduced in SQL Server version 6.5. Many examples and applications still continue to use the old-style JOIN syntax, which is shown below. (The term “old-style JOIN” is actually used in the SQL-92 specification.) The ANSI JOIN syntax is based on ANSI SQL-92. The main differences between the two types of join formulations are:
[1]The ANSI JOIN actually uses the keyword JOIN.
[2]The ANSI JOIN segregates join conditions from search conditions.
The ANSI JOIN syntax specifies the JOIN conditions in the ON clause (one for each pair of tables), and the search conditions are specified in the WHERE clause – for example, WHERE state <> ‘CA’. Although slightly more verbose, the explicit JOIN syntax is more readable.
…
One of the most common errors new SQL users make when using the old-style JOIN syntax is not specifying a join condition. Omitting the WHERE clause is still a valid SQL request and causes an answer to be returned. However, the result set is likely not what the user wanted. In Figure 7-3, omitting the WHERE clause would return the Cartesian product of the three tables: it would generate every possible combination of rows between them. Although in a few unusual cases you might want all permutations, usually this is just a user error. The number of rows returned is huge and typically doesn’t represent anything meaningful. For example, the Cartesian product of the three small tables here (authors, titles, and titleauthor each has less than 26 rows) generates 10,350 rows of (probably) meaningless output.
Using the ANSI JOIN syntax, it’s impossible to accidentally return a Cartesian product – one reason to use ANSI JOINs exclusively.” (Inside Microsoft SQL Server 7.0, Ron Soukup and Kalen Delaney, Microsoft Press, 1999, p305-306)
“The concept of a joined table (or joined relation) was incorporated into SQL to permit users to specify a table resulting from a join operation in the FROM clause of a query. This construct may be easier to comprehend than mixing together all the select and join conditions in the WHERE clause.” (Fundamentals of Database Systems: Fourth Edition, Ramez Elmasri & Shamkant B. Navathe, Addison Wesley, 2004, p237)
”When you join tables, you can use either the older SQL ’89 ANSI syntax, or the newer SQL ’99 ANSI join syntax.
…
SQL ’99 ANSI Join Syntax
The join statements for the newer ANSI syntax show up in the FROM clause of the SELECT statement:
…
The WHERE clause selects rows from the joined rows to be returned. You can choose three types of ANSI join statements: INNERJOIN, OUTERJOIN, and CROSSJOIN.
SQL ’89 Join Syntax
Although SQL Server 2000 still supports the older SQL ’89 syntax, we suggest you use the newer ANSI standard syntax instead:” (Teach Yourself Microsoft SQL Server 2000 in 21 Days: Second Edition, Richard Waymire and Rick Sawtell, SAMS, 2003, p361-362)
Finally, in the section on joins (pages 195 – 201), the book “Mastering SQL Server 2000” (Mike Gunderloy & Joseph L. Jorden, Sybex, 2000) does not even mention the old-style join using a WHERE clause; all of its joins use the new-style join using the FROM clause.
So while it is not an error to implement joins using the WHERE clause, it (1) is the old style of doing joins, which if nothing else means it may be dropped from the SQL specification in the near future, (2) can lead to a cluttered up WHERE clause, making that clause more difficult to comprehend, (3) has reduced clarity compared to having the join condition on its own, and (4) makes the process of joining tables more error prone (at least for those fairly new to SQL).