When a query is submitted to the database, it is executed in the following order:
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
So why is it important to understand this?
When a query is executed,
First all the tables and their join conditions are executed filtering out invalid references between them.
Then the WHERE clause is applied which again filters the records based on the condition given.
Now you have handful of records which are GROUP-ed
And HAVING clause is applied on the result.
As soon as it is completed, the columns mentioned are selected from the corresponding tables.
And finally sorted using ORDER BY clause.
So when a query is written it should be verified based on this order, otherwise it will lead wrong resultsets.
Example:
Recently I came across the following problem.
The original query was
SELECT DISTINCT so.departure_airport, so.airport_name
FROM special_offers so, special_offers_groups_join sogj
WHERE so.resort_code = '000170' AND so.accomodation_code = '015397'
AND so.departure_date = '2008-01-13'
AND so.nights = '7'
AND sogj.group_id = '1'
AND sogj.special_offer_id = so.special_offer_id
ORDER BY price ASC
Table data:
departure_airport airport_name price
'MAN' 'Manchester Airport' 344
'MAN' 'Manchester Airport' 288
'MAN' 'Manchester Airport' 316
'BRS' 'Bristol' 289
'BRS' 'Bristol' 345
'BRS' 'Bristol' 317
'BHX' 'Birmingham Airport' 343
'BHX' 'Birmingham Airport' 287
'BHX' 'Birmingham Airport' 315
Here the query is executed as follows:
All the records from both the tables are taken.
Conditions mentioned in the WHERE clause is applied fetching say 9 records.
Now the distinct of departure_airport and airport_name alone is taken since price is not in SELECT clause. This gives you three rows randomly selected (mostly in the order of insertion to the table!)
And the result is sorted based on price (incorrect price column due to distinct keyword).
So the output is not ordered properly based on price!
So we need to rewrite the query as follows:
All the records from both the tables should be taken.
Conditions mentioned in the WHERE clause needs to applied on the resulting resultset.
Group the resultset based on departure_airport and airport_name to get the minimum price in each group.
Departure_airport airport_name minprice
'BHX' 'Birmingham Airport' 287
'BRS' 'Bristol' 289
'MAN' 'Manchester Airport' 288
Select the columns departure_airport, airport_name and min(so.price) from the resultset.
And finally apply the ORDER By clause which orders the resultset based on the column min(so.price).
So the query should be:
SELECT so.departure_airport, so.airport_name, MIN(so.price) AS minprice
FROM special_offers so, special_offers_groups_join sogj
WHERE so.resort_code = '000170'
AND so.accomodation_code = '015397'
AND so.departure_date = '2008-01-13'
AND so.nights = '7'
AND sogj.group_id = '1'
AND sogj.special_offer_id = so.special_offer_id
GROUP BY so.departure_airport,so.airport_name
ORDER BY minprice;
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
So why is it important to understand this?
When a query is executed,
First all the tables and their join conditions are executed filtering out invalid references between them.
Then the WHERE clause is applied which again filters the records based on the condition given.
Now you have handful of records which are GROUP-ed
And HAVING clause is applied on the result.
As soon as it is completed, the columns mentioned are selected from the corresponding tables.
And finally sorted using ORDER BY clause.
So when a query is written it should be verified based on this order, otherwise it will lead wrong resultsets.
Example:
Recently I came across the following problem.
The original query was
SELECT DISTINCT so.departure_airport, so.airport_name
FROM special_offers so, special_offers_groups_join sogj
WHERE so.resort_code = '000170' AND so.accomodation_code = '015397'
AND so.departure_date = '2008-01-13'
AND so.nights = '7'
AND sogj.group_id = '1'
AND sogj.special_offer_id = so.special_offer_id
ORDER BY price ASC
Table data:
departure_airport airport_name price
'MAN' 'Manchester Airport' 344
'MAN' 'Manchester Airport' 288
'MAN' 'Manchester Airport' 316
'BRS' 'Bristol' 289
'BRS' 'Bristol' 345
'BRS' 'Bristol' 317
'BHX' 'Birmingham Airport' 343
'BHX' 'Birmingham Airport' 287
'BHX' 'Birmingham Airport' 315
Here the query is executed as follows:
All the records from both the tables are taken.
Conditions mentioned in the WHERE clause is applied fetching say 9 records.
Now the distinct of departure_airport and airport_name alone is taken since price is not in SELECT clause. This gives you three rows randomly selected (mostly in the order of insertion to the table!)
And the result is sorted based on price (incorrect price column due to distinct keyword).
So the output is not ordered properly based on price!
So we need to rewrite the query as follows:
All the records from both the tables should be taken.
Conditions mentioned in the WHERE clause needs to applied on the resulting resultset.
Group the resultset based on departure_airport and airport_name to get the minimum price in each group.
Departure_airport airport_name minprice
'BHX' 'Birmingham Airport' 287
'BRS' 'Bristol' 289
'MAN' 'Manchester Airport' 288
Select the columns departure_airport, airport_name and min(so.price) from the resultset.
And finally apply the ORDER By clause which orders the resultset based on the column min(so.price).
So the query should be:
SELECT so.departure_airport, so.airport_name, MIN(so.price) AS minprice
FROM special_offers so, special_offers_groups_join sogj
WHERE so.resort_code = '000170'
AND so.accomodation_code = '015397'
AND so.departure_date = '2008-01-13'
AND so.nights = '7'
AND sogj.group_id = '1'
AND sogj.special_offer_id = so.special_offer_id
GROUP BY so.departure_airport,so.airport_name
ORDER BY minprice;