Choosing Rows with the HAVING Clause
The HAVING clause sets conditions on the GROUP BY clause similar to the way WHERE interacts with SELECT. The WHERE search condition is applied before the grouping operation occurs; the HAVING search condition is applied after the grouping operation occurs. The HAVING syntax is similar to the WHERE syntax, except HAVING can contain aggregate functions. HAVING clauses can reference any of the items that appear in the select list.
This query finds publishers whose year-to-date sales are greater than $40,000:
USE pubs
SELECT pub_id, total = SUM(ytd_sales)
FROM titles
GROUP BY pub_id
HAVING SUM(ytd_sales) > 40000
Here is the result set:
pub_id total
------ -----------
0877 44219
(1 row(s) affected)
To make sure there are at least six books involved in the calculations for each publisher, this example uses HAVING COUNT(*) > 5 to eliminate the publishers that return totals for fewer than six books:
USE pubs
SELECT pub_id, total = SUM(ytd_sales)
FROM titles
GROUP BY pub_id
HAVING COUNT(*) > 5
Here is the result set:
pub_id total
------ -----------
0877 44219
1389 24941
(2 row(s) affected)
Understanding the correct sequence in which the WHERE, GROUP BY, and HAVING clauses are applied helps in coding efficient queries:
- The WHERE clause is used to filter the rows that result from the operations specified in the FROM clause.
- The GROUP BY clause is used to group the output of the WHERE clause.
- The HAVING clause is used to filter rows from the grouped result.
For any search conditions that could be applied either before or after the grouping operation, it is more efficient to specify them in the WHERE clause. This reduces the number of rows that have to be grouped. The only search conditions that should be specified in the HAVING clause are those search conditions that must be applied after the grouping operation has been performed.
The Microsoft® SQL Server™ 2000 query optimizer can deal with most of these conditions. If the query optimizer determines that a HAVING search condition can be applied before the grouping operation, it will do so. The query optimizer might not be able to recognize all of the HAVING search conditions that can be applied before the grouping operation. It is recommended that you place all such search conditions in the WHERE clause instead of the HAVING clause.
The following query shows HAVING with an aggregate function. It groups the rows in the titles table by type and eliminates the groups that include only one book:
USE pubs
SELECT type
FROM titles
GROUP BY type
HAVING COUNT(*) > 1
Here is the result set:
type
------------------
business
mod_cook
popular_comp
psychology
trad_cook
(5 row(s) affected)
This is an example of a HAVING clause without aggregate functions. It groups the rows in titles by type and eliminates those types that do not start with the letter p.
USE pubs
SELECT type
FROM titles
GROUP BY type
HAVING type LIKE 'p%'
Here is the result set:
type
------------------
popular_comp
psychology
(2 row(s) affected)
When multiple conditions are included in HAVING, they are combined with AND, OR, or NOT. The following example shows how to group titles by publisher, including only those publishers with identification numbers greater than 0800, who have paid more than $15,000 in total advances, and who sell books for an average of less than $20.
SELECT pub_id, SUM(advance) AS AmountAdvanced,
AVG(price) AS AveragePrice
FROM pubs.dbo.titles
WHERE pub_id > '0800'
GROUP BY pub_id
HAVING SUM(advance) > $15000
AND AVG(price) < $20
ORDER BY can be used to order the output of a GROUP BY clause. This example shows using the ORDER BY clause to define the order in which the rows from a GROUP BY clause are returned:
SELECT pub_id, SUM(advance) AS AmountAdvanced,
AVG(price) AS AveragePrice
FROM pubs.dbo.titles
WHERE pub_id > '0800'
AND price >= $5
GROUP BY pub_id
HAVING SUM(advance) > $15000
AND AVG(price) < $20
ORDER BY pub_id DESC