05 September 2007

The Underestimated Power of HAVING

This is the query that just got me to thinking that I should blog about mySQL hints, tips, and ramblings. So... thinking turned into action and now I have a blog! The next step is to be able to keep it going so that it becomes useful to mySQL database administrators that are concerned about performance of mySQL and also need make sure that queries are written well.

Now... onward and forward...

The Challenge

Return results based on the sum of a column and user_id, where the sum of this column is greater than 10 for each user_id.

The Setup

Table: sample_table
Columns: user_id bigint not null auto_increment, date_added datetime not null, int_aggregate_column int not null

The Expected Finish Line

Result Format: user_id, SUM(int_aggregate_column)
Criteria: where SUM(int_aggregate_column) is greater than 10

The Challenge

The HAVING statement is powerful and underutilized. I cannot count the number of times I have seen php loops when a HAVING statement will take care of the issue. Mind you, there are times when having a php loop statement is better, but that is for another post.

We knew we would use the HAVING statement, but how?

The Answer

SELECT DISTINCT user_id, SUM(int_aggregate_column) AS sum FROM sample_table WHERE date_added >='2007-06-01' AND date_added < '2007-07-01' GROUP BY user_id HAVING sum > 10;

Fun, eh? (That' s my Canadian accent coming through)

As a test, you can limit the records returned to ensure that you are indeed getting the results you want.