30 November 2007

A Zip Code Solution - ...WHERE LEFT(zip_code,3) vs. This Article

The Price is MySql with Jay Pipes

Quick! Pick a curtain! Do it now!

Curtain #1:
SELECT zip_code FROM table_name WHERE LEFT(zip_code,3) IN (123,456,789);

What's under curtain #2 you ask?

Well, I'm glad you asked! It is...
ALTER TABLE table_name ADD COLUMN zip3 INTEGER UNSIGNED ZEROFILL NOT NULL, ADD INDEX idx_zip3 (zip3);

What?? Another column and another index? My table already has 7,351 columns? Well, vertically partition it! OK, off topic...

In the meantime, let me show you how to save minutes, if not hours in your query execution time!

The Problem

I have a three table join in a query that is also making mysql scan 17 million records. Where is this scan coming from? It is from this part of the query:

SELECT zip_code FROM table_name WHERE LEFT(zip_code,3) IN (123,456,789);

The Solution

It a nutshell, it is this:
ALTER TABLE table_name ADD COLUMN zip3 INTEGER UNSIGNED ZEROFILL NOT NULL, ADD INDEX idx_zip3 (zip3);
UPDATE table_name SET zip3 = LEFT(zip_code,3);

Just do it. It may take an hour and 37 minutes as it did for me, but it might not!

The New Query

SELECT zip_code FROM table_name WHERE zip3 IN (123,456,789);

The Results

I have shaved off time from a query that joins a 17 million row scan to a 330,000 row scan and a 17,500 record table in the magnitude of 51:1. Instead of 45 minute runs, we're looking at under a minute. I think I will go out for lunch 4 times today. Whoops, I didn't say that. Really!

Til next time,
David


No comments: