18 December 2007

How to Find Domain Names of Emails Without Using LIKE '%domain.com'

I like working at companies where there are "problems related to success". My colleague and I from a former company that I worked for in LA talked about two types of problems faced by business:
  1. Problems related to failure - the problems of having your doors open for business but few people know about it;
  2. Problems related to success -the problems of having so many people come through your doors that you are trying to figure out how to handle it
The world of web is very much like this. Rarely would one hear that a company is in the "sweet spot in the middle", and if a company is in such a position, everyone is very quiet about it as to not the jinx the atmosphere around it. Instead, you hear "We just need to bring our numbers up by promoting 1,000 more youtube videos", or the other end - "We just brought in three racks of servers and we can barely keep up".

And, as it happens, "problems of success" fall on the lap of the mysql DBA. As many DBAs should be aware, we need to prepare for the future, before people start calling you a Disaster Briefing Analyst instead of Database Administrator.

One of these moments concerns us with the LIKE statement. We all hope that we don't need to use LIKE with the wildcard % in the front. But, as it happens, with tables containing email addresses, an business analyst will request a count of all msn.com-based email addresses.

How to do this without LIKE '%msn.com'?

REVERSE to the Rescue

Just reverse the email address in the table and then reverse the domain you are searching for. Here's how to create a table for the testing:
CREATE TABLE rv_emails (rv_email_id INT UNSIGNED NOT NULL AUTO_INCREMENT, email CHAR(50) NOT NULL, rv_email CHAR(50) NOT NULL, PRIMARY KEY (rv_email_id), INDEX idx_email (email), INDEX idx_rv_email (rv_email));

When I SELECT COUNT(1) FROM rv_emails WHERE email LIKE '%msn.com' when this table has a million records, it takes 1.28 sec to retrieve a count 17,594 records. Now try this on for size:
SELECT COUNT(1) FROM rv_emails WHERE rv_email LIKE REVERSE('%msn.com');
How long did it take for you? A little better, eh? (That's my Canadian vernacular coming through) For me, it was .03 sec to achieve the identical result. An improvement of about 40 times.

Just for fun, here are the results when the table has 5 million records:
  • Old method: 6.69 sec
  • New method: 0.19 sec
Fun stuff!

Until next time,

13 December 2007

My One Night mySQL Road Trip

I like driving. It relaxes me. I especially like driving in the Carolinas versus driving the freeways in Los Angeles. In LA, you have to intentionally calm yourself down before going to the on ramp of the 101, 5 or the 10. Here, in South Carolina, it is possible to drive highway speeds on the highways (in LA, it is possible, but only at night time or early in the morning).

When I heard that the Durham mySQL meetup had Jay Pipes over, I had to go. Thus, my Carolinas road trip, from Columbia, SC to Durham, NC. That was about the length of three CDs. Nice.

And, as always, when you connect with like-minded individuals in the industry, I *always* pick up on a few new techniques. It's like the same idea Sheeri talked about when seeing another side of things when in the presence of other mySQL DBAs.

As a result, this has become my list of "To-Do's" at our shop here. As each one gets done, I'll blog about it and turn each of these items into a link. So, without further ado, here goes:

  • Vertical partition infrequently updated fields in one table and frequently updated fields in another. We have a couple of tables that could use this, but the programming ramifications could be more significant than anticipated;
  • Remove redundant/poor indexes;
  • Think in terms of sets instead of loops - get comfortable using derived tables;
  • Take out non-deterministic values in our reporting queries
  • Turn MyISAM DELETES and UPDATES into batches
  • Convert "email LIKE '%website.com'" queries into "rv_email LIKE (REVERSE('website.com'),'%')" queries. Already did that today - worked like a charm! I'll blog about it tomorrow.
That's all for today!

Til next time,

06 December 2007

Creating Copies of Tables: MS SQL vs. MySql

I was reminded the other day how SQL Server was annoying in regards to table structure changes. All I wanted to to was copy a table (with structure intact - keys, dependancies) with its data.

How hard could that be?

MS SQL in Action

First, I needed to do a complete DDL on TableOne and replace every instance of the word "TableOne" with "TableTwo". It works, even though it is bulky and takes time, especially when the DDL was 68 lines long.

The second step was not too bad.
SELECT * FROM TableOne INTO TableTwo
Ok - that did not seem too hard, until you see how this can be accomplished in MySql.

MySql's Turn
CREATE TABLE table_1 LIKE table_2;
INSERT INTO table_1 SELECT * FROM table_2;
Which looks easier to you?

Here's another wrinkle: what about table-level permissions? Which side is easier to manage?

Until next time,