04 September 2008

Case Sensitive Fields

Yes, it has been too long since I last posted on this blog (3 months) - that's my apology; let's move forward.

There's too many interesting MySQL bits and pieces that I still come across my desk that I cannot help but post these fantastic learning opportunities to my blog. Obviously, you can read about all things interesting from PlanetMySql.org, but I hope that my blog will help sift people through specific issues that pertain directly to them while working with MySQL.

Oh yes... I am now located in Denver and I work for a big startup that uses MySQL for data mining. Most excellent!

Querying Case Sensitive Fields

Issue: Users to be able to query for case sensitive field values. For example, 700g is not the same as 700G.

Answer: COLLATE latin1_general_cs

Example in action:
mysql> CREATE TABLE test (non_case_sensitive char(1) NOT NULL, case_sensitive char(1) COLLATE latin1_general_cs) ENGINE=MyISAM charset=latin1;

mysql> insert into test values ('x','x');
mysql> insert into test values ('X','X');
mysql> insert into test values ('y','Y');
mysql> insert into test values ('Y','y');
mysql> select non_case_sensitive, count(*) from test group by non_case_sensitive;
| non_case_sensitive | count(*) |
| x | 2 |
| y | 2 |
2 rows in set (0.00 sec)

mysql> select case_sensitive, count(*) from test group by case_sensitive;
| case_sensitive | count(*) |
| X | 1 |
| x | 1 |
| Y | 1 |
| y | 1 |
4 rows in set (0.00 sec)


Performance Considerations

There does not appear to be much of a performance hit as a result of this change. On a 4.5 million record table, there was no difference. In fact many of our comparison queries (with same number of results) on the case sensitive table were a bit faster than on the case insensitive table.

Have fun! I hope to post more frequently in the future as I continue to find out interesting tidbits along the way.

Until next time,