My Top 10 List For Developers

By David Holoboff on 8:04 AM

comments (2)

Filed Under:


An excellent question was posted in a MySQL group in LinkedIn the other day that I had to post it here.

What is your "Top 10 List for Programmers Working with MySQL"?

I started mine... Let me know your thoughts on my list, and feel free to add other items to the list. I interface with programmers on a daily basis, and our experience lends us to help them in certain ways, but I think a "Top 10" list will enable us to be vigilant in our day-to-day development interactions.

Oh yes - I have 14 items in my "Top 10 List", and I have no problem adding a few more... :)

Drum roll please... [cue David Letterman]


  1. DO know your tools, and do your testing on your own LAMP environment at your desk;
  2. DO document;
  3. DO ensure your queries use indexes and use them properly (there *is* a difference);
  4. DO understand the EXPLAIN statement - its benefits and shortcomings alike;
  5. DON'T use comma joins (you must explicitly help MySQL with execution plan on joins);
  6. DO normalize your tables for transaction processing; (I like 3rd normal, but it depends on the app);
  7. DO denormalize your tables for analytical processing; (the key is how much you denormalize!)
  8. DO NOT create indexes on columns with few unique values (like gender);
  9. DON'T open large recordsets (in a dev environment, this can be easily overlooked);
  10. DO always test with large datasets (that's why we have Dev, Test, and Prod servers!);
  11. DO always set numeric columns to NOT NULL unless NULL is absolutely required;
  12. DON'T use cursors (or better yet, skip stored procedures altogether and use an API based on php, perl, C, or python);
  13. DO understand the benefits and limitations of database abstraction and when to use it;
  14. DON'T oversize your varchar's, char's, and don't use TEXT type when other types will do.

KickFire is Back

By David Holoboff on 2:59 PM

comments (2)

Filed Under: , , , ,

After receiving an email about talking with Robert David, Director of Sales at KickFire, I checked out recent news on KickFire, as there had been little written about this company aside from the big splash they made at the MySQL Users Conference back in April 2008.

And, lo and behold, there was a piece of news, posted on October 14 - Kickfire Enters Into MySQL Enterprise Agreement With Sun Microsystems. It did not seem to garner any attention from the MySQL community yet, and the only source that picked up on this was "The Financial". Interestingly, the tagline of "The Financial" is "Open Source Information", which does not represent the same meaning to those of us working with open source tools. I don't know who is doing the PR for KickFire, but there might have been a skipped number on the speed dial somewhere.

This obviously was not as exciting as a headline like "Kickfire Machine Now in Production, Already Saving Dozens of Companies Thousands of Dollars". Or "Extended Data Warehouse Testing Beyond TPC-H headquarters Proves the Robustness of the First MySQL Data Appliance". Perhaps a review from Percona, Pythian, The 451 Group, or others listed along the panel of PlanetMySQL.org could add some fury to the data warehouse fire that is burning in the news lately.

Anyways, tomorrow I will find out what the latest news on this is; after all, since April 2008, it does seem like a race of turtles between MySQL 5.1, the KickFire Appliance, and the introduction of an affordable electric car with a decent range. I understand that quality of the product exceeds the quantity of time, so I am willing to deal with that. But for the last 2+ years, I have learned not to hold my breath over that.

Fixing InnoDB IMPORT TABLESPACE Error: ERROR 1030 (HY000): Got error -1 from storage engine

Setup
We have one InnoDB file per table on our database, which was set with the following option: innodb_file_per_table.

This allows me to portably transport Innodb files on a system level with minimal hassle, and is much faster than mysqldump, as these tables are several GB each.

Problem
When transporting an .idb file from one server to another, I ran into the following error:

ERROR 1030 (HY000): Got error -1 from storage engine
How did I get to this point? I did the following:
1) created a dummy table in the database;
2) discarded its tablespace;
3) moved the .ibd file into the database folder on the system;
4) attached the tablespace back to the table

This is how these actions were accomplished:
mysql> CREATE TABLE new_innodb_table (value1 INT UNSIGNED NOT NULL, value2 CHAR(5) NOT NULL);

mysql> ALTER TABLE new_innodb_table DISCARD TABLESPACE;

[sys_prompt]# mv new_innodb_table.ibd /var/lib/mysql/db_name/

mysql> ALTER TABLE new_innodb_table ATTACH TABLESPACE;
ERROR 1030 (HY000): Got error -1 from storage engine
The next thing you automatically do when you see an error in your mysql client is check your error log immediately. If this log is not enabled, please stop reading this post and go here.

Here is what the log recorded:
081009 14:21:02 InnoDB: Error: tablespace id in file './db_name/new_innodb_table.ibd' is 26, but in the InnoDB
InnoDB: data dictionary it is 28.
InnoDB: Have you moved InnoDB .ibd files around without using the
InnoDB: commands DISCARD TABLESPACE and IMPORT TABLESPACE?
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.0/en/innodb-troubleshooting.html
InnoDB: for how to resolve the issue.
081009 18:21:02 InnoDB: cannot find or open in the database directory the .ibd file of
InnoDB: table `stbHealth/sub_stb_health_associate`
InnoDB: in ALTER TABLE ... IMPORT TABLESPACE

From the error log we see that the error concerned mismatched tablespace ids.

If the tablespace id for my .ibd file was 26, then the tablespace would have been successfully imported. Also, if the tablespace id in the data dictionary 25, then all I would need to do is create a dummy innoDB table, thus incrementing the tablespace id in the data dictionary to 26, then I could try importing the tablespace again. Tricky, but it works (especially if that number was lower and you would make several dummy innodb tables).

Solution
However, the easier option is this: restart mysql, then do the same four steps listed near the beginning of the post. This way, the tablespace id on the data dictionary and the file matched; thus importing the tablespace succeeded.

This can give you greater confidence in dealing with some of the InnoDB "gotcha's" during the recovery process or even file transfers. It did with me!

Case Sensitive Fields

Yes, it has been too long since I last posted on this blog (3 months) - that's my apology; let's move forward.

There's too many interesting MySQL bits and pieces that I still come across my desk that I cannot help but post these fantastic learning opportunities to my blog. Obviously, you can read about all things interesting from PlanetMySql.org, but I hope that my blog will help sift people through specific issues that pertain directly to them while working with MySQL.

Oh yes... I am now located in Denver and I work for a big startup that uses MySQL for data mining. Most excellent!

Querying Case Sensitive Fields

Issue: Users to be able to query for case sensitive field values. For example, 700g is not the same as 700G.

Answer: COLLATE latin1_general_cs

Example in action:

mysql> CREATE TABLE test (non_case_sensitive char(1) NOT NULL, case_sensitive char(1) COLLATE latin1_general_cs) ENGINE=MyISAM charset=latin1;

mysql> insert into test values ('x','x');
mysql> insert into test values ('X','X');
mysql> insert into test values ('y','Y');
mysql> insert into test values ('Y','y');
mysql> select non_case_sensitive, count(*) from test group by non_case_sensitive;
+--------------------+----------+
| non_case_sensitive | count(*) |
+--------------------+----------+
| x | 2 |
| y | 2 |
+--------------------+----------+
2 rows in set (0.00 sec)

mysql> select case_sensitive, count(*) from test group by case_sensitive;
+----------------+----------+
| case_sensitive | count(*) |
+----------------+----------+
| X | 1 |
| x | 1 |
| Y | 1 |
| y | 1 |
+----------------+----------+
4 rows in set (0.00 sec)

Voila!

Performance Considerations

There does not appear to be much of a performance hit as a result of this change. On a 4.5 million record table, there was no difference. In fact many of our comparison queries (with same number of results) on the case sensitive table were a bit faster than on the case insensitive table.

Have fun! I hope to post more frequently in the future as I continue to find out interesting tidbits along the way.

Until next time,
David

Things Not Replicating Correctly? Part 2

My last post on June 10 was about when MySQL received queries from php scripts that looked like this:

mysql_select_db = ('database_I_DO_want_to_replicate', $link);
$sql = "INSERT INTO db_I_DO_NOT_want_to_replicate.repl_table (a,b,c) VALUES (1,2,3);";


-- and thus could not pass this over to the slave unless I set Replicate_Wild_Ignore_Table value IN ADDITION to Replicate_Ignore_DB as such in my configuration file:
replicate_ignore_DB = db_I_DO_NOT_want_to_replicate
replicate_wild_ignore_table = db_I_DO_NOT_want_to_replicate.%


News Flash

Three things:
  1. As it happens, we do not need use replicate_ignore_DB for this to work properly;
  2. Replicate_do_DB and Replicate_ignore_DB do not need to be used at all;
  3. In fact, Replicate_do_DB is somewhat evil.

Firstly, replicate_wild_ignore_table is all you need to ignore databases, and it will properly replicate all kinds of php riff-raff, where the scripts do not necessarily select the proper database and they are always written in db.table style.

Secondly, replicate_wild_do_table is all you need to include databases, and it will properly replicate all kinds of php fudgerama, blah blah blah.

Thirdly, if you use replicate_do_DB in addition to replicate_wild_do_table to try to cover your bases, it will indeed NOT cover your bases. We have strict replication here, and it WILL break if the php code does this:
mysql_select_db = ('database_I_DO_NOT_want_to_replicate,$link);
$sql = "INSERT INTO database_I_DO_want_to_replicate.repl_table (a,b,c) VALUES (1,2,3);";


So, my advise, once again is:

DO NOT USE replicate_ignore_db and replicate_do_db!

Instead, use replicate_wild_ignore_db and replicate_wild_do_db.

Have fun! Try it out and see for yourself how this can work for and against you! (as well as MySQL loading my.cnf twice, causing your list of tables to appear twice -- proposed to be fixed for MySQL 5.1).

Until next time,
David

An Interesting Replication Tip

We recently moved some databases to different servers, and I altered our mySQL slave configuration files to take into account some of the databases we wanted to replicate (Replicate-do-db) and others that we wanted to ignore (Replicate-ignore-db) -- each of these server cases were mutually exclusive.

All went well, until I found a peculiar error:

Error 'Table 'db101.table101' doesn't exist' on query. Default database: 'db201'. Query: 'UPDATE `db101`.`table101` SET `value101`=`value101`+1 WHERE `gender`='F' AND `site`='177' AND `date`=CURDATE()'
I did not want to replicate anything from database db101, but I did want to replicate every table from db201.

How to Proceed

These are my choices:
  1. Get our team of php programmers to properly specify the database that their script is using, even if it requires 1,000 updates. And it needs to be done 10 minutes ago because our slave is stopped;
  2. Provide a bug report to mySQL and wait (months or years)
  3. Use a proxy to sort out the proper database (but there are better purposes for my lua scripts);
  4. Use Replicate-Wild-Ignore-Table option in the configuration file (very fast).
The Solution
Replicate-Do-Db=db201 # This line was already here
Replicate-Wild-Ignore-Table=db101.*

In Conclusion

There exists many solutions to a problem, but we try to aim for the simplest, most elegant solution to create the smallest footprint in terms of overhead and maintenance.

To further this last point, we must try our best to keep the overhead of our systems to a minimum, otherwise things can get out of control when we least expect it, as described in this post by John Mark Walker.

Until next time,
David

How to Group Customers by Number of Purchases Quickly!

By David H. on 2:48 PM

comments (0)

Filed Under:

This is a great little sql statement that packs a nice punch for the number crunchers out there...

A disclaimer first: make sure you don't do this on a live OLTP table with millions of rows unless you want your customers to wait. Thus, run it on your slave :).

Onto business...

The Request

"I need to know the number of people who made one purchase, two purchases, three purchases, and so on. Right now."

The Answer

SELECT total, COUNT(1) FROM (SELECT customer_id, COUNT(1) AS total FROM purchases_table GROUP BY customer_id) derived_table GROUP BY total ORDER BY total;

And would you know it, the result was there in seconds....

Until next time, (I do hope to put up some juicy warehousing how-to's soon)
David