19 February 2010

How to Save a Lot of Time with LOAD DATA INFILE Involving Large Files

Are you tired of waiting an hour or so performing a LOAD DATA INFILE statement involving a file with millions of rows?

If so, and you have enough money to pay for 3 monthly installments of $5.99... (Belgian chocolate will substitute nicely) -- you TOO can be a hero in your office and go home early for the weekend. :)

How?

Disable keys. Load data infile. Enable keys.

ALTER TABLE table_i_am_going_to_load_data_into DISABLE KEYS;
LOAD DATA INFILE '/folder/five_million_row_file.csv' INTO TABLE table_i_am_going_to_load_data_into FIELDS TERMINATED BY ',' ENCLOSED BY '"';
ALTER TABLE table_i_am_going_to_load_data_into ENABLE KEYS;

What are the savings?

In our situation -- we have hundreds of millions of rows, 12 columns, various data types, a compound primary key (bleck), a unique index, and 2 standard indexes, and a dash of pepper -- the original LOAD DATA INFILE statement took between 50 and 75 minutes to run.

Now, this is how long it takes:
1) Disabling keys: 0.01 seconds
2) Loading data infile: 2.45 to 2.8 minutes
3) Enabling keys: 10 to 12 minutes
4) Going home early: priceless

Thanks to Ronald Bradford for the idea... trying to find the link to his post on this, but I have lost it...

Have fun! Feel free to let me know of any other nifty LOAD DATA INFILE savings...

Cheers,
David

08 January 2010

How to SHOW PROCESSLIST Every .5 Seconds

Yes, it has been a while since I last posted, so we'll see how this year goes...

Just the other evening I came across something that I should have done much sooner. Run SHOW PROCESSLIST every second without hitting "ENTER" repetitively.

The Setup

1) Create a temporary user with a non-sensitive password on localhost with SUPER privs. Yes, this crazy, but I did say *temporary*, right?
mysql> GRANT SUPER ON *.* TO tmp_user@localhost IDENTIFIED BY 'tmp_user_passwd_2102394567';

Now In Action

2) Go back to the shell and type this:

$ watch -n1 'mysql -utmp_user -ptmp_user_passwd_2102394567 --exec="SHOW PROCESSLIST"'

Simple, eh?

This is a good way to get a quick rundown of mysql's process list in a dev or test environment, such as simple web page load tests, or even an environment where you have a long running query and you are wondering when it will be completed.

Of course, there are a lot of fantastic tools out there that will do much more, and are better, but this is a good "quick & dirty" processlist check that works as a good substitute for typing SHOW PROCESSLIST in mysql 20 times over...

DON'T FORGET to DROP this user when you're done!

Until next time,
David