19 February 2010

How to Save a Lot of Time with LOAD DATA INFILE Involving Large Files

Are you tired of waiting an hour or so performing a LOAD DATA INFILE statement involving a file with millions of rows?

If so, and you have enough money to pay for 3 monthly installments of $5.99... (Belgian chocolate will substitute nicely) -- you TOO can be a hero in your office and go home early for the weekend. :)

How?

Disable keys. Load data infile. Enable keys.

ALTER TABLE table_i_am_going_to_load_data_into DISABLE KEYS;
LOAD DATA INFILE '/folder/five_million_row_file.csv' INTO TABLE table_i_am_going_to_load_data_into FIELDS TERMINATED BY ',' ENCLOSED BY '"';
ALTER TABLE table_i_am_going_to_load_data_into ENABLE KEYS;

What are the savings?

In our situation -- we have hundreds of millions of rows, 12 columns, various data types, a compound primary key (bleck), a unique index, and 2 standard indexes, and a dash of pepper -- the original LOAD DATA INFILE statement took between 50 and 75 minutes to run.

Now, this is how long it takes:
1) Disabling keys: 0.01 seconds
2) Loading data infile: 2.45 to 2.8 minutes
3) Enabling keys: 10 to 12 minutes
4) Going home early: priceless

Thanks to Ronald Bradford for the idea... trying to find the link to his post on this, but I have lost it...

Have fun! Feel free to let me know of any other nifty LOAD DATA INFILE savings...

Cheers,
David