Skip to content

LOAD DATA INFILE (and mysqldump)

A colleague of mine has been benchmarking mysqldump data load vs. various versions of LOAD DATA INFILE. He created a sample data as a text file with either 100k or 20M rows of five integers each, the first column of which is the pk.
perl -MList::Util=shuffle -e '@k=shuffle(1..20e6);
  for (@k) {
    print $_, "    ", join("    ", map int(rand(1e9)), 0..3), "\n";
}' > loadme_nonpkorder.txt

perl -e 'print ++$i, "    ", join("    ", map int(rand(1e9)), 0..3), "\n" 
  for 1..20e6' > loadme_pkorder.txt

All insertion has been done on empty and new tables. The text files we read at least once before to warm up the OS disk cache. The tables have two non-unique single-column indexes. All happens on a idle-ish DB master with some substantial memory and a NetApp hosting the datadir (via XFS and LVM).

He benchmarked four cases:
  1. Insertion in PK order.
  2. Insertion in PK order, dropping indexes before insertion and re-adding them later.
  3. Insertion in random order.
  4. Insertion in random order, dropping indexes before insertion and re-adding them later.

Summary: The result is not surprising: Both using PK order and dropping/re-adding indexes improves performance considerably. The PK order insertion becomes more and more crucial with a larger dataset (which is not at all surprising if you think about what happens when adding a record to the innodb PK tree).

Continue reading "LOAD DATA INFILE (and mysqldump)"


Being a consultant for MySQL means that I need a lot of instances of MySQL running on my desktop, all different versions and different configurations for different customers. It also means that I need to create instances on the fly to show off a configuration option to a customer or to try something out for a customer.

My system is a SuSE Linux 10.0, and I keep the SuSE MySQL binaries on the system for reference. I also need to have the different 4.0, 4.1 and 5.0 versions ready on my machine. In order to not to disturb SuSE's rpm setup, I installed the MySQL binaries as tarballs into /usr/local. It looks just like this:

Continue reading "Instances"