MySQL tips for developers

Having more than a decade of experience with MySQL, there are a few tricks I picked up over the years that I believe should be in the arsenal of every developer who works with this database more than just superficially. I’m going to list a quick break-down of them from the top of my head.

Processing large amounts of data

There are several tricks that make MySQL faster if you’re doing a lot of data processing. Depending on what your goals are, any of these might be useful for you.

Bulk inserts of data

As it happens, MySQL regenerates the Indexes for your tables as soon as you issue an insert. There are two ways to increase your transaction rate for these type of workloads.

Extended inserts

MySQL enables you to insert multiple rows at the same time with the extended insert statement syntax.

INSERT statements that use VALUES syntax can insert multiple rows. To do this, include multiple lists of column values, each enclosed within parentheses and separated by commas.

INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);

This means you can insert tens of thousands of rows into your database instance, often increasing your throughput dramatically.

I have used this technique with data mining, processing large amounts of data, and collecting data in bulk via an intermediate store. For example: we collect statistics with a NGINX + LUA entry point, which writes data to Redis. Each call here is in the 0-1ms range which is at least ten times faster as PHP for example. We process a Redis list with a cron job, that issues bulk inserts into the database. Since reads and writes are separated fully, we can delay writing data to MySQL, or do basic data processing without interrupting the collection service.

Transactions

Similar to the extended-insert principle, wrapping your inserts into a transaction will also produce a higher insert speed. The reason for this is that a log flush to disk is performed on every transaction commit. Due to MVCC (multi-version concurrency control) all your data is kept in memory until you decide to commit your transaction, at which point everything is written to disk and indexes are regenerated.

Avoid indexes

The fastest way to insert data remotely to a table is to have the table be completely without indexes. If you wanted to do an import to a table with many indexes, it may speed up your inserts significantly if you drop the indexes, insert all the data using the above techniques, and then re-add the indexes after you inserted all the data.

Skipping the binary log

If you’re inserting into a table which shouldn’t be replicated to other nodes, you can turn off your replication log for these tables. You need to have SUPER privileges on the user which is doing the bulk-inserts:

set sql_log_bin = 0;
-- produce all inserts that you don't want replicated here

The statement turns off writing to the binary log only for the active connection/session. The benefit of this is to write large amounts of data to instances that have replication set up, without interfering with data that should be replicated. Mainly, this approach brings down I/O load on the system, and doesn’t put stress on the replication channel.

Multiple connections

If you’re inserting data over multiple connections, your insert speed will grow. If you’re using a single connection, MySQL uses only one thread which limits your insert speed and uses only one CPU core. If you’re using multiple connections MySQL uses more threads, which in turn use more CPU cores, your insert rate will be appropriately higher.

And who doesn’t have several cores in a database server? Use them.

Truncate table with foreign keys

If you’re doing a lot of batched imports (lets say you’re writing a custom log importer), you will inevitably have to do the import many times. If you set up your tables with foreign keys, you might want to truncate them in development to reset any kind of auto_increment values as well.

ERROR 1701 (42000): Cannot truncate a table referenced in a foreign key constraint

Ah. Well, it turns out that the work-around for this is very simple:

SET FOREIGN_KEY_CHECKS = 0; 
TRUNCATE table1; 
SET FOREIGN_KEY_CHECKS = 1;

By turning off foreign key checks we enable a full TRUNCATE to empty all the data from the table. If you don’t have auto_increment values, you can use DELETE without the need to turn off FK checks, other tables will be emptied as well if you’re using ON DELETE CASCADE functionality.

Index is used for sort

If you have a table like this:

Field	Type
author_id	int(11)
news_id	int(11)
weight	tinyint(4)

You might want to select all authors based on news_id, ordered by weight.

SELECT author_id FROM authors WHERE news_id=? ORDER BY weight

The natural instinct is to add the index to the news_id field to resolve the WHERE clause; However, if you put the index over (news_id, weight) MySQL will use the index to resolve the ORDER BY as well.

More things about MySQL

I have written a few MySQL related posts in the past, some which still apply if you want to deepen your relationship with this database:

While I have you here...

It would be great if you buy one of my books:

Buying a copy supports me writing more about similar topics.

For business inqueries, send me an email. I'm available for consultany/freelance work. See my page for more detail..