MySQL tips for developers
Having more than a decade of experience with MySQL, there are a few tricks I picked up over the years that I believe should be in the arsenal of every developer who works with this database more than just superficially. I’m going to list a quick break-down of them from the top of my head.
Processing large amounts of data
There are several tricks that make MySQL faster if you’re doing a lot of data processing. Depending on what your goals are, any of these might be useful for you.
Bulk inserts of data
As it happens, MySQL regenerates the Indexes for your tables as soon as you issue an insert. There are two ways to increase your transaction rate for these type of workloads.
Extended inserts
MySQL enables you to insert multiple rows at the same time with the extended insert statement syntax.
INSERT statements that use VALUES syntax can insert multiple rows. To do this, include multiple lists of column values, each enclosed within parentheses and separated by commas.
INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);
This means you can insert tens of thousands of rows into your database instance, often increasing your throughput dramatically.
I have used this technique with data mining, processing large amounts of data, and collecting data in bulk via an intermediate store. For example: we collect statistics with a NGINX + LUA entry point, which writes data to Redis. Each call here is in the 0-1ms range which is at least ten times faster as PHP for example. We process a Redis list with a cron job, that issues bulk inserts into the database. Since reads and writes are separated fully, we can delay writing data to MySQL, or do basic data processing without interrupting the collection service.
Transactions
Similar to the extended-insert principle, wrapping your inserts into a transaction will also produce a higher insert speed. The reason for this is that a log flush to disk is performed on every transaction commit. Due to MVCC (multi-version concurrency control) all your data is kept in memory until you decide to commit your transaction, at which point everything is written to disk and indexes are regenerated.
Avoid indexes
The fastest way to insert data remotely to a table is to have the table be completely without indexes. If you wanted to do an import to a table with many indexes, it may speed up your inserts significantly if you drop the indexes, insert all the data using the above techniques, and then re-add the indexes after you inserted all the data.
Skipping the binary log
If you’re inserting into a table which shouldn’t be replicated to other nodes, you can
turn off your replication log for these tables. You need to have SUPER
privileges on
the user which is doing the bulk-inserts:
set sql_log_bin = 0;
-- produce all inserts that you don't want replicated here
The statement turns off writing to the binary log only for the active connection/session. The benefit of this is to write large amounts of data to instances that have replication set up, without interfering with data that should be replicated. Mainly, this approach brings down I/O load on the system, and doesn’t put stress on the replication channel.
Multiple connections
If you’re inserting data over multiple connections, your insert speed will grow. If you’re using a single connection, MySQL uses only one thread which limits your insert speed and uses only one CPU core. If you’re using multiple connections MySQL uses more threads, which in turn use more CPU cores, your insert rate will be appropriately higher.
And who doesn’t have several cores in a database server? Use them.
Truncate table with foreign keys
If you’re doing a lot of batched imports (lets say you’re writing a custom log importer), you will inevitably have to do the import many times. If you set up your tables with foreign keys, you might want to truncate them in development to reset any kind of auto_increment values as well.
ERROR 1701 (42000): Cannot truncate a table referenced in a foreign key constraint
Ah. Well, it turns out that the work-around for this is very simple:
SET FOREIGN_KEY_CHECKS = 0;
TRUNCATE table1;
SET FOREIGN_KEY_CHECKS = 1;
By turning off foreign key checks we enable a full TRUNCATE
to empty all the data
from the table. If you don’t have auto_increment
values, you can use DELETE without
the need to turn off FK checks, other tables will be emptied as well if you’re using
ON DELETE CASCADE
functionality.
Index is used for sort
If you have a table like this:
Field | Type |
---|---|
author_id | int(11) |
news_id | int(11) |
weight | tinyint(4) |
You might want to select all authors based on news_id
, ordered by weight.
SELECT author_id FROM authors WHERE news_id=? ORDER BY weight
The natural instinct is to add the index to the news_id
field to resolve the WHERE clause;
However, if you put the index over (news_id, weight)
MySQL will use the index to resolve
the ORDER BY as well.
More things about MySQL
I have written a few MySQL related posts in the past, some which still apply if you want to deepen your relationship with this database:
- MySQL & replication “oh right” moments,
- MySQL optimization and diagnostics tactics,
- Outperforming MySQL by hand.
While I have you here...
It would be great if you buy one of my books:
- Go with Databases
- Advent of Go Microservices
- API Foundations in Go
- 12 Factor Apps with Docker and Go
Feel free to send me an email if you want to book my time for consultancy/freelance services. I'm great at APIs, Go, Docker, VueJS and scaling services, among many other things.
Want to stay up to date with new posts?
Stay up to date with new posts about Docker, Go, JavaScript and my thoughts on Technology. I post about twice per month, and notify you when I post. You can also follow me on my Twitter if you prefer.