Table of Contents
MySQL - Performance using case-insensitive columns
When it comes to matching in queries, if you don't need to do case insensitive comparison (and not search) you should definitely consider using binary columns. They're much faster.
In this example, I use utf8mb4, which is what you probably should use if you wish to have better Unicode support (learn more here). Keep in mind though that the character set can be essentially anything, and these results are always in favour of comparing against case sensitive (or binary) columns.
Case Sensitive
SELECT BENCHMARK(500000000, _utf8mb4'foo' < _utf8mb4'blah' COLLATE utf8mb4_bin);
Result:
1 row in set (7.82 sec)
Case Insensitive
SELECT BENCHMARK(500000000, _utf8mb4'foo' < _utf8mb4'blah' COLLATE utf8mb4_unicode_ci);
Result:
1 row in set (21.25 sec)
As you can see, though, case sensitive was nearly three times faster than case insensitive.
Therefore it is recommended to use a case-sensitive character set wherever possible.
That is to use utf8mb4_bin instead of utf8mb4_generic_ci.
It would be good to alter the column to be VARCHAR BINARY:
ALTER TABLE `mytable1` CHANGE `mycolumn` `mycolumn` VARCHAR(254) BINARY NOT NULL;
However. if you do need to use a case-insensitive character set, then consider that CHAR and VARCHAR also have case insensitive versions.
Here's a little chart for which to change to:
CHAR | BINARY |
VARCHAR | VARBINARY |
TEXT | BLOB |
MEDIUMTEXT | MEDIUMBLOB |
LONGTEXT | LONGBLOB |
If you're not using fulltext search on MyISAM you might just want to consider switching your TEXT columns to BLOB anyway to avoid character set issues with Unicode.