MySQL 8.0 中的函数索引(Function-Based Index)是一种高级索引特性,它允许数据库管理员或开发者对表中的列执行表达式计算后的结果进行索引,而不是直接对列值或列的前缀值进行索引。这种索引技术可以显著提高查询性能,特别是在需要对列值进行复杂计算或转换的查询场景中。
MySQL8.0.13引入函数索引(Oracle,postgresql都有该项功能)。
一、函数索引的定义与作用
函数索引是对表中的列执行表达式计算后的结构进行索引,而不是对列或列前缀值。它允许对未直接存储在表中的数据进行索引,为应用程序提供了极大的便利和性能提升。例如,你可能需要经常查询某个日期字段的年份部分,通过函数索引,你可以直接对年份部分进行索引,而无需在每次查询时都执行日期转换操作。
二、函数索引的创建
在MySQL 8.0中,创建函数索引的语法与普通索引类似,但需要注意以下几点:
- 索引表达式:函数索引对应的表达式需要用括号()括起来。
- 表达式复杂性:索引表达式可以包括函数、算术运算、比较操作等,但应尽可能保持简单,以避免索引维护的开销过大。
- 索引类型:函数索引可以是普通索引、唯一索引等,具体取决于你的需求。
1、创建测试表及数据
create table t_function(id int primary key auto_increment,name varchar(100),create_time datetime);
insert into t_function(name,creatE_time) values('function-index','2024-07-01 12:00:00');
insert into t_function(name,creatE_time) values('mysql5.7.28-version','2024-07-01 15:00:00');
insert into t_function(name,creatE_time) values('version-mysql8.0.28','2024-07-01 21:30:00');
insert into t_function(name,creatE_time) values('create-index-indexname','2024-07-02 01:30:00');
insert into t_function(name,creatE_time) values('on-tablename-column','2024-07-02 03:30:00');
insert into t_function(name,creatE_time) values('column-function','2024-07-02 07:32:00');
insert into t_function(name,creatE_time) values('index-indexname-column','2024-07-02 10:32:00');
insert into t_function(name,creatE_time) values('alter-table-tablename','2024-07-02 15:32:00');
insert into t_function(name,creatE_time) values('add-index-indexname','2024-07-02 20:32:00');
insert into t_function(name,creatE_time) values('use-superdb','2024-07-03 02:56:00');
insert into t_function(name,creatE_time) values('show-tables','2024-07-03 08:41:00');
insert into t_function(name,creatE_time) values('show-index-from-t_function','2024-07-03 16:20:00');
commit;
2、创建函数索引
在将上述的表及数据在MySQL8.0的实例上创建,然后创建create_time的函数索引,SQL如下
(root@localhost)[superdb]> alter table t_function add key idx_t_function_create_time(create_time);
Query OK, 0 rows affected (0.23 sec)
Records: 0 Duplicates: 0 Warnings: 0
(root@localhost)[superdb]> alter table t_function add key idx_t_function_create_time_2 ((date(create_time)));
Query OK, 0 rows affected (0.08 sec)
Records: 0 Duplicates: 0 Warnings: 0
3、按时间查询
(root@localhost)[superdb]> select * from t_function where date(create_time)='2024-07-01';
+----+---------------------+---------------------+
| id | name | create_time |
+----+---------------------+---------------------+
| 1 | function-index | 2024-07-01 12:00:00 |
| 2 | mysql5.7.28-version | 2024-07-01 15:00:00 |
| 3 | version-mysql8.0.28 | 2024-07-01 21:30:00 |
+----+---------------------+---------------------+
3 rows in set (0.02 sec)
(root@localhost)[superdb]> explain select * from t_function where date(create_time)='2024-07-01';
+----+-------------+------------+------------+------+------------------------------+------------------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+------------------------------+------------------------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | t_function | NULL | ref | idx_t_function_create_time_2 | idx_t_function_create_time_2 | 4 | const | 3 | 100.00 | NULL |
+----+-------------+------------+------------+------+------------------------------+------------------------------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.02 sec)
(root@localhost)[superdb]> explain select * from t_function where create_time>='2024-07-01' and create_time<date_add('2024-07-01',interval '1' day);
+----+-------------+------------+------------+-------+----------------------------+----------------------------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+-------+----------------------------+----------------------------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | t_function | NULL | range | idx_t_function_create_time | idx_t_function_create_time | 6 | NULL | 3 | 100.00 | Using index condition |
+----+-------------+------------+------------+-------+----------------------------+----------------------------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.01 sec)
4、查看表t_function 索引
show columns and show index statements now support an optional EXTENDED keyword that causs statement output to include information about hidden columns and indexes that MySQL uses internally and are not accessible by users.
(root@localhost)[superdb]> show index from t_function;
+------------+------------+------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+-----------------------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+------------+------------+------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+-----------------------------+
| t_function | 0 | PRIMARY | 1 | id | A | 12 | NULL | NULL | | BTREE | | | YES | NULL |
| t_function | 1 | idx_t_function_create_time | 1 | create_time | A | 12 | NULL | NULL | YES | BTREE | | | YES | NULL |
| t_function | 1 | idx_t_function_create_time_2 | 1 | NULL | A | 3 | NULL | NULL | YES | BTREE | | | YES | cast(`create_time` as date) |
+------------+------------+------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+-----------------------------+
3 rows in set (0.01 sec)
(root@localhost)[superdb]> show extended index from t_function;
+------------+------------+------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+-----------------------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+------------+------------+------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+-----------------------------+
| t_function | 0 | PRIMARY | 1 | id | A | 12 | NULL | NULL | | BTREE | | | YES | NULL |
| t_function | 0 | PRIMARY | 2 | DB_TRX_ID | A | NULL | NULL | NULL | | BTREE | | | YES | NULL |
| t_function | 0 | PRIMARY | 3 | DB_ROLL_PTR | A | NULL | NULL | NULL | | BTREE | | | YES | NULL |
| t_function | 0 | PRIMARY | 4 | name | A | NULL | NULL | NULL | YES | BTREE | | | YES | NULL |
| t_function | 0 | PRIMARY | 5 | create_time | A | NULL | NULL | NULL | YES | BTREE | | | YES | NULL |
| t_function | 1 | idx_t_function_create_time | 1 | create_time | A | 12 | NULL | NULL | YES | BTREE | | | YES | NULL |
| t_function | 1 | idx_t_function_create_time | 2 | id | A | NULL | NULL | NULL | | BTREE | | | YES | NULL |
| t_function | 1 | idx_t_function_create_time_2 | 1 | NULL | A | 3 | NULL | NULL | YES | BTREE | | | YES | cast(`create_time` as date) |
| t_function | 1 | idx_t_function_create_time_2 | 2 | id | A | NULL | NULL | NULL | | BTREE | | | YES | NULL |
+------------+------------+------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+-----------------------------+
9 rows in set (0.00 sec)
可见,在MySQL8.0 创建对应的函数索引后,不改变SQL写法的前提下,查询的列上进行对应的函数计算后也可以走索引
5、MySQL5.7.X版本
因MySQL5.7不支持函数索引,所以需要修改SQL写法来实现走索引(或者使用虚拟列的方式)
- 创建表及插入数据同第1
- 在create_time字段上创建索引
- 查询创建时间是2024-07-01那天的所有记录
mysql> select * from t_function where date(create_time)='2024-07-01';
+----+---------------------+---------------------+
| id | name | create_time |
+----+---------------------+---------------------+
| 1 | function-index | 2024-07-01 12:00:00 |
| 2 | mysql5.7.28-version | 2024-07-01 15:00:00 |
| 3 | version-mysql8.0.28 | 2024-07-01 21:30:00 |
+----+---------------------+---------------------+
3 rows in set (0.00 sec)
mysql> explain select * from t_function where date(create_time)='2024-07-01';
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | t_function | NULL | ALL | NULL | NULL | NULL | NULL | 12 | 100.00 | Using where |
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.05 sec)
-- 执行计划中可以看出是进行了全表扫描
修改SQL写法来实现走索引(或者使用虚拟列的方式)
mysql> select * from t_function where create_time>='2024-07-01' and create_time<date_add('2024-07-01',interval '1' day);
+----+---------------------+---------------------+
| id | name | create_time |
+----+---------------------+---------------------+
| 1 | function-index | 2022-04-01 12:00:00 |
| 2 | mysql5.7.28-version | 2022-04-01 15:00:00 |
| 3 | version-mysql8.0.28 | 2022-04-01 21:30:00 |
+----+---------------------+---------------------+
3 rows in set (0.00 sec)
-- 等价修改SQL写法来第一种实现走索引
mysql> select * from t_function where create_time between '2024-07-01' and date_add('2022-07-01',interval '1' day);
+----+---------------------+---------------------+
| id | name | create_time |
+----+---------------------+---------------------+
| 1 | function-index | 2024-07-01 12:00:00 |
| 2 | mysql5.7.28-version | 2024-07-01 15:00:00 |
| 3 | version-mysql8.0.28 | 2024-07-01 21:30:00 |
+----+---------------------+---------------------+
3 rows in set (0.05 sec)
-- 等价修改SQL写法第二种来实现走索引
mysql> select * from t_function where create_time>='2024-07-01' and create_time<date_add('2022-04-01',interval '1' day);
+----+---------------------+---------------------+
| id | name | create_time |
+----+---------------------+---------------------+
| 1 | function-index | 2024-07-01 12:00:00 |
| 2 | mysql5.7.28-version | 2024-07-01 15:00:00 |
| 3 | version-mysql8.0.28 | 2024-07-01 21:30:00 |
+----+---------------------+---------------------+
3 rows in set (0.05 sec)
-- 查看执行计划
mysql> explain select * from t_function where create_time>='2024-07-01' and create_time<date_add('2022-04-01',interval '1' day);
+----+-------------+------------+------------+-------+----------------------------+----------------------------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+-------+----------------------------+----------------------------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | t_function | NULL | range | idx_t_function_create_time | idx_t_function_create_time | 6 | NULL | 3 | 100.00 | Using index condition |
+----+-------------+------------+------------+-------+----------------------------+----------------------------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)
三、函数索引的使用场景
函数索引特别适用于以下场景:
- 日期和时间的计算:如查询特定年份、月份或日期的记录。
- 字符串处理:如查询字符串的某个子串或进行大小写不敏感的匹配。
- 数值计算:如查询某个数值范围经过计算后落在特定区间的记录。
四、注意事项
- 索引维护:函数索引的维护成本可能比普通索引更高,因为每次插入、更新或删除记录时,都需要重新计算索引表达式。
- 索引选择性:函数索引的选择性(即索引列中不同值的比例)对于查询性能至关重要。选择性较低的索引可能无法提供显著的性能提升。
- 兼容性:不是所有的MySQL存储引擎都支持函数索引。例如,InnoDB存储引擎支持函数索引,但MyISAM则不支持。
五、总结
MySQL 8.0 的函数索引为数据库管理员和开发者提供了更灵活的索引选择,可以显著提高查询性能。然而,在使用函数索引时,需要仔细考虑索引的维护成本、选择性和兼容性等因素。通过合理使用函数索引,可以优化数据库查询性能,提升应用程序的整体性能。