MySQL 8.0 新特性的一些典型应用场景

2023-7-31

测试机器：本地虚拟机测试表 sys_test，数据行 780w
测试实例1：8.0.30
测试实例2：5.7.20

一：Instant add column

Adding a column 测试：

实例1：8.0.30
  mysql> alter table sys_test add column task_ename varchar(20);
  Query OK, 0 rows affected (1.24 sec)
 
实例2：5.7.20
  mysql> alter table sys_test add column task_ename varchar(20);
  Query OK, 0 rows affected (6 min 15.12 sec)
 
加字段是业务中非常频繁的需求，以上测试表明 8.0 Instant add column 的特性明显快于 5.7 版本，基本实现了秒加字段，原因是 8.0 版本 instant 算法只是修改了元数据。
 
# 关于 8.0 Instant add column 的原理请参考: http://mysql.taobao.org/monthly/2020/03/01/
# 8.0.29 之后的版本支持 Instant drop column

二：InnoDB Parallel Read

从 8.0.14 新增了一个参数 innodb_parallel_read_threads，支持在聚集索引上并行 check table，提升 count( ) 函数的计算速度

实例1：8.0.30
  mysql> set innodb_parallel_read_threads = 8;
  mysql> select count(*) from sys_test;
  +----------+
  | count(*) |
  +----------+
  |  7860946 |
  +----------+
  1 row in set (7.82 sec)
 
实例2：5.7.20
  mysql> select count(*) from sys_test;
  +----------+
  | count(*) |
  +----------+
  | 7860946 |
  +----------+
  1 row in set (33.94 sec)
 
mysql> desc select count(id) from sys_test;
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table    | partitions | type  | possible_keys | key     | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | sys_test | NULL       | index | NULL          | PRIMARY | 162     | NULL | 7764944 |   100.00 | Using index |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+---------+----------+-------------+  
 
count() 聚合函数是使用比较频繁的操作，测试表明当 8.0 开启 parallel read 时，在 count() 走聚集索引的情况下，速度较 5.7 版本有 3-4 倍的提升。
 
# innodb_parallel_read_threads 只对聚集索引有效，当 count() 函数走二级索引时，测试发现 8.0 和 5.7 速度没有变化。
# parallel read 的原理请参考：https://zhuanlan.zhihu.com/p/408987542

三：InnoDB Parallel DDL

从 8.0.27 新增了一个参数 innodb_ddl_threads , 用于创建二级索引时，设置排序和生成 BTree 的并发线程数

实例1：8.0.30 
  mysql> set innodb_ddl_threads = 4 
  mysql> alter table sys_test add index TASK_ID_STATE_INDEX(task_id,state);
 Query OK, 0 rows affected (2 min 33.72 sec)
 
实例2：5.7.20 
  mysql> alter table sys_test add index TASK_ID_STATE_INDEX(task_id,state);
 Query OK, 0 rows affected (6 min 27.79 sec)
  
添加 联合索引，对比 8.0 parallel ddl 和 5.7 单线层 ddl，速度提升 3倍左右。实现环境为虚拟机 4核服务器，ssd + 高配服务器，相信差距会更明显。
 
# 并行 DDL 不是新技术了，只是在 mysql 中最近一年才支持。
# 并行 DDL 的原理见：https://docs.pingcap.com/zh/tidb/stable/ddl-introduction

四：Window Function (窗口函数)

oracle ，SQLServer 很早就支持了窗口函数，mysql 直到 8.0 版本才提供支持。熟悉数据库的同学对窗口函数应该不陌生。以下举例如何使用窗口函数：（以 rank() 专用窗口函数为例）

按 sale 销售额排名（跳跃式排名）:
mysql> select *,rank() over(order by sale) from sales;
+----------------+-------------+--------+----------------------------+
| sales_employee | fiscal_year | sale   | rank() over(order by sale) |
+----------------+-------------+--------+----------------------------+
| Alice          |        2017 | 100.00 |                          1 |
| Bob            |        2016 | 100.00 |                          1 |
| Alice          |        2016 | 150.00 |                          3 |
| Bob            |        2017 | 150.00 |                          3 |
| John           |        2017 | 150.00 |                          3 |
| Alice          |        2018 | 200.00 |                          6 |
| Bob            |        2018 | 200.00 |                          6 |
| John           |        2016 | 200.00 |                          6 |
| John           |        2018 | 250.00 |                          9 |
+----------------+-------------+--------+----------------------------+
 
按 sale 销售额排名（连续排名）:
mysql> select *,dense_rank() over(order by sale) from sales;
+----------------+-------------+--------+----------------------------------+
| sales_employee | fiscal_year | sale   | dense_rank() over(order by sale) |
+----------------+-------------+--------+----------------------------------+
| Alice          |        2017 | 100.00 |                                1 |
| Bob            |        2016 | 100.00 |                                1 |
| Alice          |        2016 | 150.00 |                                2 |
| Bob            |        2017 | 150.00 |                                2 |
| John           |        2017 | 150.00 |                                2 |
| Alice          |        2018 | 200.00 |                                3 |
| Bob            |        2018 | 200.00 |                                3 |
| John           |        2016 | 200.00 |                                3 |
| John           |        2018 | 250.00 |                                4 |
+----------------+-------------+--------+----------------------------------+

mysql 支持了很多专用窗口函数，和一些普通的聚合函数开窗，这里不一一举例，窗口函数在涉及到组内排序、排名的场景提供了方便，降低了原生 SQL 在这些场景中的复杂性。

mysql 窗口函数使用手册请参考：window-functions-usage

五：Functional index （函数索引）

函数索引也不是新技术，oracle 早有支持，只是 mysql 8.0 才提供支持。我们知道在 5.7 以前的版本中在索引列上使用表达式或者函数进行查询时，会导致索引失效，这个问题 8.0 版本可以通过创建函数索引来解决：

表达式索引测试

# 5.7 版本中：
mysql> show create table t3 \G;
*************************** 1. row ***************************
       Table: t3
Create Table: CREATE TABLE `t3` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(10) DEFAULT NULL,
  `age` int(11) DEFAULT NULL,
  `create_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
  PRIMARY KEY (`id`),
  KEY `age_idx` (`age`)
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)
 
 
mysql> desc select * from t3 where age + 1 = 11;   # 虽然 age 列上有索引，但是当我们使用表达式  age + 1 查询时，却无法使用到  age 索引
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t3    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   10 |   100.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
 
 
# 8.0 版本中： 可以创建 create index idx_age__1 on t3 ((age + 1)); 表达式索引解决这个问题
mysql> create index idx_age__1 on t3 ((age + 1));
Query OK, 0 rows affected (0.07 sec)
Records: 0  Duplicates: 0  Warnings: 0
 
mysql> show create table t3 \G;
*************************** 1. row ***************************
       Table: t3
Create Table: CREATE TABLE `t3` (
  `id` int NOT NULL AUTO_INCREMENT,
  `name` varchar(10) DEFAULT NULL,
  `age` int DEFAULT NULL,
  `create_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
  PRIMARY KEY (`id`),
  KEY `age_idx` (`age`),
  KEY `idx_age__1` (((`age` + 1)))   # 表达式索引
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)
 
mysql> desc select * from t3 where age + 1 = 11;  # 此时可以利用到索引进行  age + 1 查询数据
+----+-------------+-------+------------+------+---------------+------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key        | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | t3    | NULL       | ref  | idx_age__1    | idx_age__1 | 9       | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+------------+---------+-------+------+----------+-------+

函数索引测试

# 5.7 版本
mysql> show create table t3 \G;
*************************** 1. row ***************************
       Table: t3
Create Table: CREATE TABLE `t3` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(10) DEFAULT NULL,
  `age` int(11) DEFAULT NULL,
  `create_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
  PRIMARY KEY (`id`),
  KEY `age_idx` (`age`),
  KEY `idx_ctime` (`create_time`)
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)
 
ERROR:
No query specified
 
mysql> desc select * from t3 where DATE_FORMAT(create_time,'%Y-%m-%d')='2023-07-07';  # 众所周知，在索引字段上使用函数会导致索引失效
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t3    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   10 |   100.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
 
 
# 8.0 版本：可以创建函数索引
mysql> create index idx_func_create_time on t3 ((DATE_FORMAT(create_time,'%Y-%m-%d'))); # 创建函数索引
Query OK, 0 rows affected (0.05 sec)
Records: 0  Duplicates: 0  Warnings: 0
 
mysql> show create table t3 \G;
*************************** 1. row ***************************
       Table: t3
Create Table: CREATE TABLE `t3` (
  `id` int NOT NULL AUTO_INCREMENT,
  `name` varchar(10) DEFAULT NULL,
  `age` int DEFAULT NULL,
  `create_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
  PRIMARY KEY (`id`),
  KEY `age_idx` (`age`),
  KEY `idx_age__1` (((`age` + 1))),
  KEY `idx_func_create_time` ((date_format(`create_time`,_utf8mb4'%Y-%m-%d')))  # 函数索引
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)
 
ERROR:
No query specified
 
mysql> desc select * from t3 where DATE_FORMAT(create_time,'%Y-%m-%d')='2023-07-07';  # 此时在索引字段上使用函数，不会导致索引失效
+----+-------------+-------+------------+------+----------------------+----------------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys        | key                  | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+----------------------+----------------------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | t3    | NULL       | ref  | idx_func_create_time | idx_func_create_time | 43      | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+----------------------+----------------------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)

函数索引与普通索引的数据库结构没有任何区别，只是根据表达式成了新的排序树。

六：Descending Indexes （倒排索引）

倒排索引在 mongodb 、oracle 中早就支持了，只是 mysql 直到 8.0 才支持。8.0 之前的版本创建索引皆为正序，当表数据量较大并进行 order by desc 查询时，即使在索引上排序，有时候也会出现严重的性能问题。

对于业务中的大表在索引上倒序查询时，可以创建倒叙索引提升 SQL 排序的性能

# 创建倒排索引
mysql> create index idx_desc_age on t3 (age desc); 
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0
 
mysql> show create table t3 \G;
*************************** 1. row ***************************
       Table: t3
Create Table: CREATE TABLE `t3` (
  `id` int NOT NULL AUTO_INCREMENT,
  `name` varchar(10) DEFAULT NULL,
  `age` int DEFAULT NULL,
  `create_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
  PRIMARY KEY (`id`),
  KEY `age_idx` (`age`),
  KEY `idx_desc_age` (`age` DESC), # 倒排索引
  KEY `idx_age__1` (((`age` + 1))), # 表达式索引
  KEY `idx_func_create_time` ((date_format(`create_time`,_utf8mb4'%Y-%m-%d'))) # 函数索引
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

实际上 mysql 8.0 支持更丰富的索引类型，但仅在使用 json 类型时可用，比如多值索引，允许在数组类型上创建索引来加快块查询速度。

七：Persisted System Variables （配置参数持久化）

我们知道，5.7 版本之前通过 set global max_connections=10000 修改全局参数时，是没有持久化到配置文件的，当数据库发生重启那么修改的参数讲还原回去。8.0 版本中允许该操持久化到配置文件

mysql> show variables like 'max_connections';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| max_connections | 512   |
+-----------------+-------+
1 row in set (0.00 sec)
 
mysql> set persist max_connections=1024;  # set persist 修改全局变量，将持久化到配置文件
Query OK, 0 rows affected (0.00 sec)
 
set persist 修改配置以后，会在 datadir 目录下生成 mysqld-auto.cnf 文件，该文件用来配置参数持久化
 
[root@db-test01 data]# cat mysqld-auto.cnf
{"Version": 2, "mysql_dynamic_parse_early_variables": {"max_connections": {"Value": "1024", "Metadata": {"Host": "localhost", "User": "root", "Timestamp": 1689154725638555}}}}
 
当数据库再次启动时，会先加载 my.cnf 再加载 mysqld-auto.cnf 文件

set persist 配置参数持久化特性可以很好的解决，修改了参数又忘记修改配置文件，数据库重启导致配置还原的情况。

总结：

8.0 除了以上典型的特性，还支持 hash join ，ddl 原子性，index skip scan 等里程碑式的特性（仅限特性的使用场景）。但这些都不属于新的技术，在 oracle 中很早就提供了支持，mysql 被 oracle 收购之后，从8.0开始吸收了 orcale 的一些经验，性能上有质的提升。

以上测试的 7 个典型特性，在实际业务中具有较多的实用性。新业务中可以考虑使用 mysql 8.0.30 版本。

THE END

微信公众号后台如何添加视频号到图文中？

<<上一篇

MySQL高级教程：执行计划explain详解

下一篇>>