MySQL advanced index - index usage principle (verify index efficiency, leftmost prefix rule, range query)

1. Principles of index use

1.1 verify index efficiency

tb_sku a record of 1000w is prepared in this form.

In this table, the id is the primary key and there is a primary key index, while other fields are not indexed. Let's first query one of the records to see the fields inside. Execute the following SQL:

select * from tb_sku where id = 1\G;

You can see that even if there is 1000w data, the performance of data query based on id is still very fast, because the primary key id is indexed. Next, we will query according to the sn field and execute the following SQL:

SELECT * FROM tb_sku WHERE sn = '100000003145001';


We can see that when we query according to the sn field, the query returns a piece of data, and the result takes 20.78sec. It is because sn has no index, resulting in low query efficiency.

Then we can create an index for the sn field. After the index is established, we can query according to sn again and take a look at the query time.

create index idx_sku_sn on tb_sku(sn) ; 


Then execute the same SQL statement again and check the time consumption of SQL again.

SELECT * FROM tb_sku WHERE sn = '100000003145001';


We can obviously see that the query performance is greatly improved after the sn field is indexed. Before and after indexing, the query time is not an order of magnitude.

1.2 leftmost prefix rule

If multiple columns are indexed (joint index), the leftmost prefix rule should be observed. The leftmost prefix rule means that the query starts from the leftmost column of the index and does not skip the columns in the index. If a column is skipped, the index will be partially invalidated (the index of the subsequent field will be invalidated).

Note: the leftmost column in the leftmost prefix rule means that the leftmost field of the joint index (i.e. the first field) must exist during query, which has nothing to do with the order of condition writing when we write SQL.

In TB_ Take the user table as an example. Let's take a look at the previous TB_ The index created by the user table.

tb_ Table creation statement of user table

CREATE TABLE `tb_user` (
  `id` int NOT NULL AUTO_INCREMENT COMMENT 'Primary key',
  `name` varchar(50) NOT NULL COMMENT 'user name',
  `phone` varchar(11) NOT NULL COMMENT 'cell-phone number',
  `email` varchar(100) DEFAULT NULL COMMENT 'mailbox',
  `profession` varchar(11) DEFAULT NULL COMMENT 'major',
  `age` tinyint unsigned DEFAULT NULL COMMENT 'Age',
  `gender` char(1) DEFAULT NULL COMMENT 'Gender , 1: male, 2: female',
  `status` char(1) DEFAULT NULL COMMENT 'state',
  `createtime` datetime DEFAULT NULL COMMENT 'Creation time',
  PRIMARY KEY (`id`),
  UNIQUE KEY `idx_user_phone` (`phone`),
  KEY `idx_user_name` (`name`),
  KEY `idx_user_pro_age_sta` (`profession`,`age`,`status`),
  KEY `idx_email` (`email`)
) ENGINE=InnoDB AUTO_INCREMENT=25 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci COMMENT='System user table';

INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Lv Bu', '17799990000', 'lvbu666@163.com', 'software engineering', 23, '1', '6', '2001-02-02 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Cao Cao', '17799990001', 'caocao666@qq.com', 'Communication Engineering', 33, '1', '0', '2001-03-05 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Zhao Yun', '17799990002', '17799990@139.com', 'English', 34, '1', '2', '2002-03-02 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Sun WuKong', '17799990003', '17799990@sina.com', 'engineering cost', 54, '1', '0', '2001-07-02 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Magnolia', '17799990004', '19980729@sina.com', 'software engineering', 23, '2', '1', '2001-04-22 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Big Joe', '17799990005', 'daqiao666@sina.com', 'dance', 22, '2', '0', '2001-02-07 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Luna', '17799990006', 'luna_love@sina.com', 'applied mathematics', 24, '2', '0', '2001-02-08 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Cheng Yaojin', '17799990007', 'chengyaojin@163.com', 'chemical industry', 38, '1', '5', '2001-05-23 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Xiang Yu', '17799990008', 'xiaoyu666@qq.com', 'Metallic materials', 43, '1', '0', '2001-09-18 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Bai Qi', '17799990009', 'baiqi666@sina.com', 'Mechanical engineering and automation', 27, '1', '2', '2001-08-16 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Han Xin', '17799990010', 'hanxin520@163.com', 'Inorganic nonmetallic material engineering', 27, '1', '0', '2001-06-12 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Jing Ke', '17799990011', 'jingke123@163.com', 'accounting', 29, '1', '0', '2001-05-11 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('King Lanling', '17799990012', 'lanlinwang666@126.com', 'engineering cost', 44, '1', '1', '2001-04-09 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Crazy iron', '17799990013', 'kuangtie@sina.com', 'applied mathematics', 43, '1', '2', '2001-04-10 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('army officer's hat ornaments', '17799990014', '84958948374@qq.com', 'software engineering', 40, '2', '3', '2001-02-12 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Daji', '17799990015', '2783238293@qq.com', 'software engineering', 31, '2', '0', '2001-01-30 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Mi Yue', '17799990016', 'xiaomin2001@sina.com', 'industrial economy', 35, '2', '0', '2000-05-03 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Ying Zheng', '17799990017', '8839434342@qq.com', 'chemical industry', 38, '1', '1', '2001-08-08 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Di Renjie', '17799990018', 'jujiamlm8166@163.com', 'international trade', 30, '1', '0', '2007-03-12 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Angela', '17799990019', 'jdodm1h@126.com', 'city planning', 51, '2', '0', '2001-08-15 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Dianwei', '17799990020', 'ycaunanjian@163.com', 'city planning', 52, '1', '2', '2000-04-12 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Lian Po', '17799990021', 'lianpo321@126.com', 'civil engineering', 19, '1', '3', '2002-07-18 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Hou Yi', '17799990022', 'altycj2000@139.com', 'Urban gardens', 20, '1', '0', '2002-03-10 00:00:00');
INSERT INTO tb_user (name, phone, email, profession, age, gender, status, createtime)
VALUES ('Jiang Ziya', '17799990023', '37483844@qq.com', 'engineering cost', 29, '1', '4', '2003-05-26 00:00:00');


In TB_ In the user table, there is a joint index. The joint index involves three fields in the order of profession al, age and status.

The leftmost prefix rule means that when querying, the leftmost column, that is, profession al, must exist, otherwise all indexes will be invalid. Moreover, a column cannot be skipped in the middle, otherwise the field index behind the column will become invalid. Next, let's demonstrate several groups of cases to see the specific implementation plan:

explain
select *
from tb_user
where profession = 'software engineering'
  and age = 31
  and status = '0';

explain
select *
from tb_user
where profession = 'software engineering'
  and age = 31;

explain
select *
from tb_user
where profession = 'software engineering';

In the above three groups of tests, we found that the index will take effect as long as the leftmost field of professional in the joint index exists, but the length of the index is different. Moreover, from the above three groups of tests, we can also infer that the index length of the professional field is 47, the index length of the age field is 2, and the index length of the status field is 5.

explain
select *
from tb_user
where age = 31
  and status = '0';

explain
select *
from tb_user
where status = '0';


Through the above two sets of tests, we can also see that the index does not take effect because the leftmost prefix rule is not satisfied and the leftmost column profession al of the joint index does not exist.

explain
select *
from tb_user
where profession = 'software engineering'
  and status = '0';


In the above SQL query, there is a profession al field, the leftmost column exists, and the index meets the basic conditions of the leftmost prefix rule. However, when querying, the column age is skipped, so the subsequent column index will not be used, that is, the index part takes effect, so the length of the index is 47.

Thinking questions:

When executing SQL statements (the order of query conditions has changed):

explain
select *
from tb_user
where age = 31
  and status = '0'
  and profession = 'software engineering';

Whether the leftmost prefix rule is satisfied, whether the above joint index and index length can be used?

It can be seen that the leftmost prefix rule is fully satisfied, the index length is 54, and the joint index is effective.

1.3 range query

In the joint index, range query (>, <) appears, and the column index on the right side of the range query is invalid.

explain
select *
from tb_user
where profession = 'software engineering'
  and age > 30
  and status = '0';

When > or < is used in the range query, the joint index is taken, but the len gt h of the index is 49, which means that the status field on the right of the range query does not take the index.

explain
select *
from tb_user
where profession = 'software engineering'
  and age >= 30
  and status = '0';

When the range query uses > = or < =, the joint index is used, but the len gt h of the index is 54, which means that all fields are indexed.

Therefore, if the business allows, use range queries like > = or < = as much as possible, and avoid using > or <.

Tags: MySQL index

Posted by barrygar on Sat, 16 Apr 2022 15:30:35 +0930