[b]1、问题描述[/b]
朋友遇到一个怪事,一个用子查询的DELETE,执行效率非常低。把DELETE改成SELECT后执行起来却很快,百思不得其解。
下面就是这个用了子查询的DELETE了:
[yejr@imysql.com]mydb > EXPLAIN delete from trade_info where id in (
select id from (
select a.id from trade_info a, order_info b, user c where
b.buyer = c.id and c.itv_account='90000248′ and a.order_id = b.id) temp)\G
[img]http://imysql.com/wp-content/uploads/2016/06/delete1.jpg[/img]
几个表的DDL是这样的:
[img]http://imysql.com/wp-content/uploads/2016/06/delete2.jpg[/img]
上面这个SQL的执行耗时是:31.74秒
Query OK, 5 rows affected (31.74 sec)
如果我们把DELETE改写成SELECT的话,执行耗时仅是:0秒,来对比看下执行计划:
[yejr@imysql.com]mydb >EXPLAIN select id from trade_info where
id in (
select id from (
select a.id from trade_info a, order_info b, user c where
b.buyer = c.id and c.itv_account='90000248′ and a.order_id = b.id) temp)\G
[img]http://imysql.com/wp-content/uploads/2016/06/delete3.jpg[/img]
可以看到,trade_info 表从的全表扫描(type=ALL)变成了基于主键的等值查询(type=eq_ref),计划扫描数据量也从571万变成了1条,而且还可以避免回表,这2个SQL对比代价相差巨大。
[b]2、优化思路[/b]
既然这个SQL把DELETE改成SELECT后执行效率就可以获得很大提升,除此外没特别区别,可能是查询优化器方面有些不足,导致无法直接优化,就得另想办法了。
我们的思路是把基于子查询的DELETE简化改写成多表JOIN后DELETE(一般来说,子查询效率比较低的话,可以考虑改写成JOIN),多表DELETE的语法课参考:https://dev.mysql.com/doc/refman/5.7/en/delete.html#idm140469624466800,例如这样的:
DELETE t1 FROM t1 LEFT JOIN t2 ON t1.id=t2.id WHERE t2.id IS NULL;
参照上面的形式,改写之后的SQL变成了下面这样:
DELETE trade_info
FROM
trade_info,
(
SELECT
a.id
FROM
trade_info a
JOIN order_info b ON a.order_id = b.id
JOIN user c ON b.buyer = c.id
WHERE
c.itv_account = ‘90000248'
) t2 where trade_info.id = t2.id;
[img]http://imysql.com/wp-content/uploads/2016/06/delete4.jpg[/img]
[img]data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccllPAAAAyBpVFh0WE1MOmNvbS5hZG9iZS54bXAAAAAAADw/eHBhY2tldCBiZWdpbj0i77u/IiBpZD0iVzVNME1wQ2VoaUh6cmVTek5UY3prYzlkIj8+IDx4OnhtcG1ldGEgeG1sbnM6eD0iYWRvYmU6bnM6bWV0YS8iIHg6eG1wdGs9IkFkb2JlIFhNUCBDb3JlIDUuMC1jMDYwIDYxLjEzNDc3NywgMjAxMC8wMi8xMi0xNzozMjowMCAgICAgICAgIj4gPHJkZjpSREYgeG1sbnM6cmRmPSJodHRwOi8vd3d3LnczLm9yZy8xOTk5LzAyLzIyLXJkZi1zeW50YXgtbnMjIj4gPHJkZjpEZXNjcmlwdGlvbiByZGY6YWJvdXQ9IiIgeG1sbnM6eG1wPSJodHRwOi8vbnMuYWRvYmUuY29tL3hhcC8xLjAvIiB4bWxuczp4bXBNTT0iaHR0cDovL25zLmFkb2JlLmNvbS94YXAvMS4wL21tLyIgeG1sbnM6c3RSZWY9Imh0dHA6Ly9ucy5hZG9iZS5jb20veGFwLzEuMC9zVHlwZS9SZXNvdXJjZVJlZiMiIHhtcDpDcmVhdG9yVG9vbD0iQWRvYmUgUGhvdG9zaG9wIENTNSBXaW5kb3dzIiB4bXBNTTpJbnN0YW5jZUlEPSJ4bXAuaWlkOkJDQzA1MTVGNkE2MjExRTRBRjEzODVCM0Q0NEVFMjFBIiB4bXBNTTpEb2N1bWVudElEPSJ4bXAuZGlkOkJDQzA1MTYwNkE2MjExRTRBRjEzODVCM0Q0NEVFMjFBIj4gPHhtcE1NOkRlcml2ZWRGcm9tIHN0UmVmOmluc3RhbmNlSUQ9InhtcC5paWQ6QkNDMDUxNUQ2QTYyMTFFNEFGMTM4NUIzRDQ0RUUyMUEiIHN0UmVmOmRvY3VtZW50SUQ9InhtcC5kaWQ6QkNDMDUxNUU2QTYyMTFFNEFGMTM4NUIzRDQ0RUUyMUEiLz4gPC9yZGY6RGVzY3JpcHRpb24+IDwvcmRmOlJERj4gPC94OnhtcG1ldGE+IDw/eHBhY2tldCBlbmQ9InIiPz6p+a6fAAAAD0lEQVR42mJ89/Y1QIABAAWXAsgVS/hWAAAAAElFTkSuQmCC[/img]
可以看到新的SQL执行效率相对就高很多了,不需要再扫描571万条记录,执行耗时只需:0.01秒。
Query OK, 5 rows affected (0.01 sec)
3、其他建议
虽然MySQL 5.6及以上的版本对子查询做了优化,但从本案例的结果来看,在一些情况下还是不如意。
因此,如果发现有些子查询SQL效率比较差的话,可以尝试改写成JOIN形式,看看是否有所提升。此外,也要勇于怀疑查询优化器个别情况下存在不足,想办法绕过这些坑。