优化DBLE独立子查询教程

电子说

1.3w人已加入

描述

01问题

前期开发反馈在使用独立子查询时,不论子查询中结果集有几个,语句都会卡死迟迟得不到返回结果。但是如果去掉子查询,直接赋值查询很快得到返回结果。听到这个情况第一反应很可能DBLE在独立子查询上没有做相关优化,真的是这样么?下面我们将问题复现以及优化方式进行展示。

02 演示及优化

环境检查

DBLE版本:2.19.11.5

 MySQL版本:5.7.28

 涉及分片表:src_biz_filelist     

 分片键:batch_no

 涉及垂直表:src_image_txn_jnl

 分片拆分规则:stringhash

 节点数量:4

2.1 原始语句及执行计划

独立子查询语句如下:

select caption,batch_no,image_no from src_biz_filelist where batch_no in (SELECT imageid FROM src_image_txn_jnl WHERE  front_seq_num = '001780568097' and txn_dt = '2029-08-20')

执行计划如下:

mysql> explain select caption,batch_no,image_no from src_biz_filelist where batch_no in (SELECT imageid FROM src_image_txn_jnl WHERE  front_seq_num = '001780568097' and txn_dt = '2029-08-20');
+-------------------+-----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| DATA_NODE         | TYPE            | SQL/REF                                                                                                                                                                                                                                                                                                                                                                                              |
+-------------------+-----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dn1_0             | BASE SQL        | select `src_biz_filelist`.`ysyid`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no`,`src_biz_filelist`.`caption`,`src_biz_filelist`.`create_time`,`src_biz_filelist`.`update_time`,`src_biz_filelist`.`img_seql`,`src_biz_filelist`.`data_date` from  `src_biz_filelist` ORDER BY `src_biz_filelist`.`batch_no` ASC                                                                        |
| dn2_0             | BASE SQL        | select `src_biz_filelist`.`ysyid`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no`,`src_biz_filelist`.`caption`,`src_biz_filelist`.`create_time`,`src_biz_filelist`.`update_time`,`src_biz_filelist`.`img_seql`,`src_biz_filelist`.`data_date` from  `src_biz_filelist` ORDER BY `src_biz_filelist`.`batch_no` ASC                                                                        |
| dn3_0             | BASE SQL        | select `src_biz_filelist`.`ysyid`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no`,`src_biz_filelist`.`caption`,`src_biz_filelist`.`create_time`,`src_biz_filelist`.`update_time`,`src_biz_filelist`.`img_seql`,`src_biz_filelist`.`data_date` from  `src_biz_filelist` ORDER BY `src_biz_filelist`.`batch_no` ASC                                                                        |
| dn4_0             | BASE SQL        | select `src_biz_filelist`.`ysyid`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no`,`src_biz_filelist`.`caption`,`src_biz_filelist`.`create_time`,`src_biz_filelist`.`update_time`,`src_biz_filelist`.`img_seql`,`src_biz_filelist`.`data_date` from  `src_biz_filelist` ORDER BY `src_biz_filelist`.`batch_no` ASC                                                                        |
| merge_and_order_1 | MERGE_AND_ORDER | dn1_0; dn2_0; dn3_0; dn4_0                                                                                                                                                                                                                                                                                                                                                                           |
| shuffle_field_1   | SHUFFLE_FIELD   | merge_and_order_1                                                                                                                                                                                                                                                                                                                                                                                    |
| dn1_1             | BASE SQL        | select `autoalias_src_image_txn_jnl`.`autoalias_scalar` from (select  distinct `src_image_txn_jnl`.`imageid` as `autoalias_scalar` from  `src_image_txn_jnl` where (`src_image_txn_jnl`.`front_seq_num` = '001780568097') AND (`src_image_txn_jnl`.`txn_dt` = '2029-08-20') order by autoalias_scalar ASC) autoalias_src_image_txn_jnl order by `autoalias_src_image_txn_jnl`.`autoalias_scalar` ASC |
| merge_1           | MERGE           | dn1_1                                                                                                                                                                                                                                                                                                                                                                                                |
| join_1            | JOIN            | shuffle_field_1; merge_1                                                                                                                                                                                                                                                                                                                                                                             |
| shuffle_field_2   | SHUFFLE_FIELD   | join_1                                                                                                                                                                                                                                                                                                                                                                                               |
+-------------------+-----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
10 rows in set (0.00 sec)

**执行计划可见,DBLE对语句进行了拆分。首先分别在4个数据节点全表扫描分片表src_biz_filelist,将各自返回的结果集在DBLE层合并排序。然后将子查询结果去重排序后返回。最后在DBLE层进行匹配操作。当分片表数据量较大时,全部提取数据动作,即使不与子查询中结果集匹配,效率也不会很好。

**

按照开发描述改写子查询语句赋予具体值,进行查看:

mysql> explain select caption,batch_no,image_no from src_biz_filelist where batch_no in ('0250002403989000IMMEDA200630095447WV','1650002432532000IMMEDA200630095233WS');
+-----------+----------+------------------------------------------------------------------------------------------------------------------------------------------------------+
| DATA_NODE | TYPE     | SQL/REF                                                                                                                                              |
+-----------+----------+------------------------------------------------------------------------------------------------------------------------------------------------------+
| dn4       | BASE SQL | select caption,batch_no,image_no from src_biz_filelist where batch_no in ('0250002403989000IMMEDA200630095447WV','1650002432532000IMMEDA200630095233WS') |
+-----------+----------+------------------------------------------------------------------------------------------------------------------------------------------------------+
1 rows in set (0.00 sec)

因为分片表src_biz_filelist的分片键为batch_no,语句直接下压到了数据所在的dn4节点。验证了开发描述的在使用独立子查询时,不论子查询中结果集有几个,语句都会卡死迟迟得不到返回结果。如果去掉子查询赋予具体值,很快得到返回结果。程序进行语句拆分传入具体值查询,也是我们的优化方法之一。那是不是DBLE在独立子查询上真的没有做相关优化呢?我们继续操作。

**2.2 **独立标量子查询语句及执行计划

如果明确子查询结果单个值情况下,可以改写使用独立标量子查询,执行计划如下:

mysql> explain select caption,batch_no,image_no from src_biz_filelist where batch_no = (SELECT imageid FROM src_image_txn_jnl WHERE  front_seq_num = '001780568097' and txn_dt = '2029-08-20');
+--------------------+-----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| DATA_NODE          | TYPE                  | SQL/REF                                                                                                                                                                                                       |
+--------------------+-----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dn1_0              | BASE SQL              | select `src_image_txn_jnl`.`imageid` as `autoalias_scalar` from  `src_image_txn_jnl` where (`src_image_txn_jnl`.`front_seq_num` = '001780568097') AND (`src_image_txn_jnl`.`txn_dt` = '2029-08-20') limit 0,2 |
| merge_1            | MERGE                 | dn1_0                                                                                                                                                                                                         |
| scalar_sub_query_1 | SCALAR_SUB_QUERY      | merge_1                                                                                                                                                                                                       |
| dn1_1              | BASE SQL(May No Need) | scalar_sub_query_1; select `src_biz_filelist`.`caption`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no` from  `src_biz_filelist` where `src_biz_filelist`.`batch_no` = '{NEED_TO_REPLACE}'        |
| dn2_0              | BASE SQL(May No Need) | scalar_sub_query_1; select `src_biz_filelist`.`caption`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no` from  `src_biz_filelist` where `src_biz_filelist`.`batch_no` = '{NEED_TO_REPLACE}'        |
| dn3_0              | BASE SQL(May No Need) | scalar_sub_query_1; select `src_biz_filelist`.`caption`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no` from  `src_biz_filelist` where `src_biz_filelist`.`batch_no` = '{NEED_TO_REPLACE}'        |
| dn4_0              | BASE SQL(May No Need) | scalar_sub_query_1; select `src_biz_filelist`.`caption`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no` from  `src_biz_filelist` where `src_biz_filelist`.`batch_no` = '{NEED_TO_REPLACE}'        |
| merge_2            | MERGE                 | dn1_1; dn2_0; dn3_0; dn4_0                                                                                                                                                                                    |
| shuffle_field_1    | SHUFFLE_FIELD         | merge_2                                                                                                                                                                                                       |
+--------------------+-----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
9 rows in set (0.00 sec)

执行计划可见:dn1_0部分先将子查询结果返回,出现了SCALAR_SUB_QUERY标量子查询的标识,被引用到外部查询中,语句直接下压到了全部数据节点。这种情况下依据分片键进行等值查询,效率是OK的。

****03 解决

通过查看原始语句在DBLE的执行计划,下压到MySQL时,由于DBLE对语句进行了重写,并没有使用到MySQL优化器改写语句为semi join方式。在MySQL中需要使用子查询完成的任务,很多时候可以使用连接方式来实现,那么我们在上层就重写为join语句实验一下。

查看DBLE文档server.xml 系统参数配置部分有一个useJoinStrategy参数,该参数作用是开启之后会尝试判断join两边的where来重新调整查询SQL下发的顺序,默认关闭。添加true至server.xml的标签中,并重启DBLE。

查看改写之后语句执行计划:

mysql> explain select t1.caption,t1.batch_no,t1.image_no from src_biz_filelist t1 inner join src_image_txn_jnl t2 on t1.batch_no=t2.imageid  WHERE  t2.front_seq_num = '001780568097' and t2.txn_dt = '2029-08-20';
+-------------------+-----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| DATA_NODE         | TYPE                  | SQL/REF                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
+-------------------+-----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dn1_0             | BASE SQL              | select `t2`.`id`,`t2`.`evt_id`,`t2`.`front_seq_num`,`t2`.`org_id`,`t2`.`teller_id`,`t2`.`txn_dt`,`t2`.`txn_tm`,`t2`.`txn_type_cd`,`t2`.`txn_cd`,`t2`.`sub_seq_num`,`t2`.`bcs_seq_num`,`t2`.`user_ccy_cd`,`t2`.`acct_num`,`t2`.`acct_ccy_cd`,`t2`.`acct_amt`,`t2`.`txn_ccy_cd`,`t2`.`txn_amt`,`t2`.`exch_rate`,`t2`.`resp_num`,`t2`.`proc_info`,`t2`.`auth_teller_id`,`t2`.`terminal_num`,`t2`.`cntrpty_acct_num`,`t2`.`cntrpty_acct_name`,`t2`.`vchr_type_encode`,`t2`.`vchr_num`,`t2`.`vchr_dt`,`t2`.`chk_teller_id`,`t2`.`vchr_main_file_desc`,`t2`.`vchr_accessory_desc`,`t2`.`biz_status_cd`,`t2`.`fin_txn_ind`,`t2`.`txn_complt_dt`,`t2`.`ident_chk_chker_id`,`t2`.`ident_chk_result_cd`,`t2`.`veri_seal_chker_id`,`t2`.`veri_seal_result_cd`,`t2`.`tel_che_chker_id`,`t2`.`tel_che_result_cd`,`t2`.`tel_che_comment`,`t2`.`pay_acct_doubt_type_cd`,`t2`.`rcv_acct_doubt_type_cd`,`t2`.`job_center`,`t2`.`imageid`,`t2`.`pay_name`,`t2`.`txn_desc`,`t2`.`origin_txn_seq_num`,`t2`.`txn_start_tm`,`t2`.`txn_memo`,`t2`.`bcs_or_back_txn_num`,`t2`.`bea_intrfc_num`,`t2`.`txn_auth_teller`,`t2`.`send_back_or_back_err_info`,`t2`.`data_date` from  `src_image_txn_jnl` `t2` where (`t2`.`front_seq_num` = '001780568097') AND (`t2`.`txn_dt` = '2029-08-20') order by `t2`.`imageid` ASC |
| merge_1           | MERGE                 | dn1_0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| nest_loop_1       | NEST_LOOP             | merge_1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| dn1_1             | BASE SQL(May No Need) | nest_loop_1's RESULTS; select `t1`.`ysyid`,`t1`.`batch_no`,`t1`.`image_no`,`t1`.`caption`,`t1`.`create_time`,`t1`.`update_time`,`t1`.`img_seql`,`t1`.`data_date` from  `src_biz_filelist` `t1` where t1.batch_no IN ('{NEED_TO_REPLACE}') ORDER BY `t1`.`batch_no` ASC                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| dn2_0             | BASE SQL(May No Need) | nest_loop_1's RESULTS; select `t1`.`ysyid`,`t1`.`batch_no`,`t1`.`image_no`,`t1`.`caption`,`t1`.`create_time`,`t1`.`update_time`,`t1`.`img_seql`,`t1`.`data_date` from  `src_biz_filelist` `t1` where t1.batch_no IN ('{NEED_TO_REPLACE}') ORDER BY `t1`.`batch_no` ASC                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| dn3_0             | BASE SQL(May No Need) | nest_loop_1's RESULTS; select `t1`.`ysyid`,`t1`.`batch_no`,`t1`.`image_no`,`t1`.`caption`,`t1`.`create_time`,`t1`.`update_time`,`t1`.`img_seql`,`t1`.`data_date` from  `src_biz_filelist` `t1` where t1.batch_no IN ('{NEED_TO_REPLACE}') ORDER BY `t1`.`batch_no` ASC                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| dn4_0             | BASE SQL(May No Need) | nest_loop_1's RESULTS; select `t1`.`ysyid`,`t1`.`batch_no`,`t1`.`image_no`,`t1`.`caption`,`t1`.`create_time`,`t1`.`update_time`,`t1`.`img_seql`,`t1`.`data_date` from  `src_biz_filelist` `t1` where t1.batch_no IN ('{NEED_TO_REPLACE}') ORDER BY `t1`.`batch_no` ASC                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| merge_and_order_1 | MERGE_AND_ORDER       | dn1_1; dn2_0; dn3_0; dn4_0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| shuffle_field_1   | SHUFFLE_FIELD         | merge_and_order_1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| join_1            | JOIN                  | nest_loop_1; shuffle_field_1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| shuffle_field_2   | SHUFFLE_FIELD         | join_1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+-------------------+-----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
11 rows in set (0.00 sec)

执行计划可见:dn1_0部分先将子查询结果返回,出现了NEST_LOOP嵌套循环标识,嵌套循环的结果集被引用到外部查询中,语句直接下压到了全部数据节点。

开启MySQL general_log 日志查看语句实际在MySQL层运行情况如下:

2021-05-12T08:58:09.324017+08:00         1080 Query     select `t2`.`id`,`t2`.`evt_id`,`t2`.`front_seq_num`,`t2`.`org_id`,`t2`.`teller_id`,`t2`.`txn_dt`,`t2`.`txn_tm`,`t2`.`txn_type_cd`,`t2`.`txn_cd`,`t2`.`sub_seq_num`,`t2`.`bcs_seq_num`,`t2`.`user_ccy_cd`,`t2`.`acct_num`,`t2`.`acct_ccy_cd`,`t2`.`acct_amt`,`t2`.`txn_ccy_cd`,`t2`.`txn_amt`,`t2`.`exch_rate`,`t2`.`resp_num`,`t2`.`proc_info`,`t2`.`auth_teller_id`,`t2`.`terminal_num`,`t2`.`cntrpty_acct_num`,`t2`.`cntrpty_acct_name`,`t2`.`vchr_type_encode`,`t2`.`vchr_num`,`t2`.`vchr_dt`,`t2`.`chk_teller_id`,`t2`.`vchr_main_file_desc`,`t2`.`vchr_accessory_desc`,`t2`.`biz_status_cd`,`t2`.`fin_txn_ind`,`t2`.`txn_complt_dt`,`t2`.`ident_chk_chker_id`,`t2`.`ident_chk_result_cd`,`t2`.`veri_seal_chker_id`,`t2`.`veri_seal_result_cd`,`t2`.`tel_che_chker_id`,`t2`.`tel_che_result_cd`,`t2`.`tel_che_comment`,`t2`.`pay_acct_doubt_type_cd`,`t2`.`rcv_acct_doubt_type_cd`,`t2`.`job_center`,`t2`.`imageid`,`t2`.`pay_name`,`t2`.`txn_desc`,`t2`.`origin_txn_seq_num`,`t2`.`txn_start_tm`,`t2`.`txn_memo`,`t2`.`bcs_or_back_txn_num`,`t2`.`bea_intrfc_num`,`t2`.`txn_auth_teller`,`t2`.`send_back_or_back_err_info`,`t2`.`data_date` from  `src_image_txn_jnl` `t2` where (`t2`.`front_seq_num` = '001780568097') AND (`t2`.`txn_dt` = '2029-08-20') order by `t2`.`imageid` ASC 
2021-05-12T08:58:09.326400+08:00         1091 Query     select `t1`.`ysyid`,`t1`.`batch_no`,`t1`.`image_no`,`t1`.`caption`,`t1`.`create_time`,`t1`.`update_time`,`t1`.`img_seql`,`t1`.`data_date` from  `src_biz_filelist` `t1` where t1.batch_no IN ('0250002403989000IMMEDA200630095447WV', '0250002401013000IMMEDA200914092919JX') ORDER BY `t1`.`batch_no` ASC

语句执行顺序变化,独立子查询被赋予了具体值,并按分片键进行数据路由的特性,下发到指定分片节点进行查询。

需要注意的是关于nestLoop还有两个参数nestLoopConnSize /nestLoopRowsSize 若临时表行数大于这两个值乘积,DBLE则会报告一个后端连接错误。

综上优化方式有以下几种:

1、程序进行语句拆分传入具体值进行查询。

2、如果明确子查询结果单个值情况下,可以改写使用独立标量子查询。

3、添加 true 参数,并改写为连接查询。

打开APP阅读更多精彩内容
声明:本文内容及配图由入驻作者撰写或者入驻合作网站授权转载。文章观点仅代表作者本人,不代表电子发烧友网立场。文章及其配图仅供工程师学习之用,如有内容侵权或者其他违规问题,请联系本站处理。 举报投诉

全部0条评论

快来发表一下你的评论吧 !

×
20
完善资料,
赚取积分