SQL优化【基础06】 - 统计信息(直方图)

奋斗吧
奋斗吧
擅长邻域:未填写

标签: SQL优化【基础06】 - 统计信息(直方图) 博客 51CTO博客

2023-05-24 18:24:13 198浏览

SQL优化【基础06】 - 统计信息(直方图),前言:从10开始默认优化器就是CBO模型,RBO已经靠边站了,CBO


前言:从10开始默认优化器就是CBO模式,RBO已经靠边站了,CBO是根据什么来计算成本的,当然就是信息了,这个信息就是所谓统计信息,可见统计信息的重要性;

版本:11.2

先来看张表T1,未做统计前可以看到统计信息都是空的(空时CBO要获取信息采用动态采样),信息是通过脚本SOSI.SQL(群里上载文件中有)显示;

SQL优化【基础06】 -  统计信息(直方图)_SQL


exec dbms_stats.gather_table_stats('AIKI','T1',cascade=>true);

统计完后的信息显示

SQL优化【基础06】 -  统计信息(直方图)_ide_02


--执行一条语句看下计划,是范围扫描,与实际相符;
SQL> @allstat

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  ccm5vanq8y709, child number 0
-------------------------------------
select status from t1 where object_id=88

Plan hash value: 190799060

---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |      1 |00:00:00.01 |       3 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |      1 |      1 |      1 |00:00:00.01 |       3 |
|*  2 |   INDEX RANGE SCAN          | IDX_T1_ID |      1 |      1 |      1 |00:00:00.01 |       2 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID"=88)


19 rows selected.

--更改数据造成数据倾斜;100条,88的99条,99的值1条;
SQL> update t1 set object_id=88 where rownum<100;

99 rows updated.

SQL> commit;

Commit complete.

SQL> update t1 set object_id=99 where object_id<>88;

1 row updated.

SQL> commit;

Commit complete.

--再次收集必变后的表信息:exec dbms_stats.gather_table_stats('AIKI','T1',cascade=>true);
--这个计划是对的,因为88的值的行数占了99%;
select /*1*/status from t1 where object_id=88;
SQL> @allstat

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  4natz4s7h1uda, child number 0
-------------------------------------
select /*1*/status from t1 where object_id=88

Plan hash value: 3617692013

------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |     99 |00:00:00.01 |      10 |
|*  1 |  TABLE ACCESS FULL| T1   |      1 |     99 |     99 |00:00:00.01 |      10 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_ID"=88)


18 rows selected.

--重新做次收集,目的关闭直方图(for..size 1);
exec dbms_stats.gather_table_stats('AIKI','T1',cascade=>true,method_opt=>'for all columns size 1');

--看下执行计划,可以看到计划并非最优,它选择了范围扫描,而本来是全表扫描会略好;
select /*3*/status from t1 where object_id=88;

SQL> @allstat

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  cn417ch2unkp7, child number 0
-------------------------------------
select /*3*/status from t1 where object_id=88

Plan hash value: 190799060

---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |      1 |        |     99 |00:00:00.01 |      17 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |      1 |     50 |     99 |00:00:00.01 |      17 |
|*  2 |   INDEX RANGE SCAN          | IDX_T1_ID |      1 |     50 |     99 |00:00:00.01 |       8 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID"=88)


19 rows selected.




--看下此时列的直方图情况,可以看到直方图确实都关闭了;

SQL优化【基础06】 -  统计信息(直方图)_SQL_03


--重新收集列OBJECT_ID上的直方图信息
exec dbms_stats.gather_table_stats('AIKI','T1',cascade=>true,method_opt=>'for  columns object_id size auto',estimate_percent=>100);

SQL优化【基础06】 -  统计信息(直方图)_直方图_04


select /*5*/status from t1 where object_id=88;

--计划已经回归到最优了;
SQL> @allstat

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  gts3wuc1sy8dn, child number 0
-------------------------------------
select /*5*/status from t1 where object_id=88

Plan hash value: 3617692013

------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |     99 |00:00:00.01 |      10 |
|*  1 |  TABLE ACCESS FULL| T1   |      1 |     99 |     99 |00:00:00.01 |      10 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_ID"=88)


18 rows selected.



--直方图是什么 ?直方图就是列的统计信息,看下面的图你就明白了,88的值有99个,99的值有100-99个,100是累加值;


select * from dba_tab_histograms where owner='AIKI' and table_name='T1'

SQL优化【基础06】 -  统计信息(直方图)_ide_05



--直方图分为:等频和等高;怎么区别,很简单,列的唯一值>桶数:等高(HEIGHT BALANCED);等频:唯一值<=桶数,如下图所示,(其中11G之前的最大桶数不能超过254,软件局限),12C可超过;

select owner,table_name,column_name, histogram from dba_tab_col_statistics
where owner='AIKI'
and table_name='T1'


SQL优化【基础06】 -  统计信息(直方图)_ide_06

分区表的统计收集命令,操练下就明白了:

exec dbms_stats.gather_table_stats('AIKI','PT',partname=>'T2',CASCADE=>TRUE,METHOD_OPT=>'FOR ALL COLUMNS SIZE 1',granularity=>'ALL');  --全局表+分区表的信息都做统计
exec dbms_stats.gather_table_stats('AIKI','PT',partname=>'T2',CASCADE=>TRUE,METHOD_OPT=>'FOR ALL COLUMNS SIZE 1',granularity=>'PARTITION');  --只分区表做统计


统计信息与应用关系:

1.通过上面测试可以看到,当数据出现倾斜时,则直方图就有存在的意义,否则就无意义了;并且此列在做收集前必须存在于运行SQL的WHERE条件后,这个很好理解(都没用,做直方图也无意义了);

2.比如这条语句:select status from t1 where object_id=66 and status='VALID';当出现执行慢时你会怎么思考?

很简单 :先SOSI下看下列object_id,status的列的唯一值数(选择率),表T1的统计信息,这样心中便可大致确定跟此列相关的索引情况,看下统计时间,是否分区表,索引分布,数据采样比例等;





好博客就要一起分享哦!分享海报

此处可发布评论

评论(0展开评论

暂无评论,快来写一下吧

展开评论

您可能感兴趣的博客

客服QQ 1913284695