Oracle choosing bad executing plan for wrong cardinality Estimation and use Oracle Dynamic sampling or SQL Profile to improve better execution plan

Sometimes we see the optimizer choosing bad execution plan for queries because of wrong estimation of cardinality. Let’s say you have loaded data in a table but the table does not have statistics of the table then not having the statistics Oracle optimizer might choose bad execution plan. Another scenario can be you are gathering statistics of table at night but let’s say there is lots of DML operation performing on the tables in day time then Oracle optimizer will not know the latest statistics information of the table then it might choose bad execution plan for wrong cardinality estimation. Right cardinality equals right plan, wrong cardinality equals wrong plan. Oracle dynamic sampling or sql profile can be used to avoid bad execution plan. Here I have shown how oracle optimizer choose bad execution plan for wrong cardinality estimation and how it can be overcome by using dynamic sampling. Here I have discussed:

1. What is Dynamic sampling

2. Scenario Oracle Choosing bad execution plan and how dynamic sampling improve the execution plan

a. Load data in a table

b. Delete data from a table

c. When predicate gets just a little more complex, the optimizer misses the estimated cardinality

3. When to use dynamic sampling

1. What is Dynamic Sampling:

Optimizer dynamic sampling refers to the ability of the SQL optimizer to take a sample of rows from a table to calculate missing statistics and improve the optimizer's ability to generate good execution plans. Dynamic sampling takes place only at hard parse time and is used to dynamically generate better statistics for the optimizer to use. Sampling level depends on the level set for dynamic sampling. Dynamically gather statistics can be useful in the following circumstances:

· Statistics are not present

· Statistics are out of date

· Current statistics cannot be used or lead to estimation errors

Default value of dynamic sampling:

If OPTIMIZER_FEATURES_ENABLE is set to 10.0.0 or higher, then 2

If OPTIMIZER_FEATURES_ENABLE is set to 9.2.0, then 1

If OPTIMIZER_FEATURES_ENABLE is set to 9.0.1 or lower, then 0

Level of Dynamic Sampling:

· Level 0: Do not use dynamic sampling.

· Level 1: Sample all tables that have not been analyzed if the following criteria are met: (1) there is at least 1 unanalyzed table in the query; (2) this unanalyzed table is joined to another table or appears in a subquery or non-mergeable view; (3) this unanalyzed table has no indexes; (4) this unanalyzed table has more blocks than the number of blocks that would be used for dynamic sampling of this table. The number of blocks sampled is the default number of dynamic sampling blocks (32).

· Level 2: Apply dynamic sampling to all unanalyzed tables. The number of blocks sampled is two times the default number of dynamic sampling blocks.

· Level 3: Apply dynamic sampling to all tables that meet Level 2 criteria, plus all tables for which standard selectivity estimation used a guess for some predicate that is a potential dynamic sampling predicate. The number of blocks sampled is the default number of dynamic sampling blocks. For unanalyzed tables, the number of blocks sampled is two times the default number of dynamic sampling blocks.

· Level 4: Apply dynamic sampling to all tables that meet Level 3 criteria, plus all tables that have single-table predicates that reference 2 or more columns. The number of blocks sampled is the default number of dynamic sampling blocks. For unanalyzed tables, the number of blocks sampled is two times the default number of dynamic sampling blocks.

· Levels 5, 6, 7, 8, and 9: Apply dynamic sampling to all tables that meet the previous level criteria using 2, 4, 8, 32, or 128 times the default number of dynamic sampling blocks respectively.

· Level 10: Apply dynamic sampling to all tables that meet the Level 9 criteria using all blocks in the table.

There are two ways to use dynamic sampling:

· The OPTIMIZER_DYNAMIC_SAMPLING parameter can be set at the database instance level and can also be overridden at the session level with the ALTER SESSION command.

· The DYNAMIC_SAMPLING query hint can be added to specific queries.

2. Scenario Oracle Choosing bad execution plan and how dynamic sampling improve the execution plan

a. Load data in a table

In that scenario we can see, after loading data in a table and not having the statistics how oracle optimizer showing wrong cardinality in execution plan. Using Dynamic sampling Oracle optimizer choosing better cardinality.

SQL> create table test1

as

select owner,object_type

from all_objects;

Table created.

SQL> select count(*) from test1;

COUNT(*)

----------

92071

SQL> set autotrace traceonly explain

SQL> select /*+ dynamic_sampling(test1 0) */ * from test1;

Execution Plan

----------------------------------------------------------

Plan hash value: 3852271815

---------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

---------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 20829 | 1606K| 71 (0)| 00:00:01 |

| 1 | TABLE ACCESS FULL| TEST1 | 20829 | 1606K| 71 (0)| 00:00:01 |

---------------------------------------------------------------------------

Note

-----

- automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold

We can see Oracle Optimizer showing wrong cardinality estimation. Now let’s use dynamic sampling level 3

SQL> select /*+ dynamic_sampling(test1 3) */ * from test1;

Execution Plan

----------------------------------------------------------

Plan hash value: 3852271815

---------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

---------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 98002 | 7560K| 72 (2)| 00:00:01 |

| 1 | TABLE ACCESS FULL| TEST1 | 98002 | 7560K| 72 (2)| 00:00:01 |

---------------------------------------------------------------------------

Note

-----

- dynamic statistics used: dynamic sampling (level=0)

- automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold

We can see here now optimizer cardinality estimation is better now after using dynamic sampling.

b. Delete data from a table

In that scenario we will see after delete rows from the table oracle optimizer showing wrong cardinality estimation and using dynamic sampling, it’s showing better cardinality estimation.

SQL> delete from test1;

92071 rows deleted.

SQL> commit;

Commit complete.

SQL> set autotrace traceonly explain

SQL> select /*+ dynamic_sampling(test1 0) */ * from test1;

Execution Plan

----------------------------------------------------------

Plan hash value: 3852271815

---------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

---------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 92071 | 1348K| 72 (2)| 00:00:01 |

| 1 | TABLE ACCESS FULL| TEST1 | 92071 | 1348K| 72 (2)| 00:00:01 |

---------------------------------------------------------------------------

Note

-----

- automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold

We can see here Oracle Optimizer showing wrong cardinality estimation where all the rows in the table has been deleted. Now let’s use dynamic sampling level 3.

SQL> select /*+ dynamic_sampling(test1 3) */ * from test1;

Execution Plan

----------------------------------------------------------

Plan hash value: 3852271815

---------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

---------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 79 | 71 (0)| 00:00:01 |

| 1 | TABLE ACCESS FULL| TEST1 | 1 | 79 | 71 (0)| 00:00:01 |

---------------------------------------------------------------------------

Note

-----

- dynamic statistics used: dynamic sampling (level=0)

- automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold

Now we can see after using dynamic sampling Oracle Optimizer showing better cardinality.

c. When predicate gets just a little more complex, the optimizer misses the estimated cardinality

We have a table and statistics gathered on that table. Even though the table have good statistics we will see how optimizer showing wrong cardinality when the predicate gets just little more complex and choosing bad execution plan. The cardinality showing accurate when we are using any column val1 or val2 in “where” clause but when we use both column in “where” clause (val1 and val2) the optimizer showing wrong cardinality because optimizer doesn’t know the relationship between val1 and val2. Using dynamic sampling Optimizer choosing better execution plan.

SQL> create table test2

as select decode( mod(rownum,2), 0, 'N', 'Y' ) val1,

decode( mod(rownum,2), 0, 'Y', 'N' ) val2, a.*

from all_objects a

/

Table created.

SQL> create index test2_idx on test2(val1,val2);

Index created.

SQL> begin

dbms_stats.gather_table_stats

( user, 'test2',

method_opt=>'for all indexed columns size 254' );

end;

/

PL/SQL procedure successfully completed.

SQL> select count(*) total_rows,count(*)/2 half_rows,count(*)/2/2 querter_rows from test2;

TOTAL_ROWS HALF_ROWS QUERTER_ROWS

---------- ---------- ------------

92062 46031 23015.5

Let’s check the Cardinality estimation by optimizer where val1='N' and val2='N'

SQL> select * from test2 where val1='N';

Execution Plan

----------------------------------------------------------

Plan hash value: 3778028574

---------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

---------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 46031 | 5349K| 438 (1)| 00:00:01 |

|* 1 | TABLE ACCESS FULL| TEST2 | 46031 | 5349K| 438 (1)| 00:00:01 |

---------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("VAL1"='N')

Note

-----

- automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold

SQL> select * from test2 where val2='N';

Execution Plan

----------------------------------------------------------

Plan hash value: 3778028574

---------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

---------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 46031 | 5349K| 438 (1)| 00:00:01 |

|* 1 | TABLE ACCESS FULL| TEST2 | 46031 | 5349K| 438 (1)| 00:00:01 |

---------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("VAL2"='N')

Note

-----

- automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold

Looks good, Oracle Optimizer showing better cardinality. Now let’s check with little bit complex predicate.

SQL> select * from test2 where val1 = 'N' and val2 = 'N';

Execution Plan

----------------------------------------------------------

Plan hash value: 3778028574

---------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

---------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 23016 | 2674K| 437 (1)| 00:00:01 |

|* 1 | TABLE ACCESS FULL| TEST2 | 23016 | 2674K| 437 (1)| 00:00:01 |

---------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("VAL1"='N' AND "VAL2"='N')

Note

-----

- automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold

We can see oracle optimizer showing wrong cardinality estimation and choosing bad execution plan because optimizer does not know the relationship between val1 and val2. Therefore Oracle optimizer has chosen bad execution plan (Full table scan). Let’s use the dynamic sampling level 3.

SQL> select /*+ dynamic_sampling(test2 3) */ * from test2 where val1 = 'N' and val2 = 'N';

Execution Plan

----------------------------------------------------------

Plan hash value: 1613614129

-------------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

-------------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 10 | 1190 | 2 (0)| 00:00:01 |

| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| TEST2 | 10 | 1190 | 2 (0)| 00:00:01 |

|* 2 | INDEX RANGE SCAN | TEST2_IDX | 10 | | 1 (0)| 00:00:01 |

-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("VAL1"='N' AND "VAL2"='N')

Note

-----

- dynamic statistics used: dynamic sampling (level=0)

- automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold

We can see now Oracle Optimizer showing better cardinality and it’s using index instead of full table scan.

3. When to use dynamic sampling

It’s a tricky question. In OLTP system, queries take little time to execute and thousands of queries execute per second. The queries are usually small and fast. Therefore, increasing parse time in OLTP system might cause to spend more time for parsing than executing sql. So higher level of dynamic sampling is not advisable in OLTP system. In data warehousing environment query execution time is longer and parsing time is small portion of the overall execution time. So higher level of dynamic sampling is ok in data warehousing. A SQL profile is an alternative of dynamic sampling. A SQL profile is a database object that contains auxiliary statistics specific to a SQL statement. Conceptually, a SQL profile is to a SQL statement what object-level statistics are to a table or index. SQL profiles are created when a DBA invokes SQL Tuning Advisor. The benefit of SQL profile is it saves on the cost of dynamic sampling by sampling once and persisting the vales. Now you might think why not just always use SQL profiles instead dynamic sampling. In data warehouse system user typically generate SQL in an ad hoc fashion. Different SQL run from day to day and bind variable not use many case so there is no SQL from which to create a SQL profile. The same SQL might not execute ever again in data warehouse. So Dynamic sampling is perfect solution for data warehouse. In OLTP system most of the queries use bind variables and use the same SQL repeatedly. So, SQL profile is better choice in OLTP system as dynamic sampling takes time and resource for hard parsing where SQL profile gather the statistics for the query once and persist them for all subsequent optimization of that query.