As Jonathan Lewis commented on my article Clustering Factor: Row Migrtion’s Victim, there is even more to say about the difference between the good, old, and deprecated ANALYZE statement and the DBMS_STATS package. Jonathan mentioned that the CBO is using the CHAIN_CNT value in the statistics, if present, and suggested to try my “trapQL” after analyzing the base table in the old fashion.
For easy C&P I the complete new script to try:
CREATE TABLE row_mig1 (
a CHAR(2000),
b CHAR(2000),
c CHAR(2000),
x NUMBER NOT NULL,
filter NUMBER NOT NULL,
CONSTRAINT row_mig1_pk PRIMARY KEY (x)
) ENABLE ROW MOVEMENT;
BEGIN
FOR i IN 1..100000 LOOP
INSERT INTO row_mig1
(x, filter)
VALUES (i, trunc(dbms_random.value(0, 10)));
UPDATE row_mig1
SET a ='a', b = 'b', c = 'c'
WHERE x=i;
END LOOP;
END;
/
COMMIT;
CREATE INDEX row_mig1_idx ON row_mig1(filter);
ANALYZE TABLE row_mig1 COMPUTE STATISTICS;
ANALYZE INDEX row_mig1_pk COMPUTE STATISTICS;
ANALYZE INDEX row_mig1_idx COMPUTE STATISTICS;
SELECT i.index_name idx_name
, t.blocks table_blocks
, i.clustering_factor idx_clust_factor
, i.num_rows idx_rows
, t.chain_cnt table_chain_cnt
FROM user_indexes i
JOIN user_tables t USING (table_name)
WHERE table_name = 'ROW_MIG1';
This is the very same table as in my previous article. I just replaced the DBMS_STATS with ANALYZE. As expected, the CHAIN_CNT is set correctly:
IDX_NAME TABLE_BLOCKS IDX_CLUST_FACTOR IDX_ROWS TABLE_CHAIN_CNT -------------- ------------ ---------------- -------- --------------- ROW_MIG1_PK 100103 919 100000 99999 ROW_MIG1_IDX 100103 9187 100000 99999
So far, so good. Now let’s create the second table—without row migration:
CREATE TABLE row_mig2 (
a CHAR(2000),
b CHAR(2000),
c CHAR(2000),
x NUMBER NOT NULL,
filter NUMBER NOT NULL,
CONSTRAINT row_mig2_pk PRIMARY KEY (x)
) ENABLE ROW MOVEMENT;
INSERT INTO row_mig2 (x, filter, a, b, c)
SELECT level, trunc(dbms_random.value(0,17)), 'a', 'b', 'c'
FROM dual CONNECT BY level <= 100000;
COMMIT;
CREATE INDEX row_mig2_idx ON row_mig2(filter);
ANALYZE TABLE row_mig2 COMPUTE STATISTICS;
ANALYZE INDEX row_mig2_pk COMPUTE STATISTICS;
ANALYZE INDEX row_mig2_idx COMPUTE STATISTICS;
SELECT i.index_name idx_name
, t.blocks table_blocks
, i.clustering_factor idx_clust_factor
, i.num_rows idx_rows
, t.chain_cnt table_chain_cnt
FROM user_indexes i
JOIN user_tables t USING (table_name)
WHERE table_name = 'ROW_MIG2';
Just to verify that there is no chaining when a plain insert approach is followed:
IDX_NAME TABLE_BLOCKS IDX_CLUST_FACTOR IDX_ROWS TABLE_CHAIN_CNT -------------- ------------ ---------------- -------- --------------- ROW_MIG2_IDX 100877 100000 100000 0 ROW_MIG2_PK 100877 100000 100000 0
So, let’s see how my “trapQL” works:
SET AUTOTRACE TRACEONLY; SET TIMING ON; SELECT /* analyze that! */ * FROM row_mig1 d1 JOIN row_mig2 d2 ON (d1.x = d2.x) WHERE d1.filter = 0 AND d2.filter = 0; SET TIMING OFF SET AUTOTRACE OFF
And the result is:
602 rows selected.
Elapsed: 00:01:00.73
Execution Plan
----------------------------------------------------------
Plan hash value: 3163423506
----------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost |
----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5882 | 17663 |
| 1 | NESTED LOOPS | | | |
| 2 | NESTED LOOPS | | 5882 | 17663 |
| 3 | TABLE ACCESS BY INDEX ROWID| ROW_MIG2 | 5882 | 5896 |
|* 4 | INDEX RANGE SCAN | ROW_MIG2_IDX | 5882 | 12 |
|* 5 | INDEX UNIQUE SCAN | ROW_MIG1_PK | 1 | 0 |
|* 6 | TABLE ACCESS BY INDEX ROWID | ROW_MIG1 | 1 | 2 |
----------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("D2"."FILTER"=0)
5 - access("D1"."X"="D2"."X")
6 - filter("D1"."FILTER"=0)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
20929 consistent gets
12994 physical reads
0 redo size
39658 bytes sent via SQL*Net to client
859 bytes received via SQL*Net from client
42 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
602 rows processed
Surprising, at least to me. The optimizer considers the CHAIN_CNT and produces the best possible execution plan. The cost for the TABLE ACCESS BY INDEX ROW ID has grown to two. That’s a very concise reflection of the additional block that needs to be fetched.
To verify that the difference is the CHAIN_CNT let’s verify the plan after DBMS_STATS collection:
SQL> BEGIN
DBMS_STATS.GATHER_TABLE_STATS(null, 'ROW_MIG1',
CASCADE => TRUE);
DBMS_STATS.GATHER_TABLE_STATS(null, 'ROW_MIG12',
CASCADE => TRUE);
END;
/
PL/SQL procedure successfully completed.
SQL> SELECT i.index_name idx_name
, t.blocks table_blocks
, i.clustering_factor idx_clust_factor
, i.num_rows idx_rows
, t.chain_cnt table_chain_cnt
FROM user_indexes i
JOIN user_tables t USING (table_name)
WHERE table_name IN ('ROW_MIG1', 'ROW_MIG2')
ORDER BY i.index_name;
IDX_NAME TABLE_BLOCKS IDX_CLUST_FACTOR IDX_ROWS TABLE_CHAIN_CNT
-------------- ------------ ---------------- -------- ---------------
ROW_MIG1_IDX 100622 9180 100000 99999
ROW_MIG1_PK 100622 918 100000 99999
ROW_MIG2_IDX 100239 100000 100000 0
ROW_MIG2_PK 100239 100000 100000 0
Well, as a matter of fact, DBMS_STATS just doesn’t care about the CHAIN_CNT. So, the old values remain there. On the other hand, there are some minor differences in the other stats like the table blocks and the index clustering factor of ROW_MIG1_IDX. However, the execution plan doesn’t change:
602 rows selected.
Elapsed: 00:01:03.71
Execution Plan
----------------------------------------------------------
Plan hash value: 3163423506
----------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost |
----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6084 | 18270 |
| 1 | NESTED LOOPS | | | |
| 2 | NESTED LOOPS | | 6084 | 18270 |
| 3 | TABLE ACCESS BY INDEX ROWID| ROW_MIG2 | 6084 | 6098 |
|* 4 | INDEX RANGE SCAN | ROW_MIG2_IDX | 6084 | 12 |
|* 5 | INDEX UNIQUE SCAN | ROW_MIG1_PK | 1 | 0 |
|* 6 | TABLE ACCESS BY INDEX ROWID | ROW_MIG1 | 1 | 2 |
----------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("D2"."FILTER"=0)
5 - access("D1"."X"="D2"."X")
6 - filter("D1"."FILTER"=0)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
21131 consistent gets
12911 physical reads
0 redo size
39658 bytes sent via SQL*Net to client
859 bytes received via SQL*Net from client
42 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
602 rows processed
Well, it seems that the CHAIN_CNT is really making all the difference since it is the only value that was not calculated by DMBS_STATS…or is it? The ultimate proof would be to update the CHAIN_CNT manually to zero in the same way as I manually update the CLUSTERING_FACTOR using DBMS_STATS in my previous post. However, I already mentioned DBMS_STATS doesn’t care about the CHAIN_CNT, consequently there is no parameter in SET_TABLE_STATS to manually update it.
For the sake of science, I have done what nobody should ever do. And I will definitely not post what I’ve done. However, I did it, and I can tell you; the CHAIN_CNT is making all the difference.
Summary
The story from the previous posts is a little bit depressing. Although ANALYZE is deprecated, it provides more information to the CBO than DBMS_STATS. Using the recommended DBMS_STATS alone opens the problem that I have demonstrated in my previous post. The depressing factor is that this is actually a regression. A useful functionality of the CBO was “removed” by the recommendation to use DBMS_STATS.
Once more I must emphasize that all of that trouble was caused by the excessive use of the “insert empty, update everything” anti-pattern. Obviously everybody should try very hard not to follow that pattern. That solves all the problems.
Thanks
Thanks to Jonathan Lewis for pointing me onto the question how my “trapQL” works with ANALYZE.
Special thanks to Gerhard Kircher.

