There is a temporary table

CREATE GLOBAL TEMPORARY TABLE "TMP_CONTROL_POINT_DETECT" ( "POINT_ID" NUMBER NOT NULL ENABLE, CONSTRAINT "TMP_CONTROL_POINT_DETECT_PK" PRIMARY KEY ("POINT_ID") ENABLE ) ON COMMIT PRESERVE ROWS ; 

And there is a persistent table

 CREATE TABLE "CONTROL_POINTS_" ( "ID" NUMBER, "STATUS" NUMBER(1,0), CONSTRAINT "PK_CONTROL_POINTS_" PRIMARY KEY ("ID") USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS TABLESPACE "USERS" ENABLE ); 

There are 200 entries in the temporary table, 29,000 in persistent

I make a request

 SELECT cp.status FROM TMP_CONTROL_POINT_DETECT det JOIN CONTROL_POINTS_ cp ON ( det.POINT_ID = cp.ID ) 

and terrified of the plan

 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 200 | 3800 | 18 (6)| 00:00:01 | |* 1 | HASH JOIN | | 200 | 3800 | 18 (6)| 00:00:01 | | 2 | TABLE ACCESS FULL| TMP_CONTROL_POINT_DETECT | 200 | 2600 | 2 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL| CONTROL_POINTS_ | 29303 | 171K| 16 (7)| 00:00:01 | ----------------------------------------------------------------------------------------------- PLAN_TABLE_OUTPUT --------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("DET"."POINT_ID"="CP"."ID") Note ----- - dynamic statistics used: dynamic sampling (level=2) - this is an adaptive plan 

Then I change the selectable field in the request.

 SELECT cp.id FROM TMP_CONTROL_POINT_DETECT det JOIN CONTROL_POINTS_ cp ON ( det.POINT_ID = cp.ID ) 

And get the expected plan

 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 200 | 3600 | 2 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 200 | 3600 | 2 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL| TMP_CONTROL_POINT_DETECT | 200 | 2600 | 2 (0)| 00:00:01 | |* 3 | INDEX UNIQUE SCAN| PK_CONTROL_POINTS_ | 1 | 5 | 0 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------- PLAN_TABLE_OUTPUT --------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("DET"."POINT_ID"="CP"."ID") Note ----- - dynamic statistics used: dynamic sampling (level=2) 

Can anyone explain what is happening? Where does FULL SCAN come from? The query itself selects honest 200 entries.

Type manipulation

 SELECT cp.STATUS FROM CONTROL_POINTS_ cp WHERE cp.ID IN (SELECT ID FROM TMP_CONTROL_POINT_DETECT det) 

lead to even more depressing consequences

 ------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5860K| 33M| 2820 (4)| 00:00:01 | | 1 | MERGE JOIN CARTESIAN| | 5860K| 33M| 2820 (4)| 00:00:01 | | 2 | TABLE ACCESS FULL | TMP_CONTROL_POINT_DETECT | 200 | | 2 (0)| 00:00:01 | | 3 | BUFFER SORT | | 29303 | 171K| 2818 (4)| 00:00:01 | | 4 | TABLE ACCESS FULL | CONTROL_POINTS_ | 29303 | 171K| 14 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- 

Update

The temporary table has nothing to do with it. With the same persistent structure and the same data the same picture

  • Why a terrible plan? The optimizer believes that it is faster to select all the records and then merge through hash join, than to search in the index and pull from the table. 30K entries are not so much, everything will fit in memory. - 0xdb pm
  • @ 0xdb Yes, because then a couple more tables are jointed and there are generally wild values. Up to 200K records reach - Anton Shchyrov
  • Actually, there is nothing criminal. The optimizer thinks it's faster, why it’s hard to say exactly. But he hints that your 29 thousand lines is 171K. We should rather proceed from the fact that the query is fast enough for your tasks. And if not enough, then you can already try to optimize with hints. - Viktorov 6:49 pm
  • From myself I can add: even in the second case, the optimizer could drop the index and the slower NL and choose full scan with a fast hash join. It all depends on how much it predicts the number of lines in both rowset and, therefore, how many approximately IO operations are required. In your case, a filter with only equi inner join and a relatively small number of lines - the fact that the choice falls on hash join or sorted merge is expected and even desirable. - 0xdb

2 answers 2

Both plans are absolutely predictable. in the second case, you only need the value from the index, so the index is taken.

In the first case, you need to get the value from the data area. In this case, you have 29,000 data records, with an average recording length of about 4 bytes. On the disk, the entire table is 171 KB (as seen in the plan). When the block size on the disk in 4k is 43 blocks. To get 43 blocks for 200 individual hits by pointers from an index, to put it mildly, is expensive (Guaranteed to read the entire table, and each block will have to be parsed 4 times). Full scan and even with hash join is more than justified.

The use of an index is effective when with its help it is necessary to refer to no more than 10% of all table blocks.

  • The original problem is that in the real query two more tables are joining. And because of FULL SCAN by CONTROL_POINTS , FULL SCAN CONTROL_POINTS run on the remaining tables. As a result, I get a sample of 228K rows with a size of 2896K bytes. Also considered normal? - Anton Shchyrov
  • one
    @AntonShchyrov And what are you still joining this table and the main thing for what fields, if it only has status and id ... And 2 MB is generally not very much. What is the lead time? And I only speak, than the optimizer is guided. If you are not satisfied with the plan, use hints - Mike
  • Runtime ~ 0.3-0.5 seconds. It together with fetch. Joyns only on indexed fields. Okay, I understand the position of the optimizer. Thank you - Anton Shchyrov
  • @AntonShchyrov It's enough to slip a use_nl (control_points) into the query and see if it gets better. - 0xdb pm
  • @ 0xdb use_nl . But index brought the call to life - Anton Shchyrov

Using the hint index brought the query to life

 SELECT /*+ index(cp PK_CONTROL_POINTS) */ cp.STATUS FROM TMP_CONTROL_POINT_DETECT det JOIN CONTROL_POINTS PARTITION (lic) cp ON ( det.POINT_ID = cp.ID ) 
 -------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | -------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 200 | 4000 | 202 (0)| 00:00:01 | | | | 1 | NESTED LOOPS | | 200 | 4000 | 202 (0)| 00:00:01 | | | | 2 | NESTED LOOPS | | 200 | 4000 | 202 (0)| 00:00:01 | | | | 3 | TABLE ACCESS FULL | TMP_CONTROL_POINT_DETECT | 200 | 2600 | 2 (0)| 00:00:01 | | | |* 4 | INDEX UNIQUE SCAN | PK_CONTROL_POINTS | 1 | | 0 (0)| 00:00:01 | | | | 5 | TABLE ACCESS BY GLOBAL INDEX ROWID| CONTROL_POINTS | 1 | 7 | 1 (0)| 00:00:01 | 2 | 2 | PLAN_TABLE_OUTPUT --------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("DET"."POINT_ID"="CP"."ID") filter(TBL$OR$IDX$PART$NUM("CONTROL_POINTS",0,1,0,ROWID)=2)