JOIN with temporary table

Question

There is a temporary table

CREATE GLOBAL TEMPORARY TABLE "TMP_CONTROL_POINT_DETECT" ( "POINT_ID" NUMBER NOT NULL ENABLE, CONSTRAINT "TMP_CONTROL_POINT_DETECT_PK" PRIMARY KEY ("POINT_ID") ENABLE ) ON COMMIT PRESERVE ROWS ;

And there is a persistent table

 CREATE TABLE "CONTROL_POINTS_" ( "ID" NUMBER, "STATUS" NUMBER(1,0), CONSTRAINT "PK_CONTROL_POINTS_" PRIMARY KEY ("ID") USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS TABLESPACE "USERS" ENABLE );

There are 200 entries in the temporary table, 29,000 in persistent

I make a request

 SELECT cp.status FROM TMP_CONTROL_POINT_DETECT det JOIN CONTROL_POINTS_ cp ON ( det.POINT_ID = cp.ID )

and terrified of the plan

 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 200 | 3800 | 18 (6)| 00:00:01 | |* 1 | HASH JOIN | | 200 | 3800 | 18 (6)| 00:00:01 | | 2 | TABLE ACCESS FULL| TMP_CONTROL_POINT_DETECT | 200 | 2600 | 2 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL| CONTROL_POINTS_ | 29303 | 171K| 16 (7)| 00:00:01 | ----------------------------------------------------------------------------------------------- PLAN_TABLE_OUTPUT --------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("DET"."POINT_ID"="CP"."ID") Note ----- - dynamic statistics used: dynamic sampling (level=2) - this is an adaptive plan

Then I change the selectable field in the request.

 SELECT cp.id FROM TMP_CONTROL_POINT_DETECT det JOIN CONTROL_POINTS_ cp ON ( det.POINT_ID = cp.ID )

And get the expected plan

 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 200 | 3600 | 2 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 200 | 3600 | 2 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL| TMP_CONTROL_POINT_DETECT | 200 | 2600 | 2 (0)| 00:00:01 | |* 3 | INDEX UNIQUE SCAN| PK_CONTROL_POINTS_ | 1 | 5 | 0 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------- PLAN_TABLE_OUTPUT --------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("DET"."POINT_ID"="CP"."ID") Note ----- - dynamic statistics used: dynamic sampling (level=2)

Can anyone explain what is happening? Where does FULL SCAN come from? The query itself selects honest 200 entries.

Type manipulation

 SELECT cp.STATUS FROM CONTROL_POINTS_ cp WHERE cp.ID IN (SELECT ID FROM TMP_CONTROL_POINT_DETECT det)

lead to even more depressing consequences

 ------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5860K| 33M| 2820 (4)| 00:00:01 | | 1 | MERGE JOIN CARTESIAN| | 5860K| 33M| 2820 (4)| 00:00:01 | | 2 | TABLE ACCESS FULL | TMP_CONTROL_POINT_DETECT | 200 | | 2 (0)| 00:00:01 | | 3 | BUFFER SORT | | 29303 | 171K| 2818 (4)| 00:00:01 | | 4 | TABLE ACCESS FULL | CONTROL_POINTS_ | 29303 | 171K| 14 (0)| 00:00:01 | -------------------------------------------------------------------------------------------------

Update

The temporary table has nothing to do with it. With the same persistent structure and the same data the same picture

The optimizer believes that it is faster to select all the records and then merge through hash join, than to search in the index and pull from the table.
@ 0xdb Yes, because then a couple more tables are jointed and there are generally wild values.
The optimizer thinks it's faster, why it’s hard to say exactly.
We should rather proceed from the fact that the query is fast enough for your tasks.
And if not enough, then you can already try to optimize with hints.
From myself I can add: even in the second case, the optimizer could drop the index and the slower NL and choose full scan with a fast hash join.
It all depends on how much it predicts the number of lines in both rowset and, therefore, how many approximately IO operations are required.
In your case, a filter with only equi inner join and a relatively small number of lines - the fact that the choice falls on hash join or sorted merge is expected and even desirable.

Accepted Answer · 2019-04-24T19:19:08

Both plans are absolutely predictable. in the second case, you only need the value from the index, so the index is taken.

In the first case, you need to get the value from the data area. In this case, you have 29,000 data records, with an average recording length of about 4 bytes. On the disk, the entire table is 171 KB (as seen in the plan). When the block size on the disk in 4k is 43 blocks. To get 43 blocks for 200 individual hits by pointers from an index, to put it mildly, is expensive (Guaranteed to read the entire table, and each block will have to be parsed 4 times). Full scan and even with hash join is more than justified.

The use of an index is effective when with its help it is necessary to refer to no more than 10% of all table blocks.

The original problem is that in the real query two more tables are joining.
And because of FULL SCAN by CONTROL_POINTS , FULL SCAN CONTROL_POINTS run on the remaining tables.
As a result, I get a sample of 228K rows with a size of 2896K bytes.
@AntonShchyrov And what are you still joining this table and the main thing for what fields, if it only has status and id ... And 2 MB is generally not very much.
@AntonShchyrov It's enough to slip a use_nl (control_points) into the query and see if it gets better.

Anton Shchyrov Anton Shchyrov 23.5k 2 14 44 · Answer 2 · 2019-04-25T15:11:38

Using the hint index brought the query to life

 SELECT /*+ index(cp PK_CONTROL_POINTS) */ cp.STATUS FROM TMP_CONTROL_POINT_DETECT det JOIN CONTROL_POINTS PARTITION (lic) cp ON ( det.POINT_ID = cp.ID )

 -------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | -------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 200 | 4000 | 202 (0)| 00:00:01 | | | | 1 | NESTED LOOPS | | 200 | 4000 | 202 (0)| 00:00:01 | | | | 2 | NESTED LOOPS | | 200 | 4000 | 202 (0)| 00:00:01 | | | | 3 | TABLE ACCESS FULL | TMP_CONTROL_POINT_DETECT | 200 | 2600 | 2 (0)| 00:00:01 | | | |* 4 | INDEX UNIQUE SCAN | PK_CONTROL_POINTS | 1 | | 0 (0)| 00:00:01 | | | | 5 | TABLE ACCESS BY GLOBAL INDEX ROWID| CONTROL_POINTS | 1 | 7 | 1 (0)| 00:00:01 | 2 | 2 | PLAN_TABLE_OUTPUT --------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("DET"."POINT_ID"="CP"."ID") filter(TBL$OR$IDX$PART$NUM("CONTROL_POINTS",0,1,0,ROWID)=2)

JOIN with temporary table

2 answers 2

More articles: