Why such a big difference in execution time between these 2 methods of connection?

Question

Suppose you need to connect by partial coincidence from the beginning of the line.

Connected columns have indices.

I can connect like so

t.field1 like t2.field1+'%'

, I can connect as follows (Since the data allows us to calculate the value for an exact match):

 t.field1 =LEFT(t2.field1,LEN(t2.field1)-CHARINDEX('/',reverse(t2.field1)))

And the second method is many times faster than the first, although, in theory, they should go on a par with time, since the search comes from the beginning of the field.

Why is that?

You yourself answered - the data allows you to calculate the value for an exact match.
@vikttur, the cost of computing really faster than pure Like?
Seeing no exact match, they may well expect that there will be too many records by condition and choose a completely different way of execution (for example, to decide that working on an index for such a large number is not effective).
And when the exact match they differently estimate the expected number of entries.

i-one i-one 6,876 ten 28 · Answer 1 · 2016-09-14T07:25:56

These two conditions have different semantics, and the result of queries with one and the other condition must be different. Not surprisingly, the lead time is different.

For example, for such data

 create table t (field1 varchar(20)); create table t2 (field1 varchar(20)); insert into t values ('A'), ('A/B'), ('A/B/C'), ('A/B/C/D'), ('A/B/C/D/E'); insert into t2 values ('A/B/C');

Request with the first condition

 select t2.field1 as t2_field1, t.field1 as t_field1 from t2 join tt on t.field1 like t2.field1 + '%';

gives the result:

 t2_field1 t_field1 ---------- ---------- A/B/CA/B/C A/B/CA/B/C/D A/B/CA/B/C/D/E

And request with the second condition

 select t2.field1 as t2_field1, t.field1 as t_field1 from t2 join t on t.field1 = LEFT(t2.field1, LEN(t2.field1) - CHARINDEX('/', reverse(t2.field1)));

gives another result:

 t2_field1 t_field1 -------------------- -------------------- A/B/CA/B

Those. the first request for one line from t2 selects many lines from t (in this example, the element and its "subtree"), and the second selects only one ("parent").

Why such a big difference in execution time between these 2 methods of connection?

1 answer 1

More articles: