Hello.

I had a question about how exactly the join (JOIN) in sql works. In particular, in what order the lines go.

Ie, for example, there are 2 tables A and B:

A: B: 1 1 1 5 2 3 2 8 

Communication will be on the first field.

SELECT A.number, B.number FROM A JOIN B ON A.number = B.number

How will the strings go (meaning not in the result set, but during processing)? Like it or not:

 1 1 1 2 2 1 2 2 

Where can I read about it?

  • Well, as a result, two, and in what order the discarded variants are viewed, this is hardly fixed somewhere. Thrown out, okay. - alexlz
  • "(meaning not in the result set, but during processing)" - progmb
  • Yes, I have already fixed the comment. I can give an exact answer: I do not know. In turn, it became interesting to me, what for it to know? - alexlz
  • It became interesting how it works. By the way, maybe it will be possible to look in the source code of some opensource DBMS. Although most likely it will be a special case, since in different DBMS processing may be and most likely will be implemented in different ways. - progmb
  • It should be different even for one DBMS with a different ratio of quantity value in the first / second, presence / absence of indices for these fields, etc. In MS SQL, there was a sort of analyzer that helped plan queries, among other things, making up a query plan. - alexlz

2 answers 2

rather, the entire procedure for issuing a result will depend on the order in which the data physically lies on the disk. On the other hand, some data may already have been cached, this may also affect the results of the sample. to summarize: the order will be random

if you need a specific order of the records in the result, it is better to explicitly set the sorting. This is another step towards writing good software with predicted behavior.

    This is called CONNECTION. Union is UNION. There are several joint execution algorithms from which the optimizer chooses. For example,

     nested loops hash join merge join 

    The order of processing depends on the selected algorithm. A description of each algorithm can be found in the documentation for the database used.