Hey.

Question by mysql. What are the subtleties of using UNION in mysql?

Faced such a thing - I have two selects, the first of which contains ORDER BY . between UNION stands. sorting in the first for some reason does not happen. That is, it outputs without sorting, and no error is returned.

Created a database

 mysql> CREATE DATABASE asd; Query OK, 1 row affected (0.48 sec) mysql> SELECT table_name,table_comment FROM INFORMATION_SCHEMA.tables WHERE table_schema='asd'; Empty set (0.05 sec) mysql> USE asd; Database changed mysql> CREATE TABLE ukraine ( -> name CHAR(30) NOT NULL); Query OK, 0 rows affected (0.84 sec) 

Created ukraine table

 mysql> INSERT INTO ukraine VALUES ('Kiev'),('Lvov'),('Kharkov'),('Poltava'); Query OK, 4 rows affected (0.15 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql> SELECT * FROM ukraine; +---------+ | name | +---------+ | Kiev | | Lvov | | Kharkov | | Poltava | +---------+ 4 rows in set (0.00 sec) 

Created a russia table

 mysql> CREATE TABLE russia ( -> name CHAR(30) NOT NULL); Query OK, 0 rows affected (0.25 sec) mysql> INSERT INTO russia VALUES ('Moskva'),('Belgorod'),('Krasnodar'),('Sochi'); Query OK, 4 rows affected (0.11 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql> SELECT * FROM russia; +-----------+ | name | +-----------+ | Moskva | | Belgorod | | Krasnodar | | Sochi | +-----------+ 4 rows in set (0.00 sec) Объединяю с помощью UNION mysql> (SELECT * FROM ukraine) UNION (SELECT * FROM russia); +-----------+ | name | +-----------+ | Kiev | | Lvov | | Kharkov | | Poltava | | Moskva | | Belgorod | | Krasnodar | | Sochi | +-----------+ 8 rows in set (0.00 sec) 

Combining with UNION with ORDER BY, does not work

 mysql> (SELECT * FROM ukraine ORDER BY name) UNION (SELECT * FROM russia); +-----------+ | name | +-----------+ | Kiev | | Lvov | | Kharkov | | Poltava | | Moskva | | Belgorod | | Krasnodar | | Sochi | +-----------+ 8 rows in set (0.00 sec) 

Combining with UNION with ORDER BY, does not work

 mysql> (SELECT * FROM ukraine) UNION (SELECT * FROM russia ORDER BY name); +-----------+ | name | +-----------+ | Kiev | | Lvov | | Kharkov | | Poltava | | Moskva | | Belgorod | | Krasnodar | | Sochi | +-----------+ 8 rows in set (0.00 sec) 

I unite with the help of UNION with ORDER BY, which I put in the END, worked

 mysql> (SELECT * FROM ukraine) UNION (SELECT * FROM russia) ORDER BY name; +-----------+ | name | +-----------+ | Belgorod | | Kharkov | | Kiev | | Krasnodar | | Lvov | | Moskva | | Poltava | | Sochi | +-----------+ 8 rows in set (0.00 sec) 

The question is why does ORDER BY only work at the end? He, in theory, it should work within each request.

  • No, it should not. Rather, inside the request, it works, but if at the very end of the request there is no own order by - then the order of issue is not defined. Decide what the output should look like and you can come up with the correct order by for the entire request - Mike

5 answers 5

Imagine that each subquery inserts data into some temporary table, and then the general select pulls this data. Then it is obvious that there is no difference in what order were inserted. Only the final sample order is taken into account.

In other words

 SELECT .... UNION SELECT .... UNION SELECT .... UNION SELECT .... 

Transformed into some semblance of such

 INSERT INTO `temp` SELECT .....; INSERT INTO `temp` SELECT .....; INSERT INTO `temp` SELECT .....; INSERT INTO `temp` SELECT .....; SELECT * FROM `temp`; 

Then it is clear that whatever ORDER BY you specify in the intermediate queries the last select will use its sorting order.

  • Can you recommend a good source, where the question about subqueries is chewed? - Dimon
  • I do not really understand the answer. if not difficult, please write in more detail - Dimon
  • @Dimon What is not satisfied with the official documentation? Everything is clearly written there - Anton Shchyrov
  • I do not understand anything there. write in English. if I understood what was happening, I would not have asked questions here - Dimon
  • @Dimon I have replied to the answer - Anton Shchyrov

You are doing exactly the union , no more simple and stupid union all . union without specifying exactly which union performs union distinct . Those. will try to remove duplicate rows from the merged sets.

Usually, this in itself is not the kind of behavior that you expected, and almost always when they talk about union they actually want to see union all . If all the lines are different, this is just an extra operation, if there are matching lines, then you will lose them from the sample and this may not be what you wanted.

And yes, searching for duplicate rows changes the sorting of the rowset because it is done this way.

In fact, union all simply sticks two samples together, one by one, and does not affect the sorting. I do not remember exceptions for the top-level query, but yes, this is not guaranteed by the standard. If order is critical, it makes sense to re-sort after merging:

 (SELECT 1 as sortf, name FROM ukraine) UNION ALL (SELECT 2 as sortf, name FROM russia) ORDER BY sortf, name; 

This query is guaranteed to give first the records from the ukraine table, sorted by name, then from russia, also sorted by name.

  • thanks for the answer. I have no duplicates. If you change the union to union all, then nothing will change. I tried - Dimon
  • one
    No, the SQL standard does not guarantee that as a result of executing a union all at the beginning will be issued by the first part and strictly followed by that issued by the second part. The standard gives a guarantee only on the final order by . - Mike
  • @Mike I wrote about this, which does not guarantee and how to set this behavior at the request level. Should I reformulate the idea? - Shallow

Why does ORDER BY work only at the end? He, in theory, it should work within each request

And why did you decide that does not work? it works ... just between this triggering and getting the result to you, a lot of things are still happening - including ones that affect the final order of the record returns.

======

Much has already been said, the majority is correct, in full compliance with the documentation . Just to clearly formulate the main, I think, a consequence of the above facts:

If there is no LIMIT in a separate query included in a union by UNION, then the use of ORDER BY in it is meaningless, the result of sorting will still be ignored.

Yes, do not throw at me with stubs, there are other cases where ORDER BY in a separate query makes sense. For example, if this query uses variables, for example, to emulate window functions that are not yet implemented. But these are extremely rare cases, and for the authors of such requests, the sorting behavior in UNION is hardly a secret.

    Union works in conjunction with the set. Union cannot work with sorted data. https://msdn.microsoft.com/ru-ru/library/ms180026.aspx This site states that you cannot merge sorted data. SQL is basically sharpened to work in a relational style, that is, with a lot of it. Attempting to work in an iterative style leads to a slowdown in work, so the Cursor runs much slower than if you only manage with relational tools.

    • In fact, the TS at the very beginning clearly indicated the DBMS. And it was never MS SQL. - Akina

    Union works in all DBMS in the same way, since it is compatible with the standard, in mdsn it is shown in examples and is available.

    • Only the external effect of the work of UNION is the same in all DBMSs. And that only in that part in which the result given by it is determined. And in all other respects - and especially in the internal mechanisms of work - each DBMS has its own implementation, with its own features and problems. - Akina
    • Please provide examples of how ms sql differs from mysql in terms of the union implementation - Oleg Babichev
    • They are absolutely the same for me - I don’t know how exactly UNION is implemented in each of the servers. And if you are interested in the difference in the results of their work - try to make UNION ALL two samples from one table that does not have any unique index, with different selection conditions, and look at the sorting of the result on both servers. Then try to make UNION DISTINCT and look again at the sorting result. - Akina