In optimizing one of the subqueries, I got to this state:

SELECT foo.id FROM foo INNER JOIN bar AS f ON foo.id = bar.foo_id GROUP BY foo.id HAVING MAX(foo.baz) != MAX(bar.baz) 

The following is important here:

  • used maridb
  • tables foo and bar are related one-to-many
  • subqueries are not allowed
  • returns one field - foo.id

I don’t like the hak with an aggregate function in HAVING - if it makes sense for bar.baz field (select only those records for which the foo.baz field is not equal to the maximum bar.baz), then in the case of the foo.baz field it is used exclusively in order to remove the field from the SELECT block. Plus, there is a risk that max (*) does not return the expected result.

How to improve the query?

  • Why not use subqueries? Does this apply to temporary tables? And how big the tables are, I would also like to know. - T.Zagidullin
  • 1. The restrictions imposed by the business and the desire to remove the load from the database, if possible. 2. Tables for several million records each. - Denis Khvorostin
  • it would not be bad to write what is expected from the query and the structure of 2 tables. - hardworm

1 answer 1

It seemed to me that business was the main thing that was right and fast. You can try this option with a temporary table, it should be faster.

 select bar.foo_id, MAX(bar.baz) baz into #bar from bar group by bar.foo_id create index ix on #bar (foo_id, baz) select foo.id from foo join #bar bar on foo.id = bar.foo_id AND foo.baz <> bar.baz 
  • Will not go. The query itself is simple and is used as an operand in a larger query. Therefore, it should be a) a simple request, b) a single request. - Denis Khvorostin
  • And the number of unique id in the tables is the same? Maybe in one there are such id, which is not in the other? - T.Zagidullin
  • I repeat: one-to-many relationship. Potentially, there may be an entry in foo, which is not in bar, but there can be no entries in bar that are not in foo. - Denis Khvorostin
  • I broke my head, I did not find solutions with all the conditions that are described. The internal query is very beautiful, try to consider the option of using it. Even considering what it will use in another large query. I remember there was a query of 2 thousand lines, the subqueries went down 15 levels, and everything worked pretty quickly. By the way, nobody forbids to insert data into a temporary table before executing a “large” query. - T.Zagidullin
  • Actually, the question is whether it is possible to further improve the query. Of course, there are options with subqueries. Another thing is that due to the existing load on the database it is not recommended to use them - Denis Khvorostin