SELECT optimization with statistics

Question

There is such a request:

SELECT COUNT(`count`) AS 'visits', `code` FROM `om_log` WHERE `code` <> '0' AND `date` >='1464728400' AND `date` <='1467320399' GROUP BY `code` ORDER BY `code`;

There are a lot of records in the table, these are statistics of a site visit Usually it is going for a month. Any ideas how to optimize this query without resorting to refactoring and using intermediate tables with counters?

EXPLAIN:

 id | select_type | type | possible_keys | key | key_len | ref | rows | Extra 1 | SIMPLE | range | "code,date" | date | 5 | NULL | 1420 | "Using where; Using temporary; Using filesort"

And when it comes to optimization, the first thing to do is to look and bring here the query execution plan (obtained by explain)
There are indexes, only the count field is not indexed, but when you try to add an index to it, the local MySQL crashed.
The number of entries is now 9 049 853, explain now I will add.
do count (1) it will give the same result if the count field is not NULL everywhere
@Mike, And where did you get that in the count will always be ides?
but you use the count () function which counts the number of not null values and it doesn’t care for the exact value in the count column.
if you expected different values in this column, you probably would use sum ()

Accepted Answer · 2016-06-30T11:36:35

This optimization can be done by MySQL instead of me using a partitioning mechanism. This is a physical division of a table file into several according to a certain attribute. Thus requests begin will be executed many times faster. The main thing is not to overdo it with their number.

 ALTER TABLE `om_log` PARTITION BY RANGE(date) PARTITIONS 6( PARTITION less2015 VALUES LESS THAN (UNIX_TIMESTAMP('2015-01-01')), PARTITION less2016 VALUES LESS THAN (UNIX_TIMESTAMP('2016-01-01')), PARTITION less2017 VALUES LESS THAN (UNIX_TIMESTAMP('2017-01-01')), PARTITION less2018 VALUES LESS THAN (UNIX_TIMESTAMP('2018-01-01')), PARTITION less2019 VALUES LESS THAN (UNIX_TIMESTAMP('2019-01-01')), PARTITION other VALUES LESS THAN (MAXVALUE) );

Now I have made them with a margin, but it is not bad for a good girl to add partitions with crown if necessary, for example, at the end of the year, and suppose we delete the part 5 years ago.

It is impossible to make partitioning of an existing table, which is actively used, a table lock arises and this leaves the site. It is necessary to create a new table to which to switch the work of the site, and then transfer all the records there in small portions.

Now solved the problem of adding a combined index! Request speed 2-5 seconds This is a less acceptable time instead of the former 90 seconds.

Makarenko_I_V 1,333 3 24 · Answer 2 · 2016-06-30T12:24:11

With keys, I understand everything is ok. There are many ways:

Cluster build - my like for Percona XtraDB Cluster DB Cluster and for MariaDB Galera Cluster. Perkona is more stable, Maria is smarter, maybe stability troubles have already fixed.
Sharding - we beat one DB into several, which are spinning on different servers. Sharding needs to be managed from the code - this is not an easy thing.
Partizianirovaniye - it is possible, but he also has a bonus limit.
The rejection of SQL in favor of nosql solutions: much simpler scaling, faster on a large number of records.

In general, optimization and scaling of the database is a whole big interesting area of IT.

1,2,4 - decisions seem to me a shot from a cannon on the sparrows of a cannon.
1 - Collect whole clusters for the sake of one table, which, when it is cleaned up, maybe 10 miles.
2 - You yourself clarified that this is not an easy thing, again for 10 mil.
records then ... 4 - Saw the entire database and all requests?
Argue if there are specific opinions about their application.
- Well, then how to say - if we use ORM, for example, the doctrine: the database adapter changes to one line.
I posted all the solutions for serious projects - where there are big thick databases, where a lot of users, there is one problem with a load, another one immediately pops up.
These solutions allow scalability to be made — that is, a simple increase in productivity with linear costs.
Records problem for MySQL A big problem, especially if you want complex samples (filters), text search.
Unfortunately, the database itself, and the project itself is in a state of neglect (20 years without complete refactoring, only a minor one).
For this, now we need to plug holes, and when refactoring, when they give it time (maybe when everything breaks at all), then we’ll take it into account and think over the architecture.

SELECT optimization with statistics

2 answers 2

More articles: