For example:
First, select the minimum number of the year in which the user received a new message, and then from all of this make a grouping with summation by year:
select Year, count(user) as Count from ( select min(extract(YEAR from Date)) as Year, user as User from table where Type like 'Новое%' group by user ) as FirstMessageYears group by Year
Uniqueness is ensured by the fact that we can get only one minimum date for each user, so distinct not needed.
To add lines with years, for which there are messages, but not the first, you can use the fact that count (null) returns 0:
The internal query is complicated by the fact that it now returns the year for which there is no user with the first message, with a null value in the user column.
select AllYears.Year, count(FirstMessageYears.User) as Count from ( select min(extract(YEAR from Date)) as Year, user as User from table where Type like 'Новое%' group by user ) as FirstMessageYears right join ( select distinct extract(YEAR from Date) as Year from table ) as AllYears on FirstMessageYears.Year = AllYears.Year group by AllYears.Year
Let's try to figure it out. We start "to execute" SQL query "from within", i.e. with those parts that are at the maximum level of nesting. In our case, the nested one is a complex subquery:
( select min(extract(YEAR from Date)) as Year, user as User from table where Type like 'Новое%' group by user ) as FirstMessageYears right join ( select distinct extract(YEAR from Date) as Year from table ) as AllYears on FirstMessageYears.Year = AllYears.Year
To understand what this subquery returns, let's break it down into parts:
( select min(extract(YEAR from Date)) as Year, user as User from table where Type like 'Новое%' group by user ) as FirstMessageYears
right join
( select distinct extract(YEAR from Date) as Year from table ) as AllYears
Here, already a little easier. We have two queries, each returning a table, then these tables are joined by an operation with an incomprehensible name right join .
First, let's look at what each of the parts is returned before merging.
The first result of the subquery with the eloquent title FirstMessageYears will contain two columns: пользователь + дата первого сообщения .
The second result of the subquery with the name AllYears will contain one column in which all the years occurring in the source table will be listed.
What happens if you apply the right join operation to these two sets? Get a table consisting of two columns, the номер года and the пользователь . Logically, this set is filled in 2 stages:
First, those lines from the AllYears set for which there is no match in the FirstMessageYears set fall into the result of the merge. The пользователь field for these lines remains null .
Then, those lines from the FirstMessageYears set for which there is a match in the AllYears set fall into the result of the merge. The пользователь field for these lines is taken from the FirstMessageYears set.
It remains to carry out a grouping according to this resulting witchcraft:
select AllYears.Year, count(FirstMessageYears.User) as Count from [результат выполнения подзапроса] group by Year
and we remember that the FirstMessageYears.User column FirstMessageYears.User out to be null in those lines that correspond to years in which there was no first message from any of the users. Count (null) returns 0. Woo-a la.