Hello. Can you please explain how to write such a SQL query that finds all sessions during which the user performed the following actions, perhaps not in a row, but in the specified order:
- the user went to the rooms. homework-showcase;
- user went to rooms.view.step.content;
- user went to rooms.lesson.rev.step.content.
A session is a user activity in which less than one hour passes between successive actions. The session begins at the time of the first of these actions and ends an hour after the last of them.
The result should be the unloading of sessions of the form: user_id, <start date of the session>, <end date of the session>.
There is data of the following form:
user_id - happened_at - page,
where happened_at is the date of the action, and page is the type of the action (only rooms.homework-showcase and others are needed).
My request is now:
SELECT *, (case page when 'rooms.homework-showcase' then 1 when 'rooms.view.step.content' then 2 when 'rooms.lesson.rev.step.content' then 3 else 4 end) activity_number, extract('epoch' from happened_at) - extract('epoch' from lag(happened_at) OVER (PARTITION BY user_id ORDER BY happened_at)) time_lag FROM test.vimbox_pages WHERE (page IN ('rooms.homework-showcase', 'rooms.view.step.content', 'rooms.lesson.rev.step.content')) ORDER BY happened_at In it, I select only the necessary actions and using the window function lag I find the difference between the next and the previous time.
Now I use Python to further process the data received from the current request.
I searched a lot on the Internet, but there only for the case where one entry - one day (hour) is considered and then solved with the help of window functions or self-join'ov But this approach will not work, because there can be records in the 1st and 119 th minute The difference in hours does not exceed 1, but this option is certainly not suitable, because by the minute the difference is more than 60
If not difficult, skip off the snippet immediately or tell where to read).
PS I use Redshift SQL, which is almost the same as PostgreSQL