Suppose there is a table with some processes that can receive various statuses during execution:
process process_status ββββββ¦βββββββββ ββββββ¦βββββββββββββ¦βββββββββββββ¦ββββββββββ β id β name β β id β process_id β date β status β β βββββ¬βββββββββ£ β βββββ¬βββββββββββββ¬βββββββββββββ¬ββββββββββ£ β 1 β One β β 1 β 1 β 2018-10-10 β Start β β 2 β Two β β 2 β 1 β 2018-10-15 β Running β β 3 β Three β β 3 β 1 β 2018-10-30 β Running β ββββββ©βββββββββ β 4 β 1 β 2018-11-01 β Running β β 5 β 1 β 2018-11-02 β Error β β 6 β 1 β 2018-11-03 β Error β ββββββ©βββββββββββββ©βββββββββββββ©ββββββββββ
As you can see, statuses can be different, as well as each status can be obtained several times.
I am only interested in the Running
and Error
statuses. Now I get the entries like this:
select p.id, p.name from process p join process_status ps on p.id = ps.process_id where ps.status = 'Running' or ps.status = 'Error';
As a result, 5 rows are returned. Three with the status of Running
and two with the status of Error
.
Is it possible to somehow return only 2 rows? Is the first received Running
(If there is one at all) and / or the very first Error
received (If there is one too)?
Ideally, return all processes for which the very first Running
received no later than a certain date, for example, 2018-10-16
OR the very first Error
received no later than some other date, for example, 2018-10-17
?
Is it possible?
As a result, ideally, I need only processes that satisfy this condition:
ββββββββ¦βββββββββ β p.id β p.name β β βββββββ¬βββββββββ£ β 1 β One β ββββββββ©βββββββββ
UPD
I will explain my task a little:
Now all the processes and all their Running
and Error
statuses are recalled from the database (approximately by the query I gave), and already from the code side, those processes that received the first Running
or Error
status no later than a certain date are searched.
The fact. that about 8-10 thousand (sometimes more) processes are launched daily, their number is already in the tens of millions. Each process also receives about 10 thousand statuses.
Right now there are no performance issues, but I am performing a task that affects this code. In my opinion, too much unnecessary data travels from the database to the application. Therefore, since I still refactor this section of code, I want to make it more optimal, namely, not to subtract millions of processes and select several dozens of necessary ones in the code, and return only those 10 necessary ones at once, and work only with them.