There is a table of sites There is a table URL - urls
url are processed in some way: with a certain periodicity tasks are set with the desired execution time ( desired_time ) for each url and are executed using ScheduledThreadPool . A thread (let's call it A ), checks the active url in an infinite loop (with the flag url.active == 1 in the database) and, if necessary, sets a new task that will be processed by another thread (let's call such threads B ); or closes an open task, as no B stream managed to pick it up and the desired_time desired_time of its execution has expired.
The assigned tasks are waiting to be completed (when they are picked up by one of the performing threads B ) in the BlockingQueue .
A situation is possible when B processed the task at the moment when the desired_time task was about to desired_time . While flow B will take steps to prepare for the transfer of the result to save to the database (or some post processing), the desired_time will finally expire. But the task has already been completed and cannot be considered overdue. However, thread A at this point can check the task in the database and see that its desired_time expired. But since the result has not yet been recorded in the database - the status of the actually completed task has not yet been changed to DONE, and A can close the task with the status ADDLED , after which B completes the post processing and the result of the task will be saved in the database. The database will display the completed task with the correct result, but with the status ADDLED .
There is a flow C , which performs the role of a certain callback for B. If during the processing of a task in B , an error occurred that does not allow the processing of the task to continue, B passes task C. that it would be closed with the status of ERROR . And in addition, we have three non-synchronized streams.
It is clear that it is necessary to block the task for other threads when:
- B got the result and started post processing until the result was saved in the database
- C saves task with ERROR status
There should be exactly as many monitors as the URL is processed. Those. one monitor for all tasks of one URL.
How can such locks be implemented?
UPDATE task SET task.locked=1 WHERE task.id=:id AND task.locked=0and then looks at how many rows updated the request. if 0 means that someone has already taken the task, we score, if 1 means success, no one will touch the task while we are processing it. Just need to ensure that there are no "hung" tasks. - Sanya_Zol