Starting a long-running state-tracking function via a flag

Question

I have the following class:

class model: IS_EMPTY = 0 RUNNING = 1 SUCCESS = 2 Train = IS_EMPTY def doing(self, SOME_TODO): #Если нет того, что надо делать if not SOME_TO_DO: self.Train = IS_EMPTY return self.Train = RUNNING SOME_TODO ... #Это некий объект, надо которым надо вести длительную работу ... #Когда работа завершилась self.Train = SUCCESS def is_doing(self): return self.Train

It is understood that the doing function does some lengthy operations. How the work of the doing function works, can be judged by the state of the Train variable. I want to implement the doing function in such a way that it works as asynchronous (not strong in terminology), and the user, when checking the status, received the Train state. For obvious reasons, the following code will cause the user to wait until doing finishes work:

 mod = model() def work: if (mod.is_doing == mod.RUNNING): return mod.RUNNING if (mod.is_doing == mod.SUCCESS): return mod.SUCCESS if (mod.is_doing == mod.IS_EMPTY): #И вот тут вызывается функция doing, которая стопорит все #до своего завершения mod.doing(OBJECT_TODO)

Can asyncio help in solving such problems? Or are there more practical methods?

If these are computational tasks, this is one thing, if this is IO (for example, many database operations) this is another.

Community spirit ♦ one · Accepted Answer · 2017-03-15T15:52:48

If the function that you do not want to wait is doing computational work, then asyncio not an assistant here. Asynchrony can be achieved in several ways - using multiple threads, processes, coroutines , etc. A coroutine can save its state and switch to the execution of another coroutine (or "main" code). A simple example (Py3.5 +):

 import asyncio GLOBAL_COUNTER = 10 GLOBAL_LOOP = asyncio.get_event_loop() async def coroutine_1(): global GLOBAL_COUNTER, GLOBAL_LOOP print("CORO 1") await asyncio.sleep(3, GLOBAL_LOOP) GLOBAL_COUNTER -= 1 async def coroutine_2(): global GLOBAL_COUNTER, GLOBAL_LOOP print("CORO 2") await asyncio.sleep(6, GLOBAL_LOOP) GLOBAL_COUNTER -= 1 while GLOBAL_COUNTER > 0: GLOBAL_LOOP.run_until_complete(asyncio.gather(coroutine_1(), coroutine_2())) print("GLOBAL!")

Coroutine_2 does not wait for the first to go to sleep. When await sleep executed in the first coroutine, the second could do something without waiting for the first. And so on, the third is not waiting for the second and first, fourth, etc. However, the nuance is that asyncio.sleep is an imitation of waiting, not work. Waiting for a response from the database, from another server, from the device, from the disk. That is, while the CPU is waiting, the CPU is idle / waiting for a response / waiting for an interrupt and async / await does not allow it to do so by throwing tasks. Now replace the first coroutine:

 import time async def coroutine_1(): print("CORO 1") time.sleep(3)

time.sleep() is an imitation of a computational task that clogs the CPU completely. That is, no one waits for anyone, the CPU is busy, he cannot take it and switch somewhere else - there are enough tasks with more than enough. Please note that all this wealth occurs in one thread (on one CPU core), coroutines do not generate new threads or processes. I came up with another analogy - imagine that you are waiting for the bus. Instead of staring at the sky - you can read a book. However, it is hardly possible to read while you are writing code (although there are individuals).

In order for the main program not to wait for some long function, you can use processes (not threads). The scripts executed by Python in the CPython implementation are always single-threaded, with some exceptions. See gil . Single-threaded means that only one core of the CPU works in them, even though from the point of view of the OS there are several threads.

However, some libraries (or self-written extensions) can release GIL, thereby using all available CPU cores. The most famous is numpy . Try to use exactly numpy in order to thresh numbers - the speed increase is huge, especially if your calculations can be vectorized.

Returning to the processes - you can try to organize parallel computing using the multiprocessing module - the processes are completely independent of each other.

If you have a whole ocean of tasks and one machine fails - it's time for Celery - using this library you can scatter tasks on a cluster of several computers.

Other ways out of the situation are also possible, everything does not end with these libraries and you can find libraries for solving some narrow problems (for example, CUDA / OpenCL for computing on the GPU, Tensorflow for machine learning)

1- never block the async function without calling await (otherwise you stop the whole event cycle).
For example, instead of time.sleep(3) , you should use await asyncio.sleep(3) .
To show an example of long calculations, you can await loop.run_in_executor(executor, long_computations, *args) substitute ( executor can execute code in a thread / process pool — as needed).
2- Instead of asyncio , you can show an example with futures.concurrent how to start a function without blocking the current thread ( .submit() ), and check the status of the execution later or call a callback with the results.

Starting a long-running state-tracking function via a flag

1 answer 1

More articles: