Hello. There is such a code:

def actions(host,i): while True: time.sleep(1) def main(): hosts = [] with open('host.txt') as hosts_file: for line in hosts_file: hosts.append(line.strip()) for (i, host) in enumerate(hosts): thread = Process(target=actions, args=(host,i)) thread.start() if __name__ == '__main__': try: main() except KeyboardInterrupt: pass 

The code takes a list of hosts from the hosts.txt file and sends each host from the file to its process (ala multithreading), in which with each host some time-out actions are performed endlessly. The problem is that even if the actions function without the main code (as in the example above), after some time from the start, the program freezes like the whole OS (Ubuntu 16.04), it seems like all the RAM is being devoured, and its 32GB. Please tell me how to optimize this code.

  • limit the number of simultaneously processed hosts, obviously - strangeqargo
  • In any case, they should all be processed indefinitely! - LorDo
  • yes, but but if you run too many processes on each host, and you probably have a lot of them (specify), then you will hang. - strangeqargo pm
  • I have 3000 hosts, for each host I need 1 backend process - LorDo
  • one
    then you have to add a mountain of memory. or process them in smaller batches. look at how much memory one python process is eating and multiply by 3k, I'm sure that you can switch between bundles of hosts, and if you need 3k hosts for real-time, then a multiprocessing python will definitely not work for you - strangeqargo

1 answer 1

Optimization number zero - clarify the problem, come up with a different approach to the solution. Most likely, you can do without 3000 tasks in parallel processing.

The first optimization is to move from processes to threads. I understand that the task loads I / O. Threads in this case, the memory will spend exactly less than the processes.

The second optimization - all the same, 3000 streams are also not small, and if the memory is still not enough, then you can try to make the application asynchronous, look in the direction of asyncio

  • Yes, the processes and threads, supported by the OS, are most likely heavy and not needed. You can gevent, twisted try to simultaneously perform actions on all nodes. - jfs