There are about 80000 links written to the database with a product. I cycle through each link and read the data and select the characteristics I need for the product. It takes a lot of time.

How do I implement a simple multi-thread scheme for extracting information? I searched the Internet and found little.

    1 answer 1

    try Pool

    from multiprocessing import Pool def get_page(link): .... links = ("link1", "link2".....) pool = Pool() pool.map(get_page, links) pool.close() pool.join() 
    • one
      You can use threads instead of processes for IO tasks. Instead of pool.close() , better with ThreadPool() as pool: in case of an exception is thrown. pool.join() is not needed here. For 80_000 links, it is worth imap_unordered() call and write errors. . It may be more convenient to use concurrent.futures if re-failed links to the pool are sent (submit (), as_completed ()). - jfs
    • those. instead of multiprocessing, you need multiprocessing.dummy. - Nikolay Dyomin
    • Right. Or, obviously, ThreadPool (it's the same thing) - jfs