Python: Process Pools and Executors

Process pools are a way to manage many worker processes in Python without having to create and control each one yourself. Instead of starting processes one by one, you create a pool that handles them for you. This makes running many tasks in parallel easier and cleaner.

Executors are a higher-level tool built on top of process pools. They provide a simple and modern interface to submit tasks and get results, helping you run code in parallel without worrying about the details of process management. In this article, we’ll learn how to use both process pools and executors to run tasks in parallel smoothly.

Importing Required Modules

To get started, you need to import the modules that provide process pools and executors. Python’s multiprocessing module offers the Pool class for managing worker processes, while the concurrent.futures module includes ProcessPoolExecutor, a higher-level interface for running tasks in parallel.

from multiprocessing import Pool
from concurrent.futures import ProcessPoolExecutor

This lets you use both approaches to run multiple tasks in parallel in your programs.

Using multiprocessing.Pool

The Pool class helps you create a group of worker processes that can run tasks at the same time. With map(), you can easily apply the same function to many inputs and get all the results back once they finish.

from multiprocessing import Pool

def square(n):
    return n * n


if __name__ == "__main__":

    with Pool(4) as pool:
        results = pool.map(square, [1, 2, 3, 4, 5])

    print(results)  # Output: [1, 4, 9, 16, 25]

In this example, we create a pool of 4 processes. The square function runs on each number in the list [1, 2, 3, 4, 5] at the same time. When all are done, the results are collected and printed. This way, your program handles multiple tasks quickly and simply.

Using apply() and apply_async() in Pool

The apply() method runs a single task synchronously, meaning your program waits for it to finish before moving on. On the other hand, apply_async() runs the task asynchronously, letting your program continue while the task runs in the background. You can get the result later when it’s ready.

from multiprocessing import Pool

def greet(name):
    return f"Hello, {name}!"

if __name__ == "__main__":

    with Pool(2) as pool:

        # Run task synchronously
        result = pool.apply(greet, args=("Edward",))
        print(result)

        # Run task asynchronously
        async_result = pool.apply_async(greet, args=("Lucia",))
        print(async_result.get())

In this example, apply() blocks until greet("Edward") finishes, so you see the output right away. With apply_async(), the greeting for Lucia runs in the background, and you use .get() to wait for and print the result when it’s ready.

Using concurrent.futures.ProcessPoolExecutor

The ProcessPoolExecutor offers a modern, simple way to manage a pool of processes. It works well for running many tasks in parallel, using familiar methods like map(). This lets you apply a function to a list of inputs, and it handles creating and managing the worker processes for you.

from concurrent.futures import ProcessPoolExecutor

def cube(n):
    return n ** 3

if __name__ == "__main__":

    with ProcessPoolExecutor(max_workers=3) as executor:
        results = list(executor.map(cube, [1, 2, 3, 4]))

    print(results)  # [1, 8, 27, 64]

In this example, executor.map() applies the cube function to each number in the list, running multiple tasks in parallel. When all are done, it returns the results in the same order as the input.

The max_workers=3 part tells Python to use up to three worker processes at the same time. Each worker can handle one task, so this helps run several tasks in parallel. If you don’t set max_workers, Python will automatically use as many workers as your machine has CPU cores.

Submitting Tasks with submit() and Using Futures

With ProcessPoolExecutor, you can run tasks asynchronously using the submit() method. This immediately starts the task in the background and returns a Future object—a placeholder for the result that will be ready later.

To get the result from a Future, you call .result(). If the task isn’t finished yet, .result() waits until it completes and then returns the output.

Below is a clear example showing how to submit tasks and get their results:

from concurrent.futures import ProcessPoolExecutor

def shout(text):
    return text.upper()


if __name__ == "__main__":

    with ProcessPoolExecutor(max_workers=2) as executor:
        future1 = executor.submit(shout, "hello")
        future2 = executor.submit(shout, "world")

        print(future1.result())
        print(future2.result())

In this example, shout() is called twice in parallel using two worker processes (because max_workers=2). Each call runs independently, and their results are printed as soon as they’re ready.

Waiting for Multiple Futures

Sometimes, you run several tasks at once and want to wait until all are finished before moving on. The concurrent.futures.wait function helps with that. It takes a list of futures and waits until they’re all done.

Here’s a simple example showing how to submit multiple tasks and wait for them to complete:

from concurrent.futures import ProcessPoolExecutor, wait

def add(x, y):
    return x + y


if __name__ == "__main__":

    with ProcessPoolExecutor() as executor:

        futures = [executor.submit(add, i, i) for i in range(5)]
        done, not_done = wait(futures)

        for future in done:
            print(future.result())

In this example, we start five tasks to add numbers. The program waits until all tasks finish, then prints their results. This way, you can manage many tasks and be sure they’re complete before continuing.

Closing and Joining Pools

When you use a Pool to run multiple processes, it’s important to properly close the pool and wait for all processes to finish. Calling pool.close() tells the pool that no more tasks will be added. Then, pool.join() waits for all the worker processes to complete their work before the program continues.

Here’s an example showing how to do this:

from multiprocessing import Pool

def double(n):
    return n * 2


if __name__ == "__main__":

    pool = Pool(3)

    results = pool.map(double, [1, 2, 3])

    pool.close()  # Stop accepting new tasks
    pool.join()   # Wait for all workers to finish

    print(results)

In this code, pool.map() runs the double function on each number. After that, close() and join() make sure all work is done before printing the results. This keeps your program clean and organized when using process pools.

Using with for Automatic Cleanup

You can simplify the process by using a with statement to create the pool. This way, Python will automatically close and join the pool when the block ends, so you don’t need to do it yourself.

from multiprocessing import Pool

def double(n):
    return n * 2

if __name__ == "__main__":

    with Pool(3) as pool:
        results = pool.map(double, [1, 2, 3])

    print(results)

Using with is a clean and safe way to manage process pools, making your code shorter and ensuring proper cleanup without extra steps.

Choosing Between Pool and ProcessPoolExecutor

Both Pool and ProcessPoolExecutor let you run tasks in parallel using multiple processes. They work in similar ways, but ProcessPoolExecutor offers a simpler, more modern interface with easy support for futures and async-style code. If you want straightforward task submission and result handling, ProcessPoolExecutor is a good choice. For classic use cases, Pool still works well.

Daemon Processes and Pools/Executors

You cannot directly create daemon processes when using multiprocessing.Pool or concurrent.futures.ProcessPoolExecutor. These tools manage worker processes internally, and those workers are typically non-daemon processes.

This design ensures that worker processes complete their tasks and shut down cleanly. Daemon processes, on the other hand, stop immediately when the main program ends, which can interrupt running tasks and prevent results from being returned properly.

If you need background tasks that behave like daemon processes, you should create and manage individual multiprocessing.Process instances, where you can set the daemon attribute to True yourself.

In summary, Pools and Executors are meant for managing regular, non-daemon workers for parallel tasks, while daemon processes are better for lightweight background jobs you control directly.

Conclusion

Process pools and executors provide a simple way to run many tasks in parallel across multiple CPU cores. By managing worker processes for you, they let you focus on writing your functions without worrying about creating and handling each process manually. Whether you use multiprocessing.Pool or concurrent.futures.ProcessPoolExecutor, both tools help you divide work efficiently and collect results easily.

Exploring these tools in your own projects will give you hands-on experience with parallel processing in Python. Running tasks concurrently can speed up programs that perform many independent jobs, and process pools make this much more manageable. Give it a try, and see how your code can benefit from running tasks side by side.