Python: Processes vs Threads

In Python, both processes and threads help you run multiple tasks at the same time. A process is like a separate program running on your computer, each with its own memory. A thread is a smaller task running inside a program, sharing the same memory with other threads.

Using processes and threads makes your programs faster and more efficient by doing many things at once. This article will show you when to use processes or threads, and how to use them with simple, clear examples. By the end, you’ll understand how to pick the right tool for your task and get started with your own multi-tasking Python programs.

Table of Contents

What is a Thread?

A thread is a lightweight task that runs inside a program. Multiple threads can run at the same time within the same program, sharing the same memory space. This makes communication between threads easy because they can access the same data directly.

In Python, you can use the threading module to create and manage threads. You create a thread by passing a function that the thread will run. Then, start the thread to run it concurrently with the main program or other threads.

import threading
import time

def print_numbers():
    # Print numbers 0 to 4, pausing 1 second between each
    for i in range(5):
        time.sleep(1)  # Simulate work with a delay
        print(f"Thread prints: {i}")

# Create a thread to run the print_numbers function
thread = threading.Thread(target=print_numbers)

# Start the thread; it runs alongside the main program
thread.start()

# Wait for the thread to finish before continuing
thread.join()

print('Main Ends...')

This example creates a thread that prints numbers from 0 to 4 while the main program waits for it to finish using join(). Threads are great for tasks like waiting for input or handling multiple connections where sharing memory helps.

What is a Process?

A process is an independent program that runs separately from others, with its own memory space. Because each process has its own memory, processes run truly in parallel on multiple CPU cores, without being blocked by Python’s Global Interpreter Lock (GIL). This makes processes perfect for CPU-heavy tasks like calculations or image processing.

In Python, you use the multiprocessing module to create and manage processes. You create a process by passing a function that the process will run. Then, start the process to run it independently of the main program.

from multiprocessing import Process
import time

def print_letters():
    # Print letters A to E with a pause to simulate work
    for letter in 'ABCDE':
        time.sleep(1)
        print(f"Process prints: {letter}")


if __name__ == '__main__':

    # Create a new process that runs the print_letters function
    process = Process(target=print_letters)

    # Start the process (runs separately from the main program)
    process.start()

    # Wait for the process to finish before continuing
    process.join()

    print('Main process ends...')

This example creates a process that prints letters A to E with a 1-second delay between each. The main program waits for the process to finish with join(). Processes are useful when tasks need to run truly in parallel, like heavy calculations, since they don’t share memory and avoid GIL limits.

Note: When working with the multiprocessing module, it’s important to place your process creation and start code inside the if __name__ == '__main__': block. This ensures your program runs safely and correctly, especially on Windows, by preventing the code from running unintentionally when new processes start.

When to Use Threads

Threads work best for I/O-bound tasks — these are jobs that spend time waiting, like reading files, talking to a network, or getting user input. Since threads share the same memory space, it’s easy to share data between them without extra work.

Using threads for I/O lets your program stay busy while waiting, making things feel faster and smoother. For example, you can start several threads to download files or print messages at the same time without waiting for one to finish before starting another.

import threading
import time

def wait_and_print(name):
    # Simulate waiting (like an I/O task)
    time.sleep(1)
    print(f"{name} finished waiting")

# Start multiple threads to run the wait_and_print function
for name in ['Thread-A', 'Thread-B', 'Thread-C']:
    threading.Thread(target=wait_and_print, args=(name,)).start()

In this example, three threads run together, each waiting for 1 second before printing a message. They work in parallel, so the total wait time is about 1 second, not 3 seconds if done one by one. Threads make such waiting tasks simple and efficient.

When to Use Processes

Processes are best for CPU-bound tasks — jobs that need a lot of number crunching, image editing, or heavy math calculations. Unlike threads, each process runs independently and can use different CPU cores, so they really work in parallel without being slowed down by Python’s Global Interpreter Lock (GIL).

Using processes lets you spread out tough calculations across your computer’s CPUs, speeding up tasks that would otherwise take a long time. For example, running several math-heavy functions in separate processes lets your program do more work at once.

from multiprocessing import Process

def heavy_math(number):
    # Calculate sum of squares up to the given number
    result = sum(i*i for i in range(number))
    print(f"Sum of squares up to {number} is {result}")

processes = []

if __name__ == '__main__':

    # Create and start processes for different calculations
    for num in [10000, 20000, 30000]:
        p = Process(target=heavy_math, args=(num,))
        p.start()
        processes.append(p)

    # Wait for all processes to finish
    for p in processes:
        p.join()

This example runs three heavy math tasks in separate processes at the same time. Each process calculates a sum of squares for a different number, working fully in parallel and printing results independently. This is how you use multiprocessing to handle CPU-heavy work easily.

Sharing Data: Threads vs Processes

Threads run inside the same memory space, so sharing data between them is simple — they can directly read and write the same variables. This makes communication fast and easy but requires care to avoid conflicts.

Processes, however, run in separate memory spaces. They cannot directly share normal variables. To share data between processes, you need special tools like multiprocessing.Queue or a Manager object that helps manage shared data safely.

Example: Thread Sharing a Variable

import threading

shared_data = []

def add_items():
    for i in range(5):
        shared_data.append(i)
        print(f"Thread added {i}")

threads = [threading.Thread(target=add_items) for _ in range(2)]

for t in threads:
    t.start()
for t in threads:
    t.join()

print("Shared data:", shared_data)

Here, two threads add items to the same list. Since they share memory, the list is updated directly.

Example: Process Sharing Data with Queue

from multiprocessing import Process, Queue

# Producer function: adds items to the queue
def producer(q):
    for i in range(5):
        q.put(i)  # Put item into the queue
        print(f"Produced {i}")

# Consumer function: takes items from the queue
def consumer(q):
    while not q.empty():  # Check if queue still has items
        item = q.get()     # Get item from the queue
        print(f"Consumed {item}")

# Required to protect code when using multiprocessing
if __name__ == '__main__':
    q = Queue()  # Create a queue to share between processes

    # Create producer and consumer processes
    p1 = Process(target=producer, args=(q,))
    p2 = Process(target=consumer, args=(q,))

    p1.start()      # Start producer
    p1.join()       # Wait until producer is done

    p2.start()      # Start consumer
    p2.join()       # Wait until consumer is done

In this example, a Queue is used to send data from the producer process to the consumer process safely, because processes do not share memory directly.

If you tried to use a regular variable (like a list or dictionary) to share data between the two processes, it wouldn’t work. Each process runs in its own memory space, so changes made in one process wouldn’t be visible to the other.

Regular List

Here’s an example using a regular list to show why it doesn’t work across processes:

from multiprocessing import Process
import time

def producer(data):
    for i in range(5):
        data.append(i)
        print(f"Producer added {i}")
        time.sleep(0.5)

def consumer(data):
    time.sleep(3)  # Wait to make sure producer is done
    print("Consumer sees:", data)


if __name__ == '__main__':

    shared_list = []

    p1 = Process(target=producer, args=(shared_list,))
    p2 = Process(target=consumer, args=(shared_list,))

    p1.start()
    p2.start()

    p1.join()
    p2.join()

Even though the producer adds items to shared_list, the consumer sees it as empty. That’s because each process has its own copy of the list — they don’t share memory. So changes in one process don’t affect the other.

That’s why we use multiprocessing.Queue or Manager().list() — these are built-in tools designed for safe communication between processes. Since each process has its own memory space, regular variables won’t work for sharing data. But with a queue or manager, Python handles all the behind-the-scenes work, allowing you to pass values like numbers, strings, or even objects smoothly between processes. It’s an easy and reliable way to connect separate parts of your program.

`Manager().list()`

Here’s an example using multiprocessing.Manager().list() to safely share a list between two processes:

from multiprocessing import Process, Manager

def add_numbers(shared_list):
    for i in range(5):
        shared_list.append(i)
        print(f"Added {i}")

def read_numbers(shared_list):
    for num in shared_list:
        print(f"Read {num}")


if __name__ == '__main__':

    with Manager() as manager:

        shared = manager.list()  # Create a shared list

        p1 = Process(target=add_numbers, args=(shared,))
        p2 = Process(target=read_numbers, args=(shared,))

        p1.start()
        p1.join()  # Wait for the writer to finish

        p2.start()
        p2.join()  # Then the reader starts

This example shows how two processes can work with the same list safely. The first process fills the list with numbers, and the second reads them out. The Manager().list() handles all the sharing behind the scenes, so each process sees the same data even though they have separate memory. This makes it easy to pass information between processes without using files or sockets.

So, threads make sharing data quick and easy inside one program, but processes require special tools to pass data safely between separate memory spaces.

Stopping and Joining Threads and Processes

When working with threads or processes, it’s important to wait for them to finish their tasks before moving on. This is done using the join() method. It tells the main program to pause and wait until the thread or process is done.

This is especially useful when you want to make sure everything is complete before exiting the program or moving to the next step.

Here’s a simple example that shows how to use join() with both a thread and a process:

import threading
from multiprocessing import Process
import time

def print_numbers():

    for i in range(3):
        time.sleep(1)
        print(f"Thread prints: {i}")


def print_letters():

    for letter in 'XYZ':
        time.sleep(1)
        print(f"Process prints: {letter}")


if __name__ == '__main__':

    # Start and join thread
    thread = threading.Thread(target=print_numbers)
    thread.start()
    thread.join()  # Wait for thread to finish

    # Start and join process
    process = Process(target=print_letters)
    process.start()
    process.join()  # Wait for process to finish

    print("Main program done.")

In this example, the main program waits for the thread to finish printing numbers, then waits for the process to finish printing letters. Using join() ensures that everything runs in the right order, and nothing is cut off early.

Combining Threads and Processes

Sometimes your program needs the best of both worlds. You might want to use processes to run heavy CPU tasks in parallel, and inside those processes, use threads to handle lightweight I/O tasks like printing, reading files, or network calls.

This works well because each process runs independently on a separate CPU core, while the threads within each process can do tasks without blocking each other on I/O.

Here’s a simple example: one process runs a function that starts multiple threads.

from multiprocessing import Process
import threading
import time

def handle_io_task(name):
    time.sleep(1)
    print(f"Thread {name} inside process finished I/O")


def process_with_threads(process_name):

    print(f"Process {process_name} starting threads")

    threads = []

    for tname in ['A', 'B', 'C']:
        thread = threading.Thread(target=handle_io_task, args=(f"{process_name}-{tname}",))
        thread.start()
        threads.append(thread)

    for thread in threads:
        thread.join()

    print(f"Process {process_name} done with threads")


if __name__ == '__main__':

    # Start two processes, each with its own threads
    p1 = Process(target=process_with_threads, args=('One',))
    p2 = Process(target=process_with_threads, args=('Two',))

    p1.start()
    p2.start()
    p1.join()
    p2.join()

    print("Main program finished.")

In this setup, each process (One and Two) starts three threads to simulate I/O tasks. The main program waits until all processes and their internal threads are done. This pattern is useful in apps like web servers, data pipelines, or anything with both I/O and CPU work.

Quick Reference Table

This table gives a simple comparison to help you decide when to use threads or processes in your Python projects:

Concept	Threads	Processes
Memory	Shared among threads	Each process has its own memory space
Use Case	Great for I/O-bound tasks	Best for CPU-bound tasks
Start With	`threading.Thread`	`multiprocessing.Process`
Data Sharing	Simple (shared variables)	Requires tools like `Queue` or `Manager`
Parallelism	Limited by GIL in CPU-heavy tasks	True parallelism using multiple cores

Use this as a cheat sheet to choose the right tool for your job. Threads are easy and fast for waiting tasks; processes are strong for serious number crunching.

Fun Real-Life Analogy

Imagine you’re running a company.

Threads are like workers sitting around the same big desk. They can pass papers (data) back and forth easily because they’re all in the same space. If one picks up a file, the others can read or edit it too. It’s quick to share, but they might accidentally get in each other’s way if they grab the same paper at the same time. Still, great for teamwork where tasks involve lots of waiting—like answering calls or waiting on emails.

Processes are like workers in their own private offices. Each person has their own desk, computer, and door. They can work at full speed without bumping into anyone else. But if they need to share info, they have to use a phone or send a message (like using a Queue or Manager). It’s safer and more powerful for big, heavy tasks like calculations or design work—but takes more effort to communicate.

So, for light chatting and teamwork at the same desk—use threads. For focused, independent work in separate rooms—go with processes.

Conclusion

Threads and processes help your Python programs do many things at the same time. Threads are perfect for light tasks like waiting for files or network data, because they share memory and are easy to use. Processes work better for heavy jobs like math or image work since they run truly in parallel without sharing memory.

By trying out simple examples, you can see how each method works and when to use them. Sometimes, combining processes and threads gives you the best of both worlds — handling heavy work and smooth communication together. Start small, experiment, and build powerful, efficient programs that fit your needs.