In Python, both processes and threads help you run multiple tasks at the same time. A process is like a separate program running on your computer, each with its own memory. A thread is a smaller task running inside a program, sharing the same memory with other threads.
Using processes and threads makes your programs faster and more efficient by doing many things at once. This article will show you when to use processes or threads, and how to use them with simple, clear examples. By the end, you’ll understand how to pick the right tool for your task and get started with your own multi-tasking Python programs.
What is a Thread?
A thread is a lightweight task that runs inside a program. Multiple threads can run at the same time within the same program, sharing the same memory space. This makes communication between threads easy because they can access the same data directly.
In Python, you can use the threading
module to create and manage threads. You create a thread by passing a function that the thread will run. Then, start the thread to run it concurrently with the main program or other threads.
import threading
import time
def print_numbers():
# Print numbers 0 to 4, pausing 1 second between each
for i in range(5):
time.sleep(1) # Simulate work with a delay
print(f"Thread prints: {i}")
# Create a thread to run the print_numbers function
thread = threading.Thread(target=print_numbers)
# Start the thread; it runs alongside the main program
thread.start()
# Wait for the thread to finish before continuing
thread.join()
print('Main Ends...')
This example creates a thread that prints numbers from 0 to 4 while the main program waits for it to finish using join()
. Threads are great for tasks like waiting for input or handling multiple connections where sharing memory helps.
What is a Process?
A process is an independent program that runs separately from others, with its own memory space. Because each process has its own memory, processes run truly in parallel on multiple CPU cores, without being blocked by Python’s Global Interpreter Lock (GIL). This makes processes perfect for CPU-heavy tasks like calculations or image processing.
In Python, you use the multiprocessing
module to create and manage processes. You create a process by passing a function that the process will run. Then, start the process to run it independently of the main program.
from multiprocessing import Process
import time
def print_letters():
# Print letters A to E with a pause to simulate work
for letter in 'ABCDE':
time.sleep(1)
print(f"Process prints: {letter}")
if __name__ == '__main__':
# Create a new process that runs the print_letters function
process = Process(target=print_letters)
# Start the process (runs separately from the main program)
process.start()
# Wait for the process to finish before continuing
process.join()
print('Main process ends...')
This example creates a process that prints letters A to E with a 1-second delay between each. The main program waits for the process to finish with join()
. Processes are useful when tasks need to run truly in parallel, like heavy calculations, since they don’t share memory and avoid GIL limits.
Note: When working with the multiprocessing
module, it’s important to place your process creation and start code inside the if __name__ == '__main__':
block. This ensures your program runs safely and correctly, especially on Windows, by preventing the code from running unintentionally when new processes start.
When to Use Threads
Threads work best for I/O-bound tasks — these are jobs that spend time waiting, like reading files, talking to a network, or getting user input. Since threads share the same memory space, it’s easy to share data between them without extra work.
Using threads for I/O lets your program stay busy while waiting, making things feel faster and smoother. For example, you can start several threads to download files or print messages at the same time without waiting for one to finish before starting another.
import threading
import time
def wait_and_print(name):
# Simulate waiting (like an I/O task)
time.sleep(1)
print(f"{name} finished waiting")
# Start multiple threads to run the wait_and_print function
for name in ['Thread-A', 'Thread-B', 'Thread-C']:
threading.Thread(target=wait_and_print, args=(name,)).start()
In this example, three threads run together, each waiting for 1 second before printing a message. They work in parallel, so the total wait time is about 1 second, not 3 seconds if done one by one. Threads make such waiting tasks simple and efficient.
When to Use Processes
Processes are best for CPU-bound tasks — jobs that need a lot of number crunching, image editing, or heavy math calculations. Unlike threads, each process runs independently and can use different CPU cores, so they really work in parallel without being slowed down by Python’s Global Interpreter Lock (GIL).
Using processes lets you spread out tough calculations across your computer’s CPUs, speeding up tasks that would otherwise take a long time. For example, running several math-heavy functions in separate processes lets your program do more work at once.
from multiprocessing import Process
def heavy_math(number):
# Calculate sum of squares up to the given number
result = sum(i*i for i in range(number))
print(f"Sum of squares up to {number} is {result}")
processes = []
if __name__ == '__main__':
# Create and start processes for different calculations
for num in [10000, 20000, 30000]:
p = Process(target=heavy_math, args=(num,))
p.start()
processes.append(p)
# Wait for all processes to finish
for p in processes:
p.join()
This example runs three heavy math tasks in separate processes at the same time. Each process calculates a sum of squares for a different number, working fully in parallel and printing results independently. This is how you use multiprocessing to handle CPU-heavy work easily.
Sharing Data: Threads vs Processes
Threads run inside the same memory space, so sharing data between them is simple — they can directly read and write the same variables. This makes communication fast and easy but requires care to avoid conflicts.
Processes, however, run in separate memory spaces. They cannot directly share normal variables. To share data between processes, you need special tools like multiprocessing.Queue
or a Manager
object that helps manage shared data safely.
Example: Thread Sharing a Variable
import threading
shared_data = []
def add_items():
for i in range(5):
shared_data.append(i)
print(f"Thread added {i}")
threads = [threading.Thread(target=add_items) for _ in range(2)]
for t in threads:
t.start()
for t in threads:
t.join()
print("Shared data:", shared_data)
Here, two threads add items to the same list. Since they share memory, the list is updated directly.
Example: Process Sharing Data with Queue
from multiprocessing import Process, Queue
# Producer function: adds items to the queue
def producer(q):
for i in range(5):
q.put(i) # Put item into the queue
print(f"Produced {i}")
# Consumer function: takes items from the queue
def consumer(q):
while not q.empty(): # Check if queue still has items
item = q.get() # Get item from the queue
print(f"Consumed {item}")
# Required to protect code when using multiprocessing
if __name__ == '__main__':
q = Queue() # Create a queue to share between processes
# Create producer and consumer processes
p1 = Process(target=producer, args=(q,))
p2 = Process(target=consumer, args=(q,))
p1.start() # Start producer
p1.join() # Wait until producer is done
p2.start() # Start consumer
p2.join() # Wait until consumer is done
In this example, a Queue
is used to send data from the producer process to the consumer process safely, because processes do not share memory directly.
If you tried to use a regular variable (like a list or dictionary) to share data between the two processes, it wouldn’t work. Each process runs in its own memory space, so changes made in one process wouldn’t be visible to the other.
Regular List
Here’s an example using a regular list to show why it doesn’t work across processes:
from multiprocessing import Process
import time
def producer(data):
for i in range(5):
data.append(i)
print(f"Producer added {i}")
time.sleep(0.5)
def consumer(data):
time.sleep(3) # Wait to make sure producer is done
print("Consumer sees:", data)
if __name__ == '__main__':
shared_list = []
p1 = Process(target=producer, args=(shared_list,))
p2 = Process(target=consumer, args=(shared_list,))
p1.start()
p2.start()
p1.join()
p2.join()
Even though the producer
adds items to shared_list
, the consumer
sees it as empty. That’s because each process has its own copy of the list — they don’t share memory. So changes in one process don’t affect the other.
That’s why we use multiprocessing.Queue
or Manager().list()
— these are built-in tools designed for safe communication between processes. Since each process has its own memory space, regular variables won’t work for sharing data. But with a queue or manager, Python handles all the behind-the-scenes work, allowing you to pass values like numbers, strings, or even objects smoothly between processes. It’s an easy and reliable way to connect separate parts of your program.
Manager().list()
Here’s an example using multiprocessing.Manager().list()
to safely share a list between two processes:
from multiprocessing import Process, Manager
def add_numbers(shared_list):
for i in range(5):
shared_list.append(i)
print(f"Added {i}")
def read_numbers(shared_list):
for num in shared_list:
print(f"Read {num}")
if __name__ == '__main__':
with Manager() as manager:
shared = manager.list() # Create a shared list
p1 = Process(target=add_numbers, args=(shared,))
p2 = Process(target=read_numbers, args=(shared,))
p1.start()
p1.join() # Wait for the writer to finish
p2.start()
p2.join() # Then the reader starts
This example shows how two processes can work with the same list safely. The first process fills the list with numbers, and the second reads them out. The Manager().list()
handles all the sharing behind the scenes, so each process sees the same data even though they have separate memory. This makes it easy to pass information between processes without using files or sockets.
So, threads make sharing data quick and easy inside one program, but processes require special tools to pass data safely between separate memory spaces.
Stopping and Joining Threads and Processes
When working with threads or processes, it’s important to wait for them to finish their tasks before moving on. This is done using the join()
method. It tells the main program to pause and wait until the thread or process is done.
This is especially useful when you want to make sure everything is complete before exiting the program or moving to the next step.
Here’s a simple example that shows how to use join()
with both a thread and a process:
import threading
from multiprocessing import Process
import time
def print_numbers():
for i in range(3):
time.sleep(1)
print(f"Thread prints: {i}")
def print_letters():
for letter in 'XYZ':
time.sleep(1)
print(f"Process prints: {letter}")
if __name__ == '__main__':
# Start and join thread
thread = threading.Thread(target=print_numbers)
thread.start()
thread.join() # Wait for thread to finish
# Start and join process
process = Process(target=print_letters)
process.start()
process.join() # Wait for process to finish
print("Main program done.")
In this example, the main program waits for the thread to finish printing numbers, then waits for the process to finish printing letters. Using join()
ensures that everything runs in the right order, and nothing is cut off early.
Combining Threads and Processes
Sometimes your program needs the best of both worlds. You might want to use processes to run heavy CPU tasks in parallel, and inside those processes, use threads to handle lightweight I/O tasks like printing, reading files, or network calls.
This works well because each process runs independently on a separate CPU core, while the threads within each process can do tasks without blocking each other on I/O.
Here’s a simple example: one process runs a function that starts multiple threads.
from multiprocessing import Process
import threading
import time
def handle_io_task(name):
time.sleep(1)
print(f"Thread {name} inside process finished I/O")
def process_with_threads(process_name):
print(f"Process {process_name} starting threads")
threads = []
for tname in ['A', 'B', 'C']:
thread = threading.Thread(target=handle_io_task, args=(f"{process_name}-{tname}",))
thread.start()
threads.append(thread)
for thread in threads:
thread.join()
print(f"Process {process_name} done with threads")
if __name__ == '__main__':
# Start two processes, each with its own threads
p1 = Process(target=process_with_threads, args=('One',))
p2 = Process(target=process_with_threads, args=('Two',))
p1.start()
p2.start()
p1.join()
p2.join()
print("Main program finished.")
In this setup, each process (One
and Two
) starts three threads to simulate I/O tasks. The main program waits until all processes and their internal threads are done. This pattern is useful in apps like web servers, data pipelines, or anything with both I/O and CPU work.
Quick Reference Table
This table gives a simple comparison to help you decide when to use threads or processes in your Python projects:
Concept | Threads | Processes |
---|---|---|
Memory | Shared among threads | Each process has its own memory space |
Use Case | Great for I/O-bound tasks | Best for CPU-bound tasks |
Start With | threading.Thread | multiprocessing.Process |
Data Sharing | Simple (shared variables) | Requires tools like Queue or Manager |
Parallelism | Limited by GIL in CPU-heavy tasks | True parallelism using multiple cores |
Use this as a cheat sheet to choose the right tool for your job. Threads are easy and fast for waiting tasks; processes are strong for serious number crunching.
Fun Real-Life Analogy
Imagine you’re running a company.
Threads are like workers sitting around the same big desk. They can pass papers (data) back and forth easily because they’re all in the same space. If one picks up a file, the others can read or edit it too. It’s quick to share, but they might accidentally get in each other’s way if they grab the same paper at the same time. Still, great for teamwork where tasks involve lots of waiting—like answering calls or waiting on emails.
Processes are like workers in their own private offices. Each person has their own desk, computer, and door. They can work at full speed without bumping into anyone else. But if they need to share info, they have to use a phone or send a message (like using a Queue
or Manager
). It’s safer and more powerful for big, heavy tasks like calculations or design work—but takes more effort to communicate.
So, for light chatting and teamwork at the same desk—use threads. For focused, independent work in separate rooms—go with processes.
Conclusion
Threads and processes help your Python programs do many things at the same time. Threads are perfect for light tasks like waiting for files or network data, because they share memory and are easy to use. Processes work better for heavy jobs like math or image work since they run truly in parallel without sharing memory.
By trying out simple examples, you can see how each method works and when to use them. Sometimes, combining processes and threads gives you the best of both worlds — handling heavy work and smooth communication together. Start small, experiment, and build powerful, efficient programs that fit your needs.