blog
blog copied to clipboard
Breadth-First Search (Python)
References:
- Book: Grokking Algorithms
- https://github.com/egonschiele/grokking_algorithms
- https://pythontutor.com
How many steps are there to find the shortest route to a place? This type of problem is called a shortest-path problem.
The algorithm to solve a shortest-path problem is called breadth-irst search.
You can use breadth-irst search to:
- Write a checkers AI that calculates the fewest moves to victory
- Fine the smallest number of moves to checkmate in a game of chess
- Write a spell checker (fewest edits from your misspelling to a real word—for example, READED -> READER is one edit)
- Find the doctor closest to you in your network
1. Shortest route between two places
To figure out how to get from a place (Ex: Twin Peaks) to another place (Ex: the Golden Gate Bridge), there are two steps:
-
Model the problem as a graph.
Graphs are a way to model how diferent things are connected to one another (they don’t involve an X or Y axis).
-
Solve the problem using breadth-irst search.
Breadth-first search is a kind of search algorithm: one that runs on graphs. Breadth-first search can help answer two types of questions:
- Question type 1: Is there a path from node A to node B?
- Question type 2: What is the shortest path from node A to node B?

The shortest route from Twin Peaks to the Golden Gate Bridge:

2. Graph
2.1 What is a graph?
Graphs are a way to model how diferent things are connected to one another (they don’t involve an X or Y axis).
A graph models a set of connections. For example, suppose you and your friends are playing poker, and you want to model who owes whom money. Here’s how you could say, “Alex owes Rama money.”

Full graph of people who owe other people poker money:

Graphs are made up of nodes and edges.

A node can be directly connected to many other nodes. Those nodes are called its neighbors. In this graph, Rama is Alex’s neighbor. Adit isn’t Alex’s neighbor, because they aren’t directly connected. But Adit is Rama’s and Tom’s neighbor.
2.2 Implementing the graph
Suppose you’re the proud owner of a mango farm. You’re looking for a mango seller who can sell your mangoes. You not only search your friends, but you search their friends, too. Remember, the goal is to ind one mango seller in your network. So if one of your friend isn’t a mango seller, you add his/her friends to the list, too. That means you’ll eventually search his/her friends—and then their friends, and so on. With this algorithm, you’ll search your entire network until you come across a mango seller. This algorithm is breadth-irst search.

graph = {}
graph["You"] = ["Alice", "Bob", "Claire"]
graph["Bob"] = ["Anuj", "Peggy"]
graph["Alice"] = ["Peggy"]
graph["Claire"] = ["Thom", "Jonny"]
graph["Anuj"] = []
graph["Peggy"] = []
graph["Thom"] = []
graph["Jonny"] = []
2.3 Directed Graph and Undirected Graph
Anuj, Peggy, Thom, and Jonny don’t have any neighbors. They have arrows pointing to them, but no arrows from them to someone else. This is called a directed graph—the relationship is only one way. So Anuj is Bob’s neighbor, but Bob isn’t Anuj’s neighbor. An undirected graph doesn’t have any arrows, and both nodes are each other’s neighbors. For example, both of these graphs are equal.

- A directed graph has arrows, and the relationship follows the direction of the arrow (Rama -> Adit means “Rama owes Adit money”).
- Undirected graphs don’t have arrows, and the relationship goes both ways (Ross - Rachel means “Ross dated Rachel and Rachel dated Ross”).
2.4 Degree
Another way to see this is, irst-degree connections are added to the search list before second-degree connections:

What happens if you search Anuj before Claire, and they’re both mango sellers? Well, Anuj is a second-degree contact, and Claire is a first-degree contact. You end up with a mango seller who isn’t the closest to you in your network. So you need to search people in the order that they’re added. There’s a data structure for this: it’s called a queue.
3. Queues
A queue works exactly like it does in real life. Suppose you and your friend are queueing up at the bus stop. If you’re before him in the queue, you get on the bus first. A queue works the same way. Queues are similar to stacks. You can’t access random elements in the queue. Instead, there are two only operations, enqueue and dequeue.
- Enqueue: add an item to the queue
- Dequeue: tak an item off the queue

The queue is called a FIFO data structure: First In, First Out. In contrast, a stack is a LIFO data structure: Last In, First Out.

4. Implementing the algorithm
4.1 How will the implementation work?

push
is almost always the same thing as enqueue
, and pop
is almost always the same thing as dequeue
.
Make a queue to start:
# Use the Python double-ended queue (deque) function
from collections import deque
# Creates a new queue
search_queue = deque()
# Adds all of your neighbors to the search queue
# graph[“You”] will give you a list of all your neighbors, like [“Alice”, “Bob”, “Claire”]
search_queue += graph["You"]
Loop to find the mango seller:
# While the queue isn’t empty
while search_queue:
# Grabs the first person off the queue
person = search_queue.popleft()
# Checks whether the person is a mango seller
if person_is_seller(person):
# Yes, they’re a mango seller.
print(person + " is a mango seller!")
return True
else:
# No, they aren’t. Add all of this
search_queue += graph[person]
# If you reached here, no one in the queue was a mango seller.
return False
Create a function to tell you when someone is a mango seller:
def person_is_seller(name):
""" Tell you when someone is a mango seller.
This function checks whether the person’s name ends with
the letter m. If it does, they’re a mango seller.
Kind of a silly way to do it, but it’ll do for this example. """
return name[-1] == 'm'
The breadth-first search in action:
And so on. The algorithm will keep going until either
- A mango seller is found, or
- The queue becomes empty, in which case there is no mango seller.
4.2 Avoid ininite loop
Alice and Bob share a friend: Peggy. So Peggy will be added to the queue twice.
If you check her twice, you’re doing unnecessary, extra work. So once you search a person, you should mark that person as searched and not search them again.
If you don’t do this, you could also end up in an ininite loop. Suppose the mango seller graph looked like this:

Before checking a person, it’s important to make sure they haven’t been checked already. To do that, you’ll keep a list of people you’ve already checked.
def search(name):
search_queue = deque()
search_queue += graph[name]
# This array (list) is how you keep track of which people you’ve searched before.
# searched = []
# Sets are significantly faster than lists if you want to check if an item is contained within it.
searched = set()
while search_queue:
person = search_queue.popleft()
# Only search this person if you haven’t already searched them.
if person not in searched:
if person_is_seller(person):
print(person + " is a mango seller!")
return True
else:
search_queue += graph[person]
# Marks this person as searched
# searched.append(person) # list
searched.add(person) # set
return False
rearch("you")
4.3 All the code
# Use the Python double-ended queue (deque) function
from collections import deque
graph = {}
graph["You"] = ["Alice", "Bob", "Claire"]
graph["Bob"] = ["Anuj", "Peggy"]
graph["Alice"] = ["Peggy"]
graph["Claire"] = ["Thom", "Jonny"]
graph["Anuj"] = []
graph["Peggy"] = []
graph["Thom"] = []
graph["Jonny"] = []
def person_is_seller(name):
""" Tell you when someone is a mango seller.
This function checks whether the person’s name ends with
the letter m. If it does, they’re a mango seller.
Kind of a silly way to do it, but it’ll do for this example. """
return name[-1] == 'm'
def search(name):
# Creates a new queue
search_queue = deque()
# Adds all of a person's neighbors to the search queue
# Ex: graph[“You”] will give you a list of all your neighbors, like [“Alice”, “Bob”, “Claire”]
search_queue += graph[name]
# This set is how you keep track of which people you’ve searched before.
searched = set()
# While the queue isn’t empty
while search_queue:
# Grabs the first person off the queue
person = search_queue.popleft()
# Only search this person if you haven’t already searched them.
if person not in searched:
# Checks whether the person is a mango seller
if person_is_seller(person):
# Yes, they’re a mango seller.
print(person + " is a mango seller!")
return True
else:
# No, they aren’t. Add all of this
search_queue += graph[person]
# Marks this person as searched
searched.add(person)
# If you reached here, no one in the queue was a mango seller.
return False
search("You")
5. Running time
If you search your entire network for a mango seller, that means you’ll follow each edge (remember, an edge is the arrow or connection from one person to another). So the running time is at least O
(number of edges).
You also keep a queue of every person to search. Adding one person to the queue takes constant time: O(1)
. Doing this for every person will take O
(number of people) total. Breadth-irst search takes O
(number of people + number of edges), and it’s more commonly written as O(V+E)
(V for number of vertices, E for number of edges).