Introduction to computer algorithms: Breadth First Search, Depth First Search and Dijkstra’s algorithm

Alexandra McCarroll
4 min readJun 7, 2018

As a History major, who has just completed a four month web development bootcamp — albeit intensive, but also short — my theoretical computer science knowledge is still in its infancy. So, when tasked with a problem that from the outset appeared to be more data science-y than anything I had encountered before, I knew I needed to spend a few days down the rabbit hole to figure out exactly what theory was required to come up with a solution…

What I found was both incredibly interesting and also made glaringly obvious the gaps in my theoretical knowledge.

Lets start by explaining the problem. I was given a dataset of over 70,000 lines — probably very undaunting to the experienced data scientist but to me this felt like when I opened page one of War and Peace (and to this day I have never finished that book). The task was to manipulate this information, which consisted of nodes on a map, and bidirectional edges with distances between them, to tell the user the shortest distance between point A and point B.

My initial thought was that I would just need to search the dataset to find matching nodes on the same line. So a simple line by line search method. There, I’d cracked it. Job done. Time to submit! I was so wrong on so many levels.

Smugly, I asked a data scientist what he thought of my solution. “Yeah, no, you need to use an algorithm. Try Breadth First Search.” And, like that, I’d jumped right into the rabbit hole

What I didn’t realise before is that the bidirectional edges do not list every relationship between each node, instead only the relationships between a node and maybe two or three other nodes. (n.b. If it did list every relationship between each node, there would be over 14 million bidirectional edges listed). Below is a visual representation of 5 nodes and 5 edges. In this example, if every node were to have a relationship with every other node, there would be 20 edges… I was dealing with over 37,000 nodes.

Visual representation of relationships between nodes

Enter the world of computer algorithms. First I explored Breadth First Search (BFS) — along with Depth First Search (DFS), which is commonly used as a comparison — and found out algorithms are actually pretty cool!

In short, BFS is an algorithm used to search ‘tree’ or ‘graph’ data structures. It begins at the root or a given start node, and explores all of the neighbour nodes at the level below the root of the tree before moving on to the nodes at the next depth level. This happens over and over again until it reaches the end of the graph, or the given end node.

BFS: Courtesy of Wikipedia

On the other hand, depth first search uses the opposite strategy which instead explores the highest-depth nodes first before being forced to backtrack and search shallower nodes.

DFS: Courtesy of Wikipedia

However, BFS didn’t account for varied distances between nodes, which my dataset had, so the search continued to find the perfect algorithm.

Enter Dijkstra’s Algorithm… Dijkstra’s Algorithm allows you to use a weighed graph thus, you can assign values other than 1 for each bidirectional edge. Therefore, the distances (in metres) between nodes could be calculated. This algorithm then gives you the shortest path from your source to every node in the traversed graph. Fantastic! I felt like I found the cure to all the world’s ailments (not really, but I did feel like Archimedes — Eureka!)

This was great, I could finally manipulate 10 lines of data…

Stay tuned for part 2: attempting to use this algorithm to manipulate over 70,000 lines of data…

Further Reading (because they explain it so much better!):

--

--