Graphs
Graphs are everywhere around us: computer networks, road networks, social networks, electronic circuits, molecular structure etc.
Terminology:
Edges can be directed or undirected. Say we have two vertices and . Possible Edges could be or . In case of undirected edges, . We will use the set notation where order does not matter.
In case of undirected graph, number of edges:
If we have directed graph, there will be factor of 2, still .
Graph representation
How can we store a graph in the computer? Few of the options are:
- Edge list
- Adjacency list
- Adjacency matrix
Adjacency matrix: say we have a graph with vertices and edges. We can represent using and matrix , where element if is an edge of , and if it is not an edge. This representation allows faster look up, and updates for edge insertion and deletion. However, it may use excessive space for sparse graphs.
Adjacency list: We can more efficiently store sparse graphs by using linked lists to store neighbors of each vertex.
Sparse vs. dense: usually sparse graphs number of edges , while for dense graphs .
In real world applications, most graphs we encounter are sparse, and therefore adjacency list is the primary data structure to store such graphs.
Path
where .
Graph traversal
In a graph search, we are provided with the graph and a starting/ source vertex. Our goal is to find everything findable, i.e., all the vertex by walking along various edges. Our algorithm should be efficient, meaning we should not find every edges or vertices too many times. The complexity of search should be , where is the number of vertices and is the number of edges in the graph.
Breadth-first search (BFS)
Depending on the order of traversal, there are two class of traversal algorithms: breadth first search and depth first search. During search all vertices goes from undiscovered to discovered to explored. Once we discover a new vertex, we can store them either in a queue or stack for exploration (where we walk along all the edges and further discover more vertices). If we store the vertices in a queue (i.e., first-in first-out), we explore the oldest unexplored vertex first. In this way, our exploration slowly radiate out from the starting vertex, defining breadth-first search. Here we explore vertices in layers. Source vertex is layer 0. All the neighbors of source vertex is layer 1, and next neighbors are layer 2 and so on. Easy to compute shortest path.
Depth-first search (DFS)
On the other hand, if we store just discovered vertices using a stack (i.e., last-in first-out), we explore the new neighboring vertices first if one is available. In this case, exploration quickly wander away from the starting vertex, defining depth-first search. Once of nice advantages here is that we do not need stack container explicitly, we can use recursion instead.
Resources
- MIT OCW Lecture
- The Algorithm Design Manual by Steven S. Skiena
- Graph Theory videos by William Fiset
- Coursera: Graph Search, Shortest Paths, and Data Structures by Tim Roughgarden
- Boost Graph Library
- Dijkstra Algorithm
- A* Algorithm
- Route Planning in Large Transportation Networks
- Graph analysis using NetworkX in Python (Example Jupyter Notebook)