Graphs

Graphs are everywhere around us: computer networks, road networks, social networks, electronic circuits, molecular structure etc.

Terminology:

$G = (V, E)$

$V \Rightarrow \text{Vertices}$

$E \Rightarrow \text{Edges} \subseteq V \times V$

Edges can be directed or undirected. Say we have two vertices $v$ and $w$ . Possible Edges could be $e = (w, v)$ or $e' = (v, w)$ . In case of undirected edges, $e = \{v, w\}$ . We will use the set notation where order does not matter.

In case of undirected graph, number of edges:

|E| \le \binom{|V|}{2} = \mathcal{O}(|V|^2)

If we have directed graph, there will be factor of 2, still $\mathcal{O}(|V|^2)$ .

Graph representation

How can we store a graph in the computer? Few of the options are:

Edge list
Adjacency list
Adjacency matrix

Adjacency matrix: say we have a graph $G = (V, E)$ with $n$ vertices and $m$ edges. We can represent $G$ using and $n \times n$ matrix $M$ , where element $M[i, j] = 1$ if $(i, j)$ is an edge of $G$ , and $0$ if it is not an edge. This representation allows faster look up, and updates for edge insertion and deletion. However, it may use excessive space for sparse graphs.

Adjacency list: We can more efficiently store sparse graphs by using linked lists to store neighbors of each vertex.

Sparse vs. dense: usually sparse graphs number of edges $|E| \approx \mathcal{O}(|V|)$ , while for dense graphs $|E| \approx \mathcal{O}(|V|^2)$ .

In real world applications, most graphs we encounter are sparse, and therefore adjacency list is the primary data structure to store such graphs.

Path

$p = (v_1, v_2, \dots, v_k)$ where $(v_i, v_{i+1}) \in E~\forall~i \in \{1, 2, \dots, k-1\}$ .

$l(p) \Rightarrow \text{length}$

$\delta(p) \Rightarrow \text {length of shortest path}$

Graph traversal

In a graph search, we are provided with the graph and a starting/ source vertex. Our goal is to find everything findable, i.e., all the vertex by walking along various edges. Our algorithm should be efficient, meaning we should not find every edges or vertices too many times. The complexity of search should be $\mathcal{O}(m + n)$ , where $n$ is the number of vertices and $m$ is the number of edges in the graph.

Breadth-first search (BFS)

Depending on the order of traversal, there are two class of traversal algorithms: breadth first search and depth first search. During search all vertices goes from undiscovered to discovered to explored. Once we discover a new vertex, we can store them either in a queue or stack for exploration (where we walk along all the edges and further discover more vertices). If we store the vertices in a queue (i.e., first-in first-out), we explore the oldest unexplored vertex first. In this way, our exploration slowly radiate out from the starting vertex, defining breadth-first search. Here we explore vertices in layers. Source vertex is layer 0. All the neighbors of source vertex is layer 1, and next neighbors are layer 2 and so on. Easy to compute shortest path.

Depth-first search (DFS)

On the other hand, if we store just discovered vertices using a stack (i.e., last-in first-out), we explore the new neighboring vertices first if one is available. In this case, exploration quickly wander away from the starting vertex, defining depth-first search. Once of nice advantages here is that we do not need stack container explicitly, we can use recursion instead.

Graph representation​

Path​

Graph traversal​

Breadth-first search (BFS)​

Depth-first search (DFS)​

Resources​

Graph representation

Path

Graph traversal

Breadth-first search (BFS)

Depth-first search (DFS)

Resources