Skip to main content

Graphs

Graphs are everywhere around us: computer networks, road networks, social networks, electronic circuits, molecular structure etc.

Terminology:

G=(V,E)G = (V, E)

VVerticesV \Rightarrow \text{Vertices}

EEdgesV×VE \Rightarrow \text{Edges} \subseteq V \times V

Edges can be directed or undirected. Say we have two vertices vv and ww. Possible Edges could be e=(w,v)e = (w, v) or e=(v,w)e' = (v, w). In case of undirected edges, e={v,w}e = \{v, w\}. We will use the set notation where order does not matter.

In case of undirected graph, number of edges:

E(V2)=O(V2)|E| \le \binom{|V|}{2} = \mathcal{O}(|V|^2)

If we have directed graph, there will be factor of 2, still O(V2)\mathcal{O}(|V|^2).

Graph representation

How can we store a graph in the computer? Few of the options are:

  • Edge list
  • Adjacency list
  • Adjacency matrix

Adjacency matrix: say we have a graph G=(V,E)G = (V, E) with nn vertices and mm edges. We can represent GG using and n×nn \times n matrix MM, where element M[i,j]=1M[i, j] = 1 if (i,j)(i, j) is an edge of GG, and 00 if it is not an edge. This representation allows faster look up, and updates for edge insertion and deletion. However, it may use excessive space for sparse graphs.

Adjacency list: We can more efficiently store sparse graphs by using linked lists to store neighbors of each vertex.

Sparse vs. dense: usually sparse graphs number of edges EO(V)|E| \approx \mathcal{O}(|V|), while for dense graphs EO(V2)|E| \approx \mathcal{O}(|V|^2).

In real world applications, most graphs we encounter are sparse, and therefore adjacency list is the primary data structure to store such graphs.

Path

p=(v1,v2,,vk)p = (v_1, v_2, \dots, v_k) where (vi,vi+1)E  i{1,2,,k1}(v_i, v_{i+1}) \in E~\forall~i \in \{1, 2, \dots, k-1\}.

l(p)lengthl(p) \Rightarrow \text{length}

δ(p)length of shortest path\delta(p) \Rightarrow \text {length of shortest path}

Graph traversal

In a graph search, we are provided with the graph and a starting/ source vertex. Our goal is to find everything findable, i.e., all the vertex by walking along various edges. Our algorithm should be efficient, meaning we should not find every edges or vertices too many times. The complexity of search should be O(m+n)\mathcal{O}(m + n), where nn is the number of vertices and mm is the number of edges in the graph.

Breadth-first search (BFS)

Depending on the order of traversal, there are two class of traversal algorithms: breadth first search and depth first search. During search all vertices goes from undiscovered to discovered to explored. Once we discover a new vertex, we can store them either in a queue or stack for exploration (where we walk along all the edges and further discover more vertices). If we store the vertices in a queue (i.e., first-in first-out), we explore the oldest unexplored vertex first. In this way, our exploration slowly radiate out from the starting vertex, defining breadth-first search. Here we explore vertices in layers. Source vertex is layer 0. All the neighbors of source vertex is layer 1, and next neighbors are layer 2 and so on. Easy to compute shortest path.

Depth-first search (DFS)

On the other hand, if we store just discovered vertices using a stack (i.e., last-in first-out), we explore the new neighboring vertices first if one is available. In this case, exploration quickly wander away from the starting vertex, defining depth-first search. Once of nice advantages here is that we do not need stack container explicitly, we can use recursion instead.

Resources