Graph Traversal

←Back to Tech Tree

inventorycoverage

Graph Traversal #

Graph TheoryDifficulty: ★★☆☆☆Depth: 3Unlocks: 8

BFS and DFS. Exploring all reachable nodes systematically.

Interactive Visualization #

⏮◀◀▶▶STEP0.25x1xZOOM

t=0s

Core Concepts #

Key Symbols & Notation #

visited[v] (boolean marker indicating node v has been discovered)

Essential Relationships #

Prerequisites (2) #

Graph Representations6 atomsLinked Lists6 atoms

Unlocks (5) #

Topological Sortlvl 3Shortest Pathslvl 3Minimum Spanning Treeslvl 3Strongly Connected Componentslvl 3Graph Coloringlvl 4

Advanced Learning Details

Graph Position #

33

Depth Cost

8

Fan-Out (ROI)

6

Bottleneck Score

3

Chain Length

Cognitive Load #

5

Atomic Elements

41

Total Elements

L3

Percentile Level

L3

Atomic Level

All Concepts (17) #

Teaching Strategy #

Self-serve tutorial - low prerequisites, straightforward concepts.

Graphs show up whenever “things connect to other things”: friends in a social network, pages linked on the web, roads between cities, function calls in code. Graph traversal is the basic skill that lets you explore what’s reachable from a starting point—systematically, without getting stuck in cycles, and with predictable order.

TL;DR:

Graph traversal means visiting every vertex reachable from a start vertex using a frontier (data structure of “discovered but not fully processed” vertices) plus a visited[v] marker to avoid revisits. BFS uses a FIFO queue (good for unweighted shortest paths in edges); DFS uses a LIFO stack/recursion (good for exploring deep structure and enabling algorithms like topological sort and SCCs).

Prerequisites (stated up front) + important edge cases #

Before you learn BFS/DFS, make sure these are solid:

  1. Graph representations (you already know this)
  1. Queue/Stack basics (quick refresh)
  1. What “unweighted shortest path” means
  1. Edge cases you must handle (important for correctness)

One key theme across everything you’re about to do: traversal is not “visit each vertex once by magic.” It’s a careful combination of:

What Is Graph Traversal? #

A graph traversal is a procedure that explores all vertices reachable from a given start vertex s, in a systematic order.

Why do we need a traversal at all? #

In a tree, you can “just” walk children pointers and never worry about coming back around. In a general graph, you can have:

Without a rule to avoid revisiting, you can loop forever.

The three atomic concepts #

1) Reachability #

Given a graph G = (V, E) and start vertex s, traversal explores:

If there is no path from s to v, traversal from s will never visit v.

2) Frontier / extraction policy #

Traversal maintains a frontier: vertices that have been discovered but not fully processed.

This single design choice drastically changes the order of exploration and the properties you can prove.

3) Visited marking #

We maintain a boolean marker:

This prevents:

A subtle but crucial detail is when you set visited[v]. The standard and safest approach is:

If you delay marking until extraction, then parallel edges (or converging paths) can insert the same vertex many times, bloating runtime.

Traversal skeleton (shared idea) #

Here’s the shared structure both BFS and DFS follow:

  1. Initialize all visited[v] = false

  2. Create an empty frontier

  3. Add s to frontier, set visited[s] = true

  4. While frontier not empty:

The only difference is the frontier policy (queue vs stack).

Core Mechanic 1: BFS (Breadth-First Search) with a FIFO Queue #

Why BFS exists (motivation) #

Suppose you’re in a social network graph and want to find people closest to you:

You want to explore in “layers” of increasing hop count. That is exactly what BFS does.

Key idea: FIFO creates layers #

BFS uses a queue, so vertices discovered earlier are processed earlier.

That enforces a level-by-level expansion from the start vertex s.

We often track a distance array (optional but common):

Initialize:

BFS pseudocode (adjacency list) #

Assume the graph is stored as adjacency lists Adj[u].

BFS(G, s):
  for each vertex v in V:
    visited[v] = false
    dist[v] = +∞
    parent[v] = NIL

  visited[s] = true
  dist[s] = 0
  Q = empty queue
  enqueue(Q, s)

  while Q not empty:
    u = dequeue(Q)
    for each v in Adj[u]:
      if visited[v] == false:
        visited[v] = true
        dist[v] = dist[u] + 1
        parent[v] = u
        enqueue(Q, v)

What BFS guarantees (and why) #

Claim (unweighted shortest paths): When BFS first discovers a vertex v, dist[v] is the length (in edges) of the shortest path from s to v.

Intuition:

A useful view is “BFS tree”:

Complexity #

Let n = |V| and m = |E|.

Edge cases revisited under BFS #

Core Mechanic 2: DFS (Depth-First Search) with a LIFO Stack (or Recursion) #

Why DFS exists (motivation) #

Sometimes you don’t care about shortest hop count. You care about structure:

DFS explores “as deep as possible” before backtracking.

Two equivalent implementations #

  1. Recursive DFS (uses the call stack implicitly)

  2. Iterative DFS (uses an explicit stack)

They behave similarly, but iterative DFS avoids recursion depth limits.

Recursive DFS pseudocode #

DFS(G, s):
  for each vertex v in V:
    visited[v] = false
    parent[v] = NIL

  dfsVisit(s)

  dfsVisit(u):
    visited[u] = true
    for each v in Adj[u]:
      if visited[v] == false:
        parent[v] = u
        dfsVisit(v)

Iterative DFS (explicit stack) pseudocode #

To match the “mark when discovered” rule:

DFS_iter(G, s):
  for each vertex v in V:
    visited[v] = false
    parent[v] = NIL

  S = empty stack
  visited[s] = true
  push(S, s)

  while S not empty:
    u = pop(S)
    for each v in Adj[u]:
      if visited[v] == false:
        visited[v] = true
        parent[v] = u
        push(S, v)

Note: This iterative version’s exact visitation order depends on neighbor ordering in Adj[u]. That’s normal.

What DFS guarantees (and what it doesn’t) #

DFS guarantees:

DFS does not guarantee shortest paths.

Example intuition:

Complexity #

Same as BFS:

Edge cases under DFS #

Identical handling as BFS with visited marking:

A small but important note: “discovered” vs “finished” #

Some DFS-based algorithms (topological sort, SCC) track timestamps:

You don’t need timestamps to traverse, but it’s helpful to know DFS has a natural notion of “finishing” a node after exploring its descendants.

Application / Connection: Choosing BFS vs DFS, and what traversal enables #

BFS vs DFS: how to choose #

They solve different kinds of problems. The frontier policy is the whole story.

GoalPreferWhy
Explore in increasing number of edges from sBFSFIFO creates distance layers
Find shortest paths in an unweighted graphBFSFirst discovery gives shortest dist[v]
Detect cycles / reason about backtracking structureDFSNatural recursion stack / finish times
Topological sort (DAG ordering)DFSUses finish times
Strongly connected componentsDFSKey building block (Kosaraju/Tarjan)
Just check reachability s → tEitherBoth are O(n+m); DFS may find a path quickly

Traversal as a “subroutine” #

Many bigger algorithms start with “run BFS/DFS from …”

Traversing the entire graph (graph may be disconnected) #

So far we assumed a start node s. If you want to visit every vertex in the graph:

for each vertex v in V:
  if visited[v] == false:
    BFS(G, v)   // or DFS(G, v)

This produces a forest of BFS/DFS trees—one per connected component (undirected) or per reachable region (directed).

One more practical detail: adjacency list + duplicates #

Because adjacency lists can contain repeated neighbors (parallel edges), you should always structure the neighbor scan as:

This keeps runtime tight: each vertex enters the frontier once.

Worked Examples (3) #

Worked Example 1: BFS layers + shortest path reconstruction #

Graph (undirected) with vertices {A,B,C,D,E,F}. Edges: A-B, A-C, B-D, C-D, C-E, D-F, E-F. Start s = A. Use BFS to compute dist[·] and parent[·], then reconstruct a shortest path from A to F.

  1. Initialize:

    visited[]=false, dist[]=∞, parent[*]=NIL

    visited[A]=true, dist[A]=0

    Q=[A]

  2. Dequeue A. Neighbors: B, C.

    B unvisited ⇒ visited[B]=true, dist[B]=1, parent[B]=A, enqueue B

    C unvisited ⇒ visited[C]=true, dist[C]=1, parent[C]=A, enqueue C

    Now Q=[B,C]

  3. Dequeue B. Neighbors: A, D.

    A already visited ⇒ ignore

    D unvisited ⇒ visited[D]=true, dist[D]=2, parent[D]=B, enqueue D

    Now Q=[C,D]

  4. Dequeue C. Neighbors: A, D, E.

    A visited ⇒ ignore

    D is already visited (discovered via B) ⇒ ignore (important!)

    E unvisited ⇒ visited[E]=true, dist[E]=2, parent[E]=C, enqueue E

    Now Q=[D,E]

  5. Dequeue D. Neighbors: B, C, F.

    B visited ⇒ ignore

    C visited ⇒ ignore

    F unvisited ⇒ visited[F]=true, dist[F]=3, parent[F]=D, enqueue F

    Now Q=[E,F]

  6. Dequeue E. Neighbors: C, F.

    C visited ⇒ ignore

    F already visited ⇒ ignore

    Q=[F]

  7. Dequeue F. Neighbors: D, E (both visited). Done. Q=[]

  8. Distances found:

    dist[A]=0

    dist[B]=1, dist[C]=1

    dist[D]=2, dist[E]=2

    dist[F]=3

  9. Reconstruct shortest path A→F using parent pointers:

    F has parent D

    D has parent B

    B has parent A

    So path (reverse) is F←D←B←A ⇒ A→B→D→F

    This path has 3 edges, matching dist[F]=3.

Insight: The moment F is first discovered, BFS has already guaranteed it used the fewest edges possible. Notice how the edge C-D did not change dist[D] after D was discovered; visited marking at discovery time prevents duplicates and preserves the first (shortest) discovery.

Worked Example 2: DFS exploration order and why it’s not shortest #

Directed graph with vertices {S, A, B, C, T}. Edges: S→A, S→T, A→B, B→C, C→T. Start at S. Run DFS (recursive) scanning neighbors in alphabetical order. Compare the path found to the shortest path in edges.

  1. Adjacency lists (alphabetical):

    Adj[S]=[A,T]

    Adj[A]=[B]

    Adj[B]=[C]

    Adj[C]=[T]

    Adj[T]=[]

  2. Start dfsVisit(S):

    visited[S]=true

  3. From S, first neighbor is A (unvisited):

    parent[A]=S

    call dfsVisit(A)

  4. At A:

    visited[A]=true

    neighbor B unvisited ⇒ parent[B]=A, dfsVisit(B)

  5. At B:

    visited[B]=true

    neighbor C unvisited ⇒ parent[C]=B, dfsVisit(C)

  6. At C:

    visited[C]=true

    neighbor T unvisited ⇒ parent[T]=C, dfsVisit(T)

  7. At T:

    visited[T]=true

    Adj[T] empty ⇒ return

  8. Unwind returns back to C, B, A, S.

    Back at S, next neighbor is T, but visited[T]=true so it is skipped.

Insight: DFS found a path S→A→B→C→T (4 edges), even though there is a direct edge S→T (1 edge). DFS is about deep exploration, not shortest paths. Neighbor order can strongly influence what DFS finds first.

Worked Example 3: Handling self-loops and parallel edges with visited[v] #

Directed graph with vertices {1,2,3}. Edges: 1→1 (self-loop), 1→2 (two parallel edges), 2→3. Start BFS at 1. Show the queue evolution and explain why it terminates cleanly.

  1. Adj[1]=[1,2,2]

    Adj[2]=[3]

    Adj[3]=[]

    Initialize visited[*]=false

    visited[1]=true, Q=[1]

  2. Dequeue 1. Scan neighbors:

    • •v=1 (self-loop): visited[1] already true ⇒ do nothing
    • •v=2: unvisited ⇒ visited[2]=true, enqueue 2
    • •v=2 again (parallel edge): visited[2] already true ⇒ do nothing

    Now Q=[2]

  3. Dequeue 2. Neighbor v=3 unvisited ⇒ visited[3]=true, enqueue 3

    Q=[3]

  4. Dequeue 3. No neighbors. Q=[] stop.

Insight: Marking visited at discovery time ensures each vertex enters the frontier at most once, even if adjacency lists contain duplicates (parallel edges) or cycles (including self-loops).

Key Takeaways #

Common Mistakes #

Practice #

easy

Run BFS on the undirected graph with vertices {0,1,2,3,4} and edges: 0-1, 0-2, 1-3, 2-3, 3-4. Start at 0. Compute dist[·] and one shortest path from 0 to 4 using parent pointers.

Hint: Initialize dist[0]=0. When you discover a neighbor v from u, set dist[v]=dist[u]+1 and parent[v]=u.

Show solution

BFS layers:

Start: dist[0]=0

Discover 1 and 2 ⇒ dist[1]=1 parent[1]=0; dist[2]=1 parent[2]=0

Process 1 ⇒ discover 3 ⇒ dist[3]=2 parent[3]=1

Process 2 ⇒ 3 already visited

Process 3 ⇒ discover 4 ⇒ dist[4]=3 parent[4]=3

Shortest path 0→4: follow parents 4←3←1←0 ⇒ 0→1→3→4 (length 3).

medium

Give a small directed graph where DFS from S visits T, but the DFS parent path from S to T is longer (more edges) than the shortest path. Explain briefly why BFS would avoid this.

Hint: Include both a direct edge S→T and a longer chain S→A→B→…→T, and order neighbors so DFS explores the chain first.

Show solution

Example: vertices {S,A,B,T}. Edges: S→A, A→B, B→T, and S→T. If Adj[S]=[A,T], DFS explores S→A→B→T (3 edges) and marks T visited before returning to consider S→T. BFS would discover T from S immediately with dist[T]=1, which is the unweighted shortest path.

medium

You want to traverse an entire graph that may be disconnected. Describe (in pseudocode-level English) how to use BFS or DFS to ensure every vertex is visited exactly once, even with self-loops and parallel edges.

Hint: Use an outer loop over all vertices; start a new traversal whenever you find an unvisited vertex; mark visited on discovery.

Show solution

Set visited[v]=false for all v. For each vertex v in V: if visited[v]==false, start BFS/DFS from v. In the traversal, when scanning neighbors u of a popped vertex, only enqueue/push u if visited[u]==false, and immediately set visited[u]=true. This prevents repeats from cycles, self-loops, or parallel edges and guarantees each vertex is added to the frontier at most once.

Connections #

Related refreshers:

Quality: A (4.5/5)

← back to treebrowse all →