Tries

←Back to Tech Tree

inventorycoverage

Tries #

Data StructuresDifficulty: ★★★☆☆Depth: 3Unlocks: 0

Prefix trees for string storage. O(k) operations for length k strings.

Interactive Visualization #

⏮◀◀▶▶STEP0.25x1xZOOM

t=0s

Core Concepts #

Key Symbols & Notation #

child_map (character -> child node)

Essential Relationships #

Prerequisites (2) #

Trees5 atomsHash Tables6 atoms

Advanced Learning Details

Graph Position #

38

Depth Cost

0

Fan-Out (ROI)

0

Bottleneck Score

3

Chain Length

Cognitive Load #

5

Atomic Elements

31

Total Elements

L1

Percentile Level

L3

Atomic Level

All Concepts (12) #

Teaching Strategy #

Self-serve tutorial - low prerequisites, straightforward concepts.

When you type “aut…” and your editor suggests “auto”, “autocomplete”, and “automatic”, it’s doing something very trie-like: it’s exploiting the fact that many strings share prefixes, and it wants to find “everything that starts with this prefix” quickly.

TL;DR:

A trie (prefix tree) stores strings by sharing common prefixes along root→node paths. Each edge is labeled by one character, and nodes can be marked as “terminal” to mean “a complete key ends here”. Insert/search/delete take O(k) time for a key of length k (independent of how many total keys you stored), plus some constant factors from how you represent children (array vs map).

What Is a Trie? #

Why tries exist (motivation) #

Hash tables are great when you want to answer: “Is this exact key present?” in about O(1) average time. But they’re not naturally built for questions like:

You can do these with hashing, but you typically need extra indexing, sorting, or scanning.

A trie is designed around prefix structure. It stores many strings such that shared prefixes are represented once, then branch when the next character differs.

Definition (prefix tree) #

A trie is a rooted tree where:

  1. 1)Edges are labeled by characters.
  2. 2)Each node corresponds to the prefix formed by concatenating edge labels along the path from the root.
  3. 3)A node can be marked as terminal (and/or store a value) to indicate that the prefix at that node is a complete stored key.

If the root represents the empty prefix "", then following edges labeled 'c' → 'a' → 't' leads you to a node representing the prefix "cat".

The atomic concepts in one picture (mentally) #

This last point is crucial: if you store "car" and "cart", then the node for "car" is terminal, while it also has a child edge 't'.

Complexity intuition: why O(k)? #

Operations on tries are usually measured in terms of the string length.

Let k be the length of a key. Search/insert/delete walk one character at a time. That’s k steps.

So:

This is not “always faster” than hash tables—constant factors matter—but it’s predictably proportional to key length, and it enables efficient prefix queries.

Data model: child_map #

Each node needs a way to get from a character to the child node.

We’ll refer to this as:

Typical representations:

RepresentationLookup for next charMemoryNotes
Fixed array (e.g., 26 lowercase)O(1)HighFast, but wastes space if sparse
Hash map / dictionaryO(1) averageMediumFlexible alphabets, typical choice
Balanced BST mapO(log Σ)MediumDeterministic, Σ = alphabet size

The trie’s headline O(k) assumes child lookup is O(1) (array or hash map). If you use a tree-map, you get O(k log Σ).

What tries store (set vs map) #

A trie can store:

Both variants are common; the mechanics are the same.

Core Mechanic 1: Path = Prefix (and Why Terminal Markers Matter) #

Why this mechanic is the heart of a trie #

Tries are powerful because they turn string comparison into shared traversal.

In a sorted list, to compare two strings you scan until they differ. In a trie, you physically share that scanned prefix: once a prefix exists as a path, every key with that prefix reuses it.

Building the idea slowly #

Suppose we insert these keys:

Start at the root (""), then:

Notice how "to" and "tea" share the node("t").

Terminal marker: distinguishing key vs prefix #

A trie node represents a prefix, but not every prefix is a stored key.

Example: insert "tea" only.

Now search for "te": you can walk 't' then 'e' and arrive at node("te"). If you didn’t have a terminal marker, you might incorrectly conclude "te" is present. Terminal markers make the answer precise.

Formally:

How this enables prefix queries #

Let p be a prefix. If you can traverse p (character by character) to reach node(p), then:

This is the key property that hash tables don’t naturally provide.

Counting and aggregation (a common extension) #

Often, nodes store extra metadata to answer queries efficiently:

Then:

This is not required for basic tries, but it’s a frequent real-world upgrade.

Time complexity: stating it carefully #

Let k = |s| be the number of characters.

Traversal visits exactly k edges, doing a child_map lookup at each step.

So the work is:

∑ (per-character child lookup)

If child lookup is O(1) average:

O(k)

If child lookup is O(log Σ):

O(k log Σ)

Where Σ is alphabet size (e.g., 26 lowercase, 128 ASCII, or full Unicode codepoints).

Memory: the trade you’re making #

Tries often use more memory than a hash table for the same set of strings, because you store nodes and child maps.

But they compress shared prefixes. If your dataset has lots of common prefixes (URLs, file paths, dictionary words, identifiers), a trie can be surprisingly memory-competitive and faster for prefix operations.

Core Mechanic 2: Insert, Search, and Delete via child_map #

Why focus on mechanics #

A trie is simple conceptually, but correctness depends on a few details:

We’ll describe operations for a trie that stores a set of strings with isTerminal.


Search(s) #

Intuition #

To search s, follow the unique path dictated by its characters. If at any character you can’t follow the edge, the string isn’t stored.

Algorithm #

  1. 1)Start at root.
  2. 2)For each character c in s:
  1. 3)After consuming all characters: return node.isTerminal

Complexity #


Insert(s) #

Intuition #

Insertion is just “create missing edges along the path, then mark the endpoint terminal.” If the path already exists, you only update the terminal marker.

Algorithm #

  1. 1)Start at root.
  2. 2)For each character c in s:
  1. 3)Set node.isTerminal = true

Complexity #


Delete(s) #

Why deletion is trickier #

If you delete "inn" but keep "in", you must not remove the shared prefix nodes. You only remove nodes that become unreachable from any terminal key.

A standard approach:

  1. 1)Traverse to the end node.
  2. 2)If it’s not terminal, nothing to delete.
  3. 3)Unmark terminal.
  4. 4)Walk back up the path and remove nodes that now have:

This is basically garbage-collecting dead suffix nodes.

A careful delete strategy (stack of nodes) #

During traversal, store the path:

Then after unmarking terminal, pop upward:

Complexity #

So delete is O(k) time.


Subtle but important: representing characters #

If your alphabet is ASCII letters, an array of size 26 or 52 might be great.

If your keys are arbitrary Unicode strings, a sparse map is safer.

This choice affects constants and memory but not the core idea.


Comparing with hash tables (when would you choose a trie?) #

TaskHash tableTrie
Exact membershipO(1) avgO(k)
Prefix queryNot naturalNatural (traverse prefix then DFS)
Lexicographic iterationRequires sortingNatural (DFS in character order)
MemoryOften lowerOften higher, but shares prefixes
Worst-case guaranteesDepends on hashDeterministic O(k) given child lookup

A useful mindset: a trie is an indexing structure for strings where structure matters (prefixes), not just equality.

Application/Connection: Autocomplete, Prefix Counts, and Variants #

Autocomplete (classic application) #

Autocomplete is the canonical trie use-case:

  1. 1)Traverse the user-typed prefix p to node(p).
  2. 2)Enumerate keys in the subtree.

To enumerate keys:

If you need “top-k suggestions,” you often store extra info at nodes (e.g., most frequent completions, or a heap of top terms).

Longest prefix match (routing, tokenization) #

Given a query string q, you may want the longest prefix of q that exists as a key.

Traverse characters until you can’t continue, tracking the last terminal node you saw.

This shows up in:

Prefix counts and analytics #

If each node tracks prefixCount (how many inserted strings pass through), then you can answer:

Update rule during insert:

Delete similarly decrements.

Variants you should know exist (even if you don’t implement them yet) #

VariantIdeaWhy it matters
Radix tree / compact trieCompress chains of single-child nodes into multi-character edgesSaves memory, can be faster
Ternary search treeEach node stores a character and 3 pointers (less/equal/greater)Memory-friendly alternative
Bitwise trieAlphabet is {0,1} for integers / IP prefixesVery fast longest-prefix operations

Tradeoffs and practical guidance #

A note on “O(k) independent of n” #

A key selling point is that lookup doesn’t scale with the number of stored strings n the same way many other structures do.

That said, enumeration from a prefix node can output many results; the time necessarily scales with how many keys you return.

Worked Examples (3) #

Example 1: Insert and Search with Terminal Markers #

Build a trie for the set {"to", "tea", "ten"}. Then determine whether "te" and "ten" are contained.

  1. Start with an empty root node representing prefix "".

    Initialize root.child_map = {} and root.isTerminal = false.

  2. Insert "to":

    • •At root, see 't' not in child_map → create node("t").
    • •Move to node("t"), see 'o' not in child_map → create node("to").
    • •Mark node("to").isTerminal = true.
  3. Insert "tea":

    • •At root, 't' exists → move to node("t").
    • •At node("t"), 'e' missing → create node("te").
    • •Move to node("te"), 'a' missing → create node("tea").
    • •Mark node("tea").isTerminal = true.
  4. Insert "ten":

    • •root 't' exists → node("t").
    • •node("t") 'e' exists → node("te").
    • •node("te") 'n' missing → create node("ten").
    • •Mark node("ten").isTerminal = true.
  5. Search "te":

    • •Traverse 't' then 'e' successfully to node("te").
    • •Check node("te").isTerminal.
    • •It is false, because only "tea" and "ten" were inserted.

    Therefore "te" ∉ set.

  6. Search "ten":

    • •Traverse 't' → 'e' → 'n' to node("ten").
    • •node("ten").isTerminal = true.

    Therefore "ten" ∈ set.

Insight: Reaching a node is not enough; the terminal marker is what distinguishes a stored key from a prefix that merely exists because longer keys share it.

Example 2: Delete Without Breaking Shared Prefixes #

Store {"in", "inn", "into"}. Delete "inn" and show what remains.

  1. Insert "in": create path root —'i'→ node("i") —'n'→ node("in"). Mark node("in") terminal.

  2. Insert "inn": traverse to node("in"), then add 'n' to create node("inn"). Mark node("inn") terminal.

  3. Insert "into": traverse root-'i'-'n' to node("in"); then add 't'→node("int"), 'o'→node("into"). Mark node("into") terminal.

    Now node("in") is terminal and has children 'n' and 't'.

  4. Delete "inn":

    Traversal path: (root,'i'), (node("i"),'n'), (node("in"),'n') leads to node("inn").

    node("inn").isTerminal is true, so proceed.

  5. Unmark node("inn").isTerminal = false.

    Now decide cleanup:

    • •node("inn") has 0 children and is not terminal → remove it from node("in").child_map['n'].
  6. Stop cleanup at node("in") because node("in") is terminal (represents "in") and also still has child 't' for "into".

    So we must keep node("in") and its ancestors.

  7. Resulting stored keys are {"in", "into"}.

    The shared prefix nodes remain intact.

Insight: Deletion is “unmark terminal, then prune dead leaves upward until you reach a node that is still needed (terminal or has other children).”

Example 3: Prefix Query by Subtree Enumeration #

Given keys {"car", "card", "care", "cat", "dog"}, list all keys with prefix "car".

  1. Traverse prefix "car":

    root —'c'→ node("c") —'a'→ node("ca") —'r'→ node("car").

    If any edge were missing, the answer set would be empty.

  2. At node("car"), check terminal:

    node("car").isTerminal = true, so include "car" in results.

  3. Enumerate subtree from node("car"):

    • •Follow edge 'd' to node("card"); node("card") is terminal → include "card".
    • •Follow edge 'e' to node("care"); node("care") is terminal → include "care".
  4. Stop when DFS finishes.

    Collected results: ["car", "card", "care"].

Insight: Prefix search is a two-phase operation: O(|p|) to reach node(p), then output-sensitive traversal of its subtree.

Key Takeaways #

Common Mistakes #

Practice #

easy

You insert the keys {"a", "ab", "abc", "b"} into a trie. Which nodes must be terminal? List the terminal prefixes.

Hint: A node is terminal exactly when some inserted key ends at that node, even if it has children.

Show solution

Terminal prefixes are: "a", "ab", "abc", and "b". Note that "a" and "ab" are terminal even though they have descendants.

medium

Design a trie node structure for lowercase English words. Compare using (1) an array of length 26 and (2) a hash map for child_map. Give one scenario where each choice is better.

Hint: Think about sparsity: how many children does a typical node have, and how big is the alphabet?

Show solution

Array(26): child lookup is true O(1) with very small constant; best when alphabet is fixed and small and performance is critical (e.g., many lookups, dense branching). Hash map: saves memory when most nodes have few children; best when the trie is sparse or alphabet can vary (e.g., mixed-case, punctuation, Unicode).

hard

Implement (in pseudocode) delete(s) for a trie storing a set of strings with isTerminal and child_map. Your delete should not remove nodes needed by other strings.

Hint: Record the path while descending (stack of (parent, char)), then prune upward while nodes are non-terminal leaves.

Show solution

One correct pseudocode approach:

function delete(s):

node = root

stack = [] // pairs (parentNode, char)

for c in s:

if c not in node.child_map: return false

stack.push((node, c))

node = node.child_map[c]

if node.isTerminal == false: return false

node.isTerminal = false

// prune upward

while stack not empty:

(parent, c) = stack.pop()

child = parent.child_map[c]

if child.isTerminal == false and child.child_map is empty:

remove parent.child_map[c]

else:

break

return true

Connections #

Quality: A (4.5/5)

← back to treebrowse all →