Introduction to Trie Data Structure5 min read

Trie is a data structure that is not so popular but is particularly useful when you have to work a lot with strings. It also has different names like a radix tree or a prefix tree, this data structure is a k-ary search tree and is used for locating specific keys within a set. It comes in handy in many string-related scenarios like autocomplete, longest prefix match, spell checking, and more.

There are different ways to implement a trie, but generally, for each node, we store a key – which is mostly a string character, the references to its child nodes, and also a boolean value indicating whether this current node marks a word.

Depiction of a trie. Single empty circle, representing the root node, points to three children. The arrow to each child is marked by a different letter. The children themselves have similar set of arrows and child nodes, with nodes that correspond to full words bearing blue integer values.
Image courtesy: Wikipedia

As observed in the image above, each child node with the same parent will share a common prefix, and the root node is associated with an empty value.

Now we have a basic idea of the trie data structure and its applications, let’s create our first simple trie implementation and add some operations such as insert, exist and delete.

public class Trie {
    private final char rootCharacter = '\0';
    private Node root = new Node(rootCharacter);

    public static final class Node {
        char ch;
        boolean isWordEnding;
        Map<Character, Node> children;
        
        public Node(char ch) {
             this.ch = ch;
             this.children = new HashMap<>();
        }
        
        public void addChild(char ch, Node child) {
            this.children.put(ch, child);
        }
    }   
}

Here we have the Node class which represents each node in the trie, as we can see, we don’t store the string itself to each node but simply a character, we have a map for each node to contain its children, and also a boolean value indicating the current node is actually the end of the word. Notice that we have created our root node associating with a null value.

public boolean insert(String key) {
        Node node = root;
        boolean createdNewNode = false;
        boolean isPrefix = false;
        for(final var c : key.toCharArray()) {
            Node nextNode = node.children.get(c);
            if (nextNode == null) {
                nextNode = new Node(c);
                createdNewNode = true;
                node.addChild(c, nextNode);
            } else {
                if (nextNode.isWord) {
                    isPrefix = true;
                }
            }
            node = nextNode;
        }
        if (node != root) {
            node.isWordEnding = true;
        }
        return isPrefix || !createdNewNode; 
}

We have the insert method which returns the boolean value signifies whether the key already exists in the trie or not. Inside the method, for each iteration, we move through each character of the key and check them with the current node children, if the parent node doesn’t have any node associated with the current key character, we create a new child for it. In contrast, if the node already has a child associated with the current key, it simply checks whether the nextNode is actually the end of the word. In the end, the node now has a reference containing the last inserted character, finally, we return true if the key we want to insert is a prefix that already exists in the trie or a new node hasn’t been created.

Once having the insert method, we can now check the existence of a key by creating the exist method:

public boolean exist(String key) {
        if (key == null) throw new IllegalArgumentException("Key cannot be null");
        Node curNode = root;
        for(final var c : key.toCharArray()) {
            Node child = curNode.children.get(c);
            if (child == null) return false;
            curNode = child;
        }
        return true;
 }

Next, we have the exist method which helps us know whether we have inserted a key into the trie or not. The detail of it is pretty simple. The method only returns true if, for each key character, they must be an associated node with it in the right order. If along the way, there is a mismatch, we don’t waste our effort to dig deeper since 2 strings cannot be identical if their prefixes are different.

To delete a key, we first check whether the key is present in our trie:

public boolean delete(String key) {
        if (key == null) throw new IllegalArgumentException("Key cannot be null");
        if (!exist(key)) return false;
        Node node = root;
        for(final var c : key.toCharArray()) {
            Node curNode = node.children.remove(c);
            node = curNode;
        }
        if (!node.children.isEmpty()) {
            node.children = null;
            node = null;
        }
        return true;
}

If the key isn’t there in the trie, we return false immediately before the loop. Otherwise, for each key character, we remove its existence from the current parent. Be aware that the remove method will return the previously associated value for a key. In this case, for each iteration, the curNode will hold the reference to the removed child node, notice that every node value with the same prefix is also get removed. Finally, after the loop, if the node still has any children, meaning there is still some suffix of the deleted key, we remove all these dangling children since their parents already get removed.

To finish the implementation for our trie, we now add some 2 clear methods, one clears the node, and another clears the entire trie:

private void clear(Node node) {
    if (node == null) return;
    for(final var entry : node.children.entrySet()) {
        Node nextNode = entry.getValue();
        clear(nextNode);
        nextNode = null;
    }
    node.children.clear();
    node.children = null;
}

public void clear() {
    root = null;
    root = new Node('\0');
}

In the former method, we recursively clear all the children of the current node, in the end, once all child references have been set to null, we erase all entries from the children’s map and assign it to null to finish our attempt to clear all children for the input node. The latter simply assigns the root value to null to clear the entire trie.

We are almost done with this, let’s fiddle with our newly created trie a little bit:

Trie trie = new Trie();
trie.insert("hello world");
trie.insert("the lazy dog");
trie.exist("the"); // true
trie.delete("the lazy");
trie.exist("the lazy"); // false
trie.exist("hello"); // true
trie.exist("world"); // false
trie.exist("swamp"); // false
trie.insert("blue sky");
trie.exist("blue"); // true;
trie.clear();
trie.exist("blue"); // false

Previous Article
Next Article
Every support is much appreciated ❤️

Buy Me a Coffee