Finding orders using Topological Sort - Learn To Code Together

Topological sort is a fundamental algorithm to order elements in a directed graph; in this context, elements are often denoted as nodes, and the edges between them represent dependencies. In simple terms, when creating a sequence of linear ordering of tasks, if task B has task A as a dependency, then A will come before B in the sequence; this ensures that all dependencies will be satisfied before the current task is executed.

This algorithm comes in handy on many occasions; some use cases are scheduling, task management, etc…For example, when enrolling in a course (let’s presume its name is DSA) at a university, if this course has prerequisites, then in order to study this course, you first have to take all of its prerequisites. In this case, if we use topological sort to find the topological ordering of courses being taken, then, as expected, the DSA course will come after its prerequisites.

The topological ordering is guaranteed to be found if the graph is a directed acyclic graph (DAG), meaning the graph contains no cycle (there is no closed loop in the graph). With that in mind, to find the topological ordering of any DAG, there are some algorithms that we can use, such as Kahn’s algorithm or depth-first search traversal; in this article, we’re going to learn about the former.

Let’s say we want to generate a list of DDL commands to create a list of tables in a database, and some of the tables that we want to create contain some foreign keys (dependencies) to other tables. The relationships of these tables are simple enough that there are no circular dependencies, and of course, there are some tables that have no dependency at all. We will find out the topological order for all the tables and create them in the correct order; first, we will create an interface that labels the node as well as its dependencies:

public interface Ordering {
  String getNode();
  List<String> getDependencies();
}

Next, we will create the TableOrdering implement this interface containing the table name and a list of tables it depends on:

class TableOrdering implements Ordering {
    private final String tableName;
    private final List<String> foreignKeyTables;

    public TableOrdering(String tableName, List<String> foreignKeyTables) {
        this.tableName = tableName;
        this.foreignKeyTables = foreignKeyTables;
    }

    @Override
    public String getNode() {
        return this.tableName;
    }

    @Override
    public List<String> getDependencies() {
        return this.foreignKeyTables;
    }
}

Now, we are ready to represent a directed graph containing two properties, inDegrees which labels the number of dependencies of a given node, and adjList which represents a node with its neighbors:

class Graph {
    private final Map<String, Integer> inDegrees = new HashMap<>();
    private final Map<String, Set<String>> adjList = new HashMap<>();

    public void addEdge(String from, String to) {
        adjList.computeIfAbsent(from, k -> new HashSet<>()).add(to);
        inDegrees.merge(to, 1, Integer::sum);
    }
    

    public Integer getInDegree(String node) {
        return inDegrees.get(node);
    }

    public Set<String> getNeighbors(String node) {
        return adjList.get(node);
    }
}

For the addEdge method, we label the directed connection between the node from and to, meaning to depends on from. Afterward, we will create a helper function that transforms a list of TableOrdering into a graph:

public static Graph createGraph(List<TableOrdering> tableOrderings) {
    Graph graph = new Graph();
    for(TableOrdering tableOrdering : tableOrderings) {
       graph.setIndegree(tableOrdering.getNode(), 0); 
       graph.setAdj(tableOrdering.getNode(), new HashSet<>()); 
    }
    for(TableOrdering tableOrdering : tableOrderings) {
       String node = tableOrdering.getNode();
       List<String> dependencies = tableOrdering.getDependencies();
       for(String dependency : dependencies) {
          graph.addEdge(dependency, node);
       }
    }
    return graph;
}

Now, we’re ready to use Kahn’s algorithm to find the topological order for the tables that we want to create; here is the pseudo-code for it, which is simple to code up:

L ← Empty list that will contain the sorted elements
S ← Set of all nodes with no incoming edge

while S is not empty do
    remove a node n from S
    add n to L
    for each node m with an edge e from n to m do
        remove edge e from the graph
        if m has no other incoming edges then
            insert m into S

if graph has edges then
    return error   (graph has at least one cycle)
else 
    return L   (a topologically sorted order)

First, we create a list that contains the nodes sorted in the topological order and a set of nodes with no incoming edges (here, in our case, it will be the table has no references to other tables), in the while loop, we keep removing a node from the S set and add it to the result list L in each pass and remove an edge between that current node with its neighbors and if after this operation any neighbor has no incoming edge will be added to the set S. Once the while loop stops, if there are still edges between any two nodes, it will raise an error since the graph contains at least one cycle, otherwise it succeeds and return a list of nodes sorted in the topological order, and keep in mind there can be multiple topological orders of a given graph:

public static List<TableOrdering> findLinearOrdering(List<TableOrdering> tableOrderings) {
    Graph graph = createGraph(tableOrderings);
    List<TableOrdering> results = new ArrayList<>();
    Map<String, TableOrdering> tableNameToTable = mapTableByName(tableOrderings);
    Queue<TableOrdering> nodesWithNoIncomingEdges = getNodesWithNoIncomingEdges(graph, tableNameToTable);
    while (!nodesWithNoIncomingEdges.isEmpty()) {
        TableOrdering node = nodesWithNoIncomingEdges.poll();
        results.add(node);
        Set<String> neighbors = graph.getNeighbors(node.getNode());
        for (String neighbor : neighbors) {
            int indegree = graph.getInDegree(neighbor);
            int newIndegree = indegree - 1;
            graph.setIndegree(neighbor, newIndegree);
            if (newIndegree == 0) {
                TableOrdering table = tableNameToTable.get(neighbor);
                nodesWithNoIncomingEdges.add(table);
            }
        }
    }
    if (graph.containsAnyEdge()) {
        throw new IllegalArgumentException("Graph contains at least one cycle");
    } else {
        return results;
    }
}

To verify it is correct, we create some tests, and first, we test it with a graph with no cycle:

@Test
void testGraphWithNoCycle() {
    TableOrdering tableA = TableOrdering.of("a", List.of()); // table names and its dependencies
    TableOrdering tableB = TableOrdering.of("b", List.of("a"));
    TableOrdering tableC = TableOrdering.of("c", List.of());
    TableOrdering tableD = TableOrdering.of("d", List.of());
    TableOrdering tableE = TableOrdering.of("e", List.of("c", "d"));
    TableOrdering tableF = TableOrdering.of("f", List.of());
    TableOrdering tableG = TableOrdering.of("g", List.of("f"));
    List<TableOrdering> tableOrderings = List.of(tableA, tableB, tableC, tableD, tableE, tableF, tableG);
    List<TableOrdering> newOrder = ToposortUtils.findLinearOrdering(tableOrderings);
    Assertions.assertTrue(indexOf("a", newOrder) < indexOf("b", newOrder));
    Assertions.assertTrue(indexOf("c", newOrder) < indexOf("e", newOrder));
}

Finally, the algorithm will throw an exception when there is a cycle in the graph:

@Test
void testGraphWithCycle() {
    TableOrdering tableA = TableOrdering.of("a", List.of("b"));
    TableOrdering tableB = TableOrdering.of("b", List.of("c"));
    TableOrdering tableC = TableOrdering.of("c", List.of("d"));
    TableOrdering tableD = TableOrdering.of("d", List.of("a"));
    TableOrdering tableE = TableOrdering.of("e", List.of());
    List<TableOrdering> tableOrderings = List.of(tableA, tableB, tableC, tableD, tableE);
    Assertions.assertThrows(IllegalArgumentException.class,
            () -> ToposortUtils.findLinearOrdering(tableOrderings));
}

Facebook0 Tweet0

Finding orders using Topological Sort6 min read