LeetCode Daily Challenge-LeetCode 1948 Delete Duplicate Folders in System-Two Solutions(Hard, Java)

Anni HuangAnni Huang
6 min read

Problem Understanding

The goal is to delete all duplicate folder subtrees. Two folders are considered duplicates if they have the same structure and folder names in their subtrees.

Solution 1: Standard Trie + DFS Approach

import java.util.*;

class Solution {
    // TC: O(M+N) 
    // SC: O(M+N)
    // M = total length of all folder names
    // N = number of nodes in trie
    // Build Trie - O(M)
    // Generate Signatures - O(N)
    // Collect valid paths - O(N)
    static class Node {
        String name;
        TreeMap<String, Node> children;
        String signature;

        public Node(String name) {
            this.name = name;
            this.children = new TreeMap<>();
        }
    }

    public List<List<String>> deleteDuplicateFolder(List<List<String>> paths) {
        Node root = new Node("");
        for (List<String> path : paths) {
            Node curr = root;
            for (String folder : path) {
                curr.children.putIfAbsent(folder, new Node(folder));
                curr = curr.children.get(folder);
            }
        }

        Map<String, Integer> countMap = new HashMap<>();
        dfs(root, countMap);

        List<List<String>> result = new ArrayList<>();
        List<String> currentPath = new ArrayList<>();
        for (Node child : root.children.values()) {
            dfs2(child, currentPath, result, countMap);
        }
        return result;
    }

    private void dfs(Node node, Map<String, Integer> countMap) {
        if (node.children.isEmpty()) {
            node.signature = "";
            return;
        }
        StringBuilder sb = new StringBuilder();
        for (Node child : node.children.values()) {
            dfs(child, countMap);
            sb.append(child.name).append('(').append(child.signature).append(')');
        }
        node.signature = sb.toString();
        countMap.put(node.signature, countMap.getOrDefault(node.signature, 0) + 1);
    }

    private void dfs2(Node node, List<String> currentPath, List<List<String>> result, Map<String, Integer> countMap) {
        if (!node.children.isEmpty() && countMap.getOrDefault(node.signature, 0) >= 2) {
            return;
        }
        currentPath.add(node.name);
        result.add(new ArrayList<>(currentPath));
        for (Node child : node.children.values()) {
            dfs2(child, currentPath, result, countMap);
        }
        currentPath.remove(currentPath.size() - 1);
    }
}

Approach:

  1. Build a complete trie from all paths
  2. Generate signatures for all nodes via DFS
  3. Collect valid paths by traversing and skipping duplicates

Solution 2: Optimized Bottom-Up Approach

import java.util.*;

class Solution {
    class Node {
        Map<String, Node> subNodes = new TreeMap<>();
        String content = "";
        boolean remove = false;

        void markRemove() {
            if (remove) {
                return;
            }
            remove = true;
            if (subNodes != null) {
                for (Node value : subNodes.values()) {
                    value.markRemove();
                }
            }
        }
    }

    public List<List<String>> deleteDuplicateFolder(List<List<String>> paths) {
        paths.sort(Comparator.comparingInt(List::size));
        List<Node> nodes = new ArrayList<>(paths.size());
        Node rootNode = new Node();

        for (List<String> pathList : paths) {
            Node current = rootNode;
            int last = pathList.size() - 1;
            for (int i = 0; i < last; i++) {
                String s = pathList.get(i);
                current = current.subNodes.get(s);
            }
            String name = pathList.get(last);
            Node node = new Node();
            current.subNodes.put(name, node);
            nodes.add(node);
        }

        StringBuilder content = new StringBuilder();
        Map<String, Node> nodeByContent = new HashMap<>();
        for (int i = nodes.size() - 1; i >= 0; i--) {
            Node node = nodes.get(i);
            if (node.subNodes.isEmpty()) {
                continue;
            }
            for (Map.Entry<String, Node> entry : node.subNodes.entrySet()) {
                content.append(entry.getKey()).append('{').append(entry.getValue().content).append('}');
            }
            node.content = content.toString();
            content.delete(0, content.length());

            Node similar = nodeByContent.putIfAbsent(node.content, node);
            if (similar != null) {
                node.markRemove();
                similar.markRemove();
            }
        }

        List<List<String>> ans = new ArrayList<>();
        for (int i = 0; i < paths.size(); i++) {
            if (!nodes.get(i).remove) {
                ans.add(paths.get(i));
            }
        }
        return ans;
    }
}

Approach:

  1. Sort paths by length (crucial optimization)
  2. Build nodes incrementally - only leaf nodes initially
  3. Process signatures bottom-up in reverse order
  4. Mark duplicates immediately when found

Detailed Comparison

AspectSolution 1 (Standard)Solution 2 (Optimized)
Trie BuildingComplete trie upfrontIncremental, leaf-focused
Processing OrderTop-down DFSBottom-up (reverse path order)
Memory UsageHigher (full trie)Lower (minimal intermediate nodes)
Duplicate DetectionCount-based after full scanImmediate marking
Path SortingNot requiredCritical - sorts by length
Code ComplexityMore intuitiveMore optimized but complex

Why Solution 2's Optimizations Work

1. Path Length Sorting

paths.sort(Comparator.comparingInt(List::size));

Why this matters: Shorter paths are processed last (in reverse iteration), ensuring parent folders are processed after their children. This enables bottom-up signature generation.

2. Incremental Node Building

// Only creates leaf nodes initially
Node node = new Node();
current.subNodes.put(name, node);
nodes.add(node);

Benefit: Avoids creating intermediate trie nodes that might not be needed.

3. Reverse Processing Order

for (int i = nodes.size() - 1; i >= 0; i--) {

Effect: Processes deepest paths first, building signatures bottom-up naturally.

4. Immediate Duplicate Marking

Node similar = nodeByContent.putIfAbsent(node.content, node);
if (similar != null) {
    node.markRemove();
    similar.markRemove();
}

Advantage: Marks duplicates immediately upon detection, propagating removal down the subtree.


Key Algorithmic Differences

Solution 1: Two-Phase Processing

  1. Phase 1: Build complete trie structure
  2. Phase 2: Generate all signatures via DFS
  3. Phase 3: Collect results while skipping duplicates

Solution 2: Single-Pass Optimization

  1. Pre-processing: Sort paths by length
  2. Building: Create only necessary nodes
  3. Processing: Bottom-up signature generation with immediate duplicate detection

Complexity Comparison

Solution 1Solution 2
TimeO(M + N)O(M + N + P log P)*
SpaceO(M + N)O(M + N)

*P = number of paths, extra log P for sorting


Trade-offs

Solution 1 Advantages:

  • More intuitive - classic trie + DFS approach
  • Easier to understand and debug
  • Standard pattern - follows common trie algorithms
  • No sorting required - works with any input order

Solution 2 Advantages:

  • More memory efficient - builds minimal intermediate structure
  • Immediate duplicate detection - doesn't need to store all signatures
  • Potentially faster in practice due to less overhead
  • Clever optimizations - demonstrates advanced algorithmic thinking

Solution 2 Disadvantages:

  • More complex logic - harder to understand bottom-up processing
  • Requires sorting - additional O(P log P) complexity
  • Less intuitive - the reverse processing is not obvious

Example Walkthrough

For input: [["a","x"], ["a","x","y"], ["b","x"], ["b","x","y"]]

Solution 1 Process:

  1. Builds complete trie with all nodes
  2. Generates signatures: x nodes get "y()", a/b nodes get "x(y())"
  3. Skips nodes with count ≥ 2 during collection
  4. Result: [] (all removed)

Solution 2 Process:

  1. Sorts: [["a","x"], ["b","x"], ["a","x","y"], ["b","x","y"]]
  2. Creates only leaf nodes for each path
  3. Processes in reverse: marks duplicates immediately
  4. Result: [] (all removed)

Conclusion

Solution 1 is the cleaner, more maintainable approach that follows standard algorithmic patterns. It's what most people would write in an interview.

Solution 2 is a highly optimized version that shows deep understanding of the problem structure. It's more suitable for production where memory efficiency matters.

For interviews: I'd recommend Solution 1 for clarity and explanation. For competitive programming: Solution 2 might have slight performance advantages.

Both solutions are correct and have the same asymptotic complexity, but Solution 2 demonstrates more advanced optimization techniques!

0
Subscribe to my newsletter

Read articles from Anni Huang directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Anni Huang
Anni Huang

I am Anni HUANG, a software engineer with 3 years of experience in IDE development and Chatbot.