LeetCode Daily Challenge-LeetCode 1948 Delete Duplicate Folders in System-Two Solutions(Hard, Java)


Problem Understanding
The goal is to delete all duplicate folder subtrees. Two folders are considered duplicates if they have the same structure and folder names in their subtrees.
Solution 1: Standard Trie + DFS Approach
import java.util.*;
class Solution {
// TC: O(M+N)
// SC: O(M+N)
// M = total length of all folder names
// N = number of nodes in trie
// Build Trie - O(M)
// Generate Signatures - O(N)
// Collect valid paths - O(N)
static class Node {
String name;
TreeMap<String, Node> children;
String signature;
public Node(String name) {
this.name = name;
this.children = new TreeMap<>();
}
}
public List<List<String>> deleteDuplicateFolder(List<List<String>> paths) {
Node root = new Node("");
for (List<String> path : paths) {
Node curr = root;
for (String folder : path) {
curr.children.putIfAbsent(folder, new Node(folder));
curr = curr.children.get(folder);
}
}
Map<String, Integer> countMap = new HashMap<>();
dfs(root, countMap);
List<List<String>> result = new ArrayList<>();
List<String> currentPath = new ArrayList<>();
for (Node child : root.children.values()) {
dfs2(child, currentPath, result, countMap);
}
return result;
}
private void dfs(Node node, Map<String, Integer> countMap) {
if (node.children.isEmpty()) {
node.signature = "";
return;
}
StringBuilder sb = new StringBuilder();
for (Node child : node.children.values()) {
dfs(child, countMap);
sb.append(child.name).append('(').append(child.signature).append(')');
}
node.signature = sb.toString();
countMap.put(node.signature, countMap.getOrDefault(node.signature, 0) + 1);
}
private void dfs2(Node node, List<String> currentPath, List<List<String>> result, Map<String, Integer> countMap) {
if (!node.children.isEmpty() && countMap.getOrDefault(node.signature, 0) >= 2) {
return;
}
currentPath.add(node.name);
result.add(new ArrayList<>(currentPath));
for (Node child : node.children.values()) {
dfs2(child, currentPath, result, countMap);
}
currentPath.remove(currentPath.size() - 1);
}
}
Approach:
- Build a complete trie from all paths
- Generate signatures for all nodes via DFS
- Collect valid paths by traversing and skipping duplicates
Solution 2: Optimized Bottom-Up Approach
import java.util.*;
class Solution {
class Node {
Map<String, Node> subNodes = new TreeMap<>();
String content = "";
boolean remove = false;
void markRemove() {
if (remove) {
return;
}
remove = true;
if (subNodes != null) {
for (Node value : subNodes.values()) {
value.markRemove();
}
}
}
}
public List<List<String>> deleteDuplicateFolder(List<List<String>> paths) {
paths.sort(Comparator.comparingInt(List::size));
List<Node> nodes = new ArrayList<>(paths.size());
Node rootNode = new Node();
for (List<String> pathList : paths) {
Node current = rootNode;
int last = pathList.size() - 1;
for (int i = 0; i < last; i++) {
String s = pathList.get(i);
current = current.subNodes.get(s);
}
String name = pathList.get(last);
Node node = new Node();
current.subNodes.put(name, node);
nodes.add(node);
}
StringBuilder content = new StringBuilder();
Map<String, Node> nodeByContent = new HashMap<>();
for (int i = nodes.size() - 1; i >= 0; i--) {
Node node = nodes.get(i);
if (node.subNodes.isEmpty()) {
continue;
}
for (Map.Entry<String, Node> entry : node.subNodes.entrySet()) {
content.append(entry.getKey()).append('{').append(entry.getValue().content).append('}');
}
node.content = content.toString();
content.delete(0, content.length());
Node similar = nodeByContent.putIfAbsent(node.content, node);
if (similar != null) {
node.markRemove();
similar.markRemove();
}
}
List<List<String>> ans = new ArrayList<>();
for (int i = 0; i < paths.size(); i++) {
if (!nodes.get(i).remove) {
ans.add(paths.get(i));
}
}
return ans;
}
}
Approach:
- Sort paths by length (crucial optimization)
- Build nodes incrementally - only leaf nodes initially
- Process signatures bottom-up in reverse order
- Mark duplicates immediately when found
Detailed Comparison
Aspect | Solution 1 (Standard) | Solution 2 (Optimized) |
Trie Building | Complete trie upfront | Incremental, leaf-focused |
Processing Order | Top-down DFS | Bottom-up (reverse path order) |
Memory Usage | Higher (full trie) | Lower (minimal intermediate nodes) |
Duplicate Detection | Count-based after full scan | Immediate marking |
Path Sorting | Not required | Critical - sorts by length |
Code Complexity | More intuitive | More optimized but complex |
Why Solution 2's Optimizations Work
1. Path Length Sorting
paths.sort(Comparator.comparingInt(List::size));
Why this matters: Shorter paths are processed last (in reverse iteration), ensuring parent folders are processed after their children. This enables bottom-up signature generation.
2. Incremental Node Building
// Only creates leaf nodes initially
Node node = new Node();
current.subNodes.put(name, node);
nodes.add(node);
Benefit: Avoids creating intermediate trie nodes that might not be needed.
3. Reverse Processing Order
for (int i = nodes.size() - 1; i >= 0; i--) {
Effect: Processes deepest paths first, building signatures bottom-up naturally.
4. Immediate Duplicate Marking
Node similar = nodeByContent.putIfAbsent(node.content, node);
if (similar != null) {
node.markRemove();
similar.markRemove();
}
Advantage: Marks duplicates immediately upon detection, propagating removal down the subtree.
Key Algorithmic Differences
Solution 1: Two-Phase Processing
- Phase 1: Build complete trie structure
- Phase 2: Generate all signatures via DFS
- Phase 3: Collect results while skipping duplicates
Solution 2: Single-Pass Optimization
- Pre-processing: Sort paths by length
- Building: Create only necessary nodes
- Processing: Bottom-up signature generation with immediate duplicate detection
Complexity Comparison
Solution 1 | Solution 2 | |
Time | O(M + N) | O(M + N + P log P)* |
Space | O(M + N) | O(M + N) |
*P = number of paths, extra log P for sorting
Trade-offs
Solution 1 Advantages:
- ✅ More intuitive - classic trie + DFS approach
- ✅ Easier to understand and debug
- ✅ Standard pattern - follows common trie algorithms
- ✅ No sorting required - works with any input order
Solution 2 Advantages:
- ✅ More memory efficient - builds minimal intermediate structure
- ✅ Immediate duplicate detection - doesn't need to store all signatures
- ✅ Potentially faster in practice due to less overhead
- ✅ Clever optimizations - demonstrates advanced algorithmic thinking
Solution 2 Disadvantages:
- ❌ More complex logic - harder to understand bottom-up processing
- ❌ Requires sorting - additional O(P log P) complexity
- ❌ Less intuitive - the reverse processing is not obvious
Example Walkthrough
For input: [["a","x"], ["a","x","y"], ["b","x"], ["b","x","y"]]
Solution 1 Process:
- Builds complete trie with all nodes
- Generates signatures:
x
nodes get"y()"
,a
/b
nodes get"x(y())"
- Skips nodes with count ≥ 2 during collection
- Result:
[]
(all removed)
Solution 2 Process:
- Sorts:
[["a","x"], ["b","x"], ["a","x","y"], ["b","x","y"]]
- Creates only leaf nodes for each path
- Processes in reverse: marks duplicates immediately
- Result:
[]
(all removed)
Conclusion
Solution 1 is the cleaner, more maintainable approach that follows standard algorithmic patterns. It's what most people would write in an interview.
Solution 2 is a highly optimized version that shows deep understanding of the problem structure. It's more suitable for production where memory efficiency matters.
For interviews: I'd recommend Solution 1 for clarity and explanation. For competitive programming: Solution 2 might have slight performance advantages.
Both solutions are correct and have the same asymptotic complexity, but Solution 2 demonstrates more advanced optimization techniques!
Subscribe to my newsletter
Read articles from Anni Huang directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Anni Huang
Anni Huang
I am Anni HUANG, a software engineer with 3 years of experience in IDE development and Chatbot.