LeetCode Daily Challenge-1233. Remove Sub-Folders from the Filesystem(Med, Java)

Anni HuangAnni Huang
3 min read

Problem Statement

Given an array of folder paths, remove all sub-folders from the filesystem and return the folders after removing all sub-folders in any order.

A folder x is a sub-folder of folder y if x starts with y followed by a /. For example, /a/b is a sub-folder of /a, but /ab is not a sub-folder of /a.

Example:

  • Input: ["/a","/a/b","/c/d","/c/d/e","/c/f"]
  • Output: ["/a","/c/d","/c/f"]

Algorithm Walkthrough

Solution 1: Sorting + String Prefix Check

Core Idea: Sort folders lexicographically so parent folders appear before their subfolders, then use string prefix matching.

Steps:

  1. Sort the array lexicographically - This ensures /a comes before /a/b, /a/b/c, etc.
  2. Initialize result with first folder - The lexicographically smallest folder cannot be a subfolder of anything
  3. For each remaining folder:
    • Get the last added folder and append / to it
    • Check if current folder starts with this prefix
    • If NOT a prefix match, add current folder to result (it's a parent folder)
    • If IS a prefix match, skip it (it's a subfolder)

Example trace:

Input: ["/a/b", "/a", "/c/d", "/c/d/e", "/c/f"]
After sorting: ["/a", "/a/b", "/c/d", "/c/d/e", "/c/f"]

i=0: ans = ["/a"]
i=1: lastFolder = "/a/", "/a/b".startsWith("/a/") = true → skip
i=2: lastFolder = "/a/", "/c/d".startsWith("/a/") = false → add, ans = ["/a", "/c/d"]
i=3: lastFolder = "/c/d/", "/c/d/e".startsWith("/c/d/") = true → skip
i=4: lastFolder = "/c/d/", "/c/f".startsWith("/c/d/") = false → add, ans = ["/a", "/c/d", "/c/f"]

Solution 2: Trie (Prefix Tree)

Core Idea: Build a trie to efficiently check if a folder is a subfolder of any previously processed folder.

Trie Structure:

  • Each node has 27 children (a-z + '/' character)
  • isEndOfWord marks the end of a valid folder path
  • Special handling: append '/' to each inserted path for proper subfolder detection

Steps:

  1. Sort by length - Process shorter paths first (potential parents)
  2. For each folder:
    • Search if current folder is a subfolder using trie
    • If NOT a subfolder: insert into trie and add to result
    • If IS a subfolder: skip it

Key Trie Operations:

  • insert(path): Adds path + '/' to trie, marking endpoint
  • search(path): Returns true if path is a subfolder of any inserted path
  • During search, if we encounter isEndOfWord before completing the path, it means current path is a subfolder

Example trace:

Input: ["/a", "/a/b", "/c/d", "/c/d/e", "/c/f"]
After sorting by length: ["/a", "/c/d", "/c/f", "/a/b", "/c/d/e"]

Process "/a": search() = false → insert, ans = ["/a"]
Process "/c/d": search() = false → insert, ans = ["/a", "/c/d"]  
Process "/c/f": search() = false → insert, ans = ["/a", "/c/d", "/c/f"]
Process "/a/b": search() = true (finds "/a/" prefix) → skip
Process "/c/d/e": search() = true (finds "/c/d/" prefix) → skip

Complexity Analysis

Solution 1: Sorting + String Check

  • Time Complexity: O(n log n + n·m)
    • O(n log n) for sorting the array
    • O(n·m) for string comparisons, where m is average path length
  • Space Complexity: O(1) excluding output array
    • Only uses constant extra space

Solution 2: Trie

  • Time Complexity: O(n log n + n·m)
    • O(n log n) for sorting by length
    • O(n·m) for trie operations (insert/search), where m is average path length
  • Space Complexity: O(n·m)
    • Trie storage can be significant, worst case stores all characters of all paths

Trade-offs:

  • Solution 1 is simpler and more space-efficient
  • Solution 2 offers better theoretical performance for very long paths or when string operations are expensive, but uses more memory
  • For most practical cases, Solution 1 is preferred due to its simplicity and better space complexity
0
Subscribe to my newsletter

Read articles from Anni Huang directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Anni Huang
Anni Huang

I’m Anni Huang, an AI researcher-in-training currently at ByteDance, specializing in LLM training operations with a coding focus. I bridge the gap between engineering execution and model performance, ensuring the quality, reliability, and timely delivery of large-scale training projects.