Fylex: A Smarter, Faster Way to Clean and Copy Files with Python

Let me guess. You’ve got files named things like:

report_final(1).docx
vacation-final-final.jpg
copy_of_copy_of_final_draft.pdf

Sound familiar?

I’ve been there. My Downloads folder was a digital landfill—overflowing with duplicates, mismatched names, and backups of backups. The problem? I didn’t want to manually clean hundreds of files or rely on clunky GUI tools that didn’t give me control.

So I built a Python tool that thinks before it copies, and cleans like it’s caffeinated. Meet fylex.

What Is fylex?

fylex is a Python-powered file management utility focused on speed, clarity, and control.

It helps you:

Detect and remove duplicates using fast hashing
Copy and move files with rules and safety checks
Categorize files by extension, size, or custom patterns
Handle conflicts intelligently (rename, skip, replace, etc.)
Use dry-run and interactive modes to stay safe

And it does all of this without bloated GUIs. It’s designed for devs, power users, and automation enthusiasts.

Why I Built It

I was tired of writing one-off scripts every time I needed to clean up a directory. Tools like shutil didn’t help when I wanted to:

Avoid copying duplicates
Detect renamed but identical files
Organize large dumps of media or documents

So I turned this frustration into fylex, a CLI-optional library you can call directly from Python.

Detecting Duplicates — The Smart Way

Unlike traditional scripts that use SHA1 or MD5, fylex uses xxhash, which is lightning fast for large files.

Here’s how to use it:

import fylex as fx

fx.refine(
    target="~/Downloads",
    match_glob="*.mp3",
    recursive_check=True,
    dry_run=False,  # Set to True to preview
    verbose=True
)

This will:

Recursively scan for .mp3 files
Group by file size
Hash candidates using xxhash
Move duplicates to a backup folder instead of deleting them outright

Safety and speed, all in one.

Organize Files by Extension or Size

Need to organize files in a directory dump?

fx.categorize(target="./Documents", categorize_by="ext")

Or, group by file size:

fx.categorize_by_size(
    target=".",
    grouping={
        (0, 1_000_000): "tiny/",
        (1_000_000, 100_000_000): "medium/",
        (100_000_000, "max"): "huge/"
    }
)

Clean organization in seconds — no manual sorting needed.

What Makes fylex Different?

Fast hashing: Uses xxhash instead of slow SHA1 or MD5
Multi-threaded operations: Leverages your CPU for massive speedups
Conflict strategies: Choose from rename, skip, replace, larger, newer, and more
Dry-run mode: See what’ll happen before it does
Regex + glob support: Advanced filtering made simple
Safe deletion: Moves duplicates to a fallback folder, never deletes blindly

It’s not just a tool — it’s a defensive shield against digital clutter.

Real-World Use Cases

Cleaning up duplicate downloads
Tidying up music or photo libraries
Deduplicating backup drives
Automating file categorization in CI pipelines
Building data preprocessing scripts that avoid copying garbage

Installation

Get started in seconds:

pip install fylex

Full documentation and examples on PyPI

Who Should Use fylex?

Developers who want control over file ops
Researchers working with large datasets
Media professionals managing bulky files
Anyone tired of cleaning files manually

If you’ve ever written a Python script to move, rename, or delete files, you’ll appreciate what fylex can do out of the box.

Final Thoughts

I built fylex because I was tired of fighting my filesystem.

It turned into a fast, flexible utility I now use in almost every personal or work project involving file handling. It’s lightweight, safe by default, and battle-tested on real-world chaos like 20-year-old USB backups and messy project dumps.

If that sounds like your situation, fylex might be the utility you never knew you needed.

Give it a try. And if it cleans up even one layer of your file chaos, it’s done its job.

How I Built Fylex: A Fast Python Tool to Clean, Organize, and De-Duplicate Files