My Homelab Journey: Paperless-ngx - The Document Management Game-Changer - Part 5


Hello again!
Welcome to Part 5 of my homelab journey series! After setting up my Lenovo M920 Tiny with Proxmox (as you saw in Part 4), I've been exploring various self-hosted applications. Today, I want to share what has become my absolute favorite service: Paperless-ngx! 😄
Why Paperless-ngx Changed My Life
If you're anything like me, you've experienced that moment of panic when you need an important document—insurance certificate, warranty, medical record—and have absolutely no idea where you put it. Before Paperless-ngx, my "filing system" consisted of various folders, drawers.
Paperless-ngx solved this problem so elegantly. As their website puts it: "Paperless-ngx is a community-supported open-source document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper."
What Makes Paperless-ngx So Powerful?
1. Comprehensive Document Organization
Paperless-ngx doesn't just store your documents—it makes them truly searchable and organized:
Full-text search: Find any document by searching its contents, not just the filename
Tagging system: Create custom tags for easy categorization
Custom fields: My favorite feature! I created a "Physical Location" field so I can note exactly where the original document is stored in my home
Automatic document parsing: It can extract dates, correspondents, and other information automatically
2. Smart Document Processing
The OCR (Optical Character Recognition) capabilities are impressive:
Converts scanned images to searchable text
Supports multiple languages (I use French and English)
Can automatically categorize documents based on content
Creates thumbnails for easy visual identification
3. Flexible Input Methods
Getting documents into the system is remarkably flexible:
Direct upload through the web interface
Mobile app for scanning on the go
Watch folder for automatic importing
My Paperless Workflow
After some experimentation, I've settled on a workflow that works perfectly for me:
Document scanning: I use my phone to scan documents via the Google Drive app (it has excellent document scanning capabilities)
Automatic import: A Node.js script I wrote monitors my Google Drive folder hourly and copies new documents to Paperless-ngx's consumption folder
Processing: Paperless-ngx automatically OCRs the documents and makes an initial attempt at classification
Organization: Once a week, I log into the dashboard to verify and adjust tags, categories, and add any custom field information
Search and retrieve: Whenever I need a document, a quick search brings it up instantly!
The mobile app has been a game-changer for quick document retrieval. No more digging through filing cabinets when I need to reference something while on the phone with my insurance company!
Setting Up Paperless-ngx
I won't sugar-coat it—setting up Paperless-ngx wasn't as straightforward as some of my other services. But the effort was absolutely worth it! Here's the Docker Compose configuration that's been working flawlessly for me:
services:
broker:
image: docker.io/library/redis:7
restart: unless-stopped
volumes:
- /data/paperless/redis:/data
db:
image: docker.io/library/postgres:16.0-alpine
restart: unless-stopped
volumes:
- /data/paperless/db:/var/lib/postgresql/data
environment:
POSTGRES_DB: paperless
POSTGRES_USER: paperless
POSTGRES_PASSWORD: paperless
webserver:
image: ghcr.io/paperless-ngx/paperless-ngx:latest
restart: unless-stopped
depends_on:
- db
- broker
ports:
- "8033:8000"
volumes:
- /data/paperless/data:/usr/src/paperless/data
- /data/paperless/media:/usr/src/paperless/media
- /data/paperless/export:/usr/src/paperless/export
- /data/paperless/consume:/usr/src/paperless/consume
environment:
PAPERLESS_ADMIN_USER: "your_username"
PAPERLESS_ADMIN_PASSWORD: "your_secure_password"
PAPERLESS_REDIS: redis://broker:6379
PAPERLESS_DBHOST: db
USERMAP_UID: 1000
USERMAP_GID: 1000
PAPERLESS_SECRET_KEY: "generate_a_long_random_string_here"
PAPERLESS_TIME_ZONE: Europe/Paris
PAPERLESS_OCR_LANGUAGE: fra+eng
PAPERLESS_URL: "https://your.domain.or.ip"
PAPERLESS_CSRF_TRUSTED_ORIGINS: "https://your.domain.or.ip"
PAPERLESS_CORS_ALLOWED_HOSTS: "your.domain.or.ip"
A few important configuration notes:
Make sure to replace the placeholder values with your actual username, password, and domain/IP
The
PAPERLESS_OCR_LANGUAGE
setting is crucial—set it to the languages your documents useThe
consume
folder is where new documents will be picked up automaticallyYou'll need to adjust the
USERMAP_UID
andUSERMAP_GID
to match your system's user/group IDs
Tips From My Experience
After using Paperless-ngx for several months, I've gathered some helpful insights:
Invest time in creating a tagging system that makes sense for your documents before importing too many files
Set up document types early to help with automatic classification
Use custom fields creatively—I track physical location, importance level, and action required
Configure corresponding parties for common senders like your bank, insurance, etc.
Of all the services in my homelab, Paperless-ngx has had the most tangible impact on my day-to-day life. The time savings from not hunting for documents, the peace of mind from having everything securely backed up, and the satisfaction of a well-organized system make it the crown jewel of my setup.
If you're on the fence about setting up a document management system, I can't recommend Paperless-ngx enough. Yes, there's an initial time investment to scan and organize your backlog of documents, but the long-term benefits are enormous.
Subscribe to my newsletter
Read articles from Ayoub Touba directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Ayoub Touba
Ayoub Touba
With over a decade of hands-on experience, I specialize in building robust web applications and scalable software solutions. My expertise spans across cutting-edge frameworks and technologies, including Node.js, React, Angular, Vue.js, and Laravel. I also delve into hardware integration with ESP32 and Arduino, creating IoT solutions.