My Homelab Journey: Paperless-ngx - The Document Management Game-Changer - Part 5

Ayoub ToubaAyoub Touba
4 min read

Hello again!

Welcome to Part 5 of my homelab journey series! After setting up my Lenovo M920 Tiny with Proxmox (as you saw in Part 4), I've been exploring various self-hosted applications. Today, I want to share what has become my absolute favorite service: Paperless-ngx! 😄

Why Paperless-ngx Changed My Life

If you're anything like me, you've experienced that moment of panic when you need an important document—insurance certificate, warranty, medical record—and have absolutely no idea where you put it. Before Paperless-ngx, my "filing system" consisted of various folders, drawers.

Paperless-ngx solved this problem so elegantly. As their website puts it: "Paperless-ngx is a community-supported open-source document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper."

What Makes Paperless-ngx So Powerful?

1. Comprehensive Document Organization

Paperless-ngx doesn't just store your documents—it makes them truly searchable and organized:

  • Full-text search: Find any document by searching its contents, not just the filename

  • Tagging system: Create custom tags for easy categorization

  • Custom fields: My favorite feature! I created a "Physical Location" field so I can note exactly where the original document is stored in my home

  • Automatic document parsing: It can extract dates, correspondents, and other information automatically

2. Smart Document Processing

The OCR (Optical Character Recognition) capabilities are impressive:

  • Converts scanned images to searchable text

  • Supports multiple languages (I use French and English)

  • Can automatically categorize documents based on content

  • Creates thumbnails for easy visual identification

3. Flexible Input Methods

Getting documents into the system is remarkably flexible:

  • Direct upload through the web interface

  • Mobile app for scanning on the go

  • Watch folder for automatic importing

My Paperless Workflow

After some experimentation, I've settled on a workflow that works perfectly for me:

  1. Document scanning: I use my phone to scan documents via the Google Drive app (it has excellent document scanning capabilities)

  2. Automatic import: A Node.js script I wrote monitors my Google Drive folder hourly and copies new documents to Paperless-ngx's consumption folder

  3. Processing: Paperless-ngx automatically OCRs the documents and makes an initial attempt at classification

  4. Organization: Once a week, I log into the dashboard to verify and adjust tags, categories, and add any custom field information

  5. Search and retrieve: Whenever I need a document, a quick search brings it up instantly!

The mobile app has been a game-changer for quick document retrieval. No more digging through filing cabinets when I need to reference something while on the phone with my insurance company!

Setting Up Paperless-ngx

I won't sugar-coat it—setting up Paperless-ngx wasn't as straightforward as some of my other services. But the effort was absolutely worth it! Here's the Docker Compose configuration that's been working flawlessly for me:

services:
  broker:
    image: docker.io/library/redis:7
    restart: unless-stopped
    volumes:
      - /data/paperless/redis:/data

  db:
    image: docker.io/library/postgres:16.0-alpine
    restart: unless-stopped
    volumes:
      - /data/paperless/db:/var/lib/postgresql/data
    environment:
      POSTGRES_DB: paperless
      POSTGRES_USER: paperless
      POSTGRES_PASSWORD: paperless

  webserver:
    image: ghcr.io/paperless-ngx/paperless-ngx:latest
    restart: unless-stopped
    depends_on:
      - db
      - broker
    ports:
      - "8033:8000"
    volumes:
      - /data/paperless/data:/usr/src/paperless/data
      - /data/paperless/media:/usr/src/paperless/media
      - /data/paperless/export:/usr/src/paperless/export
      - /data/paperless/consume:/usr/src/paperless/consume
    environment:
      PAPERLESS_ADMIN_USER: "your_username"
      PAPERLESS_ADMIN_PASSWORD: "your_secure_password"
      PAPERLESS_REDIS: redis://broker:6379
      PAPERLESS_DBHOST: db
      USERMAP_UID: 1000
      USERMAP_GID: 1000
      PAPERLESS_SECRET_KEY: "generate_a_long_random_string_here"
      PAPERLESS_TIME_ZONE: Europe/Paris
      PAPERLESS_OCR_LANGUAGE: fra+eng
      PAPERLESS_URL: "https://your.domain.or.ip"
      PAPERLESS_CSRF_TRUSTED_ORIGINS: "https://your.domain.or.ip"
      PAPERLESS_CORS_ALLOWED_HOSTS: "your.domain.or.ip"

A few important configuration notes:

  • Make sure to replace the placeholder values with your actual username, password, and domain/IP

  • The PAPERLESS_OCR_LANGUAGE setting is crucial—set it to the languages your documents use

  • The consume folder is where new documents will be picked up automatically

  • You'll need to adjust the USERMAP_UID and USERMAP_GID to match your system's user/group IDs

Tips From My Experience

After using Paperless-ngx for several months, I've gathered some helpful insights:

  1. Invest time in creating a tagging system that makes sense for your documents before importing too many files

  2. Set up document types early to help with automatic classification

  3. Use custom fields creatively—I track physical location, importance level, and action required

  4. Configure corresponding parties for common senders like your bank, insurance, etc.

Of all the services in my homelab, Paperless-ngx has had the most tangible impact on my day-to-day life. The time savings from not hunting for documents, the peace of mind from having everything securely backed up, and the satisfaction of a well-organized system make it the crown jewel of my setup.

If you're on the fence about setting up a document management system, I can't recommend Paperless-ngx enough. Yes, there's an initial time investment to scan and organize your backlog of documents, but the long-term benefits are enormous.

0
Subscribe to my newsletter

Read articles from Ayoub Touba directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ayoub Touba
Ayoub Touba

With over a decade of hands-on experience, I specialize in building robust web applications and scalable software solutions. My expertise spans across cutting-edge frameworks and technologies, including Node.js, React, Angular, Vue.js, and Laravel. I also delve into hardware integration with ESP32 and Arduino, creating IoT solutions.