Want to build a tool that can read text from images — like scanned documents, screenshots, IDs, or invoices?

With Python and Tesseract OCR, you can extract text from almost any image. This post will guide you through:

Installing Tesseract and pytesseract
Preprocessing images for better accuracy
Extracting and printing clean text
Bonus: extract text from a batch of images

🧠 What is OCR?

OCR stands for Optical Character Recognition — the process of converting images of typed or handwritten text into machine-readable text.

We’ll use Google’s open-source OCR engine: Tesseract, along with the Python wrapper pytesseract.

⚙️ Step 1: Install Requirements

1. Install Tesseract engine:

Windows: https://github.com/UB-Mannheim/tesseract/wiki
macOS: brew install tesseract
Linux: sudo apt install tesseract-ocr

2. Install Python packages:

bashCopyEditpip install pytesseract pillow

📄 Step 2: Basic Text Extraction

pythonCopyEditfrom PIL import Image
import pytesseract

img = Image.open("example.png")
text = pytesseract.image_to_string(img)

print("Extracted text:\n", text)

🔍 Step 3: Preprocess for Better Accuracy

OCR works best with clean, high-contrast images. Let’s convert to grayscale and increase contrast:

pythonCopyEditfrom PIL import ImageOps

def preprocess_image(path):
    img = Image.open(path)
    img = img.convert("L")  # Grayscale
    img = ImageOps.autocontrast(img)
    return img

img = preprocess_image("example.png")
text = pytesseract.image_to_string(img)

🗂 Bonus: Batch Process a Folder of Images

pythonCopyEditimport os

def batch_ocr(folder):
    for file in os.listdir(folder):
        if file.endswith((".png", ".jpg", ".jpeg")):
            img = preprocess_image(os.path.join(folder, file))
            text = pytesseract.image_to_string(img)
            print(f"\n--- {file} ---\n{text}")

💡 Extra Features You Can Add

Save results to a .txt or .csv file
Use OpenCV to denoise, threshold, or rotate skewed images
Extract text within bounding boxes (pytesseract.image_to_data)
Build a GUI with Tkinter to select and scan images interactively

🧵 Wrapping Up

OCR in Python is simpler than you might expect — with just a few lines of code, you can turn image files into usable text data.

This is great for:

Invoice scanning
Document digitization
Screenshot logging
Passport/ID processing
Building AI datasets

Extract Text from Images in Python Using Tesseract OCR