🧠 Build a No-Code OCR App Using OCI Generative AI and Streamlit

Mukund MuraliMukund Murali
3 min read

👋 Introduction If you're a developer, cloud architect, or AI enthusiast who liked llama OCR, this post is for you. In this tutorial, you'll build a simple, llama OCR (Optical Character Recognition)** web app that:

  • Uses OCI Generative AI’s Vision LLMs for meta

  • Extracts structured text from images (like receipts, scanned forms)

  • Runs locally on your machine with Streamlit

  • Doesn’t require any front-end coding

🎯 Who Is This For?

This app is ideal for:

  • Oracle Cloud (OCI) users exploring GenAI Vision models

  • Developers who want to prototype document AI tools

  • Data teams that frequently handle scanned invoices, receipts, or forms

  • Anyone looking to integrate LLM + OCR in a no-code UI

⚙️ What You'll Build

A web UI that allows you to:

  • Upload an image (receipt, invoice, screenshot) in the application.

  • Get the extracted Markdown output from the image using LLM

  • View and copy the structured text

🏛️ Why OCI Gen AI service?

While there are many cloud providers offering Generative AI services, here’s why Oracle Cloud Infrastructure (OCI) stands out for this project:

🔒 Enterprise-Grade Security

Oracle offers built-in data residency, encryption, and compartment isolation, making it ideal for industries handling sensitive scanned documents like finance, healthcare, or government.

💰 Cost-Effective GenAI

OCI’s flexible pricing and pay-as-you-go options for Generative AI inference make it more affordable than comparable solutions on other hyperscalers.

⚙️ Native LLM Support

OCI hosts open-source LLMs like Meta LLaMA 3, Mistral, and Command R directly in your region. This means lower latency and no external API keys or SaaS contracts needed.

🧱 Full Stack Integration

From Object Storage to Data Science, Monitoring, and Vault, everything plugs in natively. You can integrate OCR output with downstream analytics, automation, or data pipelines with ease.

🔧 Developer Friendly

  • One config file (~/.oci/config)

  • No API gateway hoops

  • Full Python SDK and CLI support

  • IAM-based access control If you’re already on OCI, this setup takes you from raw image to structured text in minutes.

🛠 Prerequisites

  • ✅ ~OCI CLI configured~ (~/.oci/config)

  • ✅ Access to OCI Generative AI Service in of the regions mentioned below

  • ✅ A Vision-capable model deployed (like meta.llama-3.2-90b-vision-instruct)

  • ✅ Python 3.8+

  • ✅ Install required Python packages

Regions with Generative AI

Region NameLocationRegion IdentifierRegion Key
Brazil East (Sao Paulo)Sao Paulosa-saopaulo-1GRU
Germany Central (Frankfurt)Frankfurteu-frankfurt-1FRA
Japan Central (Osaka)Osakaap-osaka-1KIX
UAE East (Dubai)Dubaime-dubai-1DXB
UK South (London)Londonuk-london-1LHR
US Midwest (Chicago)Chicagous-chicago-1ORD

💻 Setting Up a Virtual Environment (Windows & macOS/Linux)

Creating a virtual environment helps isolate dependencies and ensures your Streamlit OCR app doesn’t interfere with other Python projects on your system.

For Windows:

Open Command Prompt (cmd) or PowerShell:

#Navigate to your project folder
cd path\\to\\your\\project

#Create a virtual environment
python -m venv venv

#Activate the virtual environment
venv\\Scripts\\activate

#Install dependencies
pip install streamlit oci

For macOS/Linux:

Open Terminal:

#Navigate to your project directory
cd ~/path/to/your/project

#Create a virtual environment
python3 -m venv venv

# Activate the virtual environment
source venv/bin/activate

# Install dependencies
pip install streamlit oci

Once you're done, launch the app with:

streamlit run ocr_vision_app.py

Find the full code in the github : mukundmurali-mm/llama-ocr-oci: Use llama OCR for extracting texts using OCI Gen AI services

0
Subscribe to my newsletter

Read articles from Mukund Murali directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Mukund Murali
Mukund Murali