Enhance Your Summarization: Multi-Input App Built with Hugging Face API and Gradio

π§ Introduction
Reading long documents, blogs, or articles can be time-consuming. What if you could get the gist of any content from a textbook paragraph to a web page in just a few seconds?
Thanks to Hugging Face's Inference API, combined with Gradio, summarizing large text inputs is easier than ever. Here, we shall build a clean, simple web app where we can:
Type or paste raw text
Upload a
.txt
fileEnter a webpage URL
All with a single goal: βgenerate a concise summary instantlyβ.
π What Is Text Summarization?
Text summarization is the process of automatically generating a shorter version of a longer text while retaining its essential meaning.
There are two main types:
Extractive: Picks and rearranges key sentences.
Abstractive: Generates new, shorter phrases like how humans summarize.
This app uses abstractive summarization through the Hugging Face model: facebook/bart-large-cnn
.
π€ Why Hugging Face + Gradio?
Hugging Face Inference API gives access to powerful pre-trained models β no training or hosting needed.
Gradio lets you build interfaces with just a few lines of Python code.
You can deploy the entire app in Hugging Face Spaces and get a public URL to share.
Useful for:
Summarizing notes or articles
Extracting key points from papers
Integrating NLP into projects
π§ What is BART?
BART (Bidirectional and Auto-Regressive Transformers) is a hybrid model introduced by Facebook AI. It merges two popular architectures:
BERT β Good at understanding text by reading it bidirectionally
GPT β Good at generating text by predicting the next word in a sequence
π§© How BART Works?
BART learns by intentionally damaging the input and training itself to fix it similar to solving a jumbled puzzle.
π Why BART?
Because BART is trained to reconstruct meaningful text from noisy input, it excels at summarization tasks. It can:
Extract the core idea of a paragraph
Rewrite it fluently
Keep the original meaning intact
π» Summarize from Any Input
Our app lets users choose from three input types:
βοΈ Typed Text: Paste any paragraph
π .txt File Upload: Summarize content inside uploaded
.txt
filesπ Webpage URL: Enter any article/blog URL
π Working Code
import gradio as gr
from bs4 import BeautifulSoup
import requests
from transformers import pipeline
# Load summarization pipeline
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
# Function to extract text from a webpage
def fetch_url_text(url):
try:
headers_req = {'User-Agent': 'Mozilla/5.0'}
response = requests.get(url, headers=headers_req, timeout=10)
response.raise_for_status()
soup = BeautifulSoup(response.text, "html.parser")
text = soup.get_text(separator=" ", strip=True)
text = " ".join(text.split())
if len(text) < 100:
return None, "β Extracted text from the webpage is too short to summarize."
return text, None
except Exception as e:
return None, f"β URL error: {e}"
# Summarization function
def summarize_text(text_input, file_upload, url_input):
text = ""
if file_upload:
try:
with open(file_upload.name, "r", encoding="utf-8") as f:
text = f.read()
except Exception as e:
return f"β File read error: {e}"
elif url_input:
text, error_msg = fetch_url_text(url_input)
if error_msg:
return error_msg
elif text_input:
text = text_input
else:
return "β οΈ Please provide some input."
try:
summary = summarizer(text[:1024], max_length=150, min_length=30, do_sample=False)
return summary[0]["summary_text"]
except Exception as e:
return f"β Summarization error: {e}"
# Gradio Interface
demo = gr.Interface(
fn=summarize_text,
inputs=[
gr.Textbox(label="βοΈ Enter Text", lines=4, placeholder="Paste or type text here..."),
gr.File(label="π Upload a .txt File", file_types=[".txt"]),
gr.Textbox(label="π Enter Webpage URL", placeholder="https://example.com/article")
],
outputs="text",
title="π§ Multi-Input Text Summarizer",
description="Summarize content from text, uploaded files, or web URLs using the BART model."
)
demo.launch()
π Code Explanation
Imports
gradio
: For building the UIBeautifulSoup
&requests
: For extracting text from webpagespipeline
fromtransformers
: To load the summarization model
Summarization Pipeline
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
Loads the pre-trained BART model optimized for summarizing long news-like articles.
Webpage Text Extraction
fetch_url_text(url)
Sends an HTTP request to the webpage
Uses BeautifulSoup to extract all visible text
Cleans up extra whitespace
Returns error if text is too short
Summarization Function
summarize_text(text_input, file_upload, url_input)
Determines the input type (text box, file, or URL)
Extracts text accordingly
Passes the text (max 1024 characters) to the
summarizer
Returns the generated summary
Gradio Interface
gr.Interface(...)
Creates an app with three inputs and a single text output
Launches a web app where users can try the summarizer easily
π Supported Inputs
Input Type | Details / Limitations |
Text Box | Up to 1024 characters summarized per input |
File Upload | Only .txt files supported, UTF-8 encoded |
Web URL | Must be a clean, HTML-readable webpage with enough content |
β οΈ Limitations
If no Torch backend is available, the pipeline wonβt run (use Spaces with PyTorch or Colab)
URLs with dynamic content (like JavaScript-based pages) may fail
Summarizer is trained on English and is not great with other languages
π¦ How to Deploy on Hugging Face Spaces
Create a new Gradio space
Upload
app.py
with the above codeAdd a
requirements.txt
with:gradio transformers torch requests beautifulsoup4
Set the hardware to GPU if available for faster results
Click Commit and Deploy
Try out the sample app at:
https://huggingface.co/spaces/divivetri/text_summarization
πSample Code for JavaScript-rendered webpages
Most websites today are built with JavaScript, meaning their content is rendered dynamically after the page loads. Standard Python tools like requests
+ BeautifulSoup
can only access the raw HTML and often miss key content. To fix this, we use Selenium, a headless browser automation tool, to fully load JavaScript-powered pages and extract visible text, making our summarization app much more powerful and real-world ready.
If you want to try out summarization on JS rendered webpages, do try this code. This works on a GPU environment and so the free spaces will to be of help. Do try the code in Colab and choose GPU.
TIP: Try running the code as separate cells for easy execution
!pip install gradio transformers torch selenium beautifulsoup4
!apt-get update
!apt install -y chromium-chromedriver
!cp /usr/lib/chromium-browser/chromedriver /usr/bin
import sys
sys.path.insert(0, '/usr/lib/chromium-browser/chromedriver')
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup
import requests
import gradio as gr
from transformers import pipeline
import time
# Load BART summarization pipeline
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
# Function to extract text from JS-enabled webpages
def fetch_url_text(url):
try:
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--disable-dev-shm-usage")
chrome_options.add_argument("--no-sandbox")
driver = webdriver.Chrome(options=chrome_options)
driver.get(url)
time.sleep(5) # Wait for JS to load
html = driver.page_source
driver.quit()
soup = BeautifulSoup(html, "html.parser")
text = soup.get_text(separator=" ", strip=True)
text = " ".join(text.split())
if len(text) < 100:
return None, "β Extracted text is too short to summarize."
return text, None
except Exception as e:
return None, f"β URL fetch error: {e}"
# Main function to handle all inputs
def summarize_text(text_input, file_upload, url_input):
text = ""
if file_upload:
try:
text = file_upload.read().decode("utf-8")
except Exception as e:
return f"β File read error: {e}"
elif url_input:
text, error_msg = fetch_url_text(url_input)
if error_msg:
return error_msg
elif text_input:
text = text_input
else:
return "β οΈ Please provide some input."
try:
summary = summarizer(text[:1024], max_length=150, min_length=30, do_sample=False)
return summary[0]["summary_text"]
except Exception as e:
return f"β Summarization error: {e}"
demo = gr.Interface(
fn=summarize_text,
inputs=[
gr.Textbox(label="βοΈ Enter Text", lines=4, placeholder="Paste or type text here..."),
gr.File(label="π Upload a .txt File", file_types=[".txt"]),
gr.Textbox(label="π Enter Webpage URL", placeholder="https://example.com/article")
],
outputs="text",
title="π§ Smart Text Summarizer with JS Page Support",
description="Summarize content from text, files, or JavaScript-rendered webpages using Hugging Face's BART model."
)
demo.launch(share=True)
π Code Explanation
Selenium for Full Webpage Rendering
from selenium import webdriver from selenium.webdriver.chrome.options import Options
Configures a headless Chrome browser
Loads the page like a real browser would, executing JavaScript
Dynamic Content Extraction
driver.get(url) html = driver.page_source soup = BeautifulSoup(html, "html.parser")
Loads the full page source after JavaScript execution
BeautifulSoup extracts readable text from the fully rendered HTML
Summarization Remains the Same
summary = summarizer(text[:1024], max_length=150, min_length=30)
- Uses the Hugging Face BART model for summarizing the text
Gradio Interface
Allows the user to paste a URL, enter text manually, or upload a file
Results are shown instantly in the browser
π References
BART Model for Summarization
Lewis, M., Liu, Y., Goyal, N., et al. (2019). BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. arXiv preprint arXiv:1910.13461.
Hugging Face Model Page: https://huggingface.co/facebook/bart-large-cnn
Transformers Library (Hugging Face)
Documentation: https://huggingface.co/docs/transformers/index
Selenium for Python
Official Documentation: https://www.selenium.dev/documentation/webdriver/
PyPI Package: https://pypi.org/project/selenium/
ChromeDriver Setup: https://chromedriver.chromium.org/
BeautifulSoup (bs4)
- Documentation: https://www.crummy.com/software/BeautifulSoup/bs4/doc/
Gradio β Build ML Apps Quickly
Website: https://gradio.app
GitHub Repo: https://github.com/gradio-app/gradio
π Wrap-Up
Youβve just built a smart summarizer that works with text, files, and even web pages all without training a model! π‘ Thanks to BART and Gradio, turning long content into clear, concise summaries is now just a click away.
π Try it out, explore its limits, and let AI do the reading for you! Build it. Run it. Summarize it!!!
Subscribe to my newsletter
Read articles from Divya Vetriveeran directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Divya Vetriveeran
Divya Vetriveeran
I am currently serving as an Assistant Professor at CHRIST (Deemed to be University), Bangalore. With a Ph.D. in Information and Communication Engineering from Anna University and ongoing post-doctoral research at the Singapore Institute of Technology, her expertise lies in Ethical AI, Edge Computing, and innovative teaching methodologies. I have published extensively in reputed international journals and conferences, hold multiple patents, and actively contribute as a reviewer for leading journals, including IEEE and Springer. A UGC-NET qualified educator with a computer science background, I am committed to fostering impactful research and technological innovation for societal good.