Building Nadian: AI-Powered Newsletter for Automated News Summarization

Life JourneyLife Journey
4 min read

๐Ÿ”— Nadian Newsletter


Planning Intent

1. Why did I create this service?

I enjoy reading news articles, but I found that domestic news(South Korea) sources often have limited topics and filtered information, resulting in missing details.

Because of this, I started reading multiple foreign news sources, but I encountered the following issues:

Problems I faced:

  1. Reading articles in English takes too much time
  2. Even after reading, there's too much additional information, making it difficult to grasp the key points
  3. After spending time reading, I realized the topic wasn't relevant to my interests

โ†’ It takes too long to extract the essential information.


2. The Solution

To solve these issues, I considered the following solutions:

  • Would it be possible to automatically translate foreign news into Korean?
  • Can I summarize the core points while still keeping important background information?
  • Could I automatically filter and collect only the news in my areas of interest?

3. Final Goal

Based on these thoughts, I designed the following service:

  1. Collecting articles from foreign news sources in my field of interest (Technology)
  2. Summarizing key points and translating them into Korean
  3. Adding important background information to help understand the core content
  4. Delivering the summarized news as a newsletter every day at a scheduled time

I believed that other people might also face similar challenges, so I decided to develop this as a service.


Development Process

1. Choosing the Execution Environment

First, I considered where to run the logic:

  • Should I run it in a local environment?
  • Should I run it on a cloud server?

Conclusion: I chose a cloud server because the system needs to execute at a fixed time every day, regardless of whether my local environment is active.


2. Choosing a Cloud Server

After deciding on cloud deployment, I had to consider:

  • Which cloud server should I use?
  • What type of CPU and storage is needed?

Choice: Amazon EC2

Why EC2?

  1. The logic is not complex โ†’ High-performance CPUs are unnecessary (mainly text processing tasks)
  2. Minimal storage required โ†’ Only subscriber information and article data need to be stored
  3. Free Tier availability โ†’ Low-cost instances like t2.micro and t3.micro are available

What is the Free Tier?

  • New Amazon accounts get Free Tier benefits for 12 months
  • EC2 offers 750 hours/month of t2.micro for free
    • Meaning you can run one instance 24/7 for an entire month
  • Storage (EBS) also includes 30GB free

3. Retrieving Article Data

There were two possible methods:

  1. Using an API
  2. Web Scraping

๐Ÿ’ก Web scraping is often prohibited by websites, so I decided to look for news sources that provide APIs.

Checking which news sources provide APIs

News OutletAPI Support
The New York TimesโŒ Only provides headlines & summaries (no full text)
The Guardianโœ… Full article text available via API

Conclusion: I decided to use The Guardian's API to fetch Technology section articles.

  • To ensure up-to-date news, I retrieve 10 articles daily
  • Since some users may not read the newsletter every day, I collect extra articles

4. Article Summarization & Translation

Target Output

  1. Explain key concepts
  2. Provide a well-structured summary
  3. Include both the original article & Korean translation

Tech Stack Used

  1. Microsoft Azure

    • Summarization: โŒ Extractive summarization โ†’ poor readability
    • Translation: โŒ Not bad, but not satisfying enough
  2. Hugging Face AI Models

    • Summarization: โœ… facebook/bart-large-cnn performed best
    • Translation: โœ… High-quality results, but RAM requirements were an issue
    • Problem: EC2 t2.micro only has 1GB RAM, but the model requires at least 4GB RAM
  3. Gemini API (Final Choice)

    • โœ… Excellent summarization performance
    • โœ… Good translation quality
    • โœ… Free API usage quota is sufficient
    • โ†’ Final decision: Google Gemini 1.5 Flash Model

5. Deploying on EC2

  • Runs in a Python 3.9 virtual environment
  • Fetches articles from The Guardian API
  • Summarizes & translates using Gemini API
  • Scheduled execution on EC2 via cron jobs

6. Purchasing a Domain

  • To send emails, I needed to authenticate SPF, DKIM, and DMARC, which required DNS registration
  • After comparing domain providers, I purchased lifejourney.dev from Dynadot

7. Choosing an Email Service

  • Options: Amazon SES, Brevo, Mailchimp
  • Amazon SES is cost-effective, but new accounts require approval from AWS Trust & Safety Team
  • Chose Brevo, which offers 300 free emails per day

๐Ÿ“ฉ Want to try it? Subscribe for free:
๐Ÿ”— https://nadian-newsletter.lifejourney.dev ๐Ÿš€

0
Subscribe to my newsletter

Read articles from Life Journey directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Life Journey
Life Journey