Starting with Sentiment Analysis with Python
Hey everyone! I wanted to share this project I recently worked on—it’s all about sentiment analysis using Python. If you've ever been curious about how people feel when they post on social media or wanted to dig into public opinion on a topic, then sentiment analysis is the perfect way to start. For my project, I decided to pull real-time data from Twitter, clean it up, and then analyze the sentiment to see if the tweets were positive, negative, or somewhere in between.
Let me walk you through my journey!
What Inspired This Project?
I’ve always found it fascinating how much you can learn from the way people express themselves online. So, I thought, why not build a tool that helps me uncover the hidden emotions behind those posts? Plus, it was a great way to implement my skills in Python and play around with some cool libraries like Tweepy and TextBlob.
The Game Plan
The project was broken down into a few manageable steps:
Fetch tweets using the Twitter API.
Clean the tweets to remove the extra noise (like URLs, mentions, and retweets).
Analyze the sentiment using TextBlob to understand how positive or negative the tweets were.
Display the results in a way that made sense.
Now, let’s dive into the details!
Step 1: Setting Up the Environment
First things first, I made sure my Python environment was ready. I used Python 3.x and the following libraries:
Tweepy: For fetching tweets from Twitter.
TextBlob: For the actual sentiment analysis.
dotenv: To handle environment variables securely (because, hey, we don’t want to expose our API keys!).
To get everything up and running, I ran:
pip install tweepy textblob python-dotenv
Step 2: Getting Access to the Twitter API
This part was a little bit of an adventure! I signed up for a Twitter Developer account, created an app, and grabbed my API credentials (API key, secret key, access token, etc.).
I didn’t want my credentials floating around in my code, so I put them in a .env file. It looked something like this:
TWITTER_API_KEY=your_api_key
TWITTER_API_SECRET_KEY=your_api_secret_key
TWITTER_ACCESS_TOKEN=your_access_token
TWITTER_ACCESS_TOKEN_SECRET=your_access_token_secret
Then, I used the dotenv
library to load these variables securely in my script:
from dotenv import load_dotenv
import os
load_dotenv()
api_key = os.getenv("TWITTER_API_KEY")
api_secret_key = os.getenv("TWITTER_API_SECRET_KEY")
Step 3: Fetching Tweets with Tweepy
With the credentials in place, I set up Tweepy to fetch tweets based on a keyword. Here’s a quick look at the code:
import tweepy
# Authentication
auth = tweepy.OAuthHandler(api_key, api_secret_key)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
# Fetching tweets
def fetch_tweets(query, count=10):
tweets = api.search_tweets(q=query, lang="en", count=count)
return [tweet.text for tweet in tweets]
# Example call
tweets = fetch_tweets("Python")
Fetching the tweets felt like opening a treasure chest—you never know what gems (or wild takes) you'll find!
Step 4: Cleaning the Data
I knew that raw tweets can be a mess with all the hashtags, mentions, and links, so I wrote a function to tidy them up. This function removed:
URLs
User mentions (e.g.,
@username
)Retweet markers (
RT
)Excess whitespace and stopwords
import re
def clean_tweet(tweet):
tweet = re.sub(r"http\S+", "", tweet) # Remove URLs
tweet = re.sub(r"@\w+", "", tweet) # Remove mentions
tweet = re.sub(r"RT\s", "", tweet) # Remove retweet markers
tweet = re.sub(r"\s+", " ", tweet).strip() # Clean up spaces
return tweet
This made the tweets cleaner and ready for analysis.
Step 5: Analyzing Sentiment with TextBlob
TextBlob is like having a mini text-analysis expert in your pocket. It breaks down the sentiment into:
Polarity: A score between -1 (very negative) and 1 (very positive).
Subjectivity: A score between 0 (very factual) and 1 (very subjective).
from textblob import TextBlob
def analyze_sentiment(tweet):
analysis = TextBlob(tweet)
return {
"polarity": analysis.polarity,
"subjectivity": analysis.subjectivity
}
# Run the analysis on the cleaned tweets
for tweet in tweets:
cleaned = clean_tweet(tweet)
sentiment = analyze_sentiment(cleaned)
print(f"Tweet: {cleaned}\nSentiment: {sentiment}\n")
The Results
I tested it out by searching for tweets about Python (surprise, surprise!). It was so interesting to see how many positive, neutral, and sometimes even hilariously negative opinions people shared about coding!
Final Thoughts
This project was more than just a technical exercise; it was a real-world application that showed me the power of combining Python libraries with real-time data. Working with Tweepy, TextBlob, and cleaning up data taught me so much about handling unstructured text. Plus, I realized just how valuable sentiment analysis can be in fields like marketing, customer service, and beyond.
If you’re curious about text analysis or want to understand what people are saying in real time, give this project a try! You’ll learn a lot, and who knows? You might even discover some surprising sentiments along the way.
Thanks for joining me on this little exploration of sentiment analysis. Let me know what you think or if you’ve tried something similar—I'd love to hear your stories!
I'll be back soon with sentiment analysis using advanced methods like VADER, transformers, and deep learning models. Stay tuned!
Hope you enjoyed this walkthrough! Feel free to check out the full code for this project on my GitHub here.
Subscribe to my newsletter
Read articles from Divya Goyal directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by