My First Data Science Project: Analyzing Titanic Survival Data with Python


My First Data Science Project: Analyzing Titanic Survival Data with Python
Hi!
Welcome to my very first mini data science project blog!
In this post, I’ll walk you through how I used Python, Pandas, and Seaborn to explore the famous Titanic dataset — a classic beginner-friendly dataset used in data science learning.
🚢 What is the Titanic Dataset?
The Titanic dataset contains data about the passengers on board the RMS Titanic — a ship that tragically sank in 1912.
The dataset includes details like:
Passenger age, gender, class
Ticket fare
Whether they survived or not
Our goal is to explore the data and uncover patterns — such as:
"Who had the best chance of survival?"
📥 Step 1: Import Libraries and Load Data
We’ll use Pandas and Seaborn for data analysis and visualization.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load Titanic dataset
df = sns.load_dataset('titanic')
df.head()
🧾 Step 2: Explore the Data
Let’s check the structure and look for missing values.
df.info()
df.isnull().sum()
Some initial observations:
Columns like
age
,embarked
, anddeck
have missing values.The
survived
column is our target — 1 means survived, 0 means not.
🧹 Step 3: Data Cleaning
Let’s handle some missing data and keep only useful columns.
# Drop columns with many missing values
df.drop(['deck', 'embark_town', 'alive'], axis=1, inplace=True)
# Fill missing age with median
df['age'].fillna(df['age'].median(), inplace=True)
# Drop rows with any remaining nulls
df.dropna(inplace=True)
📊 Step 4: Data Visualization
Let’s find some interesting insights.
1. Survival Count
sns.countplot(x='survived', data=df)
plt.title('Survival Count')
plt.show()
2. Survival by Gender
sns.countplot(x='sex', hue='survived', data=df)
plt.title('Survival by Gender')
plt.show()
3. Survival by Class
sns.countplot(x='pclass', hue='survived', data=df)
plt.title('Survival by Passenger Class')
plt.show()
4. Age Distribution
sns.histplot(data=df, x='age', bins=30, kde=True)
plt.title('Age Distribution of Passengers')
plt.show()
📈 Step 5: What Did I Learn?
🔍 Insights:
More women survived than men.
First-class passengers had a higher survival rate.
Younger passengers had better chances.
This small project helped me understand:
How to load and clean data
How to find patterns visually
How real-world data can tell powerful stories!
🧠 What’s Next?
In my next blog, I plan to build a basic machine learning model using this Titanic dataset — to actually predict survival!
Step by step, I’ll keep growing my data science skills, and I hope you’ll follow along.
Thanks for reading 💛
Feel free to try this project yourself and share your results!
— Farsana | Data Science Intern | Python + Pandas + Curiosity 🚀
Subscribe to my newsletter
Read articles from Farsana Thasnem PA directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Farsana Thasnem PA
Farsana Thasnem PA
Aspiring Data Scientist | Physics Graduate | Passionate about Machine Learning, Python, and Data Storytelling. Sharing my journey, projects, and learnings in the world of data science.