🐼 Pandas for Everyone: From Basic to Advanced

1️⃣ What is Pandas & Why Use It?
Pandas is a powerful Python library used for data manipulation and analysis. It introduces two main structures:
Series: 1-dimensional labeled array (like
[1,2,3]
with optional labels)DataFrame: 2-dimensional table (like a spreadsheet) geeksforgeeks.org+6medium.com+6reddit.com+6w3schools.com+14pandas.pydata.org+14en.wikipedia.org+14
Why use Pandas?
Handles large, tabular data effortlessly
Performs statistics (mean, sum), grouping, merging
Built-in support for dates, missing values, CSV/Excel
Fast and intuitive – perfect for class projects
2️⃣ Installation & Import
Install with pip:
pip install pandas
Import in Python:
import pandas as pd
import numpy as np # often used together
3️⃣ Creating Pandas Objects
Series:
s = pd.Series([1, 3, 5, np.nan, 6, 8])
print(s)
Output:
0 1.0
1 3.0
2 5.0
3 NaN
4 6.0
5 8.0
dtype: float64
Here, NaN
stands for missing data .
DataFrame:
dates = pd.date_range("2023-01-01", periods=6)
df = pd.DataFrame(np.random.randn(6,4), index=dates, columns=list("ABCD"))
print(df)
Creates a 6×4 table with random numbers and dates as row labels .
4️⃣ Key Attributes of DataFrame
Let's inspect a DataFrame’s attributes:
print(df.shape) # (6, 4)
print(df.ndim) # 2
print(df.size) # 24
print(df.index)
print(df.columns)
print(df.dtypes)
print(df.values[:2])
shape
: rows & columnsndim
: number of dimensions (2 for DataFrame)size
: total elementsindex
: row labelscolumns
: column namesdtypes
: type of each columnvalues
: raw data as NumPy array zh.wikipedia.org+10pandas.pydata.org+10studyopedia.com+10pandas.pydata.org+5zh.wikipedia.org+5dataanalysispython.readthedocs.io+5medium.com+5hossainlab.github.io+5geeksforgeeks.org+5
5️⃣ Viewing Data: .head()
& .tail()
print(df.head(3))
print(df.tail(2))
head(n)
: first n rowstail(n)
: last n rows
6️⃣ Selection & Indexing
print(df['A']) # column A (as Series)
print(df[['A','B']]) # DataFrame of A & B
print(df.loc[dates[0]]) # by label
print(df.iloc[2]) # by integer position
print(df.at[dates[1], 'B']) # single value label-based
print(df.iat[2,1]) # single value position-based
.loc
: label-based.iloc
: integer-based.at
,.iat
: fast access to single entries dataanalysispython.readthedocs.io+3pandas.pydata.org+3zh.wikipedia.org+3w3schools.com+4hossainlab.github.io+4datacamp.com+4
7️⃣ Basic Computations
print(df.mean()) # column-wise
print(df.mean(axis=1))# row-wise
print(df['A'] + df['B'])
Supports quick math and statistics, handling NaN
values intelligently.
8️⃣ Handling Missing Data
print(df.isna())
df2 = df.dropna() # drop any row with NaN
df3 = df.fillna(0) # fill NaN with 0
Methods like .isna()
, .dropna()
, and .fillna()
help clean data hossainlab.github.io+1medium.com+1.
9️⃣ Adding, Renaming & Replacing Columns
df['E'] = df['A'] + df['B'] # new column
df.rename(columns=str.lower, inplace=True)
df.columns = [c.upper() for c in df.columns]
Easy to manipulate column names and add new data.
🔟 Merging, Concatenating & Reshaping
Merging:
merged = pd.concat([df, df], axis=0) # stack vertically
Pivoting:
df_long = df.reset_index().melt(id_vars='index', var_name='col', value_name='val')
print(df_long.head())
Learn more about merge
, concat
, pivot_table
, etc. youtube.com+11pandas.pydata.org+11en.wikipedia.org+11zh.wikipedia.org
1️⃣1️⃣ Grouping & Aggregation
df['Group'] = ['X','X','Y','Y','X','Y']
grp = df.groupby('Group').agg({'A':'mean', 'B':'sum'})
print(grp)
Powerful tool to split data and compute stats
1️⃣2️⃣ Working with Time Series
df_ts = df.copy()
df_ts.index = pd.date_range(start='2023-01-01', periods=len(df))
print(df_ts['2023-01-01':'2023-01-03'])
print(df_ts.resample('D').mean())
Filter by dates and resample for daily, weekly summaries.
1️⃣3️⃣ IO: Reading & Writing Data
df.to_csv('data.csv')
df2 = pd.read_csv('data.csv', index_col=0, parse_dates=True)
Supports CSV, Excel, JSON, SQL, and more
🎓 Tips for Students
Use
.head()
&.tail()
often to peek at dataAlways clean missing values before analysis
Explore
.describe()
for summary statsCombine methods (e.g.,
df.groupby().sum()
) for powerful pipelines
✅ Wrap-Up
Pandas combines the speed of NumPy with rich data handling—making it perfect for ML, science, and school projects. With this guide, students can confidently explore data like pros 🏆.
Subscribe to my newsletter
Read articles from Nitin Kumar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
