Introduction

When I was exploring data with python, I got to know about Pandas. At first, it seemed a bit boring and confusing but once I started using it then I realized how powerful and important Pandas really is for working with data. It makes tasks like cleaning, analyzing, and transforming data much easier.

In this blog I want to share my leanings about pandas. If you’re just getting into data with Python, or curious about how Pandas works in real life, you’ll hopefully find this helpful!

Excited?

Let’s begin!

Why Pandas Matters🐼 :

Simplifies data handling: Easily read, write, and manipulate data from formats like CSV, Excel, SQL, etc.
Powerful data structures: Series and DataFrame make it intuitive to work with 1D and 2D data.
Efficient data cleaning: Handle missing values, duplicates, and formatting issues quickly.
Easy data analysis: Perform grouping, filtering, aggregation, and descriptive statistics in just a few lines.

How to Use Pandas – Jupyter Notebook vs VS Code :

I have used Pandas in Jupyter Notebook because it's great for testing and seeing outputs instantly.

But don’t worry….you can use Pandas in both Jupyter Notebook and VS Code. It works the same!

🧪 Using Jupyter Notebook:

Great for Learning and Data Exploration
Jupyter is perfect for testing ideas, analyzing data, and seeing results step-by-step.
Run Code in Small Cells
You can break your code into small chunks (called cells) and run them one at a time to see instant output..
Easy to Open and Use
Launch it through Anaconda Navigator or by running Jupyter Notebook in your terminal

Personally, I’ve used Jupyter Notebook inside PyCharm, which works well too! But if you want a quick and easy setup, using it directly from Anaconda is super beginner-friendly.

💻 Using Pandas in VS Code

Perfect for Real Projects
Great when you're building something bigger than quick experiments like data analysis scripts or full apps.
All-in-One Workspace
You write code, run it, see output, and install packages—all inside the same window.
Easy to Use
Just create a .py file, write your Pandas code, and hit run. That’s it!

Open the Terminal

pip install pandas

That’s it! Once installed, you’re ready to start writing Pandas code in your .py files.

🧩Key Data Structure in Pandas:

Series (1D)

A one-dimensional labeled array…think of it like a single column in Excel.
DataFrame (2D)

A two dimensional table made up of rows and columns…just like an Excel sheet or a SQL table.
This is the most commonly used structure in Pandas.

🛠️ Essential Pandas Functions to Explore Your Data:

read_*( ) Function

Pandas lets you read different file types depending on your data:
- CSV: pd.read _csv(‘file.csv’)
- Excel: pd.read _excel(‘file.xcel’)
- JSON: pd.read _json(‘file.json’)

Choose one that fits your file type

head( )

Shows the first 5 rows of the DataFrame (or more if you specify.
tail( )

Shows the last 5 rows of the DataFrame.
describe( )

Gives a summary of statistics for numeric columns
info( )

Displays a quick overview: number of rows, columns, data types, and memory usage.

These tools are our best friends when starting to explore any dataset.

Data Selection:

Some common Pandas methods we should know to select ,clean, and handle our data.

type( )

Check the type of an object and confirm whether you’re working with a DataFrame or Series.

iloc[ ]

Select data by position (index numbers)—useful for row/column slicing.

 df.iloc[0]        # First row  
 df.iloc[0:3]      # First three rows  
 df.iloc[:, 1]     # All rows, second column

dropna( )

Removes rows or columns with missing values.
fillna( )

Fills missing values with a specific value.
rename( )

Renames column or index labels.
rename(……, inplace =True)

Updates the DataFrame directly without needing to assign it again.
astype( )

Change the data type of a column.
len( )

Returns the number of rows in the DataFrame.
apply( )

lets you apply a function to a whole column
to_*( )

There are different functions depending on the file format you want
```
df.to_csv('filename.csv', index=False)
df.to_excel('filename.xlsx', index=False)
df.to_json('filename.json',index=False)
```
I used index = False when I didn’t want Pandas to add that extra index column into the saved file.
concat( )

Combines multiple DataFrames vertically or horizontally.
merge( )

Combines DataFrames based on a common column

Conclusion:

I am still exploring Pandas, but these are some tools that really help me get started. From Selecting Data to saving final output…every function that I have shared here is something that I used and sometimes struggled with 😅

Thank you for reading😄

My Learnings from Pandas – A Python Library

Table of contents