My Learnings from Pandas – A Python Library

Amardeep KumarAmardeep Kumar
4 min read

Introduction

When I was exploring data with python, I got to know about Pandas. At first, it seemed a bit boring and confusing but once I started using it then I realized how powerful and important Pandas really is for working with data. It makes tasks like cleaning, analyzing, and transforming data much easier.

In this blog I want to share my leanings about pandas. If you’re just getting into data with Python, or curious about how Pandas works in real life, you’ll hopefully find this helpful!

Excited?

Let’s begin!

Why Pandas Matters🐼 :

  • Simplifies data handling: Easily read, write, and manipulate data from formats like CSV, Excel, SQL, etc.

  • Powerful data structures: Series and DataFrame make it intuitive to work with 1D and 2D data.

  • Efficient data cleaning: Handle missing values, duplicates, and formatting issues quickly.

  • Easy data analysis: Perform grouping, filtering, aggregation, and descriptive statistics in just a few lines.

How to Use Pandas – Jupyter Notebook vs VS Code :

I have used Pandas in Jupyter Notebook because it's great for testing and seeing outputs instantly.

But don’t worry….you can use Pandas in both Jupyter Notebook and VS Code. It works the same!

🧪 Using Jupyter Notebook:

  • Great for Learning and Data Exploration
    Jupyter is perfect for testing ideas, analyzing data, and seeing results step-by-step.

  • Run Code in Small Cells
    You can break your code into small chunks (called cells) and run them one at a time to see instant output..

  • Easy to Open and Use
    Launch it through Anaconda Navigator or by running Jupyter Notebook in your terminal

Personally, I’ve used Jupyter Notebook inside PyCharm, which works well too! But if you want a quick and easy setup, using it directly from Anaconda is super beginner-friendly.

💻 Using Pandas in VS Code

  • Perfect for Real Projects
    Great when you're building something bigger than quick experiments like data analysis scripts or full apps.

  • All-in-One Workspace
    You write code, run it, see output, and install packages—all inside the same window.

  • Easy to Use
    Just create a .py file, write your Pandas code, and hit run. That’s it!

Open the Terminal

pip install pandas

That’s it! Once installed, you’re ready to start writing Pandas code in your .py files.

🧩Key Data Structure in Pandas:

  • Series (1D)

    A one-dimensional labeled array…think of it like a single column in Excel.

  • DataFrame (2D)

    A two dimensional table made up of rows and columns…just like an Excel sheet or a SQL table.
    This is the most commonly used structure in Pandas.

🛠️ Essential Pandas Functions to Explore Your Data:

  1. read_*( ) Function

    Pandas lets you read different file types depending on your data:

    • CSV: pd.read _csv(‘file.csv’)

    • Excel: pd.read _excel(‘file.xcel’)

    • JSON: pd.read _json(‘file.json’)

Choose one that fits your file type

  1. head( )

    Shows the first 5 rows of the DataFrame (or more if you specify.

  2. tail( )

    Shows the last 5 rows of the DataFrame.

  3. describe( )

    Gives a summary of statistics for numeric columns

  4. info( )

    Displays a quick overview: number of rows, columns, data types, and memory usage.

These tools are our best friends when starting to explore any dataset.

Data Selection:

Some common Pandas methods we should know to select ,clean, and handle our data.

  1. type( )

    Check the type of an object and confirm whether you’re working with a DataFrame or Series.

  2. iloc[ ]

    Select data by position (index numbers)—useful for row/column slicing.

     df.iloc[0]        # First row  
     df.iloc[0:3]      # First three rows  
     df.iloc[:, 1]     # All rows, second column
    

  3. dropna( )

    Removes rows or columns with missing values.

  4. fillna( )

    Fills missing values with a specific value.

  5. rename( )

    Renames column or index labels.

  6. rename(……, inplace =True)

    Updates the DataFrame directly without needing to assign it again.

  7. astype( )

    Change the data type of a column.

  8. len( )

    Returns the number of rows in the DataFrame.

  9. apply( )

    lets you apply a function to a whole column

  10. to_*( )

    There are different functions depending on the file format you want

    df.to_csv('filename.csv', index=False)
    df.to_excel('filename.xlsx', index=False)
    df.to_json('filename.json',index=False)
    

    I used index = False when I didn’t want Pandas to add that extra index column into the saved file.

  11. concat( )

    Combines multiple DataFrames vertically or horizontally.

  12. merge( )

    Combines DataFrames based on a common column

Conclusion:

I am still exploring Pandas, but these are some tools that really help me get started. From Selecting Data to saving final output…every function that I have shared here is something that I used and sometimes struggled with 😅

Thank you for reading😄

45
Subscribe to my newsletter

Read articles from Amardeep Kumar directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Amardeep Kumar
Amardeep Kumar