Pandas DataFrame: Key Attributes & Methods

If you're diving into data analysis with Python, pandas is an essential library, and mastering its DataFrame is a must. This blog post serves as a quick-reference guide covering the most important DataFrame attributes and methods you'll need for data inspection, cleaning, transformation, and analysis.

🔍 DataFrame Attributes

Attributes give you quick insights into the structure and metadata of your DataFrame.

Attribute	Description
`.shape`	Returns a tuple of the DataFrame dimensions (rows, columns)
`.dtypes`	Data types of each column
`.values`	Numpy array representation of the DataFrame
`.columns`	Column labels
`.index`	Row index labels
`.index.name`	Name of the index (can be set manually)
`.columns.name`	Name of the columns axis
`.size`	Total number of elements (rows × columns)

⚙️ Data Inspection Methods

Use these methods to understand the contents and structure of your data.

Method	Description
`.info()`	Summary of DataFrame: columns, data types, non-null values
`.head(n)`	First `n` rows (default: 5)
`.tail(n)`	Last `n` rows (default: 5)
`.sample(n)`	Random `n` rows from the DataFrame

📊 Aggregation & Summary Statistics

Quickly summarize numerical data in your DataFrame.

Method	Description
`.count()`	Count of non-null values (column-wise by default)
`.min()`	Minimum values (all columns unless `numeric_only=True`)
`.max()`	Maximum values
`.sum()`	Sum of values
`.mean()`	Mean of numeric values
`.describe()`	Summary stats like count, mean, std, min, and quartiles

🛠️ Data Cleaning & Manipulation

Essential methods to tidy and transform your data.

Method	Description
`.rename(columns={}, index={})`	Rename column or index labels (use `inplace=True` to apply directly)
`.value_counts()`	Frequency count of unique rows (DataFrame) or values (Series)
`.sort_values(by='', ascending=True)`	Sort by column(s)
`.sort_index()`	Sort rows by index
`.isnull()` / `.notnull()`	Detect missing values
`.dropna()`	Drop rows or columns with missing data (customizable with `how` and `subset`)
`.fillna()` / `.ffill()` / `.bfill()`	Fill missing values forward or backward
`.duplicated()`	Detect duplicate rows (or based on subset of columns)
`.drop_duplicates()`	keep = ‘first’ by default, drop duplicate rows (or based on subset of columns)
`.drop(index=[], columns=[])`	Drop rows or columns explicitly
`.rank()`	Rank data within each column

🔄 Index Management

Useful when dealing with hierarchical or multi-index data.

Method	Description
`.set_index(col)`	Make a column the index
`.reset_index()`	Reset index to default (turn index into a column)

🧠 Custom Functions & Advanced Selection

Go beyond built-ins by applying your own logic.

Method	Description
`.apply(func)`	Apply function column-wise by default; use `axis=1` for row-wise operations
`.select_dtypes(include='number')`	Filter columns by data type
`.nunique(dropna=True)`	Number of unique values in each column (excluding `NaN` by default)
`.isin([])`	Check whether each element is in a given list
`.copy()`	Create a deep copy of the DataFrame

✅ Final Tips

These methods and attributes are the foundation of pandas workflows.
Combine them to filter, clean, and understand your data efficiently.
Use .apply() for custom logic and .describe() for quick numeric overviews.

Bookmark this guide and revisit it as you work on real-world datasets. Mastering these will level up your data analysis game significantly!

🎁 Bonus Resources: Dive Deeper with My GitHub Repo

If you're serious about mastering data analysis with Python, don’t miss out on this curated GitHub repository:

🔗 Python-Data-Analysis by ShehrazSarwar

📌 What’s Inside?

✅ Step-by-step Jupyter notebooks for every major Pandas and NumPy concept
📊 Real-world datasets with hands-on case studies
🧹 In-depth data cleaning and preprocessing workflows
📈 Data exploration and visualization techniques using Matplotlib and Seaborn
🔍 Practical insights generated using EDA (Exploratory Data Analysis)

💡 Perfect for:
Beginners, students, and aspiring data analysts who want to build a strong foundation with Python and real datasets.

Bookmark it, fork it, and use it as your personal data analysis workbook!

Happy analyzing with pandas 🚀

Mastering Pandas: DataFrame Attributes and Methods