10 Basic Operations in Pandas and Python

codevizcodeviz
3 min read

Pandas is a powerful data manipulation and analysis library in Python. It provides easy-to-use data structures and data analysis tools, making it a popular choice for data scientists and analysts. In this article, we will explore 10 basic operations in Pandas and Python that will help you get started with data manipulation and analysis.

1. Importing Pandas

Before we can start using Pandas, we need to import it into our Python environment. This can be done using the following line of code:

import pandas as pd

2. Reading Data

Pandas provides various methods to read data from different sources such as CSV files, Excel files, SQL databases, and more. One of the most commonly used methods is read_csv(), which allows us to read data from a CSV file. Here's an example:

data = pd.read_csv('data.csv')

3. Viewing Data

Once we have loaded the data, we can use the head() method to view the first few rows of the DataFrame. This is useful to get a quick overview of the data. For example:

data.head()

4. Selecting Columns

To select specific columns from a DataFrame, we can use the indexing operator [] or the loc[] and iloc[] methods. Here's an example:

# Using indexing operator
selected_columns = data['column_name']

# Using loc[]
selected_columns = data.loc[:, 'column_name']

# Using iloc[]
selected_columns = data.iloc[:, column_index]

5. Filtering Data

Pandas allows us to filter data based on certain conditions. We can use logical operators such as ==, >, <, >=, <=, and != to create filters. Here's an example:

filtered_data = data[data['column_name'] > 10]

6. Sorting Data

Sorting data is a common operation in data analysis. Pandas provides the sort_values() method to sort a DataFrame based on one or more columns. Here's an example:

sorted_data = data.sort_values(by='column_name', ascending=False)

7. Grouping Data

Grouping data allows us to perform calculations on subsets of data. Pandas provides the groupby() method to group data based on one or more columns. Here's an example:

grouped_data = data.groupby('column_name').mean()

8. Aggregating Data

Aggregating data involves performing calculations on groups of data. Pandas provides various aggregation functions such as sum(), mean(), min(), max(), and count(). Here's an example:

aggregated_data = data.groupby('column_name').sum()

9. Handling Missing Data

Missing data is a common issue in real-world datasets. Pandas provides methods such as isnull(), notnull(), dropna(), and fillna() to handle missing data. Here's an example:

# Dropping rows with missing values
clean_data = data.dropna()

# Filling missing values with a specific value
filled_data = data.fillna(value)

10. Writing Data

Once we have performed the necessary operations on our data, we may want to save the modified DataFrame to a file. Pandas provides methods such as to_csv(), to_excel(), and to_sql() to write data to different file formats. Here's an example:

data.to_csv('modified_data.csv', index=False)

These are just a few of the basic operations that Pandas and Python offer for data manipulation and analysis. With these operations, you can start exploring and analyzing your data effectively. Pandas provides a vast array of functionalities, so it's worth exploring the official documentation to learn more about its capabilities.

Remember, practice is key to mastering these operations. So, start experimenting with your own datasets and see how you can leverage the power of Pandas and Python for your data analysis needs.

Happy Learning! Please follow for more articles.

0
Subscribe to my newsletter

Read articles from codeviz directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

codeviz
codeviz