Pandas DataFrame Views: Complete Guide

When working with data in Python, the Pandas library provides a powerful DataFrame structure to manage and analyze data effectively. To make the most out of Pandas, understanding how to view and manipulate DataFrame content is crucial. Here’s a detailed guide on various methods and properties to view and inspect data in a Pandas DataFrame.

Let's create a sample dataset containing information about cars and use it to demonstrate the output of various Pandas DataFrame view methods. Here's the dataset and how each function would be applied to it:

Sample Car Dataset

import pandas as pd

# Creating a sample dataset
data = {
    'Car': ['Toyota', 'Honda', 'BMW', 'Audi', 'Ford', 'Chevrolet', 'Tesla'],
    'Model': ['Camry', 'Civic', 'X5', 'A4', 'Mustang', 'Impala', 'Model S'],
    'Year': [2020, 2019, 2021, 2018, 2020, 2017, 2021],
    'Price': [24000, 22000, 60000, 35000, 26000, 29000, 80000],
    'Mileage': [30000, 40000, 20000, 50000, 15000, 45000, 10000],
    'Electric': [False, False, False, False, False, False, True]
}

# Creating DataFrame
df = pd.DataFrame(data)
print(df)

Applying Various Pandas DataFrame View Methods

Display the First Few Rows:

 print(df.head())

Output:

       Car    Model  Year  Price  Mileage  Electric
 0  Toyota    Camry  2020  24000    30000     False
 1   Honda    Civic  2019  22000    40000     False
 2     BMW       X5  2021  60000    20000     False
 3    Audi       A4  2018  35000    50000     False
 4    Ford  Mustang  2020  26000    15000     False

Display the Last Few Rows:

 print(df.tail())

Output:

           Car    Model  Year  Price  Mileage  Electric
 2         BMW       X5  2021  60000    20000     False
 3        Audi       A4  2018  35000    50000     False
 4        Ford  Mustang  2020  26000    15000     False
 5   Chevrolet   Impala  2017  29000    45000     False
 6       Tesla  Model S  2021  80000    10000      True

View Column Names:

 print(df.columns)

Output:

 Index(['Car', 'Model', 'Year', 'Price', 'Mileage', 'Electric'], dtype='object')

View Data Types of Each Column:

 print(df.dtypes)

Output:

 Car         object
 Model       object
 Year         int64
 Price        int64
 Mileage      int64
 Electric      bool
 dtype: object

Calculate Summary Statistics:

 print(df.describe())

Output:

              Year         Price       Mileage
 count   7.000000      7.000000      7.000000
 mean   2019.428571  39428.571429  32714.285714
 std       1.511858  19045.381353  13856.588075
 min    2017.000000  22000.000000  10000.000000
 25%    2018.500000  25500.000000  20000.000000
 50%    2020.000000  29000.000000  30000.000000
 75%    2020.000000  42500.000000  40000.000000
 max    2021.000000  80000.000000  50000.000000

Get Detailed Information:

 print(df.info())

Output:

 <class 'pandas.core.frame.DataFrame'>
 RangeIndex: 7 entries, 0 to 6
 Data columns (total 6 columns):
  #   Column    Non-Null Count  Dtype 
 ---  ------    --------------  ----- 
  0   Car       7 non-null      object
  1   Model     7 non-null      object
  2   Year      7 non-null      int64 
  3   Price     7 non-null      int64 
  4   Mileage   7 non-null      int64 
  5   Electric  7 non-null      bool  
 dtypes: bool(1), int64(3), object(2)
 memory usage: 455.0+ bytes
 None

View Unique Values in a Column:

 print(df['Car'].unique())

Output:

 ['Toyota' 'Honda' 'BMW' 'Audi' 'Ford' 'Chevrolet' 'Tesla']

Check for Null Values:

 print(df.isnull())

Output:

      Car  Model   Year  Price  Mileage  Electric
 0  False  False  False  False    False     False
 1  False  False  False  False    False     False
 2  False  False  False  False    False     False
 3  False  False  False  False    False     False
 4  False  False  False  False    False     False
 5  False  False  False  False    False     False
 6  False  False  False  False    False     False

Check for Non-Null Values:

 print(df.notnull())

Output:

      Car  Model   Year  Price  Mileage  Electric
 0   True   True   True   True     True      True
 1   True   True   True   True     True      True
 2   True   True   True   True     True      True
 3   True   True   True   True     True      True
 4   True   True   True   True     True      True
 5   True   True   True   True     True      True
 6   True   True   True   True     True      True

View Values in 2D Array:

print(df.values)

Output:

[['Toyota' 'Camry' 2020 24000 30000 False]
 ['Honda' 'Civic' 2019 22000 40000 False]
 ['BMW' 'X5' 2021 60000 20000 False]
 ['Audi' 'A4' 2018 35000 50000 False]
 ['Ford' 'Mustang' 2020 26000 15000 False]
 ['Chevrolet' 'Impala' 2017 29000 45000 False]
 ['Tesla' 'Model S' 2021 80000 10000 True]]

View Column and Row Names:

print(df.columns)
print(df.index)

Output:

Index(['Car', 'Model', 'Year', 'Price', 'Mileage', 'Electric'], dtype='object')
RangeIndex(start=0, stop=7, step=1)

Get Shape of the DataFrame:
```
print(df.shape)
```
Output:
```
(7, 6)
```

Print Specific Columns:

print(df[['Car', 'Model']])

Output:

         Car    Model
0      Toyota    Camry
1       Honda    Civic
2         BMW       X5
3        Audi       A4
4        Ford  Mustang
5   Chevrolet   Impala
6       Tesla  Model S

Print Specific Rows:

print(df.iloc[0:3])

Output:

    Car  Model  Year  Price  Mileage  Electric
0  Toyota  Camry  2020  24000    30000     False
1   Honda  Civic  2019  22000    40000     False
2     BMW     X5  2021  60000    20000     False

Print a Column as a Series:

print(df['Car'])

Output:

0        Toyota
1         Honda
2           BMW
3          Audi
4     Cheverlot
5         Tesla
Name: Car, dtype: object

Print a Column as a DataFrame:

print(df[['Car']])

Output:

         Car
0      Toyota
1       Honda
2         BMW
3        Audi
4        Ford
5   Chevrolet
6       Tesla

Select Specific Row and Column:

print(df.loc[0:2, ['Car', 'Model']])

Output:

       Car  Model
0  Toyota  Camry
1   Honda  Civic
2     BMW     X5

Shift Rows:

print(df.shift(1))

Output:

     Car    Model    Year    Price   Mileage  Electric
0    NaN      NaN     NaN      NaN       NaN       NaN
1  Toyota    Camry  2020.0  24000.0  30000.0     False
2   Honda    Civic  2019.0  22000.0  40000.0     False
3     BMW       X5  2021.0  60000.0  20000.0     False
4    Audi       A4  2018.0  35000.0  50000.0     False
5    Ford  Mustang  2020.0  26000.0  15000.0     False
6    Chevrolet   Impala  2017.0  29000.0  45000.0     False

Sort by Column Values:

print(df.sort_values(by=['Price']))

Output:

         Car    Model  Year  Price  Mileage  Electric
1       Honda    Civic  2019  22000    40000     False
0      Toyota    Camry  2020  24000    30000     False
4        Ford  Mustang  2020  26000    15000     False
5   Chevrolet   Impala  2017  29000    45000     False
3        Audi       A4  2018  35000    50000     False
2         BMW       X5  2021  60000    20000     False
6       Tesla  Model S  2021  80000    10000      True

Sort by Index:

print(df.sort_index(ascending=False))

Output:

         Car    Model  Year  Price  Mileage  Electric
6       Tesla  Model S  2021  80000    10000      True
5   Chevrolet   Impala  2017  29000    45000     False
4        Ford  Mustang  2020  26000    15000     False
3        Audi       A4  2018  35000    50000     False
2         BMW       X5  2021  60000    20000     False
1       Honda    Civic  2019  22000    40000     False
0      Toyota    Camry  2020  24000    30000     False

Adjust Display Options:

pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)

One-Hot Encode Columns:

print(pd.get_dummies(df, columns=['Car'], prefix=['Car'], drop_first=True))

Output:

    Model  Year  Price  Mileage  Electric  Car_BMW  Car_Chevrolet  Car_Ford  Car_Honda  Car_Tesla  Car_Toyota
0   Camry  2020  24000    30000     False        0              0         0          0          0           1
1   Civic  2019  22000    40000     False        0              0         0          1          0           0
2      X5  2021  60000    20000     False        1              0         0          0          0           0
3      A4  2018  35000    50000     False        0              0         0          0          0           0
4  Mustang  2020  26000    15000     False        0              0         1          0          0           0
5  Impala  2017  29000    45000     False        0              1         0          0          0           0
6  Model S  2021  80000    10000      True        0              0         0          0          1           0

By mastering these Pandas DataFrame view methods, you can effectively inspect, manipulate, and analyze your data, leading to better insights and decisions.

Exploring Pandas DataFrame Views: A Comprehensive Guide

Sample Car Dataset

Applying Various Pandas DataFrame View Methods

Subscribe to my newsletter

Emeron Marcelle

Emeron Marcelle