Exploring Pandas DataFrame Views: A Comprehensive Guide

Emeron MarcelleEmeron Marcelle
6 min read

When working with data in Python, the Pandas library provides a powerful DataFrame structure to manage and analyze data effectively. To make the most out of Pandas, understanding how to view and manipulate DataFrame content is crucial. Here’s a detailed guide on various methods and properties to view and inspect data in a Pandas DataFrame.

Let's create a sample dataset containing information about cars and use it to demonstrate the output of various Pandas DataFrame view methods. Here's the dataset and how each function would be applied to it:

Sample Car Dataset

import pandas as pd

# Creating a sample dataset
data = {
    'Car': ['Toyota', 'Honda', 'BMW', 'Audi', 'Ford', 'Chevrolet', 'Tesla'],
    'Model': ['Camry', 'Civic', 'X5', 'A4', 'Mustang', 'Impala', 'Model S'],
    'Year': [2020, 2019, 2021, 2018, 2020, 2017, 2021],
    'Price': [24000, 22000, 60000, 35000, 26000, 29000, 80000],
    'Mileage': [30000, 40000, 20000, 50000, 15000, 45000, 10000],
    'Electric': [False, False, False, False, False, False, True]
}

# Creating DataFrame
df = pd.DataFrame(data)
print(df)

Applying Various Pandas DataFrame View Methods

  1. Display the First Few Rows:

     print(df.head())
    

    Output:

           Car    Model  Year  Price  Mileage  Electric
     0  Toyota    Camry  2020  24000    30000     False
     1   Honda    Civic  2019  22000    40000     False
     2     BMW       X5  2021  60000    20000     False
     3    Audi       A4  2018  35000    50000     False
     4    Ford  Mustang  2020  26000    15000     False
    
  2. Display the Last Few Rows:

     print(df.tail())
    

    Output:

               Car    Model  Year  Price  Mileage  Electric
     2         BMW       X5  2021  60000    20000     False
     3        Audi       A4  2018  35000    50000     False
     4        Ford  Mustang  2020  26000    15000     False
     5   Chevrolet   Impala  2017  29000    45000     False
     6       Tesla  Model S  2021  80000    10000      True
    
  3. View Column Names:

     print(df.columns)
    

    Output:

     Index(['Car', 'Model', 'Year', 'Price', 'Mileage', 'Electric'], dtype='object')
    
  4. View Data Types of Each Column:

     print(df.dtypes)
    

    Output:

     Car         object
     Model       object
     Year         int64
     Price        int64
     Mileage      int64
     Electric      bool
     dtype: object
    
  5. Calculate Summary Statistics:

     print(df.describe())
    

    Output:

                  Year         Price       Mileage
     count   7.000000      7.000000      7.000000
     mean   2019.428571  39428.571429  32714.285714
     std       1.511858  19045.381353  13856.588075
     min    2017.000000  22000.000000  10000.000000
     25%    2018.500000  25500.000000  20000.000000
     50%    2020.000000  29000.000000  30000.000000
     75%    2020.000000  42500.000000  40000.000000
     max    2021.000000  80000.000000  50000.000000
    
  6. Get Detailed Information:

     print(df.info())
    

    Output:

     <class 'pandas.core.frame.DataFrame'>
     RangeIndex: 7 entries, 0 to 6
     Data columns (total 6 columns):
      #   Column    Non-Null Count  Dtype 
     ---  ------    --------------  ----- 
      0   Car       7 non-null      object
      1   Model     7 non-null      object
      2   Year      7 non-null      int64 
      3   Price     7 non-null      int64 
      4   Mileage   7 non-null      int64 
      5   Electric  7 non-null      bool  
     dtypes: bool(1), int64(3), object(2)
     memory usage: 455.0+ bytes
     None
    
  7. View Unique Values in a Column:

     print(df['Car'].unique())
    

    Output:

     ['Toyota' 'Honda' 'BMW' 'Audi' 'Ford' 'Chevrolet' 'Tesla']
    
  8. Check for Null Values:

     print(df.isnull())
    

    Output:

          Car  Model   Year  Price  Mileage  Electric
     0  False  False  False  False    False     False
     1  False  False  False  False    False     False
     2  False  False  False  False    False     False
     3  False  False  False  False    False     False
     4  False  False  False  False    False     False
     5  False  False  False  False    False     False
     6  False  False  False  False    False     False
    
  9. Check for Non-Null Values:

     print(df.notnull())
    

    Output:

          Car  Model   Year  Price  Mileage  Electric
     0   True   True   True   True     True      True
     1   True   True   True   True     True      True
     2   True   True   True   True     True      True
     3   True   True   True   True     True      True
     4   True   True   True   True     True      True
     5   True   True   True   True     True      True
     6   True   True   True   True     True      True
    
  10. View Values in 2D Array:

    print(df.values)
    

    Output:

    [['Toyota' 'Camry' 2020 24000 30000 False]
     ['Honda' 'Civic' 2019 22000 40000 False]
     ['BMW' 'X5' 2021 60000 20000 False]
     ['Audi' 'A4' 2018 35000 50000 False]
     ['Ford' 'Mustang' 2020 26000 15000 False]
     ['Chevrolet' 'Impala' 2017 29000 45000 False]
     ['Tesla' 'Model S' 2021 80000 10000 True]]
    
  11. View Column and Row Names:

    print(df.columns)
    print(df.index)
    

    Output:

    Index(['Car', 'Model', 'Year', 'Price', 'Mileage', 'Electric'], dtype='object')
    RangeIndex(start=0, stop=7, step=1)
    
  12. Get Shape of the DataFrame:

    print(df.shape)
    

    Output:

    (7, 6)
    
  13. Print Specific Columns:

    print(df[['Car', 'Model']])
    

    Output:

             Car    Model
    0      Toyota    Camry
    1       Honda    Civic
    2         BMW       X5
    3        Audi       A4
    4        Ford  Mustang
    5   Chevrolet   Impala
    6       Tesla  Model S
    
  14. Print Specific Rows:

    print(df.iloc[0:3])
    

    Output:

        Car  Model  Year  Price  Mileage  Electric
    0  Toyota  Camry  2020  24000    30000     False
    1   Honda  Civic  2019  22000    40000     False
    2     BMW     X5  2021  60000    20000     False
    
  15. Print a Column as a Series:

    print(df['Car'])
    

    Output:

    0        Toyota
    1         Honda
    2           BMW
    3          Audi
    4     Cheverlot
    5         Tesla
    Name: Car, dtype: object
    
  16. Print a Column as a DataFrame:

    print(df[['Car']])
    

    Output:

             Car
    0      Toyota
    1       Honda
    2         BMW
    3        Audi
    4        Ford
    5   Chevrolet
    6       Tesla
    
  17. Select Specific Row and Column:

    print(df.loc[0:2, ['Car', 'Model']])
    

    Output:

           Car  Model
    0  Toyota  Camry
    1   Honda  Civic
    2     BMW     X5
    
  18. Shift Rows:

    print(df.shift(1))
    

    Output:

         Car    Model    Year    Price   Mileage  Electric
    0    NaN      NaN     NaN      NaN       NaN       NaN
    1  Toyota    Camry  2020.0  24000.0  30000.0     False
    2   Honda    Civic  2019.0  22000.0  40000.0     False
    3     BMW       X5  2021.0  60000.0  20000.0     False
    4    Audi       A4  2018.0  35000.0  50000.0     False
    5    Ford  Mustang  2020.0  26000.0  15000.0     False
    6    Chevrolet   Impala  2017.0  29000.0  45000.0     False
    
  19. Sort by Column Values:

    print(df.sort_values(by=['Price']))
    

    Output:

             Car    Model  Year  Price  Mileage  Electric
    1       Honda    Civic  2019  22000    40000     False
    0      Toyota    Camry  2020  24000    30000     False
    4        Ford  Mustang  2020  26000    15000     False
    5   Chevrolet   Impala  2017  29000    45000     False
    3        Audi       A4  2018  35000    50000     False
    2         BMW       X5  2021  60000    20000     False
    6       Tesla  Model S  2021  80000    10000      True
    
  20. Sort by Index:

    print(df.sort_index(ascending=False))
    

    Output:

             Car    Model  Year  Price  Mileage  Electric
    6       Tesla  Model S  2021  80000    10000      True
    5   Chevrolet   Impala  2017  29000    45000     False
    4        Ford  Mustang  2020  26000    15000     False
    3        Audi       A4  2018  35000    50000     False
    2         BMW       X5  2021  60000    20000     False
    1       Honda    Civic  2019  22000    40000     False
    0      Toyota    Camry  2020  24000    30000     False
    
  21. Adjust Display Options:

    pd.set_option('display.max_colwidth', None)
    pd.set_option('display.max_rows', None)
    pd.set_option('display.max_columns', None)
    
  22. One-Hot Encode Columns:

    print(pd.get_dummies(df, columns=['Car'], prefix=['Car'], drop_first=True))
    

    Output:

        Model  Year  Price  Mileage  Electric  Car_BMW  Car_Chevrolet  Car_Ford  Car_Honda  Car_Tesla  Car_Toyota
    0   Camry  2020  24000    30000     False        0              0         0          0          0           1
    1   Civic  2019  22000    40000     False        0              0         0          1          0           0
    2      X5  2021  60000    20000     False        1              0         0          0          0           0
    3      A4  2018  35000    50000     False        0              0         0          0          0           0
    4  Mustang  2020  26000    15000     False        0              0         1          0          0           0
    5  Impala  2017  29000    45000     False        0              1         0          0          0           0
    6  Model S  2021  80000    10000      True        0              0         0          0          1           0
    

By mastering these Pandas DataFrame view methods, you can effectively inspect, manipulate, and analyze your data, leading to better insights and decisions.

0
Subscribe to my newsletter

Read articles from Emeron Marcelle directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Emeron Marcelle
Emeron Marcelle

As a doctoral scholar in Information Technology, I am deeply immersed in the world of artificial intelligence, with a specific focus on advancing the field. Fueled by a strong passion for Machine Learning and Artificial Intelligence, I am dedicated to acquiring the skills necessary to drive growth and innovation in this dynamic field. With a commitment to continuous learning and a desire to contribute innovative ideas, I am on a path to make meaningful contributions to the ever-evolving landscape of Machine Learning.