Date Manipulation with Pandas

Monojit SarkarMonojit Sarkar
1 min read

Here's the task.

You need to generate dates between two ranges, extract the year, extract the month and extract the quarter number.

import pandas as pd

start = '1/1/2018'
end = '31/12/2020'

df = pd.DataFrame(columns=['Year', 'Month', 'Quarter']

Let's generate all the possible dates between start date and end date

dates = pd.date_range(start, end)

Now extract the year as

df.loc[:, 'Year'] = dates.to_period('Y')

Now extract the month as

df.loc[:, 'Month'] = dates.to_period('M').astype(str).str.replace('^\d*-', '', regex=True)

What's the reason for using regex?

Without the regex pandas would return quarter number as 2018-01. So using regex remove the beginning digits along with the - to get the month numbers.

Now extract the Quarter number as

df.loc[:, 'Quarter'] = dates.to_period('Q').astype(str).str.replace('^\d*', '', regex=True)

What the reason for using regex?

Without the regex pandas would return quarter number as 2018Q1. So using regex remove the beginning digits to get the quarter numbers.

0
Subscribe to my newsletter

Read articles from Monojit Sarkar directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Monojit Sarkar
Monojit Sarkar

I am a self-taught Python aficionado, dancing in the realms of AI and ML. What started as a curious exploration soon turned into a revelation: the unsung heroes behind the AI symphony are linear algebra, probability, and statistics. Astonishingly, these mathematical wizards not only power the algorithms but also surpass human problem-solving finesse.