Exploratory Risk Assessment Using Correlation Matrix and Standard Deviation


Introduction
In this project, I used correlation matrices and changes in standard deviation to explore how key Brazilian macroeconomic indicators behaved during and after the peak of the COVID-19 pandemic.
Correlation matrices are powerful tools for identifying linear relationships between economic variables such as the Selic rate, unemployment, and GDP (PIB). They help reveal which variables tend to move together, a valuable insight for forecasting trends and spotting redundancy in predictive models.
Standard deviation, meanwhile, measures how much a variable fluctuates around its average. By comparing the standard deviation across the pandemic and post-pandemic periods, I was able to highlight which indicators became more volatile or more stable, uncovering structural shifts in Brazil’s economic landscape.
Methodology
The dataset was indexed using datetime for temporal organization. Then:
Missing values were interpolated;
Numerical variables were standardized using
StandardScaler
;Three correlation matrices were generated: overall, pandemic, and post-pandemic;
Finally, the standard deviation was calculated for each period, along with its percentage variation.
Results and Discussion
The overall correlation matrix (Figure 1.) highlights strong and expected relationships, such as:
IPCA and INPC showing a very high positive correlation (0.98);
Selic and PIB with a strong positive correlation (0.81), while Selic and Unemployment show a strong negative correlation (-0.91);
PIB and Unemployment with a strong negative correlation (-0.93), reflecting the cyclical nature of economic activity.
Figure 1. Correlation Matrix of Brazilian Macroeconomic Variables
The drop in the volatility of the Selic rate (Figure 2.), income, and unemployment indicators suggests a more stable economic environment in the post-pandemic period, despite persistently high inflation. On the other hand, the increased volatility of the Consumer Confidence Index (ICC) points to growing uncertainty in how Brazilians perceive the economy, even as objective indicators show signs of stabilization.
Figure 2. Percentage Variation of standard deviation
Conclusion
This study highlights how statistical tools such as the correlation matrix and the analysis of standard deviation variation can uncover meaningful patterns and shifts in Brazil’s macroeconomic dynamics. The reduction in volatility for indicators like inflation contrasts sharply with the significant increase in consumer confidence instability. These findings underscore the importance of combining statistical analysis with contextual interpretation to produce more accurate and insightful economic assessments.
Code (Python | Google Colab)
from google.colab import files
import pandas as pd
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
import seaborn as sns
uploaded = files.upload()
for file_name in uploaded.keys():
print(f"Arquivo carregado: {file_name}")
uploaded_file_name = file_name
df = pd.read_excel(uploaded_file_name)
print(df.columns.tolist())
df['Period'] = pd.to_datetime(df['Period'])
df.set_index('Period', inplace=True)
print(df.isna().sum())
df = df.interpolate().dropna()
numeric_cols = df.select_dtypes(include=['float64', 'int64']).columns
df_normalized = df.copy()
scaler = StandardScaler()
df_normalized[numeric_cols] = scaler.fit_transform(df[numeric_cols])
corr_matrix = df_normalized[numeric_cols].corr()
plt.figure(figsize=(16, 12))
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', center=0, fmt='.2f',
annot_kws={'size': 8}, linewidths=0.5)
plt.title('Correlation Matrix of Brazilian Macroeconomic Variables')
plt.xticks(rotation=45)
plt.yticks(rotation=0)
plt.tight_layout()
plt.show()
df_pandemia = df['2020':'2021']
df_pospandemia = df['2022':'2025']
corr_pandemia = df_pandemia[numeric_cols].corr()
corr_pospandemia = df_pospandemia[numeric_cols].corr()
diff_corr = corr_pospandemia - corr_pandemia
pandemia = df.loc['2020':'2021']
pos_pandemia = df.loc['2022':'2025']
std_pandemia = pandemia[numeric_cols].std()
std_pos_pandemia = pos_pandemia[numeric_cols].std()
std_comparison = pd.DataFrame({
'Desvio Padrão Pandemia (2020-2021)': std_pandemia,
'Desvio Padrão Pós-Pandemia (2022-2025)': std_pos_pandemia,
'Variação Absoluta': std_pos_pandemia - std_pandemia,
'Variação Percentual (%)': ((std_pos_pandemia - std_pandemia) / std_pandemia * 100).round(1)
})
std_comparison.sort_values('Variação Percentual (%)', ascending=False, inplace=True)
print(std_comparison)
colors = ['green' if x > 0 else 'red' for x in std_comparison['Variação Percentual (%)']]
plt.figure(figsize=(10, 6))
plt.barh(std_comparison.index, std_comparison['Variação Percentual (%)'], color=colors)
plt.axvline(0, color='gray', linestyle='--')
plt.title('Percentage Variation of Standard Deviation (Post-Pandemic vs. Pandemic')
plt.xlabel('Percentage Variation (%)')
plt.grid(axis='x', linestyle=':', alpha=0.5)
plt.tight_layout()
plt.show()
Subscribe to my newsletter
Read articles from Bernardo Ribeiro de Moura directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Bernardo Ribeiro de Moura
Bernardo Ribeiro de Moura
Senior Data Analyst at Unimed Rio Preto, working with predictive models, cost optimization, and data-driven decision-making. Bachelor’s in Chemistry (UNESP), transitioning to Data Science (UNIVESP), combining science and technology to solve real-world problems. Specialized in Google Data Analytics. I write about predictive analysis, data visualization, and statistical modeling. Let’s exchange ideas on Python, SQL, and the impact of data in our daily lives!