Margin Reversal: How to Spot Hidden Risks Before They Turn into Losses


Introduction
Projecting performance trends for companies, groups, products, or beneficiaries is often a relatively straightforward task. But what happens when that performance shifts over time?
The question that inspired this algorithm was: Are there groups that start 2025 with positive projections but, at some point, see their margins turn negative, posing a threat to operational sustainability? And more importantly: How can we identify these groups within massive datasets?
Additionally, among all groups with signs of margin inversion, which ones are the most concerning?
Simply sorting by absolute margin value isn’t enough, it could lead us to overlook dangerous cases with modest values but worrying trends.
To address this, I developed a projection algorithm combined with a composite risk index that assigns different weights to key signs of deterioration. This score allows for more accurate and fair prioritization of high-risk cases. A fictional dataset was created to demonstrate the impact of the method.
Brief description of the code logic
The dataset used contains the projected monthly margin for 2025 (revenue minus cost), by company.
The goal was to detect margin inversions over time, even when the total projected margin for the year still appears positive or only moderately negative (**Note: We have actual historical data from 2024 through April 2025. The remainder of 2025 will be projected).
The analysis was designed with the following considerations:
Linear regression applied to each time series:
The X-axis represents the accounting period, while the Y-axis represents the monthly margin.Exclusion of low-reliability series:
For statistical robustness, the script filters out time series with fewer than 6 months of data, ensuring that indicators are based on sufficient historical context.Risk score (only calculated when inversion is detected) is composed of four components:
Worst month of the year (min_magin) → 40% weight
(The more negative the lowest projected margin, the higher the risk)Number of negative months (num_negatives) → 20% weight
(The more frequent the negative months, the greater the risk)Timing of the first negative month (first_negative_idx) → 20% weight
(The earlier the inversion occurs, the more alarming it is)Drop between the first and worst month (delta) → 20% weight
(This captures the intensity of the deterioration)
Finally, all time series and summarized risk scores were plotted and presented in tabular format to facilitate visualization and interpretation.
Results and Discussion
The analysis revealed groups with initially positive margins that shifted to negative values over the course of 2025. This demonstrates that relying solely on cumulative totals can be misleading, as illustrated by Company C: despite ending with a net positive result, it experienced critical mid-period monthly losses.
Figure 1. Time Series and Summary Table
Groups such as A, E, and B exhibit recurring negative months, signaling structural deterioration. Others, like Group C, face isolated but severe shocks, which equally pose risks. All high-score groups display a downward trend, underscoring the critical need to analyze behavioral trajectories, not just aggregate results.
Conclusion
Early identification of margin reversals is critical to preventing unexpected losses. By combining projections with multiple risk factors, we can intelligently and proactively prioritize the most critical groups. This approach is simple yet powerful, and can be continuously applied to support strategic decision-making.
Code (Python | Google Colab)
from google.colab import files
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
from tabulate import tabulate
import textwrap
uploaded = files.upload()
for file_name in uploaded.keys():
print(f"Arquivo carregado: {file_name}")
uploaded_file_name = file_name
df = pd.read_excel(uploaded_file_name)
print("Coluna:")
print(df.columns.tolist())
df['Period'] = pd.to_datetime(df['Period'])
df = df.sort_values('Period')
min_date = df['Period'].min()
df['Month_Num'] = ((df['Period'].dt.year - min_date.year) * 12 +
(df['Period'].dt.month - min_date.month))
months_2025 = pd.date_range(start='2025-01-01', end='2025-12-01', freq='MS')
months_2025_num = ((months_2025.year - min_date.year) * 12 +
(months_2025.month - min_date.month))
projections_2025 = []
for group_name, group in df.groupby('Company'):
if group['Month_Num'].nunique() < 6:
continue
X = group[['Month_Num']]
y = group['Magin']
model = LinearRegression().fit(X, y)
existing_months = group['Month_Num'].unique()
months_to_project = [m for m in months_2025_num if m not in existing_months]
if not months_to_project:
continue
X_proj = pd.DataFrame({'Month_Num': months_to_project})
magin_proj = model.predict(X_proj)
has_inversion = False
if magin_proj[0] > 0 and np.any(magin_proj < 0):
has_inversion = True
total_2025 = np.sum(magin_proj)
if has_inversion:
min_magin = magin_proj.min()
num_negatives = sum(m < 0 for m in magin_proj)
first_negative_idx = np.argmax(magin_proj < 0)
delta = magin_proj[0] - min_magin
score = abs(min_magin) * 0.4 + num_negatives * 0.2 + (12 - first_negative_idx) * 0.2 + delta * 0.2
else:
score = 0
projections_2025.append({
'Company Name': group_name,
'Monthly Magin 2025': magin_proj,
'Projected Months': months_to_project,
'Historical Group': group,
'HasInversion': has_inversion,
'Risk_Score': score,
'Total_Projected_2025': total_2025
})
df_proj = pd.DataFrame(projections_2025)
df_criticals = df_proj[df_proj['HasInversion'] == True]
top_criticals = df_criticals.sort_values('Risk_Score', ascending=False).head(20)
def plot_projection(group_df, title):
for i, row in group_df.iterrows():
group = row['Historical Group']
proj_months = row['Projected Months']
magin_proj = row['Monthly Magin 2025']
name = row['Company Name']
real_months = group['Month_Num'].values
real_magin = group['Magin'].values
total_2025 = row['Total_Projected_2025']
score = row['Risk_Score']
plt.figure(figsize=(11, 4))
plt.plot(real_months, real_magin / 1_000, label='Historical', marker='o', color='blue')
proj_colors = ['green' if val > 0 else 'red' for val in magin_proj]
for x, y, color in zip(proj_months, magin_proj, proj_colors):
plt.scatter(x, y / 1_000, color=color, marker='x', s=50)
plt.plot(proj_months, magin_proj / 1_000, linestyle='--', color='gray', alpha=0.6)
for x, y in zip(real_months, real_magin):
plt.annotate(f"{y/1_000:.0f}k", (x, y/1_000),
textcoords="offset points", xytext=(0,5), ha='center', fontsize=8, color='blue')
for x, y, color in zip(proj_months, magin_proj, proj_colors):
plt.annotate(f"{y/1_000:.0f}k", (x, y/1_000),
textcoords="offset points", xytext=(0,-10), ha='center', fontsize=8, color=color)
plt.title(f'{title} – {name}\nScore: {score:.1f} | Total Projected 2025: R${total_2025/1_000:.0f}k')
plt.xlabel('Months since start of dataset')
plt.ylabel('Magin (R$ thousands)')
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
print("\n📉 Cases with projected inversion and significant financial risk:")
plot_projection(top_criticals, 'Magin Inversion – Projection 2025')
def format_thousands(value):
"""Format values in thousands with color"""
value_k = value / 1_000
if value_k < 0:
return f"\033[91m{value_k:,.0f}k\033[0m" # red
else:
return f"\033[92m{value_k:,.0f}k\033[0m" # green
def create_summary_table(df_proj, top_n=20):
table_data = []
for _, row in df_proj.sort_values('Risk_Score', ascending=False).head(top_n).iterrows():
name = textwrap.shorten(row['Company Name'], width=30, placeholder="...")
total_proj = format_thousands(row['Total_Projected_2025'])
score = f"{row['Risk_Score']:.1f}"
neg_months = sum(m < 0 for m in row['Monthly Magin 2025'])
neg_months_str = f"\033[91m{neg_months}\033[0m" if neg_months > 0 else "0"
worst_month = format_thousands(min(row['Monthly Magin 2025']))
first = row['Monthly Magin 2025'][0]
last = row['Monthly Magin 2025'][-1]
trend = "↓" if last < first else "↑" if last > first else "→"
trend_colored = f"\033[91m{trend}\033[0m" if last < first else f"\033[92m{trend}\033[0m"
table_data.append([
name, total_proj, score, neg_months_str, worst_month, trend_colored
])
headers = [
"\033[1mGroup\033[0m",
"\033[1mTotal 2025\033[0m",
"\033[1mScore\033[0m",
"\033[1mNeg. Months\033[0m",
"\033[1mWorst Month\033[0m",
"\033[1mTrend\033[0m"
]
config = {
'headers': headers,
'tablefmt': 'pretty',
'stralign': 'left',
'numalign': 'right'
}
print(f"\n\033[1m📊 SUMMARY – TOP {top_n} GROUPS WITH HIGHEST INVERSION RISK\033[0m")
print(tabulate(table_data, **config))
print("\n\033[3mLegend: ↓=Deterioration | ↑=Improvement | →=Stable\033[0m")
create_summary_table(df_criticals, top_n=5)
Subscribe to my newsletter
Read articles from Bernardo Ribeiro de Moura directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Bernardo Ribeiro de Moura
Bernardo Ribeiro de Moura
Senior Data Analyst at Unimed Rio Preto, working with predictive models, cost optimization, and data-driven decision-making. Bachelor’s in Chemistry (UNESP), transitioning to Data Science (UNIVESP), combining science and technology to solve real-world problems. Specialized in Google Data Analytics. I write about predictive analysis, data visualization, and statistical modeling. Let’s exchange ideas on Python, SQL, and the impact of data in our daily lives!