I developed a simple yet effective algorithm for identifying outliers in a daily-updated database containing thousands of historical item-level records. Without resorting to machine learning or deep learning, the method uses each time series' median and, through iteration, sorts the results by financial variation in descending order to highlight critical outliers. I then plot the most relevant transaction trends. This solution has become one of the most useful and easily implementable tools in my weekly routine. For this article, I manually created a fictional dataset to simulate the results

Code Description

I used Google Colab to load an Excel file. The code allows me to upload the dataset directly through the interface and displays the name of the selected file

Figure 1. File import process.

Next, I sorted the accounting periods in chronological order. For each month, I calculated the median cost per item using only historical data, then compared these values with current month costs. I filtered only items with costs above the historical median, computed the absolute variation, and finally sorted the results to highlight the top 30 deviations, creating a prioritized list of financial outliers.

Figure 2. Iterative Median-Based Outlier Detection Algorithm.

Finally, for each of the top 10 outliers, I plotted a time-series chart comparing actual monthly costs against the rolling historical median (progressively calculated with each new period). I used solid lines for observed costs and dashed lines for median values. This visualization clearly highlights when and how costs exceeded historical patterns, revealing either seasonal trends or abrupt spikes.

Figure 3. Time series visualization of cost outliers.

The four main time series demonstrate the algorithm's effectiveness: each plot clearly shows outliers as sharp deviations from historical medians, highlighting both isolated spikes and anomalous trends. This immediate visualization of problem areas validates our approach and focuses analysis on the most critical transactions.

Figure 4. Example time series with critical outliers

Conclusion

This project demonstrates how a simple solution based on medians and progressive iterations can effectively identify financial outliers. The algorithm proves that with clear logic and minimal processing, historical data can be transformed into actionable insights, highlighting critical anomalies without requiring sophisticated tools or extensive computational resources.

Statistical Approach for Outlier Detection

Code Description

Conclusion

Subscribe to my newsletter

Bernardo Ribeiro de Moura

Bernardo Ribeiro de Moura