Geospatial Analysis and Outlier Detection Report: A case study of Ebonyi State 2023 Presidential Election Dataset

UWAH SALOMEUWAH SALOME
2 min read

1. Introduction

This report analyses the 2023 presidential election/voting data for Ebonyi State to identify polling units with potentially unusual voting patterns. I employed geocoding techniques to obtain the longitude and latitude values for each polling unit/ward, geospatial techniques to find neighbouring units and calculated the outlier scores based on deviations from neighbouring vote averages. The goal is to pinpoint polling units where the voting results significantly deviate from their neighbours, indicating potential irregularities or influences. The tool used is python.

2. Methodology

Data Acquisition

  • The Ebonyi State voting data containing polling unit IDs, party names, and vote counts etc was obtained from drive link as was shared with us

  • The latitude and longitude data were missing, geocoding service precisely Google Maps Geocoding API was used to acquire them.

    Neighbour Identification

  • A radius of 1kilometer was defined to determine nearby polling units.

  • The Find_neighbour function identified neighbouring units based on their geospatial coordinates.

    Outlier Score Calculation

  • The calculate_outlier_scores function calculated outlier scores for each party in each polling unit.

  • Scores were derived from the absolute difference between a polling unit's vote count for a party and the average vote count for that party in its neighbouring units.

3. Findings

  1. Sorted Outlier Scores

The dataset was sorted for each party based on their outlier scores, highlighting the polling units with the most significant deviations; 11-01-01-002, 11-01-01-003 and 11-01-01-004

  1. Top 3 Outliers

For each party, the top 3 outliers (polling units with the highest outlier scores) are pulling units with codes; 11-01-01-002, 11-01-01-003 and 11-01-01-004

Considering factors such as:

  • Comparing the outlier unit's vote count with the average and individual vote counts in neighbouring units.

  • Examining the voting patterns of the surrounding pulling units for any potential abnormalities.

  • Investigating known external factors demographics, accessibility explains the reason for the outlier score.

    4. Visualization

Conclusion

This analysis identified polling units with outlier scores, potentially indicating deviations from their surrounding areas. The top 3 outliers for each party were investigated, providing insights into potential irregularities. Limitations of the data and methodology could be data accuracy and radius selection. It is recommended that future analyses could include using different spatial analysis techniques or incorporating additional datasets for a more comprehensive understanding.

Code Snippet

0
Subscribe to my newsletter

Read articles from UWAH SALOME directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

UWAH SALOME
UWAH SALOME

I'm a skilled data analyst with expertise in Excel, Python, SPSS, and PowerBi. With over 4 year of experience, I’ve worked on projects that transformed raw data into actionable insights, driving business improvements. I excel at presenting complex data in clear, compelling ways to support decision-making. My ability to align data analysis with business goals, combined with my commitment to continuous learning, makes me a valuable asset to any team. https://t.ly/6EuYJ https://www.linkedin.com/in/uwah-salome