Building a Mobile Face Recognition SDK from Scratch

kradevvkradevv
3 min read

The Twins Finder Story

The Problem with Existing Solutions

Cloud Services: AWS Rekognition and Google Vision API work well, but requiring internet for every photo analysis killed the user experience. Plus privacy concerns and API costs.

Open Source Libraries: Found plenty on GitHub, but they were either research projects with inconsistent accuracy or designed for server deployments with dependencies too heavy for mobile.

Commercial SDKs: Available but priced for enterprise customers ($10k+ licenses) and most still required network connectivity.

Building PerchEye SDK

Instead of compromising, I decided to build exactly what I needed.

Technical Requirements

  • Offline-first: No network dependency

  • Cross-platform: Single codebase for Android, iOS, Flutter, React Native

  • Mobile-optimized: Reasonable memory footprint and processing speed

  • Privacy-focused: No biometric data storage or transmission

Implementation Approach

Used TensorFlow Lite as the foundation but had to extensively optimize the models for mobile constraints. The biggest challenge was balancing accuracy with resource usage.

Final specs:

Model size: ~15MB
Face detection: 50-100ms
Hash generation: ~150ms  
Similarity comparison: <20ms
Memory usage: ~75MB peak
Accuracy: >95% (clear frontal faces)

Architecture Decisions

Transaction-based API: Allows feeding multiple images for better accuracy, which aligns with how people naturally take photos.

Mathematical hashes: Store only feature vectors, never actual face data. Hashes are 2-5KB and can't be reverse-engineered into images.

Native performance: C++ core with platform-specific bindings to avoid cross-platform performance penalties.

Building Twins Finder

With PerchEye as the foundation, the app development focused on user experience rather than computer vision complexity.

Core Flow

  1. Capture or load group photo

  2. PerchEye detects all faces

  3. Generate feature hashes for each person

  4. Calculate similarity scores

  5. Highlight the closest facial match

Privacy Implementation

  • All processing happens on-device

  • Only mathematical representations stored temporarily

  • Data automatically cleared on app close

  • GDPR compliant by design

Technical Challenges Solved

Real-time processing: Optimized the entire pipeline to maintain smooth UI while running face analysis in background threads.

Memory management: Careful buffer management to prevent OOM crashes during image processing.

Cross-platform consistency: Ensured identical behavior across platforms while maintaining native performance characteristics.

Lessons Learned

  1. Build your dependencies: Sometimes the right tool doesn't exist and you need to create it

  2. Mobile constraints are real: Desktop assumptions about processing power and memory don't apply

  3. Privacy by design: Much easier to build privacy in from the start than retrofit it later

  4. Performance perception: Even 200ms feels instant to users when there's proper feedback

Open Source Release

Made PerchEye freely available because computer vision capabilities shouldn't require enterprise budgets. The SDK includes:

  • Complete documentation

  • Working examples for all platforms

  • Demo applications

  • No usage restrictions

What's Next

Currently working on:

  • Liveness detection to prevent photo spoofing

  • Better handling of challenging lighting conditions

  • Performance optimizations for older devices

Interested in hearing from other developers working on mobile computer vision - what approaches have you taken?


Tech Stack: TensorFlow Lite, C++, Kotlin, Swift, Dart, JavaScript
Platforms: Android, iOS, Flutter, React Native
License: Apache 2.0

2
Subscribe to my newsletter

Read articles from kradevv directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

kradevv
kradevv