Building a Mobile Face Recognition SDK from Scratch


The Twins Finder Story
The Problem with Existing Solutions
Cloud Services: AWS Rekognition and Google Vision API work well, but requiring internet for every photo analysis killed the user experience. Plus privacy concerns and API costs.
Open Source Libraries: Found plenty on GitHub, but they were either research projects with inconsistent accuracy or designed for server deployments with dependencies too heavy for mobile.
Commercial SDKs: Available but priced for enterprise customers ($10k+ licenses) and most still required network connectivity.
Building PerchEye SDK
Instead of compromising, I decided to build exactly what I needed.
Technical Requirements
Offline-first: No network dependency
Cross-platform: Single codebase for Android, iOS, Flutter, React Native
Mobile-optimized: Reasonable memory footprint and processing speed
Privacy-focused: No biometric data storage or transmission
Implementation Approach
Used TensorFlow Lite as the foundation but had to extensively optimize the models for mobile constraints. The biggest challenge was balancing accuracy with resource usage.
Final specs:
Model size: ~15MB
Face detection: 50-100ms
Hash generation: ~150ms
Similarity comparison: <20ms
Memory usage: ~75MB peak
Accuracy: >95% (clear frontal faces)
Architecture Decisions
Transaction-based API: Allows feeding multiple images for better accuracy, which aligns with how people naturally take photos.
Mathematical hashes: Store only feature vectors, never actual face data. Hashes are 2-5KB and can't be reverse-engineered into images.
Native performance: C++ core with platform-specific bindings to avoid cross-platform performance penalties.
Building Twins Finder
With PerchEye as the foundation, the app development focused on user experience rather than computer vision complexity.
Core Flow
Capture or load group photo
PerchEye detects all faces
Generate feature hashes for each person
Calculate similarity scores
Highlight the closest facial match
Privacy Implementation
All processing happens on-device
Only mathematical representations stored temporarily
Data automatically cleared on app close
GDPR compliant by design
Technical Challenges Solved
Real-time processing: Optimized the entire pipeline to maintain smooth UI while running face analysis in background threads.
Memory management: Careful buffer management to prevent OOM crashes during image processing.
Cross-platform consistency: Ensured identical behavior across platforms while maintaining native performance characteristics.
Lessons Learned
Build your dependencies: Sometimes the right tool doesn't exist and you need to create it
Mobile constraints are real: Desktop assumptions about processing power and memory don't apply
Privacy by design: Much easier to build privacy in from the start than retrofit it later
Performance perception: Even 200ms feels instant to users when there's proper feedback
Open Source Release
Made PerchEye freely available because computer vision capabilities shouldn't require enterprise budgets. The SDK includes:
Complete documentation
Working examples for all platforms
Demo applications
No usage restrictions
What's Next
Currently working on:
Liveness detection to prevent photo spoofing
Better handling of challenging lighting conditions
Performance optimizations for older devices
Interested in hearing from other developers working on mobile computer vision - what approaches have you taken?
Tech Stack: TensorFlow Lite, C++, Kotlin, Swift, Dart, JavaScript
Platforms: Android, iOS, Flutter, React Native
License: Apache 2.0
Subscribe to my newsletter
Read articles from kradevv directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
