System Design: Building a Lightning-Fast Bank Identification Number (BIN) Lookup Service


Introduction
Every time you swipe, tap, or click to pay, a tiny, crucial piece of data works behind the scenes: the Bank Identification Number (BIN). It's the unsung hero that tells payment systems exactly where your card comes from. In the fast-paced world of financial transactions, lightning-fast and reliable. The BIN Lookup service isn't just nice to have, it's essential for everything from accurately routing your payment and detecting fraud in real time to clearing transactions so merchants get paid through settlements.
This post dives into the system design for just such a service, built to be scalable, high-performance, and ready for the demands of the modern payment card industry. As you'll see in the architectural diagram accompanying this design (imagine it right here!), our design incorporates several layers to ensure speed, reliability, and data freshness, targeting an impressive 300 Transactions Per Second (TPS) for lookups.
System Requirements: The Foundation of Our Design
To deliver a truly effective BIN Lookup service, we've identified key functional and non-functional requirements:
Functional Requirements
BIN Lookup: Users must be able to input a BIN (the first 6-19 digits of a payment card) and retrieve all associated details.
Data Management: The system needs to support efficient management of BIN data, ingesting updates from external data sources and card networks through both scheduled and manual processes.
Non-Functional Requirements
Low Latency: We need responses in real-time or near real-time (targeting under 100 milliseconds) to avoid delays in transaction processing and fraud detection.
High Availability & Scalability: The system must be continuously available with minimal downtime and capable of scaling effortlessly to handle growing transaction volumes, leveraging strategies like database read replicas and asynchronous consumer services for distributed processing.
Caching Mechanism: To boost performance and lighten database load, a multi-layered caching solution, leveraging Redis for frequently accessed BINs, is crucial. This cache also plays a role in invalidation during data updates.
Protocol Support & API Gateway: The system flexibly supports various input formats, including direct API calls (managed by an API Gateway with load balancing and rate limiting) and asynchronous processing via message brokers. Data serialization and deserialization will adhere to standard formats like JSON and XML.
Capacity & Performance Targets: Built for Speed
Understanding the expected load is fundamental to designing a robust system. Here's a breakdown of our capacity estimations, particularly focusing on our 300 TPS target for lookups:
BIN Data Updates (Write Operations)
BIN data, which changes regularly, typically requires batch updates managed by our Data Management layer from External Data Sources/Card Networks. We're looking at:
Daily Updates: Approximately 100,000 records. This translates to a minimal write throughput of about 0.864 transactions per second (TPS) if spread evenly over 24 hours.
Quarterly Updates: Around 2,000,000 records, resulting in roughly 0.26 TPS if spread over 90 days.
These figures show that the write load is relatively low and infrequent, which is typical for static or semi-static reference data. The architecture illustrates how these updates flow through an Update Queue and trigger Cache Invalidation.
BIN Lookups (Read Operations): Targeting 300 TPS
This is where the service truly shines and where performance is paramount. Our core target for read operations (queries per second, or QPS) is:
Peak QPS: An ambitious 300 queries per second (300 TPS) during peak demand.
Latency Target: Each lookup should respond within 100 milliseconds (ms).
To meet these demanding read performance targets at 300 TPS, the system must efficiently handle approximately 30 concurrent requests
The calculation is:
$$\text{Concurrent Requests} = \text{Queries Per Second (QPS)} \times \text{Latency Target (in seconds)}$$
So, for our scenario:
$$300 \frac{\text{queries}}{\text{second}} \times 0.1 \text{ seconds} = 30 \text{ concurrent requests}$$
This emphasizes the critical role of efficient caching and a highly optimized data layer.
API Design: The Gateway to BIN Data
The core of the BIN Lookup service will be a straightforward and efficient API endpoint, primarily serving our Client Layer (Clearing, Payment Processing, Settlement, etc.) as depicted in the diagram. All API requests first pass through an API Gateway, ensuring robust routing, load balancing, and rate limiting.
Our primary interface for retrieving BIN information is:
The BIN lookup service would have one endpoint in the API to look up BIN information.:
GET /v1/lookup/{bin}
This endpoint looks up BIN information, with this parameter
Field Name | Description | Data Type |
bin | The first 6–19 digits of a payment card | string |
And would return the BIN information in this format.
Field Name | Description | Data Type |
country | (Dictionary) The country where the issuing bank is located. | Object |
- code | The ISO numeric country code. | Integer |
- alpha3 | The ISO alpha-3 country code. | String |
- name | The full name of the country | String |
range | An array containing the low and high ranges of the BIN. | Array of integers |
low_range | The lower bound of the BIN range. | Integer |
high_range | The upper bound of the BIN range. | Integer |
card_scheme | The card scheme (e.g., Visa, Mastercard, American Express). | String |
acceptance_brand | The acceptance brand or card program identifier (e.g., Visa Classic, Mastercard Gold). | String |
brand_product | A specific product or service associated with the card brand (e.g., Visa Checkout). | String |
issuer_name | The name of the issuing bank. | String |
cardholder_currency_indicator | A code indicating the cardholder's preferred currency. | String |
card_program_priority | A priority value assigned to the card program. | Integer |
card_type | The type of card (e.g., credit, debit). | String |
card_level | The level of the card (e.g., standard, premium). | String |
cardholder_currency_code | The ISO 4217 code for the cardholder's billing currency. | String |
Architectural Layers: Deconstructing the Design
Our system is composed of several interconnected layers, each playing a vital role in achieving our performance and reliability goals:
Data Ingestion Layer: Keeping BINs Fresh and Accurate
This critical layer is responsible for retrieving, processing, and integrating the latest BIN data from various sources.
Scheme Connectors: Dedicated connectors, such as a Visa API Connector, Mastercard API Connector, and Amex API Connector, are used to interface with official card network data sources.
Data Pipeline (Message Broker + Stream Processing): For real-time updates and efficient processing, we implement a robust data pipeline using enterprise message brokers and stream processing frameworks. This pipeline handles:
Real-time data ingestion: Capturing updates as they become available.
Data transformation & validation: Ensuring data quality and consistency.
Duplicate detection: Preventing redundant entries.
Batch Processor (Scheduled jobs): A batch processor provides a resilient mechanism for:
Daily full sync fallback: Ensuring data integrity even if real-time streams face issues.
Data quality checks: Performing comprehensive audits on ingested data.
Client Layer: Integrating with Financial Services
This layer represents the various financial services that consume our BIN lookup capabilities, such as Clearing Service, Payment Processing Service, and Settlement Service. They interact with our system either directly via the API Gateway or asynchronously via the Message Broker, depending on their specific needs.
API Gateway: The Robust Front Door
All direct API requests from client services first pass through our API Gateway. This component is critical for load balancing requests across our lookup services and implementing rate limiting to protect the system from overload, ensuring stable performance even under peak loads up to our 300 TPS target.
Message Broker Layer: Powering Asynchronous Communication
Central to our asynchronous communication and high throughput is the Message Broker Layer, powered by technologies like Apache Kafka. This layer facilitates seamless, decoupled communication between various services by utilizing dedicated topics (or queues):
bin-lookup-request Queue: Client Layer services (e.g., Clearing, Payment Processing) publish their BIN lookup requests to this queue. Our Application Layer's Async Consumer Services subscribe to and process messages from this queue, ensuring requests are handled efficiently and asynchronously.
bin-lookup-response Queue: Once a BIN lookup request is processed by the Application Layer, the results are published back to this queue. Client services can then consume their respective responses from this queue, completing the asynchronous request-response cycle.
bin-data-update Queue: This critical queue is dedicated to events signalling changes or updates in the BIN data. When the Data Ingestion Layer processes new or updated BIN records, it publishes these updates to this queue. The Application Layer's consumers subscribe to this queue specifically to trigger cache invalidation for the affected BIN ranges, ensuring the Caching Layer always serves fresh data.
This strategic use of distinct queues allows for clear separation of concerns, guarantees message delivery, and significantly enhances the scalability and reliability of our real-time BIN lookup service. This asynchronous pattern is key to achieving our low-latency and high-availability goals, especially vital for handling our 300 TPS target.
Application Layer: The Brains of the Operation
The Application Layer houses the core logic of our BIN Lookup service. It includes the BIN Lookup API GET /v1/lookup/BIN endpoint for direct requests and Async Consumer Services that process messages from the message broker. Both pathways feed into the Business Logic Processing unit, which orchestrates the BIN lookup, interacts with the caching layer, and fetches data from the database when necessary. This layer is designed to auto-scale to handle fluctuating demand, crucial for maintaining 300 TPS.
Caching Layer: Accelerating Lookups with Redis
To meet our stringent latency targets, a Multi-Layered Caching strategy is implemented. The Caching Layer primarily utilizes Redis Cache for frequently accessed BINs, significantly reducing the load on our database.
When a BIN is requested, the system first checks the cache:
Cache Hit: If found, the data is returned instantly, contributing to our low latency and high TPS.
Cache Miss: If not in cache, the request proceeds to the Data Layer via a direct query. Importantly, this layer also handles Cache Invalidation triggered by data updates, ensuring data consistency.
Exploring Redis for BIN Range Lookups: The Precision Challenge
While Redis is ideal for exact key-value lookups, we rigorously evaluated its ZSET (Sorted Set) feature for caching BIN ranges. The concept involves using the high_range of a BIN as the score in a sorted set, allowing for range-based queries (ZRANGEBYSCORE).
Let's consider a practical example with a sample PAN: 6791310599999999999 and a BIN range like 6791310000000000000 to 6791319999999999999.
The implementation would look something like this:
# Storing BIN details in a Hash
redis_client.hset(f"bin_table:{key}", mapping=value)
# Adding to a Sorted Set for range indexing
# Using a (potentially truncated) high_range as the score
high_range_score = int(value['high_range'][:10]) # Truncating to fit within safe integer range
redis_client.zadd('bin_table:bin_range_index', {f"bin_table:{key}": high_range_score})
# Example lookup in Redis for PAN: 6791310599999999999
# Search for ranges where the score (high_range_score) is greater than or equal to the PAN prefix
lookup_score = int(str(PAN)[:10]) # Use a truncated PAN as lookup score
potential_keys = redis_client.zrangebyscore("bin_table:bin_range_index", lookup_score, "+inf")
# Application logic would then iterate through potential_keys
# and perform an exact check: bin_start <= PAN <= bin_end
The fundamental challenge with this approach for BIN lookups stems from how Redis ZSET scores are stored: as IEEE 754 double 64-bit floating-point numbers. This means that while they can represent a wide range of values, they can only precisely represent integers between −(2^53) and +(2^53) (approximately plus minus 9*10^15).
A typical 16 to 19-digit PAN, such as 6791310599999999999, far exceeds this safe integer range (9*10^15). If we attempt to use the full PAN or bin_end as a score in Redis, it will be stored as an approximated floating-point representation. This approximation can lead to:
ZRANGEBYSCORE failures: Queries might not find the correct range.
Incorrect or missing matches: The system could erroneously classify a BIN or fail to find a valid match.
Hard-to-debug behaviour: Inconsistent results due to underlying precision issues.
Redis Workarounds (and their trade-offs):
To mitigate this, common workarounds involve:
Truncating to Safe Digits: Using only the first 8-9 digits (e.g., 67913105) as the score. This requires significant post-processing in the application code to retrieve multiple potential ranges and then perform an exact bin_start <= PAN <= bin_end check.
Lexicographical Range (for string PANs): Storing BINs as zero-padded strings and using ZRANGEBYLEX. While it avoids floating-point issues, it relies on strings and still requires careful management of ranges.
While Redis offers unparalleled speed for simple key-value lookups, its architecture introduces complexity and potential inaccuracies for the precise range-based queries required by full BINs.
Data Layer: Persistent Storage with PostgreSQL
The Data Layer is powered by a robust Primary Database BIN Data Store, specifically a PostgreSQL cluster. This cluster is configured for master-slave replication, ensuring data durability and high availability.
For read scalability, essential for handling our 300 TPS lookup target, we extensively utilize Read Replicas with load distribution. This architecture ensures that the majority of lookup queries are served by replicas, offloading the primary database. The database is also intelligently partitioned by BIN ranges to optimize query performance for specific lookups.
Our core bins table schema is designed for highly efficient and precise range lookups, leveraging PostgreSQL's powerful native data types and indexing:
CREATE TABLE bintable (
id SERIAL PRIMARY KEY,
-- BIN range as a range type
bin_range INT8RANGE NOT NULL,
-- Store both ends separately for direct access
low_range BIGINT NOT NULL,
high_range BIGINT NOT NULL,
-- Card and issuer metadata
card_scheme TEXT,
acceptance_brand TEXT,
brand_product TEXT,
issuer_name TEXT,
cardholder_currency_indicator TEXT,
card_program_priority INT,
card_type TEXT,
card_level TEXT,
cardholder_currency_code TEXT,
-- Country as structured JSON (code, alpha3, name)
country JSONB
CONSTRAINT valid_range CHECK (low_range <= high_range),
);
CREATE INDEX idx_bintable_range ON bintable USING GIST (bin_range);
PostgreSQL's Native Superiority for BIN Range Lookups:
In stark contrast to the challenges faced with Redis ZSETs, PostgreSQL offers a far more natural, robust, and accurate way to handle BIN range queries, especially for large 16-19 digit PANs.
Consider the same PAN: 6791310599999999999. A lookup query is elegantly simple and highly performant:
SELECT *
FROM bintable
WHERE
bin_range @> 6791310599999999999::BIGINT
Why PostgreSQL Excels for BIN Range Data:
Feature | PostgreSQL | Redis (with workarounds) | Impact for BIN Lookups |
Native 64-bit Integer Support | ✅ Yes (BIGINT handles full 19-digit PANs precisely). | ❌ No (IEEE 754 double for ZSET scores, safe to 253 only). | Accuracy & Correctness: Crucial for financial data. No precision loss. |
Range Indexing | ✅ Optimized GiST/SP-GiST on int8range columns. | ⚠️ ZSET index only works on a single score (1D). | Performance: Highly efficient lookups even with millions of ranges. |
Native Range Support | ✅ Direct int8range data type with @> containment operator. | ❌ Must simulate range logic in application code. | Simplicity & Safety: One-liner SQL, atomic, consistent. Reduced application complexity. |
Indexing on Both Start/End | ✅ GiST indexes on the range itself. | ❌ ZSET only indexes on the score (e.g., high_range). | Query Flexibility: Can efficiently query against both ends of a range. |
Accuracy | ✅ Full precision. | ⚠️ May lose precision without careful workarounds. | Data Integrity: Ensures correct BIN matching every time. |
Query Logic | ✅ One-liner SQL. | ❌ Multi-step, app-driven. | Maintainability: Simpler code, easier to debug. |
Performance at Scale | ✅ Scales well with large datasets and indexes. | ⚠️ Fast for small/truncated sets; error-prone at scale for full PANs. | Reliability: Built for large, production-grade financial datasets. |
Recommendation
For the authoritative source of truth and for accurate, maintainable, and correct BIN range lookups involving full 16-19 digit PANs, PostgreSQL is unequivocally the superior choice. Its native BIGINT support and specialized int8range indexing precisely address the core requirements.
We primarily use Redis as a lightning-fast cache for exact BIN hits or perhaps for initial 6-9 digit BIN prefix lookups where precision issues are less critical and can be managed by application logic. This hybrid approach allows us to leverage Redis's speed for the vast majority of requests while relying on PostgreSQL's robustness and accuracy for all definitive lookups and cache misses.
What's Next?
This post lays the groundwork for a robust and high-performing BIN Lookup service. As illustrated in our detailed architecture, by leveraging a dedicated Data Ingestion Layer, a message broker for asynchronous communication, a multi-layered caching strategy with Redis, scalable API gateways, and a highly optimized PostgreSQL cluster with read replicas, we are building a solution designed for speed, reliability, and ease of data management. This approach ensures low latency for crucial transaction lookups while efficiently handling data updates.
Subscribe to my newsletter
Read articles from Peter Makafan directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Peter Makafan
Peter Makafan
Hi, I'm Peter, a Senior Software Engineer and independent Technology consultant, specializes in building scalable, resilient, and distributed software systems. With over a decade of experience in fintech, ecommerce, and digital marketing, he brings a unique perspective to technology solutions. His academic background in Artificial Intelligence and Computer Science fuels his passion for cutting-edge technologies like financial technology, robotics, and cloud-native development.