Scaling Node.js for High-Frequency Fintech Requests
How to keep a real-time exchange platform responsive under load — the specific caching strategy, queue design, and database access patterns that actually move the needle.
A real-time fiat-crypto exchange is one of the more demanding things you can build on Node.js. Every request involves state transitions, concurrent actor updates (user, operator, system), real-time delivery via WebSocket, and async side effects (SMS, audit logs, notifications).
Get the architecture wrong and you spend 80% of your time debugging race conditions. Here’s what I’ve learned building these systems.
The Core Problem: Shared Mutable State Under Concurrency
An exchange request has many states: created → pending → assigned → processing → completed. Multiple actors touch it. The naive approach — update in Postgres, emit a WebSocket event — fails the moment two operators try to claim the same request simultaneously.
You need three things:
- Atomic state transitions — no two actors can advance the same state simultaneously
- Real-time delivery — every interested party sees the update within 100ms
- Audit log — every transition is permanently recorded
My implementation:
// Atomic claim with optimistic locking
async function claimRequest(requestId: string, operatorId: string): Promise<boolean> {
const result = await db.transaction(async (tx) => {
const request = await tx
.select()
.from(requests)
.where(
and(
eq(requests.id, requestId),
eq(requests.status, 'pending'), // precondition
isNull(requests.assignedOperator), // unclaimed
)
)
.for('update') // row-level lock
.limit(1);
if (!request[0]) return false;
await tx
.update(requests)
.set({ status: 'assigned', assignedOperator: operatorId, assignedAt: new Date() })
.where(eq(requests.id, requestId));
return true;
});
if (result) {
await emitToRoom(`request:${requestId}`, 'status:assigned', { operatorId });
}
return result;
}
The SELECT ... FOR UPDATE inside a transaction is the key. No optimistic retry loops, no distributed locks. Postgres handles the serialization.
Redis as the Real-Time Layer
Postgres for truth; Redis for speed. I use Redis for:
- Rate limiting —
INCR+EXPIREper user/operator per window - Session caching — operator sessions, user context (5-min TTL)
- Pub/sub — cross-instance WebSocket event propagation
// Rate limit: max 10 requests per user per minute
async function checkRateLimit(userId: string): Promise<boolean> {
const key = `rl:exchange:${userId}:${Math.floor(Date.now() / 60000)}`;
const count = await redis.incr(key);
if (count === 1) await redis.expire(key, 120); // 2-min window for safety
return count <= 10;
}
BullMQ for Everything Async
Every side effect that doesn’t need to complete before the API response returns should be a job. In a fintech system this is almost everything: SMS, emails, webhook notifications, audit events, LLM calls.
The architecture I use:
// Queues by priority and failure tolerance
const queues = {
critical: new Queue('critical', { redis }), // SMS OTP — must not lose
standard: new Queue('standard', { redis }), // notifications — best effort
analytics: new Queue('analytics', { redis }), // events — can lag
};
// SMS goes on the critical queue with aggressive retry
await queues.critical.add('send-otp', { phone, code }, {
attempts: 5,
backoff: { type: 'exponential', delay: 1000 },
removeOnComplete: false, // audit trail
removeOnFail: false,
});
The 30% backend latency reduction I achieved on a SaaS platform came almost entirely from this shift — moving synchronous operations into async job queues. API responses dropped from avg 800ms to avg 320ms.
Database Access Patterns That Actually Matter
Three patterns that move the needle in high-read systems:
1. Strategic index design — a missing index on a frequently-queried foreign key is often the root cause of slow pages. Profile with EXPLAIN ANALYZE before optimizing elsewhere.
2. Connection pool sizing — the default NestJS / TypeORM pool size is 10. In a system with 50 concurrent WebSocket connections each making DB calls, you’ll queue. Set the pool relative to your actual concurrency, not the default.
3. Read replicas for analytics queries — operator dashboards run aggregate queries. Running those on your primary instance competes with transactional writes. Route them to a read replica, even if it’s 100ms behind.
// Route heavy reads to replica
const stats = await replicaDb
.select({
total: count(),
volume: sum(requests.amount),
avgTime: avg(requests.processingMs),
})
.from(requests)
.where(between(requests.createdAt, from, to));
On WebSocket Architecture
The hardest part of real-time systems isn’t the WebSocket connection — it’s state reconciliation when a client reconnects after a disconnect.
Every client should be able to call a sync endpoint on reconnect and receive the full current state. The WebSocket channel then delivers deltas. This makes your real-time layer resilient to network flakiness.
// On reconnect: client calls /api/requests/:id/sync
// Returns full current state + any missed events since lastSeenAt
async function syncRequest(requestId: string, lastSeenAt: Date) {
const [request, missedEvents] = await Promise.all([
getRequestById(requestId),
getEventsSince(requestId, lastSeenAt),
]);
return { request, missedEvents };
}
Summary
Scaling Node.js for fintech isn’t about clever tricks — it’s about applying the right tool to each layer:
| Layer | Tool | Why |
|---|---|---|
| State transitions | Postgres transactions + FOR UPDATE | ACID guarantees |
| Rate limiting | Redis INCR | Sub-ms latency |
| Async side effects | BullMQ | Durability + retries |
| Real-time delivery | WebSocket + Redis pub/sub | Cross-instance fan-out |
| Heavy reads | Read replicas | Isolation from write path |
The gains compound. Fix the right bottleneck at each layer and a 30–50% latency reduction is achievable without touching business logic.