Node.js Fintech Performance Architecture

Scaling Node.js for High-Frequency Fintech Requests

How to keep a real-time exchange platform responsive under load — the specific caching strategy, queue design, and database access patterns that actually move the needle.

E
Egor Dultsev
Senior Engineer
8 min read

A real-time fiat-crypto exchange is one of the more demanding things you can build on Node.js. Every request involves state transitions, concurrent actor updates (user, operator, system), real-time delivery via WebSocket, and async side effects (SMS, audit logs, notifications).

Get the architecture wrong and you spend 80% of your time debugging race conditions. Here’s what I’ve learned building these systems.

The Core Problem: Shared Mutable State Under Concurrency

An exchange request has many states: created → pending → assigned → processing → completed. Multiple actors touch it. The naive approach — update in Postgres, emit a WebSocket event — fails the moment two operators try to claim the same request simultaneously.

You need three things:

  1. Atomic state transitions — no two actors can advance the same state simultaneously
  2. Real-time delivery — every interested party sees the update within 100ms
  3. Audit log — every transition is permanently recorded

My implementation:

// Atomic claim with optimistic locking
async function claimRequest(requestId: string, operatorId: string): Promise<boolean> {
  const result = await db.transaction(async (tx) => {
    const request = await tx
      .select()
      .from(requests)
      .where(
        and(
          eq(requests.id, requestId),
          eq(requests.status, 'pending'),      // precondition
          isNull(requests.assignedOperator),   // unclaimed
        )
      )
      .for('update')  // row-level lock
      .limit(1);

    if (!request[0]) return false;

    await tx
      .update(requests)
      .set({ status: 'assigned', assignedOperator: operatorId, assignedAt: new Date() })
      .where(eq(requests.id, requestId));

    return true;
  });

  if (result) {
    await emitToRoom(`request:${requestId}`, 'status:assigned', { operatorId });
  }

  return result;
}

The SELECT ... FOR UPDATE inside a transaction is the key. No optimistic retry loops, no distributed locks. Postgres handles the serialization.

Redis as the Real-Time Layer

Postgres for truth; Redis for speed. I use Redis for:

  • Rate limitingINCR + EXPIRE per user/operator per window
  • Session caching — operator sessions, user context (5-min TTL)
  • Pub/sub — cross-instance WebSocket event propagation
// Rate limit: max 10 requests per user per minute
async function checkRateLimit(userId: string): Promise<boolean> {
  const key = `rl:exchange:${userId}:${Math.floor(Date.now() / 60000)}`;
  const count = await redis.incr(key);
  if (count === 1) await redis.expire(key, 120); // 2-min window for safety
  return count <= 10;
}

BullMQ for Everything Async

Every side effect that doesn’t need to complete before the API response returns should be a job. In a fintech system this is almost everything: SMS, emails, webhook notifications, audit events, LLM calls.

The architecture I use:

// Queues by priority and failure tolerance
const queues = {
  critical: new Queue('critical', { redis }),    // SMS OTP — must not lose
  standard: new Queue('standard', { redis }),    // notifications — best effort
  analytics: new Queue('analytics', { redis }), // events — can lag
};

// SMS goes on the critical queue with aggressive retry
await queues.critical.add('send-otp', { phone, code }, {
  attempts: 5,
  backoff: { type: 'exponential', delay: 1000 },
  removeOnComplete: false,  // audit trail
  removeOnFail: false,
});

The 30% backend latency reduction I achieved on a SaaS platform came almost entirely from this shift — moving synchronous operations into async job queues. API responses dropped from avg 800ms to avg 320ms.

Database Access Patterns That Actually Matter

Three patterns that move the needle in high-read systems:

1. Strategic index design — a missing index on a frequently-queried foreign key is often the root cause of slow pages. Profile with EXPLAIN ANALYZE before optimizing elsewhere.

2. Connection pool sizing — the default NestJS / TypeORM pool size is 10. In a system with 50 concurrent WebSocket connections each making DB calls, you’ll queue. Set the pool relative to your actual concurrency, not the default.

3. Read replicas for analytics queries — operator dashboards run aggregate queries. Running those on your primary instance competes with transactional writes. Route them to a read replica, even if it’s 100ms behind.

// Route heavy reads to replica
const stats = await replicaDb
  .select({
    total: count(),
    volume: sum(requests.amount),
    avgTime: avg(requests.processingMs),
  })
  .from(requests)
  .where(between(requests.createdAt, from, to));

On WebSocket Architecture

The hardest part of real-time systems isn’t the WebSocket connection — it’s state reconciliation when a client reconnects after a disconnect.

Every client should be able to call a sync endpoint on reconnect and receive the full current state. The WebSocket channel then delivers deltas. This makes your real-time layer resilient to network flakiness.

// On reconnect: client calls /api/requests/:id/sync
// Returns full current state + any missed events since lastSeenAt
async function syncRequest(requestId: string, lastSeenAt: Date) {
  const [request, missedEvents] = await Promise.all([
    getRequestById(requestId),
    getEventsSince(requestId, lastSeenAt),
  ]);
  return { request, missedEvents };
}

Summary

Scaling Node.js for fintech isn’t about clever tricks — it’s about applying the right tool to each layer:

LayerToolWhy
State transitionsPostgres transactions + FOR UPDATEACID guarantees
Rate limitingRedis INCRSub-ms latency
Async side effectsBullMQDurability + retries
Real-time deliveryWebSocket + Redis pub/subCross-instance fan-out
Heavy readsRead replicasIsolation from write path

The gains compound. Fix the right bottleneck at each layer and a 30–50% latency reduction is achievable without touching business logic.