3× Performance: What Actually Bottlenecks Blockchain Routing
A post-mortem on tripling the throughput of a multi-chain transaction routing layer — the actual bottlenecks, what we tried that didn't work, and the changes that did.
“We need 3× the throughput.” That was the brief. The system: a multi-chain routing layer handling high-frequency transaction flows across multiple blockchains. The existing service was the bottleneck. Everything downstream was waiting on it.
Here’s what actually moved the needle — and what didn’t.
Starting Point: Where Was the Time Going?
Before touching any code, I instrumented every stage of the routing pipeline. This took about a day and saved us weeks of guessing.
The timing breakdown was not what the team expected:
| Stage | Expected to be bottleneck | Actual time % |
|---|---|---|
| RPC node calls | Yes | 62% |
| Database queries | Yes | 21% |
| Signature validation | No | 11% |
| Business logic / routing | No | 6% |
The team had been optimizing the routing algorithm. The routing algorithm wasn’t the problem. The RPC calls were.
Bottleneck 1: Sequential RPC Calls
The original code made RPC calls sequentially for every transaction. Each call was 80–200ms (external network). With 4–5 calls per transaction, you’re looking at 400–1000ms per transaction just in network time.
The fix was parallelization, but not naive Promise.all — that spikes your outbound connection count and hits rate limits.
// Before: sequential
for (const call of rpCalls) {
result.push(await rpcClient.call(call));
}
// After: controlled concurrency
import pLimit from 'p-limit';
const limit = pLimit(8); // max 8 concurrent per node
const results = await Promise.all(
rpcCalls.map((call) => limit(() => rpcClient.call(call)))
);
Result: RPC time down 70%. This single change was responsible for most of the throughput improvement.
Bottleneck 2: RPC Response Caching
Many of the RPC calls were for data that doesn’t change frequently: token decimals, contract ABIs, chain parameters. We were fetching these from the network on every transaction.
class RpcCache {
private cache = new Map<string, { value: unknown; expiresAt: number }>();
async get<T>(key: string, fetcher: () => Promise<T>, ttlMs: number): Promise<T> {
const cached = this.cache.get(key);
if (cached && cached.expiresAt > Date.now()) {
return cached.value as T;
}
const value = await fetcher();
this.cache.set(key, { value, expiresAt: Date.now() + ttlMs });
return value;
}
}
// Token metadata: 10-minute TTL
const decimals = await rpcCache.get(
`token:${address}:decimals`,
() => contract.decimals(),
10 * 60 * 1000
);
For a production system, move this to Redis with appropriate TTLs. In-memory works at single-instance scale; it becomes a problem when you have multiple instances with inconsistent caches.
Result: ~40% of RPC calls eliminated entirely.
Bottleneck 3: Database Query Patterns
The profiler showed 21% of time in DB queries. Three specific patterns:
Missing indexes on foreign keys. Three join tables had foreign keys with no index. On a small dataset this is invisible. On millions of rows it’s a full-table scan per transaction.
-- These were missing. Non-negotiable on high-volume tables.
CREATE INDEX CONCURRENTLY idx_transactions_chain_id ON transactions(chain_id);
CREATE INDEX CONCURRENTLY idx_transactions_status_created ON transactions(status, created_at DESC);
N+1 in the routing query. The routing logic fetched a route, then fetched each hop individually. Classic ORM trap.
// Before: N+1 (1 route query + N hop queries)
const route = await db.route.findOne({ where: { id } });
const hops = await Promise.all(route.hopIds.map(id => db.hop.findOne({ where: { id } })));
// After: join
const route = await db.route.findOne({
where: { id },
relations: ['hops', 'hops.token', 'hops.pool'],
});
Write batching for audit events. Every transaction generated multiple audit log entries. Writing them individually was creating write pressure. Batch insert with a small flush interval:
class AuditBuffer {
private buffer: AuditEvent[] = [];
private flushMs = 100;
add(event: AuditEvent) {
this.buffer.push(event);
if (this.buffer.length >= 100) this.flush();
}
async flush() {
if (!this.buffer.length) return;
const batch = this.buffer.splice(0);
await db.auditEvent.createMany({ data: batch });
}
}
What Didn’t Work
Switching RPC providers. We tried multiple providers, assuming latency differences would be significant. They weren’t — the variance was smaller than our internal processing overhead.
In-memory queue for transaction ordering. I tried buffering incoming transactions and processing them in optimized batches. The ordering guarantees required made this too complex to be worth it. The simpler concurrency control worked better.
Caching route calculations. Routes are highly dependent on current chain state (gas, liquidity, mempool conditions). A cached route from 30 seconds ago is often wrong. The cache hit rate was too low to justify the complexity.
The Architecture After
Incoming transaction
│
▼
Concurrency limiter (p-limit)
│
▼
┌────────────┐
│ RPC Cache │◄─── Redis (TTL-keyed metadata)
└────────────┘
│
▼
Parallelized RPC calls (×8 concurrent)
│
▼
Route resolution (indexed DB query, no N+1)
│
▼
Async audit write (buffered, 100ms flush)
│
▼
Result / emit
Numbers
| Metric | Before | After | Change |
|---|---|---|---|
| Avg transaction latency | 920ms | 290ms | −68% |
| Throughput (tx/sec) | 47 | 148 | +3.1× |
| RPC calls per transaction | 4.8 | 2.9 | −40% |
| DB query time per tx | ~190ms | ~42ms | −78% |
The 3× number was real, but it came from boring fundamentals: measure, parallelize, cache, fix indexes. Not from clever algorithms or exotic infrastructure.
Takeaway
Every “performance problem” I’ve worked on has followed the same pattern:
- The assumed bottleneck is wrong
- The actual bottleneck is obvious once you measure
- The fix is usually simple once you know where to look
The discipline is in the instrumentation. Skip it and you’re optimizing blind.