Implementing Redis caching in a production NestJS API reduced average response times from 450ms to 42ms — a 10x improvement — with a cache hit ratio of 89% and database load reduced by roughly 70%. These numbers are from a real NestJS project at Commsult Indonesia where we had a dashboard endpoint hitting PostgreSQL with a complex 6-table JOIN on every request. Redis is not magic, but applied correctly, it is the single highest-leverage performance tool available to a NestJS developer.
NestJS's default @nestjs/cache-manager uses in-memory storage, which works for a single-process setup but breaks immediately when you scale to multiple instances or restart your server. Redis solves both problems: it persists across restarts (optionally), survives container restarts, and is shared across all instances of your NestJS app. A single Redis instance (even a $15/month managed Redis on DigitalOcean) handles hundreds of thousands of operations per second with sub-millisecond latency. For most NestJS APIs, in-process caching is a local optimization; Redis is a system-level optimization.
Install @nestjs/cache-manager, cache-manager, and @keyv/redis. Configure the CacheModule in your AppModule with the Redis store. Once configured, you can use the @UseInterceptors(CacheInterceptor) decorator on any controller method to cache its response automatically, with the cache key derived from the route and query parameters. The CacheModule supports TTL configuration at both the module level and per-endpoint level using @CacheTTL().
Cache-aside (also called lazy loading) is the most common and flexible pattern. The application checks the cache first; on a miss, it fetches from the database, stores the result in Redis, and returns it. On subsequent requests, the cache hit is served directly. This pattern gives you control over what gets cached and when. The trade-off is that the first request after a cache miss or expiry always hits the database — which means cache warming matters for high-traffic endpoints.
┌─────────────────────────────────────────────────────┐
│ CACHE-ASIDE PATTERN FLOW │
└─────────────────────────────────────────────────────┘
Request
│
▼
NestJS Service
│
├─► Redis CACHE ──► HIT ──► Return cached data
│ │
│ └─► MISS
│ │
│ ▼
└──────► PostgreSQL DB
│
▼
Store in Redis (TTL)
│
▼
Return to clientFrom my experience with NestJS at Commsult Indonesia, use Redis keyspace notifications to implement event-driven cache invalidation rather than TTL-only expiry. When a user updates an ERP record, publish a cache invalidation event via Redis pub/sub. A separate cache manager service listens and deletes affected keys immediately. This keeps your cache fresh without relying on TTL expiry — which can serve stale data for minutes. Pair this with a reasonable TTL as a fallback for cases where invalidation events are missed.
Cache invalidation is the hardest problem in caching. There are three main strategies: TTL-based expiry (simplest, but potentially serves stale data), event-driven invalidation (accurate, but requires an invalidation event system), and write-through caching (update cache and database together on writes, ensures consistency). For ERP-style data at Commsult Indonesia where records are read frequently but updated infrequently, TTL-based caching with a 5-minute window is acceptable. For financial data or any data where staleness has business consequences, use write-through or event-driven invalidation.
Cache key design is critical. Use namespaced keys: user:{userId}:profile, report:{reportId}:data, inventory:{productId}. This makes targeted invalidation possible — when a user updates their profile, you can delete user:{userId}:* without touching other users' data. Avoid using raw query strings as cache keys — they can be extremely long and expose sensitive parameters in Redis MONITOR output. Hash complex query parameters with a fast hash function and use the hash as part of the key.
# Redis configuration — /etc/redis/redis.conf
maxmemory 512mb
maxmemory-policy allkeys-lru
save 900 1
save 300 10
# Redis CLI: check cache hit ratio
redis-cli INFO stats | grep -E "keyspace_hits|keyspace_misses"
# Namespaced key pattern
SET user:1234:profile '{"name":"Matthews","role":"admin"}' EX 300
SET report:5678:data '{"rows":42,"total":150000}' EX 600
DEL user:1234:* # invalidate single user's cache
# NestJS: manual cache-aside pattern
const cacheKey = 'report:' + reportId + ':data'
const cached = await this.cacheManager.get(cacheKey)
if (cached) return cached
const data = await this.db.getReport(reportId)
await this.cacheManager.set(cacheKey, data, 300)
return dataFor very high-traffic endpoints, consider a two-layer cache: a short-lived in-process LRU cache (using node-lru-cache or similar) for the hottest keys, backed by Redis for the broader cache. The in-process cache eliminates Redis network round-trips for the most frequently accessed data. The Redis layer serves the rest and ensures consistency across instances. This pattern reduces p99 latency to sub-millisecond for the hottest keys while keeping Redis as the source of truth.
I once deployed Redis on a 1GB VPS without configuring maxmemory and a memory leak in our caching layer filled the instance completely. Redis OOMed, crashed, and our NestJS app fell through to the database for every request simultaneously — causing a cascade failure that took 20 minutes to recover. Always set maxmemory in redis.conf (e.g., maxmemory 512mb) and choose an appropriate eviction policy: allkeys-lru works for most caching use cases and evicts the least-recently-used keys when memory is full rather than crashing.
Track these Redis metrics in production: keyspace_hits and keyspace_misses (calculate your hit ratio), used_memory vs maxmemory, evicted_keys (non-zero means you need more memory or a different eviction policy), and connected_clients. Expose these metrics to Prometheus via redis_exporter and build a Grafana dashboard. A healthy cache should show >80% hit ratio — below that, either your TTLs are too short, your key strategy is wrong, or you're not caching the right endpoints.
Not everything should be in Redis. Cache data that is expensive to compute or fetch (complex JOIN queries, external API responses, computed aggregates) and read much more often than it is written. Do not cache data that must be real-time accurate (financial balances, inventory counts in high-turnover environments), data that is already fast to fetch (a single-row primary key lookup is already <1ms in PostgreSQL with proper indexes), or authentication tokens (use Redis for sessions instead, with explicit expiry tied to the session lifetime).