A default Nginx installation on a 4-core VPS handles roughly 5,000-10,000 requests per second out of the box. With proper tuning — worker processes, connection limits, kernel parameters, and caching — the same hardware can handle 50,000+ requests per second for static content and 20,000+ for proxy workloads. I learned Nginx tuning the hard way when a product launch at Commsult Indonesia sent traffic spikes that brought down an untuned Nginx instance serving a NestJS API. This guide documents every tuning lever I now apply to every production Nginx deployment.
The first tuning targets are worker_processes and worker_connections in nginx.conf. Set worker_processes auto — Nginx detects your CPU core count and spawns one worker per core. Each worker handles connections independently, fully utilizing multi-core hardware. worker_connections defines how many simultaneous connections each worker handles; 4096-8192 is appropriate for most production servers. The total connection capacity is worker_processes times worker_connections — on a 4-core server with worker_connections 8192, you have 32,768 concurrent connection capacity.
The events block controls connection handling: use epoll on Linux (the most efficient I/O event mechanism), enable multi_accept on to accept all pending connections immediately when notified (reduces latency for connection bursts), and set worker_rlimit_nofile to 65535 to match the system file descriptor limit. Without worker_rlimit_nofile, Nginx workers hit the default 1024 file descriptor limit and start refusing connections under moderate load — a silent performance cliff.
In the http block: enable sendfile on (zero-copy file serving that bypasses userspace buffers), enable tcp_nopush on (batches response headers and file data into a single TCP packet), enable tcp_nodelay on (disables Nagle algorithm for real-time responsiveness), set keepalive_timeout 65 (keep connections alive to reduce TCP handshake overhead for repeat visitors), and set client_max_body_size to a reasonable limit for your use case (default 1MB is too low for file upload APIs).
┌─────────────────────────────────────────────────────┐
│ NGINX PERFORMANCE TUNING STACK │
└─────────────────────────────────────────────────────┘
Default Nginx (4-core VPS): 5,000 - 10,000 req/s
│
[1] worker_processes auto ▼
[2] worker_connections 8192 ~15,000 req/s
[3] use epoll + multi_accept │
[4] sendfile + tcp_nopush ▼
[5] keepalive_timeout 65 ~30,000 req/s
[6] gzip comp level 6 │
[7] Kernel sysctl tuning ▼
[8] TLS session cache 50,000+ req/s
[9] Proxy cache for APIs │
(static content)From my experience tuning Nginx on DigitalOcean Droplets for Commsult Indonesia, the biggest single performance gain came from kernel sysctl tuning, not Nginx config. Adding net.core.somaxconn=65535, net.ipv4.tcp_tw_reuse=1, and fs.file-max=2097152 to /etc/sysctl.conf and running sysctl -p increased our sustained throughput under burst load by roughly 40%. Nginx config changes give diminishing returns if the kernel is still bottlenecked.
Enabling gzip compression reduces response sizes by 60-80% for text-based content (HTML, JSON, CSS, JS), directly reducing bandwidth costs and improving load times for clients on slow connections. Indonesia has significant mobile internet usage on 4G networks where bandwidth matters. Configure: gzip on, gzip_types text/plain text/css application/json application/javascript text/xml, gzip_comp_level 6 (level 1-9, 6 balances CPU cost and compression ratio), gzip_min_length 1000 (skip compressing tiny responses where overhead exceeds savings), and gzip_vary on (adds Vary: Accept-Encoding header for proper CDN/proxy caching).
For Nginx proxying to a backend API (NestJS, Next.js), proxy caching dramatically reduces backend load for cacheable responses. Configure a proxy_cache_path, set cache keys by URI and method, and enable proxy_cache_bypass for authenticated requests. For public read-only API endpoints (product listings, blog posts, public data), even a 30-second cache TTL eliminates database queries for traffic spikes — a product page receiving 1,000 requests in 30 seconds becomes 1 database query instead of 1,000.
# /etc/nginx/nginx.conf — production tuning
worker_processes auto;
worker_rlimit_nofile 65535;
events {
use epoll;
multi_accept on;
worker_connections 8192;
}
http {
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
gzip on;
gzip_types text/plain text/css application/json
application/javascript text/xml;
gzip_comp_level 6;
gzip_min_length 1000;
gzip_vary on;
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 1d;
ssl_protocols TLSv1.2 TLSv1.3;
}
# /etc/sysctl.conf — kernel tuning
net.core.somaxconn = 65535
net.ipv4.tcp_tw_reuse = 1
fs.file-max = 2097152
# Benchmark before and after
wrk -t4 -c400 -d30s https://yourdomain.comTLS adds computational overhead but modern configuration minimizes it. Use TLSv1.2 and TLSv1.3 only (drop older versions), prefer ECDHE cipher suites (Elliptic Curve Diffie-Hellman provides forward secrecy at lower CPU cost than RSA key exchange), enable ssl_session_cache shared:SSL:10m (caches TLS session parameters to avoid full handshakes for returning clients), and set ssl_session_timeout 1d. TLSv1.3 is 40% faster than TLSv1.2 for new connections due to its 1-RTT handshake — ensure your Nginx version supports it (Nginx 1.13+ with OpenSSL 1.1.1+).
I made the mistake of applying tuning parameters from a blog post without benchmarking before and after. Some settings actually decreased performance for our specific workload — worker_connections set too high caused memory pressure on a 1GB Droplet because Nginx allocates buffers per connection. Always benchmark with wrk or ApacheBench before and after each tuning change, measuring latency percentiles (p50, p95, p99) not just average throughput. Averages hide the tail latency problems that cause user-visible slowness.
Enable the ngx_http_stub_status_module in your server block at /nginx_status (restrict to localhost). This exposes active connections, accepted/handled connections, and total requests. Scrape this with nginx-prometheus-exporter for Grafana dashboards. Key metrics to track: active connections (approaching worker_processes times worker_connections indicates saturation), requests per second (baseline and peak), and 4xx/5xx error rates (proxy errors indicate upstream problems, 4xx spike indicates a client-side issue or potential attack).
A production-ready Nginx config combines all these elements: worker_processes auto with epoll and multi_accept, sendfile/tcp_nopush/tcp_nodelay enabled, keepalive tuned, gzip enabled for text content, TLS hardened with session caching, proxy caching for API endpoints, and rate limiting for public endpoints. Apply kernel sysctl tuning in /etc/sysctl.conf and restart the server. Run wrk -t4 -c400 -d30s https://yourdomain.com before and after to quantify the improvement.