When should I use Cloud Run instead of a Compute Engine VM?

Cloud Run is the right default for any HTTP API or web app because it handles scaling, SSL, and billing per request automatically. Switch to Compute Engine only when you need OS-level control—such as custom kernel modules, raw socket access, GPU workloads, or persistent background processes that don't fit the request-response model.

What are the practical constraints of running on Cloud Run?

Requests must respond within 60 minutes, stateful connections like WebSockets require a specific revision setting, and you cannot run background daemons that need persistent CPU. Additionally, Cloud Run scales to zero by default, which can add 2–4 seconds of latency on the first request after an idle period—setting --min-instances 1 prevents cold starts for production endpoints.

Why use Cloud SQL instead of running Postgres on a VM?

Cloud SQL handles backups, point-in-time recovery, automatic minor version updates, and high-availability failover out of the box. The cost premium over self-hosting Postgres on a VM is modest, while the operational savings are significant—especially for a small team that would otherwise need to manage OS patching, disk, and failover manually.

How does the Cloud SQL Auth Proxy work when connecting Cloud Run to Cloud SQL?

The Cloud SQL Auth Proxy runs as a sidecar on Cloud Run and handles IAM authentication plus TLS termination without exposing the database to a public IP. The connection string uses a Unix socket path instead of a hostname, so no IP whitelisting in a VPC firewall is needed—the IAM service account permission is the only access control required.

Google Cloud Managed Services vs Raw Compute: When to Use Which

Q: When does it actually make sense to use GKE?

GKE is appropriate when you specifically need Kubernetes features such as custom operators, StatefulSets for clustered databases, complex inter-service networking with a service mesh, or multi-cloud portability. It is not the right choice simply because your team wants to use Kubernetes—GKE adds substantial operational complexity including cluster upgrades, node pool management, and RBAC configuration.

Google Cloud offers so many compute and data services that choosing between them feels like a trap. Cloud Run, GKE, App Engine, Compute Engine, Cloud Functions—and then a separate tier of managed data services on top. After running infrastructure for Commsult Indonesia's projects on GCP for a couple of years, I've developed a simple mental model for which service belongs where. The answer is almost never 'it depends'—there are concrete signals that point to the right choice.

Dimension	Managed Services	Raw Compute
Control	Google manages patching, scaling, and failover	Full control over OS, runtime, and configuration
Setup speed	Deploy Cloud Run or App Engine in minutes	Provision Compute Engine VMs, configure networking and OS yourself
Cost model	Pay per request or usage, scales to zero	Pay for reserved VM capacity whether it's used or not
Operational overhead	Google handles OS patches and load balancing	You own patching, monitoring, and capacity planning
Customization	Constrained to the platform's supported runtimes	Any OS, kernel module, or custom binary
Best fit	Stateless APIs, event-driven workloads, fast iteration	Legacy workloads, custom networking, GPU-heavy or license-bound software

The Fundamental Split: Control vs Convenience

Every GCP compute service lives on a spectrum between maximum control (raw VMs) and maximum convenience (fully managed). The trade-off is real: more control means more operational burden—you patch the OS, configure the reverse proxy, handle SSL renewal, set up monitoring agents. More convenience means accepting constraints—fixed runtimes, request timeout limits, cold starts, vendor lock-in on configuration format. Neither end of the spectrum is universally better; the right choice depends on your team size, your deployment frequency, and how much of your engineering time you want to spend on infrastructure versus product.

Cloud Run: The Default Choice for HTTP Workloads

Cloud Run is where I start every new HTTP API or web app unless there's a specific reason not to. It's a fully managed container platform: you give it a Docker image, it runs it, scales it to zero when idle, and scales it up on demand. Billing is per-request (CPU is only allocated during request processing), which makes it extremely cost-effective for APIs with variable traffic. The constraints are real: requests must respond within 60 minutes (configurable), stateful connections like WebSockets require the WebSocket-capable revision setting, and you can't run background daemons that need persistent CPU.

When to Reach for Compute Engine

Raw Compute Engine VMs are appropriate when you need OS-level control: custom kernel modules, raw socket access, GPU workloads, or persistent background processes that don't fit the request-response model. I also use GCE when I need to run Docker Swarm across multiple VMs for internal tooling—Swarm isn't supported on Cloud Run. The operational overhead is significant: you're responsible for OS updates, disk management, firewall rules, SSH key rotation, and monitoring agent setup.

GCP Service Decision Tree
  ─────────────────────────

  Need to run code?
        │
        ▼
  ┌─────────────────────────────────────────┐
  │ Do you need OS-level control?           │
  │ (custom kernel, raw sockets, GPU, etc.) │
  └───────┬──────────────────┬──────────────┘
          │ YES              │ NO
          ▼                  ▼
    Compute Engine      Stateless HTTP?
    (raw VM, GCE)            │
                       ┌─────┴──────┐
                       │ YES        │ NO
                       ▼            ▼
                  Cloud Run     Need containers?
                  (fully        │
                  managed)   ┌──┴───┐
                             │ YES  │ NO
                             ▼      ▼
                           GKE   App Engine
                         (k8s)  (PaaS, flex/std)

  Need managed data?
  ├── Relational  → Cloud SQL (Postgres/MySQL)
  ├── Serverless  → Firestore / Spanner
  ├── Cache       → Memorystore (Redis)
  └── Object      → Cloud Storage (GCS)

Use 'gcloud run services describe SERVICE --format=json | jq .status.traffic' to inspect the current traffic split across revisions. This is useful when you're doing a gradual rollout and want to confirm traffic is shifting as expected before promoting a new revision to 100%.

Managed Data Services: Almost Always the Right Call

On the data layer, the calculus is simpler: unless you have a very specific reason to self-host your database, use Cloud SQL for Postgres or MySQL. Cloud SQL handles backups, point-in-time recovery, automatic minor version updates, and high-availability failover. The cost premium over running Postgres yourself on a VM is modest, and the operational savings are enormous—especially for a small team. For caching, Memorystore (managed Redis) eliminates the operational burden of managing Redis replication and failover.

Connecting Services: Cloud SQL Auth Proxy

When Cloud Run connects to Cloud SQL, the recommended approach is the Cloud SQL Auth Proxy, which handles IAM authentication and TLS termination without exposing the database to a public IP. On Cloud Run, you add the Cloud SQL instance as a connection via --add-cloudsql-instances, and the proxy runs as a sidecar. The connection string uses a Unix socket path instead of a hostname. This means you never need to whitelist Cloud Run's IP addresses in a VPC firewall—the IAM service account permission is the only access control needed.

# Deploy NestJS API to Cloud Run (managed, scale-to-zero)
gcloud run deploy my-api   --image gcr.io/my-project/my-api:latest   --platform managed   --region asia-southeast2   --allow-unauthenticated   --set-env-vars DATABASE_URL=$$DATABASE_URL   --min-instances 0   --max-instances 10   --memory 512Mi   --cpu 1

# Connect to Cloud SQL without a public IP (via Cloud SQL Auth Proxy)
gcloud run services update my-api   --add-cloudsql-instances my-project:asia-southeast2:my-postgres   --set-env-vars "DB_HOST=/cloudsql/my-project:asia-southeast2:my-postgres"

# Compare: raw Compute Engine VM for same workload
gcloud compute instances create my-api-vm   --machine-type e2-medium   --image-family debian-12   --image-project debian-cloud   --zone asia-southeast2-a   --tags http-server,https-server
# Then manually: install Node, set up systemd, configure nginx reverse proxy,
# handle SSL, set up monitoring, auto-start on reboot...

GKE: When You Actually Need Kubernetes

GKE (Google Kubernetes Engine) is the right choice when you need Kubernetes-specific features: custom operators, StatefulSets for clustered databases, complex inter-service networking with service mesh, or multi-cloud portability. It is not the right choice because 'we want to use Kubernetes' or because your DevOps job posting mentions it. GKE adds substantial operational complexity—cluster upgrades, node pool management, RBAC configuration, persistent volume provisioning.

Cloud Run's scale-to-zero is a cost win for dev and staging, but for production APIs with steady traffic, a minimum of 1 instance prevents cold starts—which can add 2-4 seconds of latency on the first request after the service scales to zero. Set --min-instances 1 for production endpoints where latency matters, and factor that always-on cost into your estimates.

A Real-World Stack: Commsult ERP on GCP

The ERP system I help maintain at Commsult Indonesia runs on: Cloud Run for the NestJS API, Cloud SQL for Postgres 16 (with a read replica for reporting queries), Memorystore for Redis (session cache and BullMQ queue backend), Cloud Storage for uploaded documents and generated PDFs, and a Compute Engine VM for internal Prometheus/Grafana monitoring. This stack covers the full application with zero dedicated DevOps headcount for infrastructure.

Making the Decision

The decision tree is simple: start with Cloud Run for any HTTP workload. If you need OS control or GPU, use Compute Engine. If you need Kubernetes-specific features and have the team to support it, use GKE. For data: use Cloud SQL unless you need unsupported extensions, use Memorystore for Redis, use Cloud Storage for objects. The goal is to spend engineering time on your product, not on infrastructure that Google can manage better and more reliably than a small team can.

Sources & Further Reading

Frequently Asked Questions

Google Cloud Managed Services vs Raw Compute: When to Use Which

Frequently Asked Questions

Google Cloud Managed Services vs Raw Compute: When to Use Which

The Fundamental Split: Control vs Convenience

Cloud Run: The Default Choice for HTTP Workloads

When to Reach for Compute Engine

Managed Data Services: Almost Always the Right Call

Connecting Services: Cloud SQL Auth Proxy

GKE: When You Actually Need Kubernetes

A Real-World Stack: Commsult ERP on GCP

Making the Decision

The Fundamental Split: Control vs Convenience

Cloud Run: The Default Choice for HTTP Workloads

When to Reach for Compute Engine

Managed Data Services: Almost Always the Right Call

Connecting Services: Cloud SQL Auth Proxy

GKE: When You Actually Need Kubernetes

A Real-World Stack: Commsult ERP on GCP

Making the Decision