Google Cloud Managed Services vs Raw Compute: When to Use Which

Photo by Unsplash

Photo by Unsplash
Google Cloud offers so many compute and data services that choosing between them feels like a trap. Cloud Run, GKE, App Engine, Compute Engine, Cloud Functions—and then a separate tier of managed data services on top. After running infrastructure for Commsult Indonesia's projects on GCP for a couple of years, I've developed a simple mental model for which service belongs where. The answer is almost never 'it depends'—there are concrete signals that point to the right choice.
Every GCP compute service lives on a spectrum between maximum control (raw VMs) and maximum convenience (fully managed). The trade-off is real: more control means more operational burden—you patch the OS, configure the reverse proxy, handle SSL renewal, set up monitoring agents. More convenience means accepting constraints—fixed runtimes, request timeout limits, cold starts, vendor lock-in on configuration format. Neither end of the spectrum is universally better; the right choice depends on your team size, your deployment frequency, and how much of your engineering time you want to spend on infrastructure versus product.
Cloud Run is where I start every new HTTP API or web app unless there's a specific reason not to. It's a fully managed container platform: you give it a Docker image, it runs it, scales it to zero when idle, and scales it up on demand. Billing is per-request (CPU is only allocated during request processing), which makes it extremely cost-effective for APIs with variable traffic. The constraints are real: requests must respond within 60 minutes (configurable), stateful connections like WebSockets require the WebSocket-capable revision setting, and you can't run background daemons that need persistent CPU.
Raw Compute Engine VMs are appropriate when you need OS-level control: custom kernel modules, raw socket access, GPU workloads, or persistent background processes that don't fit the request-response model. I also use GCE when I need to run Docker Swarm across multiple VMs for internal tooling—Swarm isn't supported on Cloud Run. The operational overhead is significant: you're responsible for OS updates, disk management, firewall rules, SSH key rotation, and monitoring agent setup.
GCP Service Decision Tree
─────────────────────────
Need to run code?
│
▼
┌─────────────────────────────────────────┐
│ Do you need OS-level control? │
│ (custom kernel, raw sockets, GPU, etc.) │
└───────┬──────────────────┬──────────────┘
│ YES │ NO
▼ ▼
Compute Engine Stateless HTTP?
(raw VM, GCE) │
┌─────┴──────┐
│ YES │ NO
▼ ▼
Cloud Run Need containers?
(fully │
managed) ┌──┴───┐
│ YES │ NO
▼ ▼
GKE App Engine
(k8s) (PaaS, flex/std)
Need managed data?
├── Relational → Cloud SQL (Postgres/MySQL)
├── Serverless → Firestore / Spanner
├── Cache → Memorystore (Redis)
└── Object → Cloud Storage (GCS)Use 'gcloud run services describe SERVICE --format=json | jq .status.traffic' to inspect the current traffic split across revisions. This is useful when you're doing a gradual rollout and want to confirm traffic is shifting as expected before promoting a new revision to 100%.
On the data layer, the calculus is simpler: unless you have a very specific reason to self-host your database, use Cloud SQL for Postgres or MySQL. Cloud SQL handles backups, point-in-time recovery, automatic minor version updates, and high-availability failover. The cost premium over running Postgres yourself on a VM is modest, and the operational savings are enormous—especially for a small team. For caching, Memorystore (managed Redis) eliminates the operational burden of managing Redis replication and failover.
When Cloud Run connects to Cloud SQL, the recommended approach is the Cloud SQL Auth Proxy, which handles IAM authentication and TLS termination without exposing the database to a public IP. On Cloud Run, you add the Cloud SQL instance as a connection via --add-cloudsql-instances, and the proxy runs as a sidecar. The connection string uses a Unix socket path instead of a hostname. This means you never need to whitelist Cloud Run's IP addresses in a VPC firewall—the IAM service account permission is the only access control needed.
# Deploy NestJS API to Cloud Run (managed, scale-to-zero)
gcloud run deploy my-api --image gcr.io/my-project/my-api:latest --platform managed --region asia-southeast2 --allow-unauthenticated --set-env-vars DATABASE_URL=$$DATABASE_URL --min-instances 0 --max-instances 10 --memory 512Mi --cpu 1
# Connect to Cloud SQL without a public IP (via Cloud SQL Auth Proxy)
gcloud run services update my-api --add-cloudsql-instances my-project:asia-southeast2:my-postgres --set-env-vars "DB_HOST=/cloudsql/my-project:asia-southeast2:my-postgres"
# Compare: raw Compute Engine VM for same workload
gcloud compute instances create my-api-vm --machine-type e2-medium --image-family debian-12 --image-project debian-cloud --zone asia-southeast2-a --tags http-server,https-server
# Then manually: install Node, set up systemd, configure nginx reverse proxy,
# handle SSL, set up monitoring, auto-start on reboot...GKE (Google Kubernetes Engine) is the right choice when you need Kubernetes-specific features: custom operators, StatefulSets for clustered databases, complex inter-service networking with service mesh, or multi-cloud portability. It is not the right choice because 'we want to use Kubernetes' or because your DevOps job posting mentions it. GKE adds substantial operational complexity—cluster upgrades, node pool management, RBAC configuration, persistent volume provisioning.
Cloud Run's scale-to-zero is a cost win for dev and staging, but for production APIs with steady traffic, a minimum of 1 instance prevents cold starts—which can add 2-4 seconds of latency on the first request after the service scales to zero. Set --min-instances 1 for production endpoints where latency matters, and factor that always-on cost into your estimates.
The ERP system I help maintain at Commsult Indonesia runs on: Cloud Run for the NestJS API, Cloud SQL for Postgres 16 (with a read replica for reporting queries), Memorystore for Redis (session cache and BullMQ queue backend), Cloud Storage for uploaded documents and generated PDFs, and a Compute Engine VM for internal Prometheus/Grafana monitoring. This stack covers the full application with zero dedicated DevOps headcount for infrastructure.
The decision tree is simple: start with Cloud Run for any HTTP workload. If you need OS control or GPU, use Compute Engine. If you need Kubernetes-specific features and have the team to support it, use GKE. For data: use Cloud SQL unless you need unsupported extensions, use Memorystore for Redis, use Cloud Storage for objects. The goal is to spend engineering time on your product, not on infrastructure that Google can manage better and more reliably than a small team can.
Sources & Further Reading