GCP Deployment Deep Dive

Complete guide to deploying Bedsheet AI agents on Google Cloud Platform - architecture, authentication, templates, troubleshooting, and battle-tested lessons from production.

v0.4.x Production Ready Gemini 3 Flash

๐Ÿ—๏ธ Architecture Overview

High-Level System Architecture

Bedsheet's GCP deployment target creates a complete serverless infrastructure using Cloud Run, Cloud Build, and Vertex AI. The generated artifacts include Terraform for infrastructure and a Makefile for developer experience.

graph TB subgraph "Developer Machine" DEV[Developer] CLI[bedsheet CLI] TEMPLATES[Jinja2 Templates] end subgraph "Google Cloud Platform" subgraph "Build Pipeline" CB[Cloud Build] AR[Artifact Registry] end subgraph "Runtime" CR[Cloud Run Service] ADK[ADK Web Server] AGENT[Bedsheet Agent] end subgraph "AI Services" GEMINI[Gemini 3 Flash] GLOBAL[Global Endpoint] end end DEV -->|bedsheet generate| CLI CLI -->|renders| TEMPLATES TEMPLATES -->|creates| DEPLOY[deploy/gcp/] DEPLOY -->|make deploy| CB CB --> AR AR --> CR CR --> ADK ADK --> AGENT AGENT --> GEMINI

Component Responsibilities

Component Responsibility
bedsheet CLI Generates deployment artifacts from templates
Jinja2 Templates Define infrastructure-as-code and runtime configuration
Cloud Build CI/CD pipeline for building and deploying containers
Artifact Registry Docker image storage
Cloud Run Serverless container hosting with auto-scaling
ADK Web Server Google's Agent Development Kit runtime with Dev UI
Vertex AI Gemini model API (global endpoint for Gemini 3)

Multi-Agent Architecture

Bedsheet supports multi-agent systems with a Supervisor pattern. The Investment Advisor example demonstrates this with three specialized collaborators:

graph LR subgraph "Cloud Run Container" ROOT[InvestmentAdvisor
Supervisor] MA[MarketAnalyst] NR[NewsResearcher] RA[RiskAnalyst] end USER[User] -->|HTTP| ROOT ROOT -->|delegate| MA ROOT -->|delegate| NR ROOT -->|delegate| RA MA --> TOOLS1[get_stock_data
get_technical_analysis] NR --> TOOLS2[search_news
analyze_sentiment] RA --> TOOLS3[analyze_volatility
get_position_recommendation]

๐Ÿ” Authentication Deep Dive

โš ๏ธ Critical Knowledge Understanding GCP authentication is essential for debugging. Most "permission denied" errors stem from credential misconfiguration, not actual IAM issues.

The Authentication Stack

The Google Cloud SDK and Python libraries check credentials in a specific priority order. Understanding this order is crucial for debugging.

graph TB subgraph "Priority Order" P1["1๏ธโƒฃ GOOGLE_APPLICATION_CREDENTIALS
(Environment Variable)"] P2["2๏ธโƒฃ Application Default Credentials
(ADC from gcloud auth)"] P3["3๏ธโƒฃ Metadata Server
(GCE/Cloud Run SA)"] end P1 --> CHECK1{Set?} CHECK1 -->|Yes| USE1[Use Service Account from file] CHECK1 -->|No| P2 P2 --> CHECK2{Exists?} CHECK2 -->|Yes| USE2[Use ADC credentials] CHECK2 -->|No| P3 P3 --> CHECK3{On GCP?} CHECK3 -->|Yes| USE3[Use instance SA] CHECK3 -->|No| FAIL[No credentials!] style P1 fill:#ff6b6b,color:white style USE1 fill:#ff6b6b,color:white

The January 2026 Bug

This diagram explains exactly what happened during our debugging session:

sequenceDiagram participant Dev as Developer participant SDK as google-genai SDK participant ENV as Environment participant Gemini as Vertex AI Note over Dev,Gemini: โŒ The Bug Dev->>SDK: genai.Client(project='my-gcp-project') SDK->>ENV: Check GOOGLE_APPLICATION_CREDENTIALS ENV-->>SDK: /path/to/other-project-service-account.json SDK->>SDK: Load other-project SA credentials SDK->>Gemini: Request to bedsheet project Gemini-->>SDK: 403 PERMISSION_DENIED Note over Dev,Gemini: โœ… The Fix Dev->>ENV: unset GOOGLE_APPLICATION_CREDENTIALS Dev->>SDK: genai.Client(project='my-gcp-project') SDK->>ENV: Check GOOGLE_APPLICATION_CREDENTIALS ENV-->>SDK: Not set SDK->>SDK: Fall back to ADC SDK->>Gemini: Request to bedsheet project Gemini-->>SDK: 200 OK

Authentication Checklist

Before deploying or debugging, always verify:

# 1. Check if GOOGLE_APPLICATION_CREDENTIALS is set
echo $GOOGLE_APPLICATION_CREDENTIALS
# If set to wrong project's SA, unset it:
unset GOOGLE_APPLICATION_CREDENTIALS

# 2. Check current gcloud auth
gcloud auth list

# 3. Check ADC configuration
cat ~/.config/gcloud/application_default_credentials.json | jq '.quota_project_id'

# 4. Test direct API access (bypasses Python SDK)
curl -s -X POST \
  "https://aiplatform.googleapis.com/v1/projects/YOUR_PROJECT/locations/global/publishers/google/models/gemini-3-flash-preview:generateContent" \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  -d '{"contents":[{"role":"user","parts":[{"text":"Hi"}]}]}'
๐Ÿ’ก Pro Tip If curl works but Python SDK fails, you likely have a GOOGLE_APPLICATION_CREDENTIALS environment variable pointing to the wrong service account.

๐Ÿค– The ADK Integration

What is ADK?

Agent Development Kit (ADK) is Google's framework for building AI agents. Bedsheet generates ADK-compatible agents for GCP deployment, leveraging ADK's runtime and development tools.

ADK Server Modes

Mode Command Features Use Case
api_server adk api_server /run, /run_sse, /apps/* API-only production
web โœ… adk web All API + /dev-ui/ Development & debugging
๐Ÿ“Œ Bedsheet uses web mode We chose web mode for all deployments because:
  • Includes all API endpoints from api_server
  • Adds interactive Dev UI at /dev-ui/
  • Enables trace visualization for debugging
  • No performance penalty

ADK Directory Structure

deploy/gcp/
โ”œโ”€โ”€ agent/                    # ADK agent directory
โ”‚   โ”œโ”€โ”€ __init__.py          # Exports root_agent
โ”‚   โ”œโ”€โ”€ agent.py             # Agent definition
โ”‚   โ””โ”€โ”€ tools.py             # Tool implementations
โ”œโ”€โ”€ Dockerfile               # Container definition
โ”œโ”€โ”€ pyproject.toml           # Dependencies
โ”œโ”€โ”€ Makefile                 # Deployment commands
โ”œโ”€โ”€ cloudbuild.yaml          # CI/CD pipeline
โ””โ”€โ”€ terraform/               # Infrastructure as code
    โ”œโ”€โ”€ main.tf
    โ”œโ”€โ”€ variables.tf
    โ””โ”€โ”€ outputs.tf

The root_agent Pattern

ADK discovers agents by looking for root_agent in the agent module:

# agent/__init__.py
from .agent import root_agent

# agent/agent.py
from google.adk.agents import LlmAgent

root_agent = LlmAgent(
    name="InvestmentAdvisor",
    model="gemini-3-flash-preview",
    instruction="You are an investment advisor...",
    sub_agents=[market_analyst, news_researcher, risk_analyst]
)

๐Ÿ“„ Template System

How Templates Work

graph LR YAML[bedsheet.yaml] --> PARSE[Parse Config] AGENT[agents.py] --> INTRO[Introspect] PARSE --> RENDER[Render Templates] INTRO --> RENDER TEMPLATES[templates/gcp/*.j2] --> RENDER RENDER --> OUTPUT[deploy/gcp/]

Key Template: Dockerfile.j2

# {{ config.name }} - Cloud Run Container
# Generated by: bedsheet generate --target gcp

FROM python:3.11-slim

# Install uv (fast Python package manager)
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv

WORKDIR /app

# Install dependencies
COPY pyproject.toml .
RUN uv pip install --system -r pyproject.toml

# Copy agent code
COPY agent/ ./agent/

# Cloud Run expects PORT env var
ENV PORT=8080

# ADK serves the agent with Dev UI
CMD ["python", "-m", "google.adk.cli", "web", "--host", "0.0.0.0", "--port", "8080", "."]

Template Variables

Variable Source Example
config.name bedsheet.yaml investment-advisor
gcp.project bedsheet.yaml targets.gcp my-gcp-project
gcp.region bedsheet.yaml targets.gcp europe-west1
gcp.model bedsheet.yaml targets.gcp gemini-3-flash-preview

๐Ÿš€ Deployment Flow

Complete Sequence

sequenceDiagram participant Dev as Developer participant CLI as bedsheet CLI participant Make as Makefile participant CB as Cloud Build participant CR as Cloud Run Note over Dev,CR: Phase 1: Generate Dev->>CLI: bedsheet generate --target gcp CLI-->>Dev: deploy/gcp/ created Note over Dev,CR: Phase 2: Setup (one-time) Dev->>Make: make setup Make-->>Dev: Auth configured Note over Dev,CR: Phase 3: Deploy Dev->>Make: make deploy Make->>CB: gcloud builds submit CB->>CB: Build container CB->>CR: Deploy to Cloud Run CR-->>Dev: Service URL

Make Targets

# Quick reference
make help      # Show all commands
make setup     # One-time setup (auth, project, APIs)
make deploy    # Deploy to Cloud Run via Cloud Build
make dev       # Run locally with ADK Dev UI
make logs      # Stream Cloud Run logs
make url       # Get service URL
make destroy   # Remove all resources

๐Ÿ”ง Troubleshooting Guide

Decision Tree

graph TD A[403 PERMISSION_DENIED] --> B{GOOGLE_APPLICATION_CREDENTIALS set?} B -->|Yes| C{Points to correct project?} B -->|No| D{ADC configured?} C -->|No| E["unset GOOGLE_APPLICATION_CREDENTIALS"] C -->|Yes| F{API enabled?} D -->|No| G["gcloud auth application-default login"] D -->|Yes| H{Quota project correct?} F -->|No| I["gcloud services enable aiplatform.googleapis.com"] F -->|Yes| J{User has Vertex AI role?} H -->|No| K["gcloud auth application-default set-quota-project PROJECT"] H -->|Yes| J J -->|No| L["Grant roles/aiplatform.user"] J -->|Yes| M[Check model region] E --> N[โœ… Fixed!] G --> N I --> N K --> N L --> N style N fill:#51cf66,color:white

Common Issues

1. 403 PERMISSION_DENIED on Vertex AI

# Quick fix
unset GOOGLE_APPLICATION_CREDENTIALS
gcloud auth application-default login --scopes="https://www.googleapis.com/auth/cloud-platform"
gcloud auth application-default set-quota-project YOUR_PROJECT

2. Gemini 3 Model Not Found

# Gemini 3 requires global endpoint
export GOOGLE_CLOUD_LOCATION=global

# Check SDK version (need >= 1.51.0)
pip show google-genai | grep Version

3. ADK Dev UI Returns 404

# Wrong - api_server has no UI
CMD ["python", "-m", "google.adk.cli", "api_server", ...]

# Correct - web mode includes Dev UI
CMD ["python", "-m", "google.adk.cli", "web", ...]

๐Ÿ” The Great Debugging of January 2026

This is the story of how we spent hours debugging a 403 error that turned out to be a single environment variable. It's now documented here so no one else has to suffer.

The Mystery

Test Result
curl with gcloud auth print-access-token โœ… Worked
Python SDK with project other-project-id โœ… Worked
Python SDK with project my-gcp-project โŒ 403 Error
Both projects had same APIs enabled โœ… Verified
User has Owner role on both โœ… Verified

The Investigation Timeline

โŒ Initial Symptom
Investment Advisor returns 403 PERMISSION_DENIED when deployed to my-gcp-project
๐Ÿ” Hypothesis 1: API Not Enabled
Checked - both projects have aiplatform.googleapis.com enabled โœ“
๐Ÿ” Hypothesis 2: IAM Issue
User has Owner role, includes all Vertex AI permissions โœ“
๐Ÿ” Hypothesis 3: Model Access
Tried curl directly - it worked! So the API is accessible โœ“
๐Ÿ’ก Key Insight
curl works, SDK fails โ†’ Different authentication paths!
๐Ÿ” Checked Environment
echo $GOOGLE_APPLICATION_CREDENTIALS
Output: /path/to/other-project-service-account.json
โœ… Root Cause Found!
GOOGLE_APPLICATION_CREDENTIALS pointed to another project's service account!
SDK used other-project SA โ†’ No access to bedsheet project โ†’ 403
โœ… Fix Applied
unset GOOGLE_APPLICATION_CREDENTIALS
Now SDK uses ADC โ†’ Correct project โ†’ Success!
๐ŸŽฏ The Lesson When debugging auth issues: check environment variables first! GOOGLE_APPLICATION_CREDENTIALS takes priority over everything else.

๐Ÿ“š Lessons Learned

  1. Check environment variables FIRST when debugging auth issues. GOOGLE_APPLICATION_CREDENTIALS overrides ADC.
  2. Credential priority matters: ENV > ADC > Metadata Server. Know the order.
  3. curl working but SDK failing = different auth sources. This is the key diagnostic.
  4. Trust your instincts: "It works on another project" means the API works - the issue is auth.
  5. Multiple GCP projects require careful credential management. Service accounts are project-specific.

โšก Quick Reference

Deploy a New Agent to GCP

# 1. Create agent
bedsheet init my-agent && cd my-agent

# 2. Configure bedsheet.yaml
# Set your GCP project, region, model

# 3. Generate deployment artifacts
bedsheet generate --target gcp

# 4. Deploy
cd deploy/gcp
make setup   # One-time
make deploy  # Deploy to Cloud Run

# 5. Access Dev UI
open $(make url)/dev-ui/

Environment Variables

Variable Purpose Required
GOOGLE_CLOUD_PROJECT Target GCP project Yes
GOOGLE_CLOUD_LOCATION Model location (global for Gemini 3) Yes
GOOGLE_GENAI_USE_VERTEXAI Use Vertex AI (not AI Studio) Yes (True)
GOOGLE_APPLICATION_CREDENTIALS Service account key path No (use ADC)

๐Ÿ”ฎ Future Considerations

Potential Improvements

1. Credential Validation in Makefile

_validate_credentials:
    @if [ -n "$$GOOGLE_APPLICATION_CREDENTIALS" ]; then \
        echo "โš ๏ธ  WARNING: GOOGLE_APPLICATION_CREDENTIALS is set"; \
        echo "   Value: $$GOOGLE_APPLICATION_CREDENTIALS"; \
        echo "   This may cause auth issues with different projects."; \
    fi

2. Project-Specific Service Account

resource "google_service_account" "agent_sa" {
  account_id   = "${var.service_name}-sa"
  display_name = "Service Account for ${var.service_name}"
  project      = var.project_id
}

resource "google_project_iam_member" "vertex_ai_user" {
  project = var.project_id
  role    = "roles/aiplatform.user"
  member  = "serviceAccount:${google_service_account.agent_sa.email}"
}

Roadmap

Version Feature Status
v0.4.2 GCP E2E Testing โœ… Done
v0.4.2 ADK Dev UI in Cloud Run โœ… Done
v0.5.0 Knowledge Bases / RAG Planned
v0.7.0 Agent Engine (managed) Planned