Deployment¶

Hawk runs on AWS. The deployment is managed by a single Pulumi project in the infra/ directory.

Infrastructure Overview¶

infra/
├── __main__.py          # Entrypoint — instantiates all stacks
├── lib/                 # Shared: config, naming, tagging, IAM helpers
├── core/                # VPC, EKS, ALB, ECS, RDS, Route53, S3
├── k8s/                 # Karpenter, Cilium, Datadog agent, GPU operator, RBAC
├── hawk/                # Hawk API (ECS), Lambdas, EventBridge, CloudFront
└── datadog/             # Monitors, dashboards, log archives (optional)

Deployment Phases¶

Stacks deploy in order:

CoreStack — VPC, EKS, ALB, ECS cluster, RDS, Route53, S3
K8sStack — Cluster-level Kubernetes resources (skipped for dev envs sharing EKS)
HawkStack — Hawk API, Lambda functions, EventBridge, CloudFront

Stack Configuration¶

Copy Pulumi.example.yaml to Pulumi.<stack-name>.yaml and fill in your values:

config:
  aws:region: us-west-2
  hawk:domain: staging.example.com
  hawk:publicDomain: example.com
  hawk:primarySubnetCidr: "10.0.0.0/16"

If omitted, Hawk creates a Cognito user pool for authentication automatically. To use your own OIDC provider (Okta, Auth0, etc.) instead:

# Optional: use your own OIDC provider instead of Cognito
hawk:oidcClientId: "your-client-id"
hawk:oidcAudience: "your-audience"
hawk:oidcIssuer: "https://login.example.com/oauth2/default"

See the Configuration Reference for all available options.

IAM Permissions¶

pulumi up creates resources across EKS, ECS Fargate, Aurora RDS, S3, Route53, KMS, IAM, Lambda, and CloudFront. The IAM principal running Pulumi needs broad permissions.

LLM API Keys¶

Hawk's LLM proxy (Middleman) needs API keys to forward requests to model providers:

scripts/dev/set-api-keys.sh <env> OPENAI_API_KEY=sk-...

Set multiple keys at once:

scripts/dev/set-api-keys.sh <env> OPENAI_API_KEY=sk-... ANTHROPIC_API_KEY=sk-ant-...

Supported keys: OPENAI_API_KEY, ANTHROPIC_API_KEY, GEMINI_API_KEY, DEEPINFRA_TOKEN, DEEPSEEK_API_KEY, FIREWORKS_API_KEY, META_API_KEY, MISTRAL_API_KEY, OPENROUTER_API_KEY, TOGETHER_API_KEY, XAI_API_KEY.

Multiple Environments¶

You can run multiple Hawk environments (staging, production, dev) from the same repo. Each gets its own Pulumi stack and isolated AWS resources.

pulumi stack init staging --secrets-provider="awskms://alias/pulumi-secrets?region=<same as aws:region>&awssdk=v2"
# configure Pulumi.staging.yaml
pulumi up -s staging

pulumi stack init production --secrets-provider="awskms://alias/pulumi-secrets?region=<same as aws:region>&awssdk=v2"
# configure Pulumi.production.yaml
pulumi up -s production

Dev Environments¶

Lightweight dev environments share an existing stack's VPC, ALB, and EKS cluster while getting their own database and services:

./scripts/dev/new-dev-env.sh alice    # creates a dev-alice stack

Services appear at:

API: https://api-alice.hawk.<staging-domain>
Viewer: https://viewer-alice.hawk.<staging-domain>

Database migrations run automatically on deploy. Secrets are shared from staging (referenced by ARN — no manual seeding). Dev stacks resolve the shared VPC/ALB/EKS via pulumi.StackReference("stg"); only the Aurora warehouse, ECS cluster, and Hawk services are created per dev env. See StackConfig.from_dev_env() for how config is resolved.

Model data is auto-synced from staging during pulumi up. To re-sync manually:

uv run --directory hawk python -m hawk.tools.sync_models \
  --source-url "$(pulumi stack output -s stg database_url_admin)" \
  --target-url "$(pulumi stack output -s dev-<name> database_url_admin)"

Tail the API logs:

aws logs tail "$(pulumi stack output api_log_group_name -s dev-<name>)" \
  --region us-west-2 --since 30m --format short | grep -v /health

Domain Naming¶

Dev envs use a slug pattern so the OIDC provider can wildcard *.hawk.<staging-domain>:

Example	What
`api.hawk.<domain>`	Hawk API (staging)
`api-alice.hawk.<domain>`	Hawk API (alice's dev env)
`viewer.hawk.<domain>`	Eval log viewer (staging)
`viewer-alice.hawk.<domain>`	Eval log viewer (alice's dev env)

Tearing Down¶

pulumi destroy -s dev-alice
pulumi stack rm dev-alice    # only after destroy completes

Warning

Always wait for pulumi destroy to complete before running stack rm. Running stack rm first will orphan AWS resources in your account.

Optional Integrations¶

Service	Config Key	Purpose
Datadog	`hawk:enableDatadog`	APM, metrics, log forwarding, monitors
Cloudflare	`hawk:cloudflareZoneId`	DNS delegation from parent Cloudflare zone
Tailscale	`tailscaleAuthKeysSecretArn`	VPN jumphost / subnet router

When disabled, services fall back to simpler alternatives (CloudWatch logs instead of Datadog, no DNS delegation).

Why Pulumi?¶

Pulumi is an open-source infrastructure-as-code tool that lets us define our entire AWS infrastructure using Python.

It uses the same provider ecosystem as Terraform under the hood, but lets us use real programming constructs (loops, functions, classes) and share code between infrastructure and application.

Refer to this article for more advantages of Pulumi over CDK.