Security¶
This page covers Hawk's security architecture, access control, audit logging, and optional AWS security services.
Authentication¶
Hawk uses OIDC (OpenID Connect) for all authentication. JWTs are validated at every service boundary — the API server, Middleman (LLM proxy), the web viewer, and Lambda functions.
Default: Cognito¶
When no OIDC provider is configured, Hawk creates a Cognito user pool automatically. Users are managed via the AWS Console or the helper scripts.
Managing Users¶
Managing Model Access Groups¶
Cognito automatically includes group memberships in the cognito:groups claim of access tokens. Create groups matching the model groups configured in Middleman:
# Create a model access group
scripts/dev/manage-cognito-groups.sh <stack> create model-access-openai
# Add a user to the group
scripts/dev/manage-cognito-groups.sh <stack> add-user model-access-openai user@example.com
# List all groups
scripts/dev/manage-cognito-groups.sh <stack> list
Users who aren't in any group fall back to hawk:defaultPermissions (default: model-access-public), which grants access to models in the public group.
External OIDC Provider (Okta, Auth0, etc.)¶
For production deployments, we recommend using your organization's identity provider. Use the autodiscovery script to generate Pulumi config from your issuer URL:
This prints the full set of hawk:oidc* config values to add to your Pulumi.<stack>.yaml. See Pulumi.example.yaml for the complete list of OIDC settings.
OIDC App Requirements¶
Your OIDC application must:
- Use PKCE (Proof Key for Code Exchange) — no client secret
- Support the
authorization_codegrant type - Include these redirect URIs:
http://localhost:18922/callback— Hawk CLI loginhttps://viewer.<your-domain>/oauth/complete— web viewer login
Required JWT Claims¶
Hawk extracts permissions from the permissions claim (or scp as a fallback). The claim value can be either a JSON array of strings or a space-separated string:
{
"sub": "user123",
"iss": "https://login.example.com/oauth2/default",
"aud": "your-audience",
"permissions": ["model-access-openai", "model-access-anthropic"]
}
Or equivalently:
The group names must match the groups assigned to models in Middleman (see Model Groups below).
No permissions claim?
If the JWT has no permissions or scp claim, Hawk falls back to hawk:defaultPermissions (default: model-access-public). This is how Cognito users get access without custom claims.
Setting Up Your Identity Provider¶
The exact steps vary by provider, but the general approach is:
-
Create an OIDC application in your IdP (Okta, Auth0, Entra ID, etc.) with PKCE and the redirect URIs above.
-
Create groups in your IdP for each model access level you need (e.g.,
model-access-openai,model-access-anthropic). These must match the group names you assign to models in Middleman. -
Add a custom claim named
permissionsto your access tokens that includes the user's group memberships. Most IdPs support this via:- Okta: Custom claim on an authorization server using a Groups expression
- Auth0: Post-login Action that adds groups to the access token
- Entra ID (Azure AD): App roles or group claims in the token configuration
- Keycloak: Protocol mapper for group membership
-
Configure Hawk with your IdP's client ID, audience, and issuer URL in your Pulumi stack config.
Access Control¶
Model Groups¶
Model access is controlled through model groups. Each model configured in Middleman belongs to a group (e.g., model-access-openai, model-access-anthropic). Users must have a matching group in their JWT permissions claim to:
- Use the model for evaluations (via Middleman)
- View evaluation results that used that model (via the web viewer and API)
This means evaluation results are automatically restricted to users who have access to the models used in the evaluation.
Models are assigned to groups by Middleman admins when configuring the model. For example, a model with group: "model-access-openai" requires the user to have model-access-openai in their JWT permissions claim. The model-access-public group is the default and grants access to models intended for all users.
How Group Membership Flows¶
flowchart LR
IdP["Identity Provider<br/>(Okta, Auth0, Cognito)"]
JWT["JWT with<br/>permissions claim"]
API["Hawk API"]
MM["Middleman"]
TB["Token Broker<br/>(Lambda)"]
S3["S3 (eval logs)"]
DB["PostgreSQL<br/>(RLS)"]
IdP -->|"issues"| JWT
JWT -->|"validates"| API
JWT -->|"validates"| MM
JWT -->|"exchanges for<br/>scoped credentials"| TB
TB -->|"scoped access"| S3
API -->|"RLS enforces<br/>group membership"| DB
- Identity Provider issues JWTs with model group memberships in the
permissionsclaim - Middleman validates the JWT and checks the user's groups before routing model API calls
- Token Broker validates the user's model group permissions, then exchanges the JWT for scoped AWS credentials tied to a specific job via AWS session tags
- PostgreSQL RLS (Row-Level Security) restricts database queries to evaluation results the user is authorized to see
Administrative Roles¶
Hawk has one administrative role: Middleman Admin. Admins can:
- Create, update, and delete model configurations
- View all models regardless of group membership
- Manage provider API keys
To grant admin access, add a boolean claim to your OIDC access tokens:
| Claim | Purpose |
|---|---|
https://middleman.metr.org/claims/admin |
Full admin access (production + non-production) |
https://middleman.metr.org/claims/dev-admin |
Admin access for non-production environments only |
The dev-admin claim is only accepted when the Middleman environment variable MIDDLEMAN_ACCEPT_DEV_ADMIN=true is set (which Hawk configures automatically for non-production stacks).
Example JWT with admin access:
{
"sub": "admin-user",
"permissions": ["model-access-openai"],
"https://middleman.metr.org/claims/admin": true
}
In your IdP, create a group (e.g., middleman-admins) and configure a custom claim that emits true when the user is a member of that group.
Sensitive Model Protection¶
Hawk protects sensitive model information through codenames:
- Models are given a
public_name(codename) that is used everywhere in the system - The real model identifier (
danger_name) is only known to Middleman - Evaluation results, logs, and the web viewer only show the
public_name - Observability data (Datadog, Sentry) is scrubbed to prevent
danger_nameleakage — API keys, auth headers, and model identifiers are filtered at multiple layers
Users without the appropriate model group cannot access evaluation results that used that model.
Sandbox Isolation¶
Evaluations run in isolated Kubernetes pods with:
- Separate namespaces — each evaluation gets its own namespace for the runner and sandbox pods
- Network policies — Cilium network policies block egress to VPC infrastructure (primary subnet CIDR, EC2 IMDS, EKS Pod Identity) from sandbox pods, and can restrict no-internet pods to DNS-only egress
- Resource limits — CPU and memory constraints per pod
- StatefulSets — sandbox pods auto-restart on failure
Alternative Sandbox Providers¶
While Kubernetes is the default sandbox environment, Hawk's architecture does not strictly require it. EC2-based sandboxing and other providers (e.g., Modal) can be used as alternatives. The sandbox provider is configured per evaluation.
Audit Logging¶
Application-Level Logging¶
- Hawk API — all API requests are logged to CloudWatch with user identity, action, and resource context
- Middleman — model API calls are logged with user identity, model (public name only), and token usage. Request/response bodies are not logged.
- Token Broker — credential exchanges are logged with user identity and requested scope
AWS CloudTrail¶
CloudTrail is enabled by default in AWS accounts and logs all AWS API calls. CloudTrail Insights (anomaly detection for API call rates and error rates) can be enabled separately via the infra-shared repository.
VPC Flow Logs¶
VPC flow logs are enabled for all traffic and sent to CloudWatch Logs at /aws/vpc/flowlogs/<env>. Retention follows hawk:cloudwatchLogsRetentionDays (default: 14 days).
Endpoint Protection (CrowdStrike Falcon)¶
Hawk optionally deploys the CrowdStrike Falcon sensor to protect infrastructure hosts. Enable it with hawk:enableCrowdstrike: "true" and a Secrets Manager secret containing your CrowdStrike API credentials (see Configuration).
What's Protected¶
| Target | OS / Arch | Installation Method |
|---|---|---|
| GPU nodes (Karpenter) | AL2023 / x86_64 | Sensor RPM installed via EC2NodeClass userData at boot |
| Tailscale subnet router | AL2023 / ARM64 | Sensor RPM installed via cloud-init at boot |
| Default nodes (Karpenter) | Bottlerocket / x86_64 | DaemonSet via falcon-sensor Helm chart (requires Falcon Images Download scope) |
The sensor is downloaded from the CrowdStrike API at instance boot using the Sensor Download: Read API scope. The DaemonSet approach (for Bottlerocket nodes) pulls a container image from registry.crowdstrike.com, which requires the Falcon Images Download: Read scope — part of the Falcon Cloud Security with Containers add-on.
Tailscale ZTA Integration¶
With the Falcon sensor on the subnet router, you can enable CrowdStrike ZTA with Tailscale to gate tailnet access based on device posture. This requires a separate API client with Hosts: Read and Zero Trust: Read scopes, configured in the Tailscale admin console under Device Posture Integrations. Note that ZTA scores apply to the router instance itself, not to traffic routed through it.
AWS Security Services¶
AWS security services (GuardDuty, Security Hub, AWS Config, CloudTrail Insights) are managed by the infra-shared repository, not by Hawk. See infra-shared for configuration details.
Recommended Security Configuration¶
For production deployments, consider:
- Setting
hawk:eksPublicEndpoint: "false"and using Tailscale for private cluster access - Setting
hawk:albInternal: "true"to make the ALB private (requires VPN) - Setting
hawk:protectResources: "true"to prevent accidental deletion of S3 buckets and secrets
Monitoring & Observability¶
CloudWatch¶
All services log to CloudWatch by default. Log retention is configurable via hawk:cloudwatchLogsRetentionDays (default: 14 days).
Key log groups:
- Hawk API logs
- Middleman logs
- Lambda function logs
- GuardDuty findings (when enabled):
/aws/events/guardduty/<env> - Security Hub findings (when enabled):
/aws/events/securityhub/<env>
Datadog (Optional)¶
For richer monitoring, Hawk supports Datadog integration:
This enables:
- APM — distributed tracing across API, Middleman, and Lambda functions
- Log forwarding — CloudWatch logs forwarded to Datadog
- Custom metrics — token usage, import counts, evaluation durations
- Sensitive data filtering —
danger_name, API keys, and auth headers are scrubbed from all telemetry
Budget Alerts¶
Monitor AWS spending with budget alerts:
hawk:budgetLimit: "10000"
hawk:budgetNotificationEmails:
- "team@example.com"
hawk:budgetNotificationThresholds:
- 80
- 100
External Dependencies¶
Hawk interacts with the following external services:
| Service | Purpose | Required? |
|---|---|---|
| Docker Hub | Pull base container images | Yes (login recommended to avoid rate limits) |
| LLM Providers | Model API calls via Middleman | At least one provider API key required |
| OIDC Provider | User authentication | Optional (Cognito used by default) |
| Datadog | Monitoring and observability | Optional |
| Slack | Budget alerts | Optional |
| CrowdStrike Falcon | Endpoint protection for EKS nodes and subnet router | Optional |
| Cloudflare | DNS delegation | Optional |
| GitHub | CI/CD via Pulumi Deploy | Optional |
Network Security¶
- TLS everywhere — all external traffic uses TLS via ACM certificates
- Private subnets — EKS nodes, RDS, and ECS tasks run in private subnets with no direct internet access
- NAT Gateways — outbound internet access from private subnets goes through NAT gateways
- Security groups — restrict traffic between components (ALB → ECS, ECS → RDS, etc.)
- VPC endpoints — S3 traffic stays within the VPC via a Gateway endpoint
- Cilium network policies — pod-level network isolation within Kubernetes