Security¶

This page covers Hawk's security architecture, access control, audit logging, and optional AWS security services.

Authentication¶

Hawk uses OIDC (OpenID Connect) for all authentication. JWTs are validated at every service boundary — the API server, Middleman (LLM proxy), the web viewer, and Lambda functions.

Default: Cognito¶

When no OIDC provider is configured, Hawk creates a Cognito user pool automatically. Users are managed via the AWS Console or the helper scripts.

Managing Users¶

scripts/dev/create-cognito-user.sh <stack> user@example.com

Managing Model Access Groups¶

Cognito automatically includes group memberships in the cognito:groups claim of access tokens. Create groups matching the model groups configured in Middleman:

# Create a model access group
scripts/dev/manage-cognito-groups.sh <stack> create model-access-openai

# Add a user to the group
scripts/dev/manage-cognito-groups.sh <stack> add-user model-access-openai user@example.com

# List all groups
scripts/dev/manage-cognito-groups.sh <stack> list

Users who aren't in any group fall back to hawk:defaultPermissions (default: model-access-public), which grants access to models in the public group.

External OIDC Provider (Okta, Auth0, etc.)¶

For production deployments, we recommend using your organization's identity provider. Use the autodiscovery script to generate Pulumi config from your issuer URL:

uv run python scripts/dev/discover-oidc.py <your-issuer-url> <your-client-id> <your-audience>

This prints the full set of hawk:oidc* config values to add to your Pulumi.<stack>.yaml. See Pulumi.example.yaml for the complete list of OIDC settings.

OIDC App Requirements¶

Your OIDC application must:

Use PKCE (Proof Key for Code Exchange) — no client secret
Support the authorization_code grant type
Include these redirect URIs:
- http://localhost:18922/callback — Hawk CLI login
- https://viewer.<your-domain>/oauth/complete — web viewer login

Required JWT Claims¶

Hawk extracts permissions from the permissions claim (or scp as a fallback). The claim value can be either a JSON array of strings or a space-separated string:

{
  "sub": "user123",
  "iss": "https://login.example.com/oauth2/default",
  "aud": "your-audience",
  "permissions": ["model-access-openai", "model-access-anthropic"]
}

Or equivalently:

{
  "permissions": "model-access-openai model-access-anthropic"
}

The group names must match the groups assigned to models in Middleman (see Model Groups below).

No permissions claim?

If the JWT has no permissions or scp claim, Hawk falls back to hawk:defaultPermissions (default: model-access-public). This is how Cognito users get access without custom claims.

Setting Up Your Identity Provider¶

The exact steps vary by provider, but the general approach is:

Create an OIDC application in your IdP (Okta, Auth0, Entra ID, etc.) with PKCE and the redirect URIs above.
Create groups in your IdP for each model access level you need (e.g., model-access-openai, model-access-anthropic). These must match the group names you assign to models in Middleman.
Add a custom claim named permissions to your access tokens that includes the user's group memberships. Most IdPs support this via:
- Okta: Custom claim on an authorization server using a Groups expression
- Auth0: Post-login Action that adds groups to the access token
- Entra ID (Azure AD): App roles or group claims in the token configuration
- Keycloak: Protocol mapper for group membership
Configure Hawk with your IdP's client ID, audience, and issuer URL in your Pulumi stack config.

Access Control¶

Model Groups¶

Model access is controlled through model groups. Each model configured in Middleman belongs to a group (e.g., model-access-openai, model-access-anthropic). Users must have a matching group in their JWT permissions claim to:

Use the model for evaluations (via Middleman)
View evaluation results that used that model (via the web viewer and API)

This means evaluation results are automatically restricted to users who have access to the models used in the evaluation.

Models are assigned to groups by Middleman admins when configuring the model. For example, a model with group: "model-access-openai" requires the user to have model-access-openai in their JWT permissions claim. The model-access-public group is the default and grants access to models intended for all users.

How Group Membership Flows¶

flowchart LR
    IdP["Identity Provider<br/>(Okta, Auth0, Cognito)"]
    JWT["JWT with<br/>permissions claim"]
    API["Hawk API"]
    MM["Middleman"]
    TB["Token Broker<br/>(Lambda)"]
    S3["S3 (eval logs)"]
    DB["PostgreSQL<br/>(RLS)"]

    IdP -->|"issues"| JWT
    JWT -->|"validates"| API
    JWT -->|"validates"| MM
    JWT -->|"exchanges for<br/>scoped credentials"| TB
    TB -->|"scoped access"| S3
    API -->|"RLS enforces<br/>group membership"| DB

Identity Provider issues JWTs with model group memberships in the permissions claim
Middleman validates the JWT and checks the user's groups before routing model API calls
Token Broker validates the user's model group permissions, then exchanges the JWT for scoped AWS credentials tied to a specific job via AWS session tags
PostgreSQL RLS (Row-Level Security) restricts database queries to evaluation results the user is authorized to see

Administrative Roles¶

Hawk has one administrative role: Middleman Admin. Admins can:

Create, update, and delete model configurations
View all models regardless of group membership
Manage provider API keys

To grant admin access, add a boolean claim to your OIDC access tokens:

Claim	Purpose
`https://middleman.metr.org/claims/admin`	Full admin access (production + non-production)
`https://middleman.metr.org/claims/dev-admin`	Admin access for non-production environments only

The dev-admin claim is only accepted when the Middleman environment variable MIDDLEMAN_ACCEPT_DEV_ADMIN=true is set (which Hawk configures automatically for non-production stacks).

Example JWT with admin access:

{
  "sub": "admin-user",
  "permissions": ["model-access-openai"],
  "https://middleman.metr.org/claims/admin": true
}

In your IdP, create a group (e.g., middleman-admins) and configure a custom claim that emits true when the user is a member of that group.

Hawk Admin¶

Hawk admins can stop and delete eval sets and scan runs they do not own. Normal users are restricted to resources they created. Admin status has no effect on model or eval data access — viewing evaluation results remains gated by model-group membership as usual.

There are two OR-ed sources of admin status; both are disabled by default:

Config key	How it works
`hawk:hawkAdminClaim`	Name of a JWT claim. When the claim is present and its value is boolean `true`, the bearer is treated as a Hawk admin.
`hawk:hawkAdminPermissions`	List of permission/group names. If any entry matches a value in the token's `permissions`, `scp`, or `cognito:groups` claim, the bearer is treated as a Hawk admin.

hawkAdminPermissions entries must be disjoint from defaultPermissions: tokens that carry no permission claims have the defaults substituted in, so any overlap would make every such caller an admin. The API refuses to start if it detects an overlap. The entries must also not be OAuth scopes that users can self-request from your IdP.

Every admin override is logged at warning level as a structured admin_override event that includes the action (stop/delete), the actor identity, the resource owner, and the job ID, so all non-owner mutations are auditable.

Note that runner pods receive the launching user's access token (for model API and artifact access). When an admin launches an eval or scan, that delegated token carries admin status for its lifetime, so workloads in the run could use it to stop or delete other users' jobs. Prefer separate day-to-day and admin identities, or grant admin via short-lived group membership.

Example: using hawk:hawkAdminClaim with an Okta custom claim:

{
  "sub": "admin-user",
  "permissions": ["model-access-openai"],
  "hawk-admin": true
}

pulumi config set hawk:hawkAdminClaim hawk-admin

For Cognito deployments, hawk:hawkAdminPermissions is simpler — create a Cognito group (e.g. hawk-admin) and add it to the config:

pulumi config set --path 'hawk:hawkAdminPermissions[0]' hawk-admin

Sensitive Model Protection¶

Hawk protects sensitive model information through codenames:

Models are given a public_name (codename) that is used everywhere in the system
The real model identifier (danger_name) is only known to Middleman
Evaluation results, logs, and the web viewer only show the public_name
Observability data (Datadog, Sentry) is scrubbed to prevent danger_name leakage — API keys, auth headers, and model identifiers are filtered at multiple layers

Users without the appropriate model group cannot access evaluation results that used that model.

Sandbox Isolation¶

Evaluations run in isolated Kubernetes pods with:

Separate namespaces — each evaluation gets its own namespace for the runner and sandbox pods
Network policies — Cilium network policies block egress to VPC infrastructure (primary subnet CIDR, EC2 IMDS, EKS Pod Identity) from sandbox pods, and can restrict no-internet pods to DNS-only egress
Resource limits — CPU and memory constraints per pod
StatefulSets — sandbox pods auto-restart on failure

Alternative Sandbox Providers¶

While Kubernetes is the default sandbox environment, Hawk's architecture does not strictly require it. EC2-based sandboxing and other providers (e.g., Modal) can be used as alternatives. The sandbox provider is configured per evaluation.

Audit Logging¶

Application-Level Logging¶

Hawk API — all API requests are logged to CloudWatch with user identity, action, and resource context
Middleman — model API calls are logged with user identity, model (public name only), and token usage. Request/response bodies are not logged.
Token Broker — credential exchanges are logged with user identity and requested scope

AWS CloudTrail¶

CloudTrail is enabled by default in AWS accounts and logs all AWS API calls. CloudTrail Insights (anomaly detection for API call rates and error rates) can be enabled separately via the infra-shared repository.

VPC Flow Logs¶

VPC flow logs are enabled for all traffic and sent to CloudWatch Logs at /aws/vpc/flowlogs/<env>. Retention follows hawk:cloudwatchLogsRetentionDays (default: 14 days).

Endpoint Protection (CrowdStrike Falcon)¶

Hawk optionally deploys the CrowdStrike Falcon sensor to protect infrastructure hosts. Enable it with hawk:enableCrowdstrike: "true" and a Secrets Manager secret containing your CrowdStrike API credentials (see Configuration).

What's Protected¶

Target	OS / Arch	Installation Method
GPU nodes (Karpenter)	AL2023 / x86_64	Sensor RPM installed via EC2NodeClass userData at boot
Tailscale subnet router	AL2023 / ARM64	Sensor RPM installed via cloud-init at boot
Default nodes (Karpenter)	Bottlerocket / x86_64	DaemonSet via falcon-sensor Helm chart (requires Falcon Images Download scope)

The sensor is downloaded from the CrowdStrike API at instance boot using the Sensor Download: Read API scope. The DaemonSet approach (for Bottlerocket nodes) pulls a container image from registry.crowdstrike.com, which requires the Falcon Images Download: Read scope — part of the Falcon Cloud Security with Containers add-on.

Tailscale ZTA Integration¶

With the Falcon sensor on the subnet router, you can enable CrowdStrike ZTA with Tailscale to gate tailnet access based on device posture. This requires a separate API client with Hosts: Read and Zero Trust: Read scopes, configured in the Tailscale admin console under Device Posture Integrations. Note that ZTA scores apply to the router instance itself, not to traffic routed through it.

AWS Security Services¶

AWS security services (GuardDuty, Security Hub, AWS Config, CloudTrail Insights) are managed by the infra-shared repository, not by Hawk. See infra-shared for configuration details.

Recommended Security Configuration¶

For production deployments, consider:

Setting hawk:eksPublicEndpoint: "false" and using Tailscale for private cluster access
Setting hawk:albInternal: "true" to make the ALB private (requires VPN)
Setting hawk:protectResources: "true" to prevent accidental deletion of stateful resources (S3 buckets, secrets, the Datadog log-archive bucket, and the Aurora cluster)

Monitoring & Observability¶

CloudWatch¶

All services log to CloudWatch by default. Log retention is configurable via hawk:cloudwatchLogsRetentionDays (default: 14 days).

Key log groups:

Hawk API logs
Middleman logs
Lambda function logs
GuardDuty findings (when enabled): /aws/events/guardduty/<env>
Security Hub findings (when enabled): /aws/events/securityhub/<env>

Datadog (Optional)¶

For richer monitoring, Hawk supports Datadog integration:

hawk:enableDatadog: "true"

This enables:

APM — distributed tracing across API, Middleman, and Lambda functions
Log forwarding — CloudWatch logs forwarded to Datadog
Custom metrics — token usage, import counts, evaluation durations
Sensitive data filtering — danger_name, API keys, and auth headers are scrubbed from all telemetry

Budget Alerts¶

Monitor AWS spending with budget alerts:

hawk:budgetLimit: "10000"
hawk:budgetNotificationEmails:
  - "team@example.com"
hawk:budgetNotificationThresholds:
  - 80
  - 100

External Dependencies¶

Hawk interacts with the following external services:

Service	Purpose	Required?
Docker Hub	Pull base container images	Yes (login recommended to avoid rate limits)
LLM Providers	Model API calls via Middleman	At least one provider API key required
OIDC Provider	User authentication	Optional (Cognito used by default)
Datadog	Monitoring and observability	Optional
Slack	Budget alerts	Optional
CrowdStrike Falcon	Endpoint protection for EKS nodes and subnet router	Optional
Cloudflare	DNS delegation	Optional
GitHub	CI/CD via Pulumi Deploy	Optional

Network Security¶

TLS everywhere — all external traffic uses TLS via ACM certificates
Private subnets — EKS nodes, RDS, and ECS tasks run in private subnets with no direct internet access
NAT Gateways — outbound internet access from private subnets goes through NAT gateways
Security groups — restrict traffic between components (ALB → ECS, ECS → RDS, etc.)
VPC endpoints — S3 traffic stays within the VPC via a Gateway endpoint
Cilium network policies — pod-level network isolation within Kubernetes