Contributing¶

Developer Setup¶

There are two ways to run Hawk locally:

cp hawk/.env.example hawk/.env
docker compose up --build

The defaults in .env.example are configured for fully local development (MinIO, local PostgreSQL, Minikube). For staging, update the values to point at staging infrastructure.

Then submit evals:

hawk eval-set examples/simple.eval-set.yaml

Run k9s to monitor the Inspect pod.

Commit signing¶

All commits to this repo must be signed. Set up SSH commit signing once (global config applies to every clone and worktree):

git config --global commit.gpgsign true
git config --global gpg.format ssh
# point at your SSH public key (this is the common default path)
git config --global user.signingkey ~/.ssh/id_ed25519.pub

Then add that same public key to GitHub as a Signing Key (Settings → SSH and GPG keys → New SSH key → key type "Signing Key"). A key added only as an Authentication key still signs locally but won't show as Verified on GitHub.

Confirm a commit is signed with git cat-file -p <sha> (look for a gpgsig header) or git log --show-signature; on GitHub it shows a "Verified" badge.

Full Dev Stack (API + Viewer + Live Reload)¶

For developing with hot reload across the full stack:

Terminal 1: Library Watch Mode¶

Inspect AIInspect Scout

cd ~/inspect_ai/src/inspect_ai/_view/www
pnpm install
pnpm build:lib --watch

cd ~/inspect_scout/src/inspect_scout/_view/www
pnpm install
pnpm build:lib --watch

Terminal 2: Viewer Dev Server¶

Update www/package.json to point to your local library, then:

cd www
pnpm install
VITE_API_BASE_URL=http://localhost:8080 pnpm dev

Terminal 3: API Server¶

Run from the repo root:

# Point an env file at a deployed stack (DB, S3, OIDC, etc.):
uv run python scripts/dev/generate-env.py <stack> --api > hawk/.env
# ...or, for fully local development: cp hawk/.env.example hawk/.env

scripts/dev/api   # serves http://localhost:8080 with live reload

scripts/dev/api loads hawk/.env (override with HAWK_ENV_FILE) and runs the server in the app's project env. Extra args pass through to fastapi dev, e.g. scripts/dev/api --port 9000.

Code Quality¶

ruff check       # lint
ruff format      # format
basedpyright     # type check
pytest           # unit tests

All code must pass basedpyright with zero errors and zero warnings.

Testing Runner Changes¶

Build and push a custom runner image:

scripts/dev/build-and-push-runner-image.sh my-tag
hawk eval-set examples/simple.eval-set.yaml --image-tag my-tag

Local Minikube Setup¶

This runs Hawk entirely locally. It uses MinIO for S3, local PostgreSQL, a local Docker registry, and Minikube for Kubernetes. This is the same setup used by E2E tests in CI.

Prerequisites¶

You must be inside the devcontainer, which includes minikube, Docker-in-Docker, cilium, kubectl, helm, and gvisor.

Quick Start¶

These commands are run from the hawk/ directory:

cp .env.example .env
scripts/dev/start-minikube.sh

The script will:

Start Minikube with gvisor, containerd, and an insecure local registry
Create Kubernetes resources and install Cilium
Launch services (API server, MinIO, PostgreSQL, Docker registry)
Run a smoke test to verify the cluster works
Build and push a dummy runner image
Run a simple eval set to verify everything works

Running Evals Locally¶

HAWK_API_URL=http://localhost:8080 hawk eval-set examples/simple.eval-set.yaml --image-tag=dummy

To run real evals, build and push a real runner image:

RUNNER_IMAGE_NAME=localhost:5000/runner scripts/dev/build-and-push-runner-image.sh latest

Updating Dependencies (Inspect AI / Inspect Scout)¶

Use the prepare-release.py script:

# Update to a specific PyPI version
scripts/ops/prepare-release.py --inspect-ai 0.3.50

# Update to a specific git commit SHA
scripts/ops/prepare-release.py --inspect-ai abc123def456

# Update Scout
scripts/ops/prepare-release.py --inspect-scout 0.2.10

The script updates pyproject.toml files, runs uv lock, creates a release branch (for PyPI versions), and publishes any npm packages if needed.

Database Migrations¶

See Database for migration instructions.

Pull Requests¶

When creating PRs, use the template at .github/pull_request_template.md. The template includes:

Overview and linked issue
Approach and alternatives considered
Testing & validation checklist
Code quality checklist