Skip to content

Integration tests

The project ships four tiers of tests. Each tier catches a different class of regression — a change is “done” when every tier the change can plausibly break is green.

Default go test ./... -race -count=1. Runs on every push and every PR via the go CI job. Covers pure-Go logic and the mock-backed LSP wire client. No external binary dependencies beyond the Go toolchain itself.

Terminal window
make test

Tier 2 — Integration tests (make test-integration)

Section titled “Tier 2 — Integration tests (make test-integration)”

Files tagged //go:build integration. Drive the real external binaries the sandbox image ships:

PackageBinaryWhat it catches
internal/lsp/integration_test.gogoplsMock-vs-real wire drift (the original implementation returned no rename edits because gopls uses documentChanges not changes; this tier caught it)
internal/lsp/rust_integration_test.gorust-analyzerSame wire-drift class, in the rust-analyzer dialect (initialize quirks, references shape, rename shape)
internal/lsp/python_integration_test.gopyright-langserverSame, in the pyright dialect
internal/lsp/node_integration_test.gotypescript-language-server (+ typescript)Same, in the tsserver dialect
internal/verify/integration_test.gogolangci-lintLint-output format changes across linter versions
internal/verify/python_integration_test.goruffSame class of drift on pythonDetector.ParseLint (F-series format)
internal/verify/rust_integration_test.gocargo clippy (--message-format=short)Same class of drift on rustDetector.ParseLint (severity-tagged one-liner format)
internal/verify/node_integration_test.goeslint (--format=json)Same class of drift on nodeDetector.ParseLint (eslint JSON output; the legacy --format=compact was removed from eslint v9 core)
internal/tools/integration_test.gogoStructured-failure parsing from live go test -json output

Run locally:

Terminal window
make test-integration

Binaries the tier needs on PATH:

  • go (always present in a Go dev environment)
  • goplsgo install golang.org/x/tools/gopls@latest
  • golangci-lint v2 — brew install golangci-lint / apt install golangci-lint
  • ruffpip install ruff
  • cargo clippyrustup component add clippy
  • eslintnpm i -g eslint
  • rust-analyzerrustup component add rust-analyzer
  • pyright-langservernpm i -g pyright (or pip install pyright)
  • typescript-language-servernpm i -g typescript-language-server typescript

Any missing binary skips the corresponding test with a clear message; it is not a failure. That keeps the target safe to run on a partially-provisioned machine while still being meaningful when every tool is present.

CI runs this tier in a dedicated integration job that installs each binary fresh, so every detector + LSP is fully exercised on every PR rather than depending on contributor toolchains.

Tier 3 — End-to-end MCP wire (scripts/e2e-p0.sh)

Section titled “Tier 3 — End-to-end MCP wire (scripts/e2e-p0.sh)”

Out-of-test-tree smoke that chains every P0 tool over the real MCP HTTP+SSE wire against a real bin/sandbox binary. This is the only tier that exercises the tool surface through the full transport — MCP initialization, JSON-RPC request / SSE response round-trips, tool dispatch, the scrub + metrics + tracing middleware stack. Mirrors scripts/e2e-demo.sh in shape.

Terminal window
bash scripts/e2e-p0.sh

Runtime ~60s on a warm Go cache. LSP steps skip cleanly when gopls isn’t on PATH; everything else is unconditional. Binaries needed: go, curl, jq, git, ripgrep, gopls (optional).

CI runs this tier as the e2e-smoke job: installs gopls + ripgrep, builds the binary, runs the script. No binary skips in CI — every LSP step is exercised.

A focused companion to e2e-p0.sh that boots the sandbox in -workspaces=primary=A,extension=B mode and asserts every workspace-aware tool dispatches to the correct root, rejects the no-hint case with an actionable error, rejects unknown-name hints, and keeps the read-tracker per-absolute-path (one workspace’s Read cannot unlock another’s Write).

Terminal window
bash scripts/e2e-multi-workspace.sh

Runtime ~10s. Binaries needed: go, bash, curl, jq, ripgrep (Glob + Grep delegate to rg). Wired into CI as the e2e-multi-workspace job parallel to e2e-smoke.

One job per feature layer composes Dockerfile.tools + Dockerfile.tools-<lang> onto the layer’s operator-recommended base, boots a container, opens the SSE stream, initialises an MCP session, and exercises the layer’s characteristic tools.

JobBaseFeature-layer binariesWhat it asserts via MCP
docker-integrationgolang:1.25-alpinegopls, golangci-lintEvery P0 tool name is present in tools/list
docker-integration-nodenode:22-slimpnpm, bunrun_tests runs a node --test suite to exit 0; Bash sees pnpm + bun on PATH
docker-integration-pythonpython:3.12-slim + pytestruffrun_lint surfaces a seeded F401 finding via MCP; run_tests runs pytest to exit 0
docker-integration-rustrust:1-slim-bookworm + clippyrust-analyzerrun_tests passes cargo test; run_lint surfaces cargo clippy output
docker-integration-rendertools-render base (debian + dot + mmdc + Chromium)render_mermaid + render_dot each write an SVG with an <svg root to the workspace volume

This is the only tier that verifies the published image actually boots and registers its tool surface. If any of these go red, no other test tier’s green matters — the operators can’t run the thing.

The MCP handshake + tools/call boilerplate is factored into scripts/mcp-helpers.sh so each job’s inline bash stays focused on the seed + assertions.

  • Small refactor, no external deps touched: unit tests cover it.
  • New tool, new flag, anything agent-visible: add a case to scripts/e2e-p0.sh and run it locally before pushing.
  • Touching the LSP client, the lint parser, or the Go test parser: the integration tier is your regression net; run make test-integration locally.
  • Touching the MCP transport, middleware, or any tool handler’s wire contract: e2e-smoke catches the full-stack regression.
  • Touching ResolveWorkspace, the read-tracker, or any tool’s workspace argument plumbing: e2e-multi-workspace is the regression net.
  • Dockerfile or image composition change: push the branch and let docker-integration gate the merge.