Canary Review 2026: The QA Harness Built Specifically for Claude Code
Canary is a free, open-source QA harness built for Claude Code — E2E testing with screen recordings, console logs, network HARs, and Playwright traces bundled i

Testing software built with AI coding agents is a challenge. Claude Code can write and modify code quickly, but verifying that changes do not break existing functionality requires solid QA tooling. Canary is a testing harness designed specifically for Claude Code — providing E2E tests with comprehensive observability data that makes debugging failures fast and clear.
What Is Canary?
Canary is an open-source QA harness developed by wizenheimer on GitHub. It is built specifically for use with Claude Code and integrates with Playwright to provide end-to-end testing with rich observability: screen recordings of test runs, console log capture, network HAR files, and full Playwright traces — all bundled together for each test session.
The goal is to make it easy for Claude Code to understand what happened during a failing test and to fix the issue without human intervention in many cases.
Key Features
Screen Recordings
Every test run is recorded as a video. When a test fails, Claude Code can review what happened on screen to understand the failure context — something that is impossible with text-only test output.
Console Log Capture
Browser console logs are captured and attached to each test session. JavaScript errors, warnings, and custom log messages are all available for analysis.
Network HAR Files
HTTP Archive (HAR) files capture all network requests and responses during the test. This makes it easy to diagnose API failures, authentication issues, or unexpected response data.
Playwright Traces
Full Playwright traces include a timeline of DOM events, network activity, and screenshots at each step. These can be opened in Playwright’s trace viewer for detailed post-mortem analysis.
Claude Code Integration
The combination of screen recording, logs, network data, and Playwright traces gives Claude Code enough context to understand and often fix failing tests automatically without human description of what went wrong.
Pros
- Free and open source
- Comprehensive observability bundle per test session
- Designed specifically for Claude Code integration
- Screen recordings make visual failures immediately obvious
- HAR files enable deep network debugging
- Playwright-based — familiar to most frontend test engineers
Cons
- Primarily designed for web application testing
- Screen recording and trace storage can require significant disk space
- Requires Playwright familiarity to configure effectively
- Claude Code integration works best with Anthropic’s hosted models
Who Is It For?
Canary is for development teams using Claude Code to build web applications who want a solid E2E testing setup that integrates well with AI-assisted debugging. It is particularly valuable for teams where Claude Code is used to both write and fix code, closing the loop between generation and verification.
Pricing
Free. Available at github.com/wizenheimer/canary.
Verdict
Canary solves a real pain point for teams using Claude Code — getting the AI enough context about test failures to fix them independently. The combination of screen recording, logs, network data, and Playwright traces is comprehensive. Recommended for any team using Claude Code for web development.
Rating: 8/10 — Excellent QA tooling for Claude Code users. Web-only and storage overhead are the main limitations.
This article is for educational purposes only. Always evaluate open-source tools against your own requirements before deploying to production.
Partner picks
Build a smarter digital stack
Explore curated AI, automation, wealth, and creator tools selected for practical value, transparent pricing, and clear use cases.
Disclosure: some links may be affiliate links. DigitechLifestyle may earn a commission at no additional cost to you.