Composable monorepo CI fixes: - Add empty go.sum.tmpl files for pkg, service, worker, and cli components - Fix Dockerfile.tmpl glob patterns (COPY go.work.sum* is invalid in Kaniko) - Add deps step to CI that runs go work sync and go mod tidy before builds - Fix scalar-go dependency version (v0.1.2 doesn't exist, use v0.13.0) Health endpoint improvements: - Add registry health check (zot OCI /v2/ endpoint) - Add health metrics for CI, registry, and Git - Add /health/ci endpoint for Woodpecker health Visual verification scaffolding: - Add Playwright pod and scripts ConfigMap - Add vision.md and implementation breakdown plan Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
17 KiB
Visual Verification Implementation Breakdown
Goal: Add Playwright-based visual verification to rdev, enabling automated screenshot/video capture of deployed sites and AI-driven feature completeness evaluation. Integrate with SDLC as an optional QA gate and add a cookbook E2E test.
Estimated Duration: 4 weeks (assumes ~25 hours/week of focused work)
Week 1: Foundation — Domain + Capture Infrastructure
Goals:
- Playwright pod deployed and reachable via kubectl exec
- Capture script working end-to-end
- Domain models and work task type in place
- Manual verification via kubectl exec confirms capture works
Tasks:
Day 1-2: Playwright Pod Infrastructure
-
Create Playwright pod manifest (
deployments/k8s/base/playwright-pod.yaml)- StatefulSet with
mcr.microsoft.com/playwright:v1.50.0-nobleimage sleep infinitycommand (stays alive for kubectl exec)- Labels:
app: playwright,rdev.orchard9.ai/role: playwright - Volumes:
/captures(emptyDir),/scripts(ConfigMap) - Resources: 500m CPU / 1Gi request, 2 CPU / 4Gi limit
- StatefulSet with
-
Create capture script (
deployments/k8s/base/playwright-scripts/capture.js)- ~60 lines Node.js using Playwright
- CLI:
--url,--viewports(comma-sep),--output,--wait-for,--full-page,--video,--timeout - Output: JSON manifest to stdout with screenshot paths
- Error handling: catch navigation failures, timeout gracefully
-
Create ConfigMap for script (
deployments/k8s/base/playwright-configmap.yaml)- Mount
capture.jsat/scripts/capture.js
- Mount
-
Deploy to cluster and test manually
kubectl apply -f deployments/k8s/base/playwright-configmap.yaml kubectl apply -f deployments/k8s/base/playwright-pod.yaml kubectl exec playwright-0 -- node /scripts/capture.js \ --url=https://example.com --viewports=1920x1080 --output=/captures/test/ kubectl exec playwright-0 -- cat /captures/test/manifest.json
Day 3: Domain Models
-
Create domain types (
internal/domain/verify.go)VerifySpecstruct with fields: URL, Viewports, WaitFor, WaitTimeout, FullPage, Video, Evaluate, Prompt, SpecPath, CallbackURLValidate()method: URL required, callback URL validation (reuseValidateCallbackURL)VerifyResultstruct: Success, Screenshots, Video, Evaluation, Score, Passed, DurationMs, ErrorToWorkResult()method (promote screenshots to artifacts map)
-
Add work task type (
internal/domain/work.go)- Add
WorkTaskTypeVerify WorkTaskType = "verify"to constants - Update
IsValid()to include verify
- Add
-
Unit tests (
internal/domain/verify_test.go)- Test Validate() with valid/invalid specs
- Test ToWorkResult() conversion
Day 4-5: Verify Executor (Capture Only)
-
Create verify executor (
internal/worker/verify_executor.go)- Follow
BuildExecutorpattern exactly Execute(ctx, task)method:- Parse VerifySpec from task.Spec map
- Build kubectl exec command:
kubectl exec playwright-0 -- node /scripts/capture.js --url=X ... - Execute via existing
CommandExecutorport - Parse JSON manifest from stdout
- Return
BuildResultwith artifacts map containing screenshot paths
- Config struct:
VerifyExecutorConfigwith playwright pod name, namespace - Constructor:
NewVerifyExecutor(executor, streams, logger, cfg)
- Follow
-
Wire executor to WorkExecutor (
internal/worker/work_executor.go)- Add
verifyExec *VerifyExecutorfield - Add case in
executeTask()switch forWorkTaskTypeVerify - Update
NewWorkExecutor()to accept VerifyExecutor
- Add
-
Unit tests (
internal/worker/verify_executor_test.go)- Mock CommandExecutor to return capture manifest JSON
- Test successful capture with multiple viewports
- Test failure handling (command fails, invalid JSON)
Deliverables:
- Playwright pod running in cluster
- Capture script takes screenshots successfully
- VerifySpec/VerifyResult domain types with tests
- VerifyExecutor can dispatch capture via kubectl exec
- Work queue can dispatch verify tasks (manual test via SQL insert)
Foundation this enables:
- Week 2 can build API layer knowing capture works
- Executor pattern established for AI evaluation later
Week 2: API Layer + Manual E2E
Goals:
- Full API surface: POST /verify, GET /verify/{id}, GET /verifications
- Auth scopes configured
- Manual E2E working: API call → queue → capture → result
- Initial release candidate deployed to staging
Tasks:
Day 1: Auth and Service Layer
-
Add auth scopes (
internal/auth/scopes.go)ScopeVerifyRead Scope = "verify:read"ScopeVerifyWrite Scope = "verify:write"- Add to
AllScopesif needed
-
Create verify service (
internal/service/verify_service.go)- Follow
BuildServicepattern StartVerify(ctx, projectID, spec)→ validate, enqueue task, return task IDGetVerifyStatus(ctx, taskID)→ get task from work queueListVerifications(ctx, projectID, limit)→ list tasks by project- Dependencies: WorkQueue port (existing)
- Follow
-
Unit tests (
internal/service/verify_service_test.go)- Mock work queue
- Test enqueue, status, list
Day 2-3: Handler Layer
-
Create verify handler (
internal/handlers/verify.go)- Follow
BuildsHandlerpattern exactly Mount(r api.Router)with scopes:- POST
/projects/{id}/verify→ ScopeVerifyWrite - GET
/projects/{id}/verifications→ ScopeVerifyRead - GET
/verify/{taskId}→ ScopeVerifyRead
- POST
- Use
api.DecodeJSON(),validate.New(), response helpers - Request struct:
VerifyRequestmatching VerifySpec - Response structs: match existing patterns
- Follow
-
Wire DI (
cmd/rdev-api/main.go)- Create VerifyExecutor in worker setup
- Create VerifyService
- Create VerifyHandler
- Mount routes
-
Handler tests (
internal/handlers/verify_test.go)- Test POST with valid/invalid specs
- Test auth scope enforcement
- Test GET status/list
Day 4: SSE Events
- Add verify events (
internal/worker/verify_executor.go)- Publish events via StreamPublisher:
verify.started- task claimedverify.capturing- starting captureverify.captured- capture complete with manifestverify.completed/verify.failed- final status
- Event constants in verify_executor.go (follow BuildExecutor pattern)
- Publish events via StreamPublisher:
Day 5: Manual E2E + Deploy
-
Manual E2E test sequence
# 1. Start verification curl -X POST $RDEV_API_URL/projects/myproject/verify \ -H "X-API-Key: $RDEV_API_KEY" \ -H "Content-Type: application/json" \ -d '{"url": "https://myproject.threesix.ai", "viewports": ["1920x1080"]}' # Response: {"task_id": "xxx"} # 2. Poll for completion curl $RDEV_API_URL/verify/xxx -H "X-API-Key: $RDEV_API_KEY" # Response: screenshots in artifacts -
Build and deploy
./scripts/release.sh v0.11.0 "feat: add visual verification (capture-only MVP)" --deploy
Deliverables:
- Auth scopes for verify:read/write
- VerifyService with enqueue/status/list
- VerifyHandler with 3 endpoints
- SSE events for verification progress
- Deployed to staging, manual E2E passing
Foundation this enables:
- Week 3 can add AI evaluation knowing API works
- Cookbook script can use standard api_call() pattern
Week 3: AI Evaluation + Cookbook Test
Goals:
- AI evaluation path working (Claude reads screenshots, returns verdict)
- Cookbook E2E test script:
visual-verify-test.sh - Add to common.sh utilities
- Full E2E passing in CI
Tasks:
Day 1-2: AI Evaluation Path
-
Add evaluation to VerifyExecutor (
internal/worker/verify_executor.go)- After successful capture, if
spec.Evaluate:- Build evaluation prompt: "Compare these screenshots against the specification..."
- Include spec.Prompt or read spec.SpecPath content
- Call Claude Code via CodeAgentRegistry
- Pass screenshots as attachments (file paths in pod)
- Parse evaluation output for score (look for "Score: XX/100" pattern)
- Set result.Evaluation, result.Score, result.Passed
- After successful capture, if
-
Evaluation prompt template (hardcoded in executor for now)
Evaluate these screenshots against the following specification: {spec.Prompt or contents of spec.SpecPath} For each screenshot, assess: 1. Does the UI match the specification? 2. Are all required elements present? 3. Is the layout correct at this viewport? End with: "Score: XX/100" and "PASSED" or "FAILED" -
Handle partial failures (
internal/worker/verify_executor.go)- If capture succeeds but evaluation fails:
- Set success=true (screenshots are still useful)
- Leave evaluation=""
- Log warning
- If capture succeeds but evaluation fails:
-
Unit tests for evaluation path
- Mock CodeAgentRegistry
- Test evaluation output parsing
- Test partial failure handling
Day 3-4: Cookbook Test Script
-
Add utility to common.sh (
cookbooks/scripts/common.sh)# Wait for verification to complete # Arguments: task_id [max_attempts] [poll_interval] wait_for_verify() { local task_id="$1" local max_attempts="${2:-30}" local poll_interval="${3:-5}" # Poll GET /verify/{task_id} until completed/failed } -
Create visual-verify-test.sh (
cookbooks/scripts/visual-verify-test.sh)- Follow cookbook script SKILL.md patterns exactly
- Commands: run, status, diagnose, teardown
- Flow:
- Create composable project with app-astro component
- Wait for initial deploy (site is live)
- Start build: "Create a hero section with a call-to-action button"
- Wait for build to complete
- Wait for CI pipeline
- Wait for site to respond
- Start verification:
POST /projects/{id}/verify {url, evaluate: true, prompt: ...} - Wait for verify to complete
- Assert: result.passed == true OR result.score >= 70
- Teardown
-
Add auto-teardown support
- Parse
--auto-teardownflag - Register cleanup trap
- Set CLEANUP_PROJECT
- Parse
Day 5: Integration + CI
-
Test locally
./cookbooks/scripts/visual-verify-test.sh run vv-test --auto-teardown -
Add to CI (if CI runs cookbook tests)
- Add visual-verify-test to test matrix
- Ensure playwright-0 pod is available in test environment
-
Document in cookbook skill (
.claude/skills/cookbook-scripts/SKILL.md)- Add
wait_for_verify()to utilities list - Add visual-verify-test.sh to examples
- Add
Deliverables:
- AI evaluation working with score extraction
- Partial failure handling (capture ok, eval fail)
- wait_for_verify() in common.sh
- visual-verify-test.sh passing end-to-end
- Documentation updated
Foundation this enables:
- Week 4 can add SDLC integration knowing full flow works
- Cookbook pattern established for future tests
Week 4: SDLC Integration + Polish
Goals:
- Visual verification as optional SDLC gate between QA and merge
- Skeleton command:
/verify-feature - Build chaining: auto-verify after deploy
- Release v0.12.0 with full feature
Tasks:
Day 1-2: SDLC Types and Rules
-
Add artifact type (
internal/sdlc/types.go)ArtifactVerification ArtifactType = "verification"- Add to
ValidArtifactTypesslice - Add case in
ArtifactFilename()→ returns"verification.md"
-
Add action types (
internal/sdlc/types.go)ActionVerifyFeature ActionType = "VERIFY_FEATURE"ActionFixVerificationIssues ActionType = "FIX_VERIFICATION_ISSUES"
-
Add classifier rules (
internal/sdlc/rules_execution.go)needsVerificationRule():- Condition: Phase=QA, qa_results=passed, verification=nil or pending
- Action: ActionVerifyFeature
- NextCommand: "/verify-feature {slug}"
verificationFailedRule():- Condition: Phase=QA, verification=failed
- Action: ActionFixVerificationIssues
- NextCommand: "/fix-verification-issues {slug}"
verificationPassedRule():- Condition: Phase=QA, qa_results=passed, verification=passed
- Action: ActionTransition to PhaseMerge
-
Update rule ordering (
internal/sdlc/rules.go)- Insert verification rules after qaPassedRule
- Update qaPassedRule: only transition if verification also passed OR feature doesn't require verification (config flag)
-
Unit tests (
internal/sdlc/rules_execution_test.go)- Test all three verification rules
- Test interaction with existing QA rules
Day 3: Skeleton Command
-
Create verify-feature command (embedded template:
templates/skeleton/.claude/commands/verify-feature.md)--- description: Visually verify a deployed feature argument-hint: <feature-slug> allowed-tools: Bash, Read, Write, Edit, Glob, Grep --- Visually verify feature: $ARGUMENTS ## Instructions 1. Load feature spec from `.sdlc/features/$ARGUMENTS/spec.md` 2. Get project domain from CLAUDE.md or config 3. Determine the deployed URL 4. Execute verification via rdev API (if available) or Playwright directly 5. Write results to `.sdlc/features/$ARGUMENTS/verification.md` 6. Register artifact: `sdlc artifact create $ARGUMENTS verification` ## Output Format Write `.sdlc/features/$ARGUMENTS/verification.md`: ```markdown # Visual Verification: [Feature Title] ## Screenshots | Viewport | Status | Notes | |----------|--------|-------| | Desktop (1920x1080) | PASS | All elements visible | | Mobile (375x667) | PASS | Responsive layout correct | ## Evaluation [AI or manual evaluation notes] ## Result **Status:** PASSED **Score:** 95/100 -
Update skeleton template to include the command
- Ensure new projects get verify-feature.md
Day 4: Build Chaining (Optional)
-
Add verify_after to BuildSpec (
internal/domain/build.go)VerifyAfter bool- auto-verify after successful deployVerifyURL string- URL to verify (if different from project domain)
-
Chain verification in BuildExecutor (
internal/worker/build_executor.go)- After successful build + push (line ~270):
if spec.VerifyAfter && spec.VerifyURL != "" { // Enqueue verify task } - Or: callback webhook triggers external verification
- After successful build + push (line ~270):
-
Update build handler to accept verify_after/verify_url
Day 5: Documentation + Release
-
Update documentation
- CLAUDE.md: Update platform status to "Done"
- visual-verification.md: Add SDLC integration examples
- sdlc.md: Document verification rules
-
Integration test
- Test full SDLC flow with verification gate
- Test classifier transitions correctly
-
Final release
./scripts/release.sh v0.12.0 "feat: visual verification with SDLC integration" --deploy
Deliverables:
- ArtifactVerification type in SDLC
- 3 classifier rules for verification gate
- verify-feature.md skeleton command
- Build chaining (verify_after flag)
- Full integration test passing
- v0.12.0 released
Summary
| Week | Theme | Key Output |
|---|---|---|
| 1 | Foundation | Playwright pod + capture script + domain types + executor |
| 2 | API Layer | Handlers + service + auth scopes + manual E2E |
| 3 | AI + Cookbook | Evaluation path + visual-verify-test.sh + common.sh utils |
| 4 | SDLC + Polish | Classifier rules + skeleton command + build chaining + release |
Risks and Mitigations
| Risk | Impact | Mitigation |
|---|---|---|
| Playwright pod OOM | Capture fails | Start with conservative limits (4Gi), tune based on usage |
| AI evaluation unreliable | Poor pass/fail decisions | Start with high threshold (70), tune; partial success mode |
| Screenshot storage fills up | Pod crashes | EmptyDir for now, add cleanup job or PVC later |
| SDLC rules conflict | Features stuck | Test extensively, make verification optional via config |
| Claude Code can't read screenshots | Evaluation broken | Test multimodal support; fallback to manual verification |
Files Created/Modified
New Files (13):
internal/domain/verify.gointernal/domain/verify_test.gointernal/service/verify_service.gointernal/service/verify_service_test.gointernal/handlers/verify.gointernal/handlers/verify_test.gointernal/worker/verify_executor.gointernal/worker/verify_executor_test.godeployments/k8s/base/playwright-pod.yamldeployments/k8s/base/playwright-configmap.yamldeployments/k8s/base/playwright-scripts/capture.jscookbooks/scripts/visual-verify-test.shtemplates/skeleton/.claude/commands/verify-feature.md
Modified Files (8):
internal/domain/work.go- Add WorkTaskTypeVerifyinternal/auth/scopes.go- Add verify scopesinternal/worker/work_executor.go- Add dispatch caseinternal/sdlc/types.go- Add artifact/action typesinternal/sdlc/rules.go- Register verification rulesinternal/sdlc/rules_execution.go- Add verification rulescookbooks/scripts/common.sh- Add wait_for_verify()cmd/rdev-api/main.go- Wire DI