Composable monorepo CI fixes: - Add empty go.sum.tmpl files for pkg, service, worker, and cli components - Fix Dockerfile.tmpl glob patterns (COPY go.work.sum* is invalid in Kaniko) - Add deps step to CI that runs go work sync and go mod tidy before builds - Fix scalar-go dependency version (v0.1.2 doesn't exist, use v0.13.0) Health endpoint improvements: - Add registry health check (zot OCI /v2/ endpoint) - Add health metrics for CI, registry, and Git - Add /health/ci endpoint for Woodpecker health Visual verification scaffolding: - Add Playwright pod and scripts ConfigMap - Add vision.md and implementation breakdown plan Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
480 lines
17 KiB
Markdown
480 lines
17 KiB
Markdown
# Visual Verification Implementation Breakdown
|
|
|
|
**Goal:** Add Playwright-based visual verification to rdev, enabling automated screenshot/video capture of deployed sites and AI-driven feature completeness evaluation. Integrate with SDLC as an optional QA gate and add a cookbook E2E test.
|
|
|
|
**Estimated Duration:** 4 weeks (assumes ~25 hours/week of focused work)
|
|
|
|
---
|
|
|
|
## Week 1: Foundation — Domain + Capture Infrastructure
|
|
|
|
**Goals:**
|
|
- Playwright pod deployed and reachable via kubectl exec
|
|
- Capture script working end-to-end
|
|
- Domain models and work task type in place
|
|
- Manual verification via kubectl exec confirms capture works
|
|
|
|
**Tasks:**
|
|
|
|
### Day 1-2: Playwright Pod Infrastructure
|
|
|
|
1. **Create Playwright pod manifest** (`deployments/k8s/base/playwright-pod.yaml`)
|
|
- StatefulSet with `mcr.microsoft.com/playwright:v1.50.0-noble` image
|
|
- `sleep infinity` command (stays alive for kubectl exec)
|
|
- Labels: `app: playwright`, `rdev.orchard9.ai/role: playwright`
|
|
- Volumes: `/captures` (emptyDir), `/scripts` (ConfigMap)
|
|
- Resources: 500m CPU / 1Gi request, 2 CPU / 4Gi limit
|
|
|
|
2. **Create capture script** (`deployments/k8s/base/playwright-scripts/capture.js`)
|
|
- ~60 lines Node.js using Playwright
|
|
- CLI: `--url`, `--viewports` (comma-sep), `--output`, `--wait-for`, `--full-page`, `--video`, `--timeout`
|
|
- Output: JSON manifest to stdout with screenshot paths
|
|
- Error handling: catch navigation failures, timeout gracefully
|
|
|
|
3. **Create ConfigMap for script** (`deployments/k8s/base/playwright-configmap.yaml`)
|
|
- Mount `capture.js` at `/scripts/capture.js`
|
|
|
|
4. **Deploy to cluster and test manually**
|
|
```bash
|
|
kubectl apply -f deployments/k8s/base/playwright-configmap.yaml
|
|
kubectl apply -f deployments/k8s/base/playwright-pod.yaml
|
|
kubectl exec playwright-0 -- node /scripts/capture.js \
|
|
--url=https://example.com --viewports=1920x1080 --output=/captures/test/
|
|
kubectl exec playwright-0 -- cat /captures/test/manifest.json
|
|
```
|
|
|
|
### Day 3: Domain Models
|
|
|
|
5. **Create domain types** (`internal/domain/verify.go`)
|
|
- `VerifySpec` struct with fields: URL, Viewports, WaitFor, WaitTimeout, FullPage, Video, Evaluate, Prompt, SpecPath, CallbackURL
|
|
- `Validate()` method: URL required, callback URL validation (reuse `ValidateCallbackURL`)
|
|
- `VerifyResult` struct: Success, Screenshots, Video, Evaluation, Score, Passed, DurationMs, Error
|
|
- `ToWorkResult()` method (promote screenshots to artifacts map)
|
|
|
|
6. **Add work task type** (`internal/domain/work.go`)
|
|
- Add `WorkTaskTypeVerify WorkTaskType = "verify"` to constants
|
|
- Update `IsValid()` to include verify
|
|
|
|
7. **Unit tests** (`internal/domain/verify_test.go`)
|
|
- Test Validate() with valid/invalid specs
|
|
- Test ToWorkResult() conversion
|
|
|
|
### Day 4-5: Verify Executor (Capture Only)
|
|
|
|
8. **Create verify executor** (`internal/worker/verify_executor.go`)
|
|
- Follow `BuildExecutor` pattern exactly
|
|
- `Execute(ctx, task)` method:
|
|
- Parse VerifySpec from task.Spec map
|
|
- Build kubectl exec command: `kubectl exec playwright-0 -- node /scripts/capture.js --url=X ...`
|
|
- Execute via existing `CommandExecutor` port
|
|
- Parse JSON manifest from stdout
|
|
- Return `BuildResult` with artifacts map containing screenshot paths
|
|
- Config struct: `VerifyExecutorConfig` with playwright pod name, namespace
|
|
- Constructor: `NewVerifyExecutor(executor, streams, logger, cfg)`
|
|
|
|
9. **Wire executor to WorkExecutor** (`internal/worker/work_executor.go`)
|
|
- Add `verifyExec *VerifyExecutor` field
|
|
- Add case in `executeTask()` switch for `WorkTaskTypeVerify`
|
|
- Update `NewWorkExecutor()` to accept VerifyExecutor
|
|
|
|
10. **Unit tests** (`internal/worker/verify_executor_test.go`)
|
|
- Mock CommandExecutor to return capture manifest JSON
|
|
- Test successful capture with multiple viewports
|
|
- Test failure handling (command fails, invalid JSON)
|
|
|
|
**Deliverables:**
|
|
- [ ] Playwright pod running in cluster
|
|
- [ ] Capture script takes screenshots successfully
|
|
- [ ] VerifySpec/VerifyResult domain types with tests
|
|
- [ ] VerifyExecutor can dispatch capture via kubectl exec
|
|
- [ ] Work queue can dispatch verify tasks (manual test via SQL insert)
|
|
|
|
**Foundation this enables:**
|
|
- Week 2 can build API layer knowing capture works
|
|
- Executor pattern established for AI evaluation later
|
|
|
|
---
|
|
|
|
## Week 2: API Layer + Manual E2E
|
|
|
|
**Goals:**
|
|
- Full API surface: POST /verify, GET /verify/{id}, GET /verifications
|
|
- Auth scopes configured
|
|
- Manual E2E working: API call → queue → capture → result
|
|
- Initial release candidate deployed to staging
|
|
|
|
**Tasks:**
|
|
|
|
### Day 1: Auth and Service Layer
|
|
|
|
1. **Add auth scopes** (`internal/auth/scopes.go`)
|
|
- `ScopeVerifyRead Scope = "verify:read"`
|
|
- `ScopeVerifyWrite Scope = "verify:write"`
|
|
- Add to `AllScopes` if needed
|
|
|
|
2. **Create verify service** (`internal/service/verify_service.go`)
|
|
- Follow `BuildService` pattern
|
|
- `StartVerify(ctx, projectID, spec)` → validate, enqueue task, return task ID
|
|
- `GetVerifyStatus(ctx, taskID)` → get task from work queue
|
|
- `ListVerifications(ctx, projectID, limit)` → list tasks by project
|
|
- Dependencies: WorkQueue port (existing)
|
|
|
|
3. **Unit tests** (`internal/service/verify_service_test.go`)
|
|
- Mock work queue
|
|
- Test enqueue, status, list
|
|
|
|
### Day 2-3: Handler Layer
|
|
|
|
4. **Create verify handler** (`internal/handlers/verify.go`)
|
|
- Follow `BuildsHandler` pattern exactly
|
|
- `Mount(r api.Router)` with scopes:
|
|
- POST `/projects/{id}/verify` → ScopeVerifyWrite
|
|
- GET `/projects/{id}/verifications` → ScopeVerifyRead
|
|
- GET `/verify/{taskId}` → ScopeVerifyRead
|
|
- Use `api.DecodeJSON()`, `validate.New()`, response helpers
|
|
- Request struct: `VerifyRequest` matching VerifySpec
|
|
- Response structs: match existing patterns
|
|
|
|
5. **Wire DI** (`cmd/rdev-api/main.go`)
|
|
- Create VerifyExecutor in worker setup
|
|
- Create VerifyService
|
|
- Create VerifyHandler
|
|
- Mount routes
|
|
|
|
6. **Handler tests** (`internal/handlers/verify_test.go`)
|
|
- Test POST with valid/invalid specs
|
|
- Test auth scope enforcement
|
|
- Test GET status/list
|
|
|
|
### Day 4: SSE Events
|
|
|
|
7. **Add verify events** (`internal/worker/verify_executor.go`)
|
|
- Publish events via StreamPublisher:
|
|
- `verify.started` - task claimed
|
|
- `verify.capturing` - starting capture
|
|
- `verify.captured` - capture complete with manifest
|
|
- `verify.completed` / `verify.failed` - final status
|
|
- Event constants in verify_executor.go (follow BuildExecutor pattern)
|
|
|
|
### Day 5: Manual E2E + Deploy
|
|
|
|
8. **Manual E2E test sequence**
|
|
```bash
|
|
# 1. Start verification
|
|
curl -X POST $RDEV_API_URL/projects/myproject/verify \
|
|
-H "X-API-Key: $RDEV_API_KEY" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"url": "https://myproject.threesix.ai", "viewports": ["1920x1080"]}'
|
|
# Response: {"task_id": "xxx"}
|
|
|
|
# 2. Poll for completion
|
|
curl $RDEV_API_URL/verify/xxx -H "X-API-Key: $RDEV_API_KEY"
|
|
# Response: screenshots in artifacts
|
|
```
|
|
|
|
9. **Build and deploy**
|
|
```bash
|
|
./scripts/release.sh v0.11.0 "feat: add visual verification (capture-only MVP)" --deploy
|
|
```
|
|
|
|
**Deliverables:**
|
|
- [ ] Auth scopes for verify:read/write
|
|
- [ ] VerifyService with enqueue/status/list
|
|
- [ ] VerifyHandler with 3 endpoints
|
|
- [ ] SSE events for verification progress
|
|
- [ ] Deployed to staging, manual E2E passing
|
|
|
|
**Foundation this enables:**
|
|
- Week 3 can add AI evaluation knowing API works
|
|
- Cookbook script can use standard api_call() pattern
|
|
|
|
---
|
|
|
|
## Week 3: AI Evaluation + Cookbook Test
|
|
|
|
**Goals:**
|
|
- AI evaluation path working (Claude reads screenshots, returns verdict)
|
|
- Cookbook E2E test script: `visual-verify-test.sh`
|
|
- Add to common.sh utilities
|
|
- Full E2E passing in CI
|
|
|
|
**Tasks:**
|
|
|
|
### Day 1-2: AI Evaluation Path
|
|
|
|
1. **Add evaluation to VerifyExecutor** (`internal/worker/verify_executor.go`)
|
|
- After successful capture, if `spec.Evaluate`:
|
|
- Build evaluation prompt: "Compare these screenshots against the specification..."
|
|
- Include spec.Prompt or read spec.SpecPath content
|
|
- Call Claude Code via CodeAgentRegistry
|
|
- Pass screenshots as attachments (file paths in pod)
|
|
- Parse evaluation output for score (look for "Score: XX/100" pattern)
|
|
- Set result.Evaluation, result.Score, result.Passed
|
|
|
|
2. **Evaluation prompt template** (hardcoded in executor for now)
|
|
```
|
|
Evaluate these screenshots against the following specification:
|
|
|
|
{spec.Prompt or contents of spec.SpecPath}
|
|
|
|
For each screenshot, assess:
|
|
1. Does the UI match the specification?
|
|
2. Are all required elements present?
|
|
3. Is the layout correct at this viewport?
|
|
|
|
End with: "Score: XX/100" and "PASSED" or "FAILED"
|
|
```
|
|
|
|
3. **Handle partial failures** (`internal/worker/verify_executor.go`)
|
|
- If capture succeeds but evaluation fails:
|
|
- Set success=true (screenshots are still useful)
|
|
- Leave evaluation=""
|
|
- Log warning
|
|
|
|
4. **Unit tests for evaluation path**
|
|
- Mock CodeAgentRegistry
|
|
- Test evaluation output parsing
|
|
- Test partial failure handling
|
|
|
|
### Day 3-4: Cookbook Test Script
|
|
|
|
5. **Add utility to common.sh** (`cookbooks/scripts/common.sh`)
|
|
```bash
|
|
# Wait for verification to complete
|
|
# Arguments: task_id [max_attempts] [poll_interval]
|
|
wait_for_verify() {
|
|
local task_id="$1"
|
|
local max_attempts="${2:-30}"
|
|
local poll_interval="${3:-5}"
|
|
# Poll GET /verify/{task_id} until completed/failed
|
|
}
|
|
```
|
|
|
|
6. **Create visual-verify-test.sh** (`cookbooks/scripts/visual-verify-test.sh`)
|
|
- Follow cookbook script SKILL.md patterns exactly
|
|
- Commands: run, status, diagnose, teardown
|
|
- Flow:
|
|
1. Create composable project with app-astro component
|
|
2. Wait for initial deploy (site is live)
|
|
3. Start build: "Create a hero section with a call-to-action button"
|
|
4. Wait for build to complete
|
|
5. Wait for CI pipeline
|
|
6. Wait for site to respond
|
|
7. Start verification: `POST /projects/{id}/verify {url, evaluate: true, prompt: ...}`
|
|
8. Wait for verify to complete
|
|
9. Assert: result.passed == true OR result.score >= 70
|
|
10. Teardown
|
|
|
|
7. **Add auto-teardown support**
|
|
- Parse `--auto-teardown` flag
|
|
- Register cleanup trap
|
|
- Set CLEANUP_PROJECT
|
|
|
|
### Day 5: Integration + CI
|
|
|
|
8. **Test locally**
|
|
```bash
|
|
./cookbooks/scripts/visual-verify-test.sh run vv-test --auto-teardown
|
|
```
|
|
|
|
9. **Add to CI** (if CI runs cookbook tests)
|
|
- Add visual-verify-test to test matrix
|
|
- Ensure playwright-0 pod is available in test environment
|
|
|
|
10. **Document in cookbook skill** (`.claude/skills/cookbook-scripts/SKILL.md`)
|
|
- Add `wait_for_verify()` to utilities list
|
|
- Add visual-verify-test.sh to examples
|
|
|
|
**Deliverables:**
|
|
- [ ] AI evaluation working with score extraction
|
|
- [ ] Partial failure handling (capture ok, eval fail)
|
|
- [ ] wait_for_verify() in common.sh
|
|
- [ ] visual-verify-test.sh passing end-to-end
|
|
- [ ] Documentation updated
|
|
|
|
**Foundation this enables:**
|
|
- Week 4 can add SDLC integration knowing full flow works
|
|
- Cookbook pattern established for future tests
|
|
|
|
---
|
|
|
|
## Week 4: SDLC Integration + Polish
|
|
|
|
**Goals:**
|
|
- Visual verification as optional SDLC gate between QA and merge
|
|
- Skeleton command: `/verify-feature`
|
|
- Build chaining: auto-verify after deploy
|
|
- Release v0.12.0 with full feature
|
|
|
|
**Tasks:**
|
|
|
|
### Day 1-2: SDLC Types and Rules
|
|
|
|
1. **Add artifact type** (`internal/sdlc/types.go`)
|
|
- `ArtifactVerification ArtifactType = "verification"`
|
|
- Add to `ValidArtifactTypes` slice
|
|
- Add case in `ArtifactFilename()` → returns `"verification.md"`
|
|
|
|
2. **Add action types** (`internal/sdlc/types.go`)
|
|
- `ActionVerifyFeature ActionType = "VERIFY_FEATURE"`
|
|
- `ActionFixVerificationIssues ActionType = "FIX_VERIFICATION_ISSUES"`
|
|
|
|
3. **Add classifier rules** (`internal/sdlc/rules_execution.go`)
|
|
- `needsVerificationRule()`:
|
|
- Condition: Phase=QA, qa_results=passed, verification=nil or pending
|
|
- Action: ActionVerifyFeature
|
|
- NextCommand: "/verify-feature {slug}"
|
|
- `verificationFailedRule()`:
|
|
- Condition: Phase=QA, verification=failed
|
|
- Action: ActionFixVerificationIssues
|
|
- NextCommand: "/fix-verification-issues {slug}"
|
|
- `verificationPassedRule()`:
|
|
- Condition: Phase=QA, qa_results=passed, verification=passed
|
|
- Action: ActionTransition to PhaseMerge
|
|
|
|
4. **Update rule ordering** (`internal/sdlc/rules.go`)
|
|
- Insert verification rules after qaPassedRule
|
|
- Update qaPassedRule: only transition if verification also passed OR feature doesn't require verification (config flag)
|
|
|
|
5. **Unit tests** (`internal/sdlc/rules_execution_test.go`)
|
|
- Test all three verification rules
|
|
- Test interaction with existing QA rules
|
|
|
|
### Day 3: Skeleton Command
|
|
|
|
6. **Create verify-feature command** (embedded template: `templates/skeleton/.claude/commands/verify-feature.md`)
|
|
```markdown
|
|
---
|
|
description: Visually verify a deployed feature
|
|
argument-hint: <feature-slug>
|
|
allowed-tools: Bash, Read, Write, Edit, Glob, Grep
|
|
---
|
|
|
|
Visually verify feature: $ARGUMENTS
|
|
|
|
## Instructions
|
|
|
|
1. Load feature spec from `.sdlc/features/$ARGUMENTS/spec.md`
|
|
2. Get project domain from CLAUDE.md or config
|
|
3. Determine the deployed URL
|
|
4. Execute verification via rdev API (if available) or Playwright directly
|
|
5. Write results to `.sdlc/features/$ARGUMENTS/verification.md`
|
|
6. Register artifact: `sdlc artifact create $ARGUMENTS verification`
|
|
|
|
## Output Format
|
|
|
|
Write `.sdlc/features/$ARGUMENTS/verification.md`:
|
|
|
|
```markdown
|
|
# Visual Verification: [Feature Title]
|
|
|
|
## Screenshots
|
|
|
|
| Viewport | Status | Notes |
|
|
|----------|--------|-------|
|
|
| Desktop (1920x1080) | PASS | All elements visible |
|
|
| Mobile (375x667) | PASS | Responsive layout correct |
|
|
|
|
## Evaluation
|
|
|
|
[AI or manual evaluation notes]
|
|
|
|
## Result
|
|
|
|
**Status:** PASSED
|
|
**Score:** 95/100
|
|
```
|
|
```
|
|
|
|
7. **Update skeleton template** to include the command
|
|
- Ensure new projects get verify-feature.md
|
|
|
|
### Day 4: Build Chaining (Optional)
|
|
|
|
8. **Add verify_after to BuildSpec** (`internal/domain/build.go`)
|
|
- `VerifyAfter bool` - auto-verify after successful deploy
|
|
- `VerifyURL string` - URL to verify (if different from project domain)
|
|
|
|
9. **Chain verification in BuildExecutor** (`internal/worker/build_executor.go`)
|
|
- After successful build + push (line ~270):
|
|
```go
|
|
if spec.VerifyAfter && spec.VerifyURL != "" {
|
|
// Enqueue verify task
|
|
}
|
|
```
|
|
- Or: callback webhook triggers external verification
|
|
|
|
10. **Update build handler** to accept verify_after/verify_url
|
|
|
|
### Day 5: Documentation + Release
|
|
|
|
11. **Update documentation**
|
|
- CLAUDE.md: Update platform status to "Done"
|
|
- visual-verification.md: Add SDLC integration examples
|
|
- sdlc.md: Document verification rules
|
|
|
|
12. **Integration test**
|
|
- Test full SDLC flow with verification gate
|
|
- Test classifier transitions correctly
|
|
|
|
13. **Final release**
|
|
```bash
|
|
./scripts/release.sh v0.12.0 "feat: visual verification with SDLC integration" --deploy
|
|
```
|
|
|
|
**Deliverables:**
|
|
- [ ] ArtifactVerification type in SDLC
|
|
- [ ] 3 classifier rules for verification gate
|
|
- [ ] verify-feature.md skeleton command
|
|
- [ ] Build chaining (verify_after flag)
|
|
- [ ] Full integration test passing
|
|
- [ ] v0.12.0 released
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
| Week | Theme | Key Output |
|
|
|------|-------|------------|
|
|
| 1 | Foundation | Playwright pod + capture script + domain types + executor |
|
|
| 2 | API Layer | Handlers + service + auth scopes + manual E2E |
|
|
| 3 | AI + Cookbook | Evaluation path + visual-verify-test.sh + common.sh utils |
|
|
| 4 | SDLC + Polish | Classifier rules + skeleton command + build chaining + release |
|
|
|
|
## Risks and Mitigations
|
|
|
|
| Risk | Impact | Mitigation |
|
|
|------|--------|------------|
|
|
| Playwright pod OOM | Capture fails | Start with conservative limits (4Gi), tune based on usage |
|
|
| AI evaluation unreliable | Poor pass/fail decisions | Start with high threshold (70), tune; partial success mode |
|
|
| Screenshot storage fills up | Pod crashes | EmptyDir for now, add cleanup job or PVC later |
|
|
| SDLC rules conflict | Features stuck | Test extensively, make verification optional via config |
|
|
| Claude Code can't read screenshots | Evaluation broken | Test multimodal support; fallback to manual verification |
|
|
|
|
## Files Created/Modified
|
|
|
|
**New Files (13):**
|
|
- `internal/domain/verify.go`
|
|
- `internal/domain/verify_test.go`
|
|
- `internal/service/verify_service.go`
|
|
- `internal/service/verify_service_test.go`
|
|
- `internal/handlers/verify.go`
|
|
- `internal/handlers/verify_test.go`
|
|
- `internal/worker/verify_executor.go`
|
|
- `internal/worker/verify_executor_test.go`
|
|
- `deployments/k8s/base/playwright-pod.yaml`
|
|
- `deployments/k8s/base/playwright-configmap.yaml`
|
|
- `deployments/k8s/base/playwright-scripts/capture.js`
|
|
- `cookbooks/scripts/visual-verify-test.sh`
|
|
- `templates/skeleton/.claude/commands/verify-feature.md`
|
|
|
|
**Modified Files (8):**
|
|
- `internal/domain/work.go` - Add WorkTaskTypeVerify
|
|
- `internal/auth/scopes.go` - Add verify scopes
|
|
- `internal/worker/work_executor.go` - Add dispatch case
|
|
- `internal/sdlc/types.go` - Add artifact/action types
|
|
- `internal/sdlc/rules.go` - Register verification rules
|
|
- `internal/sdlc/rules_execution.go` - Add verification rules
|
|
- `cookbooks/scripts/common.sh` - Add wait_for_verify()
|
|
- `cmd/rdev-api/main.go` - Wire DI
|