# Media Handling Specification **Version:** 1.0 **Status:** Implementation **Owner:** Platform Team **Last Updated:** 2026-02-08 ## Overview This specification defines comprehensive media handling for rdev, enabling generated projects to store and serve user-uploaded files (images, videos, documents) via Google Cloud Storage. The implementation follows rdev's established patterns for infrastructure provisioning, skeleton packages, and component templates. ## Architecture ``` ┌─────────────────────────────────────────────────────────────┐ │ rdev Platform Layer │ │ ┌───────────────────────────────────────────────────────┐ │ │ │ GCS Provisioner (internal/adapter/gcs) │ │ │ │ - Creates per-project buckets │ │ │ │ - Generates service account credentials │ │ │ │ - Stores credentials in rdev credential store │ │ │ └───────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────┘ ↓ credentials ┌─────────────────────────────────────────────────────────────┐ │ Generated Project Layer │ │ ┌───────────────────────────────────────────────────────┐ │ │ │ pkg/storage (skeleton package) │ │ │ │ - Storage interface abstraction │ │ │ │ - GCS implementation │ │ │ │ - Memory implementation (testing) │ │ │ │ - Signed URL generation │ │ │ └───────────────────────────────────────────────────────┘ │ │ ┌───────────────────────────────────────────────────────┐ │ │ │ media-upload Component (optional) │ │ │ │ - HTTP upload/download/delete endpoints │ │ │ │ - File validation │ │ │ │ - Path management │ │ │ └───────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────────────────────────┐ │ Google Cloud Storage (External) │ │ - Per-project buckets (project-{name}-media) │ │ - Per-project service accounts │ │ - IAM bindings (objectAdmin) │ │ - Lifecycle rules (temp/ auto-cleanup) │ │ - CORS configuration │ └─────────────────────────────────────────────────────────────┘ ``` ## Storage Patterns ### Path Conventions Projects should organize objects using consistent path patterns: ``` uploads/{user_id}/{timestamp}-{filename} # User-uploaded files avatars/{user_id}.jpg # User avatars temp/{session_id}/{filename} # Temporary files (auto-delete after 24h) public/{category}/{filename} # Public assets (logos, etc.) private/{user_id}/{document_id}/{filename} # Private documents (signed URLs only) ``` ### Content Types Supported MIME types: - **Images:** `image/jpeg`, `image/png`, `image/gif`, `image/webp`, `image/svg+xml` - **Videos:** `video/mp4`, `video/webm`, `video/quicktime` - **Documents:** `application/pdf`, `application/msword`, `application/vnd.openxmlformats-officedocument.*` - **Archives:** `application/zip`, `application/x-tar`, `application/gzip` ### Size Limits - **Default max upload:** 100MB per file - **Component template:** Configurable via `MAX_UPLOAD_SIZE` env var - **GCS bucket:** No hard limit (quota-based) ### TTL and Expiry Lifecycle rules automatically delete objects: - `temp/*` paths: 24 hours - User can configure custom rules in GCS console ## Security Model ### Authentication Flow 1. **Provisioning Time** (rdev API): - Create GCS bucket: `project-{name}-media` - Create service account: `project-{name}-storage@{gcp-project}.iam.gserviceaccount.com` - Grant IAM role: `roles/storage.objectAdmin` on bucket - Generate service account JSON key - Store credentials in rdev credential store (encrypted) 2. **Runtime** (generated project): - Read credentials from env vars: `GCS_BUCKET`, `GCS_SERVICE_ACCOUNT_JSON` - Initialize `pkg/storage` client with service account JSON - Client uses ADC (Application Default Credentials) with service account ### IAM Roles Per-project service accounts have **isolated permissions**: - **Bucket-scoped:** `roles/storage.objectAdmin` (CRUD on objects) - **No cross-project access:** Service account A cannot access bucket B - **No IAM permissions:** Cannot modify IAM policies or create resources ### Signed URLs For temporary access without service account credentials: ```go // Generate read URL (1 hour expiry) signedURL, _ := storageClient.SignURL(ctx, "uploads/photo.jpg", time.Hour, false) // Generate write URL (15 min expiry) for client-side uploads uploadURL, _ := storageClient.SignURL(ctx, "uploads/photo.jpg", 15*time.Minute, true) ``` **Use cases:** - Direct browser downloads (avoid proxying through API) - Client-side uploads (POST directly to GCS, not API) - Sharing files with external users (time-limited links) ### CORS Configuration Buckets are created with CORS rules: ```yaml MaxAge: 3600 Methods: [GET, POST, PUT, DELETE, OPTIONS] Origins: ["https://*.threesix.ai"] ResponseHeaders: ["Content-Type", "ETag"] ``` Projects should override origins for custom domains. ## API Standards ### Upload Endpoint **Request:** ```http POST /api/media-upload/upload Content-Type: multipart/form-data --boundary Content-Disposition: form-data; name="file"; filename="photo.jpg" Content-Type: image/jpeg --boundary-- ``` **Response (201 Created):** ```json { "data": { "url": "https://storage.googleapis.com/project-myapp-media/uploads/1706889600-photo.jpg", "path": "uploads/1706889600-photo.jpg", "filename": "photo.jpg", "size": 245678 } } ``` **Error Responses:** - `400 Bad Request`: File too large, invalid form, missing file - `500 Internal Server Error`: Upload failed (GCS error) ### Download Endpoint **Request:** ```http GET /api/media-upload/download/uploads/1706889600-photo.jpg ``` **Response (307 Temporary Redirect):** ```http HTTP/1.1 307 Temporary Redirect Location: https://storage.googleapis.com/project-myapp-media/uploads/1706889600-photo.jpg?X-Goog-Algorithm=... ``` **Error Responses:** - `404 Not Found`: File does not exist ### Delete Endpoint **Request:** ```http DELETE /api/media-upload/delete/uploads/1706889600-photo.jpg ``` **Response (204 No Content):** ```http HTTP/1.1 204 No Content ``` **Error Responses:** - `404 Not Found`: File does not exist - `500 Internal Server Error`: Delete failed ### Rate Limiting Component template should include rate limiting: - **Upload:** 10 requests/minute per IP - **Download:** 100 requests/minute per IP - **Delete:** 10 requests/minute per IP Use `github.com/go-chi/httprate` middleware. ## Testing Strategy ### Unit Tests **Platform (GCS Provisioner):** - `TestSanitizeForGCP`: Validate bucket name sanitization - `TestBucketNameFor`: Validate bucket naming convention - `TestServiceAccountEmailFor`: Validate service account email format **Skeleton (pkg/storage):** - `TestMemoryStorage`: Verify in-memory implementation - `TestUploadOptions`: Validate option handling - `TestErrorHandling`: Verify error types (ErrNotFound, etc.) ### Integration Tests **GCS Provisioner:** ```go // Requires GCS_TEST_PROJECT_ID and GCS_TEST_CREDENTIALS_PATH env vars func TestGCSProvisionerIntegration(t *testing.T) { // Create bucket creds, err := provisioner.CreateProjectBucket(ctx, "test-project-123") // Verify bucket exists in GCS // Verify service account exists // Verify IAM bindings // Cleanup provisioner.DeleteProjectBucket(ctx, "test-project-123", true) } ``` **pkg/storage:** ```go // Requires test GCS bucket func TestGCSStorageIntegration(t *testing.T) { // Upload file // Verify file exists // Download file // Delete file } ``` ### E2E Tests (Cookbook Tree) See `cookbooks/trees/media-upload-flow.yaml`: 1. Create project 2. Provision GCS component 3. Add media-upload service component 4. Wait for CI/CD pipeline 5. Test upload endpoint 6. Verify file in GCS bucket 7. Test download endpoint 8. Cleanup (delete project, bucket) ### Mocks For projects using `pkg/storage`, provide mock implementation: ```go // pkg/storage/mock.go (generated projects can create this) type MockStorage struct { UploadFunc func(ctx context.Context, path string, r io.Reader, opts UploadOptions) (string, error) DownloadFunc func(ctx context.Context, path string) (io.ReadCloser, *ObjectAttrs, error) // ... other methods } ``` ## Operational Concerns ### Bucket Lifecycle **Creation:** - Triggered by `POST /projects/{id}/components` with `type=gcs` - Returns immediately after bucket + credentials created - Credentials stored in rdev credential store **Deletion:** - Triggered by `DELETE /project/{id}` - Deletes all objects first (if `force=true`) - Deletes bucket - Deletes service account and keys **Orphan Prevention:** - Project deletion hook cleans up all infra (postgres, redis, gcs) - If cleanup fails, logs warning but continues (manual cleanup required) ### Cost Management **Estimates (per project):** - **Storage:** $0.020/GB/month (Standard class, US region) - **Operations:** $0.005/10k reads, $0.05/10k writes - **Network:** $0.12/GB egress (to internet) **Typical project (1k users, 10GB media):** - Storage: $0.20/month - Operations: $0.10/month (10k reads, 1k writes) - **Total:** ~$0.30/month **Cost optimization:** - Use lifecycle rules to auto-delete temp files - Serve images via CDN (reduce GCS egress) - Use signed URLs (avoid API proxy overhead) ### Monitoring **Metrics to track (Prometheus):** - `rdev_gcs_buckets_total`: Total buckets created - `rdev_gcs_provision_duration_seconds`: Bucket creation latency - `rdev_gcs_provision_errors_total`: Provisioning failures - `storage_upload_duration_seconds`: Upload latency (in generated projects) - `storage_upload_errors_total`: Upload failures - `storage_upload_bytes_total`: Total bytes uploaded **Logs to monitor:** - Provisioning errors (insufficient permissions, quota exceeded) - Upload errors (file too large, invalid content type) - Download 404s (broken links, deleted files) ### Quotas **GCS limits:** - **Bucket creation:** 100/day per GCP project (sufficient for small deployments) - **Service accounts:** 100 per GCP project (shared quota with other services) - **IAM policies:** 1500 bindings per bucket (one per service account) **Scaling beyond limits:** - Use multiple GCP projects (shard by project ID hash) - Use single bucket with path prefixes (less isolation, not recommended) ### Backup and Recovery **Bucket versioning:** - Enable versioning for critical projects: `gsutil versioning set on gs://bucket` - Allows recovery from accidental deletions **Cross-region replication:** - For high-availability projects, enable dual-region buckets - Example: `Location: "US"` → `Location: "NAM4"` (multi-region) ## Implementation Phases ### Phase 1: Platform - GCS Provisioner ✅ - Define `port.StorageProvisioner` interface - Implement `adapter/gcs.Provisioner` - Wire into `ComponentService` - Add GCS config to `cmd/rdev-api` ### Phase 2: Skeleton - pkg/storage ✅ - Define `storage.Storage` interface - Implement `GCSStorage` - Implement `MemoryStorage` (testing) - Add to skeleton templates ### Phase 3: Component - media-upload ✅ - Create component template with upload/download handlers - Add Woodpecker build/deploy steps - Add to component registry ### Phase 4: Testing - Cookbook Tree ✅ - Write E2E cookbook tree - Run in CI pipeline - Document in guides ## Security Checklist - [ ] Service accounts have minimal IAM roles (objectAdmin only) - [ ] Credentials stored encrypted in rdev credential store - [ ] Bucket names do not expose sensitive project details - [ ] CORS origins restricted to *.threesix.ai (or custom domains) - [ ] Signed URLs have reasonable expiry times (≤1 hour for reads, ≤15 min for writes) - [ ] File size limits enforced (prevent DoS via large uploads) - [ ] Content-Type validation (prevent malicious file uploads) - [ ] Public read ACLs only set when explicitly requested (`Public: true`) ## Future Enhancements 1. **Multi-Backend Support:** Add S3, MinIO, R2 adapters 2. **Image Processing:** Automatic thumbnail generation, format conversion 3. **CDN Integration:** Cloudflare R2 + cache purging 4. **Quota Management:** Per-project storage limits, alerting 5. **Virus Scanning:** ClamAV integration for uploaded files 6. **Resumable Uploads:** For large files (>100MB) 7. **Streaming:** Direct browser-to-GCS uploads (bypass API) ## References - **GCS Client Docs:** https://cloud.google.com/go/docs/reference/cloud.google.com/go/storage/latest - **IAM Best Practices:** https://cloud.google.com/iam/docs/best-practices - **Signed URLs:** https://cloud.google.com/storage/docs/access-control/signed-urls - **rdev Postgres Provisioner:** `internal/adapter/postgres/provisioner.go` - **rdev Redis Provisioner:** `internal/adapter/redis/provisioner.go`