# Network Topology Diagram

## Port Scheme Overview

```
┌────────────────────────────────────────────────────────────────┐
│                    StemeDB Port Allocation (181XX)             │
├────────┬──────────┬─────────────────────┬──────────────────────┤
│ Port   │ Protocol │ Service             │ Purpose              │
├────────┼──────────┼─────────────────────┼──────────────────────┤
│ 18180  │ TCP/HTTP │ API Server          │ Queries, ingest      │
│ 18181  │ TCP/HTTP │ Cluster Gateway     │ Coordination         │
│ 18182  │ TCP/gRPC │ Cluster RPC         │ Replication          │
│ 18183  │ UDP      │ SWIM Gossip         │ Membership           │
│ 18184  │ -        │ (Reserved)          │ Future metrics       │
│ 18185  │ -        │ (Reserved)          │ Future admin         │
│ 18186  │ TCP/HTTP │ Latent Signal       │ AE detection         │
│ 18187  │ TCP/HTTP │ Community App       │ Community corpus     │
│ 18188  │ TCP/HTTP │ StemeDB Dashboard   │ Web UI               │
│ 18189  │ TCP/HTTP │ Aphoria Dashboard   │ Aphoria UI           │
└────────┴──────────┴─────────────────────┴──────────────────────┘
```

## Single-Node Network Topology

```
┌─────────────────────────────────────────────────────────────────┐
│                         Internet                                │
│                            │                                     │
│                            │ HTTPS (443)                         │
│                            ▼                                     │
│                    ┌───────────────┐                            │
│                    │ Reverse Proxy │                            │
│                    │ (Nginx/Envoy) │                            │
│                    │ • TLS term    │                            │
│                    │ • Rate limit  │                            │
│                    └───────┬───────┘                            │
│                            │                                     │
│                            │ HTTP (18180)                        │
└────────────────────────────┼─────────────────────────────────────┘
                             │
          ┌──────────────────┼──────────────────┐
          │ Internal Network (10.0.0.0/8)       │
          │                  ▼                  │
          │         ┌─────────────────┐         │
          │         │  StemeDB Node   │         │
          │         │  10.0.1.50      │         │
          │         │                 │         │
          │         │  :18180 (API)   │◀────────┼─── Clients (internal)
          │         │  :18188 (Dash)  │         │
          │         └────────┬────────┘         │
          │                  │                  │
          │                  ▼                  │
          │         ┌─────────────────┐         │
          │         │  Prometheus     │         │
          │         │  10.0.1.100     │         │
          │         │  Scrapes :18180 │         │
          │         └─────────────────┘         │
          └─────────────────────────────────────┘

Security Zones:
- Public: Internet → Reverse Proxy (443)
- DMZ: Reverse Proxy → StemeDB (18180)
- Internal: Prometheus → StemeDB (18180/metrics)
```

## Three-Node Cluster Network Topology

```
┌──────────────────────────────────────────────────────────────────┐
│                          Internet                                │
│                             │                                     │
│                             │ HTTPS (443)                         │
│                             ▼                                     │
│                     ┌───────────────┐                            │
│                     │ Load Balancer │                            │
│                     │ (ALB/ELB)     │                            │
│                     │ • TLS term    │                            │
│                     │ • Health chks │                            │
│                     └───────┬───────┘                            │
│                             │                                     │
│                             │ HTTP (18180)                        │
└─────────────────────────────┼──────────────────────────────────────┘
                              │
              ┌───────────────┴───────────────┐
              │                               │
┌─────────────┼───────────────────────────────┼──────────────────┐
│ Private Network (10.0.1.0/24)               │                  │
│             ▼                               ▼                  │
│  ┌─────────────────┐            ┌─────────────────┐           │
│  │   Node 1        │            │   Node 2        │           │
│  │   10.0.1.51     │            │   10.0.1.52     │           │
│  │                 │            │                 │           │
│  │ :18180 (API)    │            │ :18180 (API)    │           │
│  │ :18181 (Gate)   │            │ :18181 (Gate)   │           │
│  │ :18182 (RPC)────┼────────────┼────:18182 (RPC) │           │
│  │ :18183 (SWIM)···┼···········UDP···:18183 (SWIM)│           │
│  └────────┬────────┘            └────────┬────────┘           │
│           │                              │                     │
│           │                              │                     │
│           │                              │                     │
│           │         ┌─────────────────┐  │                     │
│           │         │   Node 3        │  │                     │
│           │         │   10.0.1.53     │  │                     │
│           │         │                 │  │                     │
│           │         │ :18180 (API)    │  │                     │
│           │         │ :18181 (Gate)   │  │                     │
│           └─────────┼────:18182 (RPC) │──┘                     │
│                 ···UDP···:18183 (SWIM)│                        │
│                     └────────┬────────┘                        │
│                              │                                 │
│                              ▼                                 │
│                     ┌─────────────────┐                        │
│                     │  Prometheus     │                        │
│                     │  10.0.1.100     │                        │
│                     │  Scrapes all 3  │                        │
│                     └─────────────────┘                        │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Security Zones:
- Public: Internet → Load Balancer (443)
- DMZ: Load Balancer → Nodes (18180)
- Cluster: Node ↔ Node (18181-18183)
- Internal: Prometheus → Nodes (18180/metrics)

Firewall Rules:
- Allow 18180 from Load Balancer to all nodes
- Allow 18181-18183 within cluster (node ↔ node)
- Allow 18180/metrics from Prometheus only
- Block 18181 from outside (admin endpoints)
```

## Inter-Node Communication Detail

```
Node 1 (10.0.1.51)                    Node 2 (10.0.1.52)

Port 18182 (TCP/gRPC)
  │
  ├─────────────────────────────────────▶ :18182
  │  Push Replication                    (receive assertions)
  │  • Assertion payload
  │  • BLAKE3 hash
  │  • Signature
  │
  ◀─────────────────────────────────────┤
     ACK (received)                     │
                                        │
Port 18183 (UDP)
  │
  ├───────────────────────────────────▶ :18183
  │  SWIM Gossip (every 1s)             (membership)
  │  • Ping: "Are you alive?"
  │  • Membership: "Node 3 is UP"
  │
  ◀───────────────────────────────────┤
     Ack: "I'm alive"                  │
     Membership: "Node 1 is UP"        │

Port 18181 (TCP/HTTP)
  │
  ├─────────────────────────────────────▶ :18181
  │  Merkle Sync (periodic)               (compare trees)
  │  GET /cluster/merkle
  │  • Root hash: ABC123
  │
  ◀─────────────────────────────────────┤
     Merkle tree response               │
     • Root hash: ABC123 (same!)        │
     • No sync needed                   │
```

## Firewall Configuration (iptables)

```
# On each cluster node:

# Allow API from load balancer
-A INPUT -s 10.0.1.10 -p tcp --dport 18180 -j ACCEPT

# Allow cluster RPC from other nodes
-A INPUT -s 10.0.1.51 -p tcp --dport 18181:18182 -j ACCEPT
-A INPUT -s 10.0.1.52 -p tcp --dport 18181:18182 -j ACCEPT
-A INPUT -s 10.0.1.53 -p tcp --dport 18181:18182 -j ACCEPT

# Allow SWIM gossip (UDP) from other nodes
-A INPUT -s 10.0.1.51 -p udp --dport 18183 -j ACCEPT
-A INPUT -s 10.0.1.52 -p udp --dport 18183 -j ACCEPT
-A INPUT -s 10.0.1.53 -p udp --dport 18183 -j ACCEPT

# Allow metrics from Prometheus
-A INPUT -s 10.0.1.100 -p tcp --dport 18180 -j ACCEPT

# Allow SSH from bastion
-A INPUT -s 10.0.1.200 -p tcp --dport 22 -j ACCEPT

# Drop everything else
-A INPUT -p tcp --dport 18180:18189 -j DROP
-A INPUT -p udp --dport 18183 -j DROP
```

## AWS Security Group Example

```
Security Group: sg-stemedb-cluster

Inbound Rules:
┌──────────┬──────────┬─────────────────┬─────────────────────────┐
│ Type     │ Protocol │ Port Range      │ Source                  │
├──────────┼──────────┼─────────────────┼─────────────────────────┤
│ HTTP     │ TCP      │ 18180           │ sg-load-balancer        │
│ Custom   │ TCP      │ 18181-18182     │ sg-stemedb-cluster      │
│ Custom   │ UDP      │ 18183           │ sg-stemedb-cluster      │
│ SSH      │ TCP      │ 22              │ sg-bastion              │
└──────────┴──────────┴─────────────────┴─────────────────────────┘

Outbound Rules:
┌──────────┬──────────┬─────────────────┬─────────────────────────┐
│ All      │ All      │ All             │ 0.0.0.0/0               │
└──────────┴──────────┴─────────────────┴─────────────────────────┘
```

## Network Latency Requirements

```
Client → Load Balancer: <100ms (internet typical)
        │
        ▼
Load Balancer → Node: <10ms (same region)
        │
        ├───────────────────────────────────────┐
        ▼                                       ▼
   Node 1 ◀─────<5ms (CRITICAL)─────────▶ Node 2
        ▲                                       ▲
        │                                       │
        └───────────<5ms (CRITICAL)─────────────┘
                        Node 3

Why <5ms inter-node?
- SWIM gossip requires fast ping/ack
- Replication lag increases with latency
- Merkle sync performance degrades

Test: ping -c 100 node2 (should show avg <5ms)
```

## Bandwidth Usage

```
┌─────────────────────────────────────────────────────────────┐
│                    Bandwidth Breakdown                      │
├─────────────────┬───────────────────────────────────────────┤
│ Direction       │ Usage (per node)                          │
├─────────────────┼───────────────────────────────────────────┤
│ Inbound (API)   │ 100 assertions/sec × 1KB = 0.8 Mbps       │
│ Outbound (API)  │ 100 queries/sec × 5KB = 4 Mbps            │
│ Replication     │ 100 assertions/sec × 1KB × 2 = 1.6 Mbps   │
│ SWIM Gossip     │ ~10 KB/sec (negligible)                   │
├─────────────────┼───────────────────────────────────────────┤
│ Total           │ ~7 Mbps per node                          │
│ Recommended     │ 1 Gbps NIC (100× headroom)                │
└─────────────────┴───────────────────────────────────────────┘
```

## Monitoring Endpoints

```
┌─────────────────────────────────────────────────────────────┐
│                 Prometheus Scrape Targets                   │
├─────────────────┬───────────────────────────────────────────┤
│ Target          │ URL                                       │
├─────────────────┼───────────────────────────────────────────┤
│ Node 1          │ http://10.0.1.51:18180/metrics            │
│ Node 2          │ http://10.0.1.52:18180/metrics            │
│ Node 3          │ http://10.0.1.53:18180/metrics            │
├─────────────────┼───────────────────────────────────────────┤
│ Scrape Interval │ 15 seconds                                │
│ Timeout         │ 10 seconds                                │
└─────────────────┴───────────────────────────────────────────┘

Key Metrics:
- up{job="stemedb", instance="node1"} = 1
- stemedb_query_latency_seconds{quantile="0.99", instance="node1"}
- replication_lag_seconds{instance="node1"}
- process_resident_memory_bytes{instance="node1"}
```

## DNS Configuration

```
Public DNS (example.com):
┌────────────────────────────────────────────────────────────┐
│ stemedb.example.com.  300  IN  CNAME  stemedb-lb.example. │
│ stemedb-lb.example.   60   IN  A      203.0.113.10        │
└────────────────────────────────────────────────────────────┘

Private DNS (cluster.local):
┌────────────────────────────────────────────────────────────┐
│ node1.cluster.local.  300  IN  A  10.0.1.51                │
│ node2.cluster.local.  300  IN  A  10.0.1.52                │
│ node3.cluster.local.  300  IN  A  10.0.1.53                │
└────────────────────────────────────────────────────────────┘

TTL Recommendations:
- Public: 300s (5 min) - balance caching vs failover speed
- Private: 60s (1 min) - faster convergence within cluster
```