Operations Dashboard

Real-time system metrics, distributed tracing, and performance monitoring

All Systems Operational
Uptime 99.97%
Avg Response 45ms
Requests/min 2.4K
Error Rate 0.02%
CPU Usage
34%
Memory Usage
62%
Disk I/O
2.1K
Network
145 Mbps

Request Metrics

3K 2K 1K 0 12:00 12:10 12:20 12:30 12:40 12:50
Requests/min P95 Latency (ms)

Distributed Tracing

POST /api/chat/message trace_a1b2c3d4e5f6
300ms 2 min ago
GET /api/kb/search trace_b2c3d4e5f6g7
700ms 5 min ago
POST /api/billing/checkout trace_c3d4e5f6g7h8
250ms 8 min ago
GET /api/user/profile trace_d4e5f6g7h8i9
50ms 12 min ago
API LLM Database Vector DB Cache External

Service Health

API Gateway 4 instances | Load: 34%
2.4K req/min 45ms avg
PostgreSQL Primary + 2 Replicas
450 conn 12ms avg
Qdrant (Vector DB) 3 nodes cluster
2.4M vectors 8ms search
Redis Cache Cluster mode | 3 shards
94% hit rate 0.5ms avg
LLM Service OpenAI GPT-4 | High latency
125K tok/min 1.2s avg
Object Storage S3 Compatible
34.5 GB used 50ms avg

Error Tracking

View All Errors
23 Total Errors (24h)
5 Critical
8 Warnings
10 Info
Critical
Payment processing failed: Stripe API timeout billing/checkout.rs:145
3 occurrences 15 min ago
Warning
Rate limit approaching for OpenAI API llm/openai.rs:89
12 occurrences 30 min ago
Warning
Slow database query detected (>1s) core/analytics.rs:234
5 occurrences 45 min ago
Info
Cache miss rate above threshold cache/redis.rs:67
8 occurrences 1 hour ago

Endpoint Performance

POST /api/chat/message
845 req/min
320ms P95
0.1% errors
GET /api/kb/search
523 req/min
580ms