Operations Dashboard

Real-time system metrics, distributed tracing, and performance monitoring

All Systems Operational

Uptime 99.97%

Avg Response 45ms

Requests/min 2.4K

Error Rate 0.02%

CPU Usage

34%

Memory Usage

62%

Disk I/O

2.1K

Network

145 Mbps

Request Metrics

Requests Latency Errors

Requests/min P95 Latency (ms)

Distributed Tracing

POST /api/chat/message trace_a1b2c3d4e5f6

300ms 2 min ago

GET /api/kb/search trace_b2c3d4e5f6g7

700ms 5 min ago

POST /api/billing/checkout trace_c3d4e5f6g7h8

250ms 8 min ago

GET /api/user/profile trace_d4e5f6g7h8i9

50ms 12 min ago

API LLM Database Vector DB Cache External

Service Health

API Gateway 4 instances | Load: 34%

2.4K req/min 45ms avg

PostgreSQL Primary + 2 Replicas

450 conn 12ms avg

Qdrant (Vector DB) 3 nodes cluster

2.4M vectors 8ms search

Redis Cache Cluster mode | 3 shards

94% hit rate 0.5ms avg

LLM Service OpenAI GPT-4 | High latency

125K tok/min 1.2s avg

Object Storage S3 Compatible

34.5 GB used 50ms avg

Error Tracking

View All Errors

23 Total Errors (24h)

5 Critical

8 Warnings

10 Info

Critical

Payment processing failed: Stripe API timeout billing/checkout.rs:145

3 occurrences 15 min ago

Warning

Rate limit approaching for OpenAI API llm/openai.rs:89

12 occurrences 30 min ago

Warning

Slow database query detected (>1s) core/analytics.rs:234

5 occurrences 45 min ago

Info

Cache miss rate above threshold cache/redis.rs:67

8 occurrences 1 hour ago

Endpoint Performance

POST /api/chat/message

845 req/min

320ms P95

0.1% errors

GET /api/kb/search

523 req/min

580ms