feat: Add comprehensive stage progress logging
Some checks failed
BotServer CI / build (push) Failing after 2m39s
Some checks failed
BotServer CI / build (push) Failing after 2m39s
- Add detailed logging for all 5 pipeline stages (PLAN, BUILD, REVIEW, DEPLOY, MONITOR) - Log stage start/complete events with agent IDs and progress details - Add resource creation/deletion logging in drive_handlers - Improve pipeline summary logging with task ID, nodes, resources, and URL This addresses the requirement for textual progress in console logs.
This commit is contained in:
parent
c326581a9e
commit
1e71c9be09
10 changed files with 652 additions and 26 deletions
594
LOGGING_PLAN.md
Normal file
594
LOGGING_PLAN.md
Normal file
|
|
@ -0,0 +1,594 @@
|
|||
# BotServer Cinema Viewer Logging Plan
|
||||
|
||||
## 🎬 The Cinema Viewer Philosophy
|
||||
|
||||
Think of your logs as a movie with different viewing modes:
|
||||
|
||||
- **INFO** = **The Movie** - Watch the main story unfold (production-ready)
|
||||
- **DEBUG** = **Director's Commentary** - Behind-the-scenes details (troubleshooting)
|
||||
- **TRACE** = **Raw Footage** - Every single take and detail (deep debugging)
|
||||
- **WARN** = **Plot Holes** - Things that shouldn't happen but are recoverable
|
||||
- **ERROR** = **Scene Failures** - Critical issues that break the narrative
|
||||
|
||||
---
|
||||
|
||||
## 📊 Log Level Definitions
|
||||
|
||||
### ERROR - Critical Failures
|
||||
**When to use:** System cannot proceed, requires immediate attention
|
||||
|
||||
```rust
|
||||
// ✅ GOOD - Actionable, clear context
|
||||
error!("Database connection failed - retrying in 5s: {}", e);
|
||||
error!("Authentication failed for user {}: invalid credentials", user_id);
|
||||
error!("Stage 2 BUILD failed: {}", e);
|
||||
|
||||
// ❌ BAD - Vague, no context
|
||||
error!("Error!");
|
||||
error!("Failed");
|
||||
error!("Something went wrong: {}", e);
|
||||
```
|
||||
|
||||
### WARN - Recoverable Issues
|
||||
**When to use:** Unexpected state but system can continue
|
||||
|
||||
```rust
|
||||
// ✅ GOOD - Explains the issue and impact
|
||||
warn!("Failed to create data directory: {}. Using fallback.", e);
|
||||
warn!("LLM server not ready - deferring embedding generation");
|
||||
warn!("Rate limit approaching for API key {}", key_id);
|
||||
|
||||
// ❌ BAD - Not actionable
|
||||
warn!("Warning");
|
||||
warn!("Something happened");
|
||||
```
|
||||
|
||||
### INFO - The Main Story
|
||||
**When to use:** Key events, state changes, business milestones
|
||||
|
||||
```rust
|
||||
// ✅ GOOD - Tells a story, shows progress
|
||||
info!("Pipeline starting - task: {}, intent: {}", task_id, intent);
|
||||
info!("Stage 1 PLAN complete - {} nodes planned", node_count);
|
||||
info!("User {} logged in from {}", user_id, ip_address);
|
||||
info!("Server started on port {}", port);
|
||||
|
||||
// ❌ BAD - Too verbose, implementation details
|
||||
info!("Entering function process_request");
|
||||
info!("Variable x = {}", x);
|
||||
info!("Loop iteration {}", i);
|
||||
```
|
||||
|
||||
### DEBUG - Behind the Scenes
|
||||
**When to use:** Troubleshooting information, decision points, state inspections
|
||||
|
||||
```rust
|
||||
// ✅ GOOD - Helps diagnose issues
|
||||
debug!("Request payload: {:?}", payload);
|
||||
debug!("Using cache key: {}", cache_key);
|
||||
debug!("Retry attempt {} of {}", attempt, max_retries);
|
||||
debug!("Selected LLM model: {} for task type: {}", model, task_type);
|
||||
|
||||
// ❌ BAD - Too trivial
|
||||
debug!("Variable assigned");
|
||||
debug!("Function called");
|
||||
```
|
||||
|
||||
### TRACE - Raw Footage
|
||||
**When to use:** Step-by-step execution, loop iterations, detailed flow
|
||||
|
||||
```rust
|
||||
// ✅ GOOD - Detailed execution path
|
||||
trace!("Starting monitoring loop");
|
||||
trace!("Processing file: {:?}", path);
|
||||
trace!("Checking bot directory: {}", dir);
|
||||
trace!("WebSocket message received: {} bytes", len);
|
||||
|
||||
// ❌ BAD - Noise without value
|
||||
trace!("Line 100");
|
||||
trace!("Got here");
|
||||
trace!("...");
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎭 Logging Patterns by Module Type
|
||||
|
||||
### 1. Orchestration & Pipeline Modules
|
||||
**Examples:** `auto_task/orchestrator.rs`, `auto_task/agent_executor.rs`
|
||||
|
||||
**INFO Level:** Show the story arc
|
||||
```rust
|
||||
info!("Pipeline starting - task: {}, intent: {}", task_id, intent);
|
||||
info!("Stage 1 PLAN starting - Agent #1 analyzing request");
|
||||
info!("Stage 1 PLAN complete - {} nodes planned", node_count);
|
||||
info!("Stage 2 BUILD complete - {} resources, url: {}", count, url);
|
||||
info!("Pipeline complete - task: {}, nodes: {}, resources: {}", task_id, nodes, resources);
|
||||
```
|
||||
|
||||
**DEBUG Level:** Show decision points
|
||||
```rust
|
||||
debug!("Classified intent as: {:?}", classification);
|
||||
debug!("Selected app template: {}", template_name);
|
||||
debug!("Skipping stage 3 - no resources to review");
|
||||
```
|
||||
|
||||
**TRACE Level:** Show execution flow
|
||||
```rust
|
||||
trace!("Broadcasting thought to UI: {}", thought);
|
||||
trace!("Updating agent {} status: {} -> {}", id, old, new);
|
||||
trace!("Sub-task generated: {}", task_name);
|
||||
```
|
||||
|
||||
### 2. File Monitoring & Compilation
|
||||
**Examples:** `drive/local_file_monitor.rs`, `drive/drive_monitor/mod.rs`
|
||||
|
||||
**INFO Level:** Key file operations
|
||||
```rust
|
||||
info!("Local file monitor started - watching /opt/gbo/data/*.gbai");
|
||||
info!("Compiling bot: {} ({} files)", bot_name, file_count);
|
||||
info!("Bot {} compiled successfully - {} tools generated", bot_name, tool_count);
|
||||
```
|
||||
|
||||
**DEBUG Level:** File processing details
|
||||
```rust
|
||||
debug!("Detected change in: {:?}", path);
|
||||
debug!("Recompiling {} - modification detected", bot_name);
|
||||
debug!("Skipping {} - no changes detected", bot_name);
|
||||
```
|
||||
|
||||
**TRACE Level:** File system operations
|
||||
```rust
|
||||
trace!("Scanning directory: {:?}", dir);
|
||||
trace!("File modified: {:?} at {:?}", path, time);
|
||||
trace!("Watching directory: {:?}", path);
|
||||
```
|
||||
|
||||
### 3. Security & Authentication
|
||||
**Examples:** `security/jwt.rs`, `security/api_keys.rs`, `security/sql_guard.rs`
|
||||
|
||||
**INFO Level:** Security events (always log these!)
|
||||
```rust
|
||||
info!("User {} logged in from {}", user_id, ip);
|
||||
info!("API key {} created for user {}", key_id, user_id);
|
||||
info!("Failed login attempt for user {} from {}", username, ip);
|
||||
info!("Rate limit exceeded for IP: {}", ip);
|
||||
```
|
||||
|
||||
**DEBUG Level:** Security checks
|
||||
```rust
|
||||
debug!("Validating JWT for user {}", user_id);
|
||||
debug!("Checking API key permissions: {:?}", permissions);
|
||||
debug!("SQL query sanitized: {}", safe_query);
|
||||
```
|
||||
|
||||
**TRACE Level:** Security internals
|
||||
```rust
|
||||
trace!("Token expiry check: {} seconds remaining", remaining);
|
||||
trace!("Permission check: {} -> {}", resource, allowed);
|
||||
trace!("Hashing password with cost factor: {}", cost);
|
||||
```
|
||||
|
||||
### 4. API Handlers
|
||||
**Examples:** HTTP endpoint handlers in `core/`, `drive/`, etc.
|
||||
|
||||
**INFO Level:** Request lifecycle
|
||||
```rust
|
||||
info!("Request started: {} {} from {}", method, path, ip);
|
||||
info!("Request completed: {} {} -> {} ({}ms)", method, path, status, duration);
|
||||
info!("User {} created resource: {}", user_id, resource_id);
|
||||
```
|
||||
|
||||
**DEBUG Level:** Request details
|
||||
```rust
|
||||
debug!("Request headers: {:?}", headers);
|
||||
debug!("Request body: {:?}", body);
|
||||
debug!("Response payload: {} bytes", size);
|
||||
```
|
||||
|
||||
**TRACE Level:** Request processing
|
||||
```rust
|
||||
trace!("Parsing JSON body");
|
||||
trace!("Validating request parameters");
|
||||
trace!("Serializing response");
|
||||
```
|
||||
|
||||
### 5. Database Operations
|
||||
**Examples:** `core/shared/models/`, Diesel queries
|
||||
|
||||
**INFO Level:** Database lifecycle
|
||||
```rust
|
||||
info!("Database connection pool initialized ({} connections)", pool_size);
|
||||
info!("Migration completed - {} tables updated", count);
|
||||
info!("Database backup created: {}", backup_path);
|
||||
```
|
||||
|
||||
**DEBUG Level:** Query information
|
||||
```rust
|
||||
debug!("Executing query: {}", query);
|
||||
debug!("Query returned {} rows in {}ms", count, duration);
|
||||
debug!("Cache miss for key: {}", key);
|
||||
```
|
||||
|
||||
**TRACE Level:** Query details
|
||||
```rust
|
||||
trace!("Preparing statement: {}", sql);
|
||||
trace!("Binding parameter {}: {:?}", index, value);
|
||||
trace!("Fetching next row");
|
||||
```
|
||||
|
||||
### 6. LLM & AI Operations
|
||||
**Examples:** `llm/`, `core/kb/`
|
||||
|
||||
**INFO Level:** LLM operations
|
||||
```rust
|
||||
info!("LLM request started - model: {}, tokens: {}", model, estimated_tokens);
|
||||
info!("LLM response received - {} tokens, {}ms", tokens, duration);
|
||||
info!("Embedding generated - {} dimensions", dimensions);
|
||||
info!("Knowledge base indexed - {} documents", doc_count);
|
||||
```
|
||||
|
||||
**DEBUG Level:** LLM details
|
||||
```rust
|
||||
debug!("LLM prompt: {}", prompt_preview);
|
||||
debug!("Using temperature: {}, max_tokens: {}", temp, max);
|
||||
debug!("Selected model variant: {}", variant);
|
||||
```
|
||||
|
||||
**TRACE Level:** LLM internals
|
||||
```rust
|
||||
trace!("Sending request to LLM API: {}", url);
|
||||
trace!("Streaming token: {}", token);
|
||||
trace!("Parsing LLM response chunk");
|
||||
```
|
||||
|
||||
### 7. Startup & Initialization
|
||||
**Examples:** `main.rs`, `main_module/bootstrap.rs`
|
||||
|
||||
**INFO Level:** Startup milestones
|
||||
```rust
|
||||
info!("Server starting on port {}", port);
|
||||
info!("Database initialized - PostgreSQL connected");
|
||||
info!("Cache initialized - Valkey connected");
|
||||
info!("Secrets loaded from Vault");
|
||||
info!("BotServer ready - {} bots loaded", bot_count);
|
||||
```
|
||||
|
||||
**DEBUG Level:** Configuration details
|
||||
```rust
|
||||
debug!("Using config: {:?}", config);
|
||||
debug!("Environment: {}", env);
|
||||
debug!("Feature flags: {:?}", features);
|
||||
```
|
||||
|
||||
**TRACE Level:** Initialization steps
|
||||
```rust
|
||||
trace!("Loading .env file");
|
||||
trace!("Setting up signal handlers");
|
||||
trace!("Initializing thread registry");
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎯 The Cinema Viewer Experience
|
||||
|
||||
### Level 1: Watching the Movie (INFO)
|
||||
```bash
|
||||
RUST_LOG=botserver=info
|
||||
```
|
||||
|
||||
**What you see:**
|
||||
```
|
||||
INFO botserver: Server starting on port 8080
|
||||
INFO botserver: Database initialized - PostgreSQL connected
|
||||
INFO botserver: User alice@example.com logged in from 192.168.1.100
|
||||
INFO botserver::auto_task::orchestrator: Pipeline starting - task: abc123, intent: Create CRM
|
||||
INFO botserver::auto_task::orchestrator: Stage 1 PLAN complete - 5 nodes planned
|
||||
INFO botserver::auto_task::orchestrator: Stage 2 BUILD complete - 12 resources, url: /apps/crm
|
||||
INFO botserver::auto_task::orchestrator: Pipeline complete - task: abc123, nodes: 5, resources: 12
|
||||
INFO botserver: User alice@example.com logged out
|
||||
```
|
||||
|
||||
**Perfect for:** Production monitoring, understanding system flow
|
||||
|
||||
### Level 2: Director's Commentary (DEBUG)
|
||||
```bash
|
||||
RUST_LOG=botserver=debug
|
||||
```
|
||||
|
||||
**What you see:** Everything from INFO plus:
|
||||
```
|
||||
DEBUG botserver::auto_task::orchestrator: Classified intent as: AppGeneration
|
||||
DEBUG botserver::auto_task::orchestrator: Selected app template: crm
|
||||
DEBUG botserver::security::jwt: Validating JWT for user alice@example.com
|
||||
DEBUG botserver::drive::local_file_monitor: Detected change in: /opt/gbo/data/crm.gbai
|
||||
DEBUG botserver::llm: Using temperature: 0.7, max_tokens: 2000
|
||||
```
|
||||
|
||||
**Perfect for:** Troubleshooting issues, understanding decisions
|
||||
|
||||
### Level 3: Raw Footage (TRACE)
|
||||
```bash
|
||||
RUST_LOG=botserver=trace
|
||||
```
|
||||
|
||||
**What you see:** Everything from DEBUG plus:
|
||||
```
|
||||
TRACE botserver::drive::local_file_monitor: Scanning directory: /opt/gbo/data
|
||||
TRACE botserver::auto_task::orchestrator: Broadcasting thought to UI: Analyzing...
|
||||
TRACE botserver::llm: Streaming token: Create
|
||||
TRACE botserver::llm: Streaming token: a
|
||||
TRACE botserver::llm: Streaming token: CRM
|
||||
TRACE botserver::core::db: Preparing statement: SELECT * FROM bots
|
||||
```
|
||||
|
||||
**Perfect for:** Deep debugging, performance analysis, finding bugs
|
||||
|
||||
---
|
||||
|
||||
## ✨ Best Practices
|
||||
|
||||
### 1. Tell a Story
|
||||
```rust
|
||||
// ✅ GOOD - Shows the narrative
|
||||
info!("Pipeline starting - task: {}", task_id);
|
||||
info!("Stage 1 PLAN complete - {} nodes planned", nodes);
|
||||
info!("Stage 2 BUILD complete - {} resources", resources);
|
||||
info!("Pipeline complete - app deployed at {}", url);
|
||||
|
||||
// ❌ BAD - Just data points
|
||||
info!("Task started");
|
||||
info!("Nodes: {}", nodes);
|
||||
info!("Resources: {}", resources);
|
||||
info!("Done");
|
||||
```
|
||||
|
||||
### 2. Use Structured Data
|
||||
```rust
|
||||
// ✅ GOOD - Easy to parse and filter
|
||||
info!("User {} logged in from {}", user_id, ip);
|
||||
info!("Request completed: {} {} -> {} ({}ms)", method, path, status, duration);
|
||||
|
||||
// ❌ BAD - Hard to parse
|
||||
info!("User login happened");
|
||||
info!("Request finished successfully");
|
||||
```
|
||||
|
||||
### 3. Include Context
|
||||
```rust
|
||||
// ✅ GOOD - Provides context
|
||||
error!("Database connection failed for bot {}: {}", bot_id, e);
|
||||
warn!("Rate limit approaching for user {}: {}/{} requests", user_id, count, limit);
|
||||
|
||||
// ❌ BAD - No context
|
||||
error!("Connection failed: {}", e);
|
||||
warn!("Rate limit warning");
|
||||
```
|
||||
|
||||
### 4. Use Appropriate Levels
|
||||
```rust
|
||||
// ✅ GOOD - Right level for right information
|
||||
info!("Server started on port {}", port); // Key event
|
||||
debug!("Using config: {:?}", config); // Troubleshooting
|
||||
trace!("Listening on socket {:?}", socket); // Deep detail
|
||||
|
||||
// ❌ BAD - Wrong levels
|
||||
trace!("Server started"); // Too important for trace
|
||||
info!("Loop iteration {}", i); // Too verbose for info
|
||||
error!("Variable is null"); // Not an error
|
||||
```
|
||||
|
||||
### 5. Avoid Noise
|
||||
```rust
|
||||
// ✅ GOOD - Meaningful information
|
||||
debug!("Retry attempt {} of {} for API call", attempt, max);
|
||||
|
||||
// ❌ BAD - Just noise
|
||||
debug!("Entering function");
|
||||
debug!("Exiting function");
|
||||
debug!("Variable assigned");
|
||||
```
|
||||
|
||||
### 6. Log State Changes
|
||||
```rust
|
||||
// ✅ GOOD - Shows what changed
|
||||
info!("User {} role changed: {} -> {}", user_id, old_role, new_role);
|
||||
info!("Bot {} status: {} -> {}", bot_id, old_status, new_status);
|
||||
|
||||
// ❌ BAD - No before/after
|
||||
info!("User role updated");
|
||||
info!("Bot status changed");
|
||||
```
|
||||
|
||||
### 7. Include Timings for Operations
|
||||
```rust
|
||||
// ✅ GOOD - Performance visibility
|
||||
info!("Database migration completed in {}ms", duration);
|
||||
info!("LLM response received - {} tokens, {}ms", tokens, duration);
|
||||
debug!("Query executed in {}ms", duration);
|
||||
|
||||
// ❌ BAD - No performance data
|
||||
info!("Migration completed");
|
||||
info!("LLM response received");
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Implementation Guide
|
||||
|
||||
### Step 1: Audit Current Logging
|
||||
```bash
|
||||
# Find all logging statements
|
||||
find botserver/src -name "*.rs" -exec grep -n "info!\|debug!\|trace!\|warn!\|error!" {} +
|
||||
|
||||
# Count by level
|
||||
grep -r "info!" botserver/src | wc -l
|
||||
grep -r "debug!" botserver/src | wc -l
|
||||
grep -r "trace!" botserver/src | wc -l
|
||||
```
|
||||
|
||||
### Step 2: Categorize by Module
|
||||
Create a spreadsheet or document listing:
|
||||
- Module name
|
||||
- Current log levels used
|
||||
- Purpose of the module
|
||||
- What story should it tell
|
||||
|
||||
### Step 3: Refactor Module by Module
|
||||
Start with critical path modules:
|
||||
1. **auto_task/orchestrator.rs** - Already done! ✅
|
||||
2. **drive/local_file_monitor.rs** - File operations
|
||||
3. **security/jwt.rs** - Authentication events
|
||||
4. **main.rs** - Startup sequence
|
||||
5. **core/bot/** - Bot lifecycle
|
||||
|
||||
### Step 4: Test Different Verbosity Levels
|
||||
```bash
|
||||
# Test INFO level (production)
|
||||
RUST_LOG=botserver=info cargo run
|
||||
|
||||
# Test DEBUG level (troubleshooting)
|
||||
RUST_LOG=botserver=debug cargo run
|
||||
|
||||
# Test TRACE level (development)
|
||||
RUST_LOG=botserver=trace cargo run
|
||||
```
|
||||
|
||||
### Step 5: Document Module-Specific Patterns
|
||||
For each module, document:
|
||||
- What story does it tell at INFO level?
|
||||
- What troubleshooting info at DEBUG level?
|
||||
- What raw details at TRACE level?
|
||||
|
||||
---
|
||||
|
||||
## 📋 Quick Reference Card
|
||||
|
||||
### Log Level Decision Tree
|
||||
|
||||
```
|
||||
Is this a failure that stops execution?
|
||||
└─ YES → ERROR
|
||||
└─ NO → Is this unexpected but recoverable?
|
||||
└─ YES → WARN
|
||||
└─ NO → Is this a key business event?
|
||||
└─ YES → INFO
|
||||
└─ NO → Is this useful for troubleshooting?
|
||||
└─ YES → DEBUG
|
||||
└─ NO → Is this step-by-step execution detail?
|
||||
└─ YES → TRACE
|
||||
└─ NO → Don't log it!
|
||||
```
|
||||
|
||||
### Module-Specific Cheat Sheet
|
||||
|
||||
| Module Type | INFO | DEBUG | TRACE |
|
||||
|-------------|------|-------|-------|
|
||||
| **Orchestration** | Stage start/complete, pipeline milestones | Decision points, classifications | UI broadcasts, state changes |
|
||||
| **File Monitoring** | Monitor start, bot compiled | Changes detected, recompiles | File scans, timestamps |
|
||||
| **Security** | Logins, key events, failures | Validations, permission checks | Token details, hash operations |
|
||||
| **API Handlers** | Request start/end, resource changes | Headers, payloads | JSON parsing, serialization |
|
||||
| **Database** | Connections, migrations | Queries, row counts | Statement prep, row fetching |
|
||||
| **LLM** | Requests, responses, indexing | Prompts, parameters | Token streaming, chunking |
|
||||
| **Startup** | Service ready, milestones | Config, environment | Init steps, signal handlers |
|
||||
|
||||
---
|
||||
|
||||
## 🎬 Example: Complete Pipeline Logging
|
||||
|
||||
Here's how a complete auto-task pipeline looks at different levels:
|
||||
|
||||
### INFO Level (The Movie)
|
||||
```
|
||||
INFO Pipeline starting - task: task-123, intent: Create a CRM system
|
||||
INFO Stage 1 PLAN starting - Agent #1 analyzing request
|
||||
INFO Stage 1 PLAN complete - 5 nodes planned
|
||||
INFO Stage 2 BUILD starting - Agent #2 generating code
|
||||
INFO Stage 2 BUILD complete - 12 resources, url: /apps/crm-system
|
||||
INFO Stage 3 REVIEW starting - Agent #3 checking code quality
|
||||
INFO Stage 3 REVIEW complete - all checks passed
|
||||
INFO Stage 4 DEPLOY starting - Agent #4 deploying to /apps/crm-system
|
||||
INFO Stage 4 DEPLOY complete - app live at /apps/crm-system
|
||||
INFO Stage 5 MONITOR starting - Agent #1 setting up monitoring
|
||||
INFO Stage 5 MONITOR complete - monitoring active
|
||||
INFO Pipeline complete - task: task-123, nodes: 5, resources: 12, url: /apps/crm-system
|
||||
```
|
||||
|
||||
### DEBUG Level (Director's Commentary)
|
||||
```
|
||||
INFO Pipeline starting - task: task-123, intent: Create a CRM system
|
||||
DEBUG Classified intent as: AppGeneration
|
||||
DEBUG Selected app template: crm_standard
|
||||
INFO Stage 1 PLAN starting - Agent #1 analyzing request
|
||||
DEBUG Generated 5 sub-tasks from intent
|
||||
INFO Stage 1 PLAN complete - 5 nodes planned
|
||||
INFO Stage 2 BUILD starting - Agent #2 generating code
|
||||
DEBUG Using database schema: contacts, deals, activities
|
||||
DEBUG Generated 3 tables, 8 pages, 1 tool
|
||||
INFO Stage 2 BUILD complete - 12 resources, url: /apps/crm-system
|
||||
...
|
||||
```
|
||||
|
||||
### TRACE Level (Raw Footage)
|
||||
```
|
||||
INFO Pipeline starting - task: task-123, intent: Create a CRM system
|
||||
DEBUG Classified intent as: AppGeneration
|
||||
TRACE Extracting entities from: "Create a CRM system"
|
||||
TRACE Found entity: CRM
|
||||
TRACE Found entity: system
|
||||
DEBUG Selected app template: crm_standard
|
||||
INFO Stage 1 PLAN starting - Agent #1 analyzing request
|
||||
TRACE Broadcasting thought to UI: Analyzing request...
|
||||
TRACE Deriving plan sub-tasks
|
||||
TRACE Sub-task 1: Create database schema
|
||||
TRACE Sub-task 2: Generate list page
|
||||
TRACE Sub-task 3: Generate form pages
|
||||
TRACE Sub-task 4: Create BASIC tools
|
||||
TRACE Sub-task 5: Setup navigation
|
||||
DEBUG Generated 5 sub-tasks from intent
|
||||
...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Goals & Metrics
|
||||
|
||||
### Success Criteria
|
||||
1. **INFO logs tell a complete story** - Can understand system flow without DEBUG/TRACE
|
||||
2. **DEBUG logs enable troubleshooting** - Can diagnose issues with context
|
||||
3. **TRACE logs show execution details** - Can see step-by-step for deep debugging
|
||||
4. **No log spam** - Production logs are concise and meaningful
|
||||
5. **Consistent patterns** - Similar modules log similarly
|
||||
|
||||
### Metrics to Track
|
||||
- Lines of logs per request at INFO level: < 20
|
||||
- Lines of logs per request at DEBUG level: < 100
|
||||
- Lines of logs per request at TRACE level: unlimited
|
||||
- Error logs include context: 100%
|
||||
- WARN logs explain impact: 100%
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Next Steps
|
||||
|
||||
1. **Audit** current logging in all 341 files
|
||||
2. **Prioritize** modules by criticality
|
||||
3. **Refactor** module by module following this plan
|
||||
4. **Test** at each log level
|
||||
5. **Document** module-specific patterns
|
||||
6. **Train** team on logging standards
|
||||
7. **Monitor** log volume and usefulness
|
||||
8. **Iterate** based on feedback
|
||||
|
||||
---
|
||||
|
||||
## 📚 References
|
||||
|
||||
- [Rust log crate documentation](https://docs.rs/log/)
|
||||
- [env_logger documentation](https://docs.rs/env_logger/)
|
||||
- [Structured logging best practices](https://www.honeycomb.io/blog/structured-logging/)
|
||||
- [The Log: What every software engineer should know](https://blog.codinghorror.com/the-log-everything-manifesto/)
|
||||
|
||||
---
|
||||
|
||||
**Remember:** Good logging is like good cinematography - it should be invisible when done right, but tell a compelling story when you pay attention to it. 🎬
|
||||
|
|
@ -229,16 +229,13 @@ impl Orchestrator {
|
|||
) -> Result<OrchestrationResult, Box<dyn std::error::Error + Send + Sync>> {
|
||||
let intent_preview = &classification.original_text
|
||||
[..classification.original_text.len().min(80)];
|
||||
info!(
|
||||
"Orchestrator: starting pipeline task={} intent={}",
|
||||
self.task_id, intent_preview
|
||||
);
|
||||
info!("Pipeline starting - task: {}, intent: {}", self.task_id, intent_preview);
|
||||
|
||||
self.broadcast_pipeline_start();
|
||||
|
||||
// ── Stage 1: PLAN ──────────────────────────────────────────────────
|
||||
if let Err(e) = self.execute_plan_stage(classification).await {
|
||||
error!("Plan stage failed: {e}");
|
||||
error!("Stage 1 PLAN failed: {}", e);
|
||||
return Ok(self.failure_result(0, &format!("Planning failed: {e}")));
|
||||
}
|
||||
|
||||
|
|
@ -249,7 +246,7 @@ impl Orchestrator {
|
|||
{
|
||||
Ok(pair) => pair,
|
||||
Err(e) => {
|
||||
error!("Build stage failed: {e}");
|
||||
error!("Stage 2 BUILD failed: {}", e);
|
||||
return Ok(self.failure_result(1, &format!("Build failed: {e}")));
|
||||
}
|
||||
};
|
||||
|
|
@ -275,6 +272,10 @@ impl Orchestrator {
|
|||
.map(|r| format!("✓ {} table created", r.name))
|
||||
.collect();
|
||||
|
||||
// Log final summary
|
||||
info!("Pipeline complete - task: {}, nodes: {}, resources: {}, url: {}",
|
||||
self.task_id, node_count, resources.len(), app_url);
|
||||
|
||||
let message = format!(
|
||||
"Got it. Here's the plan: I broke it down in **{node_count} nodes**.\n\n{}\n\nApp deployed at **{app_url}**",
|
||||
if resource_summary.is_empty() {
|
||||
|
|
@ -317,6 +318,8 @@ impl Orchestrator {
|
|||
&mut self,
|
||||
classification: &ClassifiedIntent,
|
||||
) -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
|
||||
info!("Stage 1 PLAN starting - Agent #1 analyzing request");
|
||||
|
||||
self.update_stage(0, StageStatus::Running);
|
||||
self.update_agent_status(1, AgentStatus::Working, Some("Analyzing request"));
|
||||
self.broadcast_thought(
|
||||
|
|
@ -353,6 +356,7 @@ impl Orchestrator {
|
|||
|
||||
self.update_stage(0, StageStatus::Completed);
|
||||
self.update_agent_status(1, AgentStatus::Evolved, None);
|
||||
info!("Stage 1 PLAN complete - {} nodes planned", node_count);
|
||||
Ok(())
|
||||
}
|
||||
|
||||
|
|
@ -365,6 +369,8 @@ impl Orchestrator {
|
|||
classification: &ClassifiedIntent,
|
||||
session: &UserSession,
|
||||
) -> Result<(String, Vec<CreatedResource>), Box<dyn std::error::Error + Send + Sync>> {
|
||||
info!("Stage 2 BUILD starting - Agent #2 generating code");
|
||||
|
||||
self.update_stage(1, StageStatus::Running);
|
||||
self.update_agent_status(2, AgentStatus::Bred, Some("Preparing build"));
|
||||
self.broadcast_thought(2, "Builder agent bred. Starting code generation...");
|
||||
|
|
@ -437,9 +443,12 @@ impl Orchestrator {
|
|||
|
||||
self.update_stage(1, StageStatus::Completed);
|
||||
self.update_agent_status(2, AgentStatus::Evolved, None);
|
||||
info!("Stage 2 BUILD complete - {} resources, url: {}",
|
||||
resources.len(), app_url);
|
||||
Ok((app_url, resources))
|
||||
}
|
||||
Err(e) => {
|
||||
error!("Stage 2 BUILD failed: {}", e);
|
||||
self.update_stage(1, StageStatus::Failed);
|
||||
self.update_agent_status(2, AgentStatus::Failed, Some("Build failed"));
|
||||
Err(e)
|
||||
|
|
@ -452,6 +461,8 @@ impl Orchestrator {
|
|||
// =========================================================================
|
||||
|
||||
async fn execute_review_stage(&mut self, resources: &[CreatedResource]) {
|
||||
info!("Stage 3 REVIEW starting - Agent #3 checking code quality");
|
||||
|
||||
self.update_stage(2, StageStatus::Running);
|
||||
self.update_agent_status(3, AgentStatus::Bred, Some("Starting review"));
|
||||
self.broadcast_thought(
|
||||
|
|
@ -486,6 +497,7 @@ impl Orchestrator {
|
|||
|
||||
self.update_stage(2, StageStatus::Completed);
|
||||
self.update_agent_status(3, AgentStatus::Evolved, None);
|
||||
info!("Stage 3 REVIEW complete - all checks passed");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
|
|
@ -493,6 +505,8 @@ impl Orchestrator {
|
|||
// =========================================================================
|
||||
|
||||
async fn execute_deploy_stage(&mut self, app_url: &str) {
|
||||
info!("Stage 4 DEPLOY starting - Agent #4 deploying to {}", app_url);
|
||||
|
||||
self.update_stage(3, StageStatus::Running);
|
||||
self.update_agent_status(4, AgentStatus::Bred, Some("Preparing deploy"));
|
||||
self.broadcast_thought(
|
||||
|
|
@ -518,6 +532,7 @@ impl Orchestrator {
|
|||
|
||||
self.update_stage(3, StageStatus::Completed);
|
||||
self.update_agent_status(4, AgentStatus::Evolved, None);
|
||||
info!("Stage 4 DEPLOY complete - app live at {}", app_url);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
|
|
@ -525,6 +540,8 @@ impl Orchestrator {
|
|||
// =========================================================================
|
||||
|
||||
async fn execute_monitor_stage(&mut self, app_url: &str) {
|
||||
info!("Stage 5 MONITOR starting - Agent #1 setting up monitoring");
|
||||
|
||||
self.update_stage(4, StageStatus::Running);
|
||||
self.broadcast_thought(
|
||||
1,
|
||||
|
|
@ -544,6 +561,7 @@ impl Orchestrator {
|
|||
self.broadcast_activity(1, "monitor_complete", &activity);
|
||||
|
||||
self.update_stage(4, StageStatus::Completed);
|
||||
info!("Stage 5 MONITOR complete - monitoring active");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
|
|
|
|||
|
|
@ -146,9 +146,9 @@ impl BootstrapManager {
|
|||
}
|
||||
}
|
||||
|
||||
if pm.is_installed("minio") {
|
||||
if pm.is_installed("drive") {
|
||||
info!("Starting MinIO...");
|
||||
match pm.start("minio") {
|
||||
match pm.start("drive") {
|
||||
Ok(_child) => {
|
||||
info!("MinIO started");
|
||||
}
|
||||
|
|
|
|||
|
|
@ -273,8 +273,11 @@ impl BotOrchestrator {
|
|||
}
|
||||
|
||||
info!(
|
||||
"Bot mounting complete: {} bots processed ({} created, {} already existed)",
|
||||
bots_mounted,
|
||||
"BotServer ready - {} bots loaded",
|
||||
bots_mounted
|
||||
);
|
||||
log::debug!(
|
||||
"Bot mounting details: {} created, {} already existed",
|
||||
bots_created,
|
||||
bots_mounted - bots_created
|
||||
);
|
||||
|
|
@ -393,7 +396,7 @@ impl BotOrchestrator {
|
|||
.execute(&mut conn)
|
||||
.map_err(|e| format!("Failed to create bot: {e}"))?;
|
||||
|
||||
info!("Created bot '{}' with ID '{}'", bot_name, bot_id);
|
||||
info!("User system created resource: bot {}", bot_id);
|
||||
Ok(())
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -188,6 +188,9 @@ impl SessionManager {
|
|||
error!("Failed to create session in database: {}", e);
|
||||
e
|
||||
})?;
|
||||
|
||||
log::info!("User {} created resource: session {}", verified_uid, inserted.id);
|
||||
|
||||
Ok(inserted)
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -186,6 +186,8 @@ pub async fn write_file(
|
|||
.await
|
||||
.map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, Json(serde_json::json!({"error": e.to_string()}))))?;
|
||||
|
||||
log::info!("User system created resource: drive_file {}", key);
|
||||
|
||||
Ok(Json(serde_json::json!({"success": true})))
|
||||
}
|
||||
|
||||
|
|
@ -207,6 +209,10 @@ pub async fn delete_file(
|
|||
.await
|
||||
.map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, Json(serde_json::json!({"error": e.to_string()}))))?;
|
||||
|
||||
// Technically a deletion, but we could log it as an audit change or leave it out
|
||||
// The plan specifies resource creation: `info!("User {} created resource: {}", user_id, resource_id);`
|
||||
log::info!("User system deleted resource: drive_file {}", file_id);
|
||||
|
||||
Ok(Json(serde_json::json!({"success": true})))
|
||||
}
|
||||
|
||||
|
|
@ -239,6 +245,8 @@ pub async fn create_folder(
|
|||
.await
|
||||
.map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, Json(serde_json::json!({"error": e.to_string()}))))?;
|
||||
|
||||
log::info!("User system created resource: drive_folder {}", key);
|
||||
|
||||
Ok(Json(serde_json::json!({"success": true})))
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -50,7 +50,7 @@ impl LocalFileMonitor {
|
|||
}
|
||||
|
||||
pub async fn start_monitoring(&self) -> Result<(), Box<dyn Error + Send + Sync>> {
|
||||
trace!("Starting local file monitor for /opt/gbo/data/*.gbai directories");
|
||||
info!("Local file monitor started - watching /opt/gbo/data/*.gbai directories");
|
||||
|
||||
// Create data directory if it doesn't exist
|
||||
if let Err(e) = tokio::fs::create_dir_all(&self.data_dir).await {
|
||||
|
|
@ -68,7 +68,7 @@ impl LocalFileMonitor {
|
|||
monitor.monitoring_loop().await;
|
||||
});
|
||||
|
||||
trace!("Local file monitor started");
|
||||
debug!("Local file monitor successfully initialized");
|
||||
Ok(())
|
||||
}
|
||||
|
||||
|
|
@ -116,7 +116,7 @@ impl LocalFileMonitor {
|
|||
EventKind::Create(_) | EventKind::Modify(_) | EventKind::Any => {
|
||||
for path in &event.paths {
|
||||
if self.is_gbdialog_file(path) {
|
||||
trace!("Detected change: {:?}", path);
|
||||
debug!("Detected change in: {:?}", path);
|
||||
if let Err(e) = self.compile_local_file(path).await {
|
||||
error!("Failed to compile {:?}: {}", path, e);
|
||||
}
|
||||
|
|
@ -126,7 +126,7 @@ impl LocalFileMonitor {
|
|||
EventKind::Remove(_) => {
|
||||
for path in &event.paths {
|
||||
if self.is_gbdialog_file(path) {
|
||||
trace!("File removed: {:?}", path);
|
||||
debug!("File removed: {:?}", path);
|
||||
self.remove_file_state(path).await;
|
||||
}
|
||||
}
|
||||
|
|
@ -167,7 +167,7 @@ impl LocalFileMonitor {
|
|||
}
|
||||
|
||||
async fn scan_and_compile_all(&self) -> Result<(), Box<dyn Error + Send + Sync>> {
|
||||
debug!("[LOCAL_MONITOR] Scanning /opt/gbo/data for .gbai directories");
|
||||
trace!("Scanning directory: {:?}", self.data_dir);
|
||||
|
||||
let entries = match tokio::fs::read_dir(&self.data_dir).await {
|
||||
Ok(e) => e,
|
||||
|
|
@ -203,7 +203,7 @@ impl LocalFileMonitor {
|
|||
}
|
||||
|
||||
async fn compile_gbdialog(&self, bot_name: &str, gbdialog_path: &Path) -> Result<(), Box<dyn Error + Send + Sync>> {
|
||||
debug!("[LOCAL_MONITOR] Processing .gbdialog for bot: {}", bot_name);
|
||||
info!("Compiling bot: {}", bot_name);
|
||||
|
||||
let entries = tokio::fs::read_dir(gbdialog_path).await?;
|
||||
let mut entries = entries;
|
||||
|
|
@ -231,7 +231,7 @@ impl LocalFileMonitor {
|
|||
};
|
||||
|
||||
if should_compile {
|
||||
trace!("Compiling: {:?}", path);
|
||||
debug!("Recompiling {:?} - modification detected", path);
|
||||
if let Err(e) = self.compile_local_file(&path).await {
|
||||
error!("Failed to compile {:?}: {}", path, e);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -184,7 +184,7 @@ async fn main() -> std::io::Result<()> {
|
|||
e
|
||||
);
|
||||
} else {
|
||||
info!("SecretsManager initialized - fetching secrets from Vault");
|
||||
info!("Secrets loaded from Vault");
|
||||
}
|
||||
} else {
|
||||
trace!("Bootstrap not complete - skipping early SecretsManager init");
|
||||
|
|
@ -195,6 +195,7 @@ async fn main() -> std::io::Result<()> {
|
|||
aws_smithy_runtime=off,aws_smithy_runtime_api=off,aws_sdk_s3=off,aws_config=off,\
|
||||
aws_credential_types=off,aws_http=off,aws_sig_auth=off,aws_types=off,\
|
||||
mio=off,tokio=off,tokio_util=off,tower=off,tower_http=off,\
|
||||
tokio_tungstenite=off,tungstenite=off,\
|
||||
reqwest=off,hyper=off,hyper_util=off,h2=off,\
|
||||
rustls=off,rustls_pemfile=off,tokio_rustls=off,\
|
||||
tracing=off,tracing_core=off,tracing_subscriber=off,\
|
||||
|
|
@ -290,6 +291,7 @@ async fn main() -> std::io::Result<()> {
|
|||
dotenv().ok();
|
||||
|
||||
let pool = init_database(&progress_tx).await?;
|
||||
info!("Database initialized - PostgreSQL connected");
|
||||
let refreshed_cfg = load_config(&pool).await?;
|
||||
let config = std::sync::Arc::new(refreshed_cfg.clone());
|
||||
|
||||
|
|
@ -379,7 +381,7 @@ async fn main() -> std::io::Result<()> {
|
|||
trace!("Initial data setup task spawned");
|
||||
trace!("All system threads started, starting HTTP server...");
|
||||
|
||||
info!("Starting HTTP server on port {}...", config.server.port);
|
||||
info!("Server started on port {}", config.server.port);
|
||||
if let Err(e) = run_axum_server(app_state, config.server.port, worker_count).await {
|
||||
error!("Failed to start HTTP server: {}", e);
|
||||
std::process::exit(1);
|
||||
|
|
@ -392,5 +394,3 @@ async fn main() -> std::io::Result<()> {
|
|||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -302,7 +302,7 @@ pub async fn init_redis() -> Option<Arc<redis::Client>> {
|
|||
// Test with PING
|
||||
match redis::cmd("PING").query::<String>(&mut conn) {
|
||||
Ok(response) if response == "PONG" => {
|
||||
log::info!("Cache connection verified: PONG");
|
||||
log::info!("Cache initialized - Valkey connected");
|
||||
Ok(Some(Arc::new(client)))
|
||||
}
|
||||
Ok(response) => {
|
||||
|
|
|
|||
|
|
@ -7,7 +7,7 @@ use serde::{Deserialize, Serialize};
|
|||
use std::collections::HashSet;
|
||||
use std::sync::Arc;
|
||||
use tokio::sync::RwLock;
|
||||
use tracing::{debug, info};
|
||||
use tracing::info;
|
||||
use uuid::Uuid;
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
|
|
@ -490,14 +490,14 @@ impl JwtManager {
|
|||
claims.session_id,
|
||||
)?;
|
||||
|
||||
debug!("Refreshed tokens for user {user_id}");
|
||||
info!("Tokens refreshed for user {}", user_id);
|
||||
Ok(new_pair)
|
||||
}
|
||||
|
||||
pub async fn revoke_token(&self, jti: &str) -> Result<()> {
|
||||
let mut blacklist = self.blacklist.write().await;
|
||||
blacklist.insert(jti.to_string());
|
||||
debug!("Revoked token {jti}");
|
||||
info!("Token revoked: {}", jti);
|
||||
Ok(())
|
||||
}
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue