- Move system_prompt retrieval inside spawn_blocking closure
- Include system_prompt in the return tuple to fix scope issue
- Add trace logging for debugging system-prompt loading
- GLM-5 and other LLM providers now correctly receive custom system prompts
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add detailed logging for all 5 pipeline stages (PLAN, BUILD, REVIEW, DEPLOY, MONITOR)
- Log stage start/complete events with agent IDs and progress details
- Add resource creation/deletion logging in drive_handlers
- Improve pipeline summary logging with task ID, nodes, resources, and URL
This addresses the requirement for textual progress in console logs.
- Fix PAT extraction timing with retry loop (waits up to 60s for PAT in logs)
- Add sync command to flush filesystem buffers before extraction
- Improve logging with progress messages and PAT verification
- Refactor setup code into consolidated setup.rs module
- Fix YAML indentation for PatPath and MachineKeyPath
- Change Zitadel init parameter from --config to --steps
The timing issue occurred because:
1. Zitadel writes PAT to logs at startup (~18:08:59)
2. Post-install extraction ran too early (~18:09:35)
3. PAT file wasn't created until ~18:10:38 (63s after installation)
4. OAuth client creation failed because PAT file didn't exist yet
With the retry loop:
- Waits for PAT to appear in logs with sync+grep check
- Extracts PAT immediately when found
- OAuth client creation succeeds
- directory_config.json saved with valid credentials
- Login flow works end-to-end
Tested: Full reset.sh and login verification successful
- Added OAuth client creation to the 'already running' branch
- Previously, OAuth client was only created when Zitadel was started
- This fixes the issue where OAuth client wasn't created on subsequent runs
Fixes the OAuth client creation issue discovered during testing.
Fixed compilation errors by adding .await to all ensure_admin_token() calls:
- create_organization()
- create_user()
- save_config()
The method was made async but the calls weren't updated.
- Add password grant authentication support in DirectorySetup
- Extract initial admin credentials from Zitadel log file
- Fix race condition in Zitadel startup (wait for health check before starting)
- Create parent directories before saving config
- Add retry logic for OAuth client creation
- Improve error handling with detailed messages
Fixes authentication service not configured error after bootstrap.
- Add vector_db_health_check() function to verify Qdrant availability
- Add wait loop for vector_db startup in bootstrap (15 seconds)
- Fallback to local LLM when external URL configured but no API key provided
- Prevent external LLM (api.z.ai) usage without authentication key
This fixes the production issues:
- Qdrant vector database not available at https://localhost:6333
- External LLM being used instead of local when no key is configured
- Ensures vector_db is properly started and ready before use
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
The previous /dev/tcp test was giving false positives, reporting that
Valkey was running when it was actually down. This caused bootstrap to
skip starting Valkey, leading to botserver hanging on cache connection.
Changes:
- Use nc (netcat) with -z flag for reliable port checking
- Final fallback: /dev/tcp with actual PING/PONG verification
- Only returns true if port is open AND responds correctly
This ensures cache_health_check() accurately reports Valkey status.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Try valkey-cli first (preferred for Valkey installations)
- Fall back to redis-cli (for Redis installations)
- Fall back to TCP connection test (works for both)
This fixes environments that only have Valkey installed without
Redis symlinks or redis-cli.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Fix component name mismatch: "redis" -> "cache" in bootstrap_manager
- Add cache_health_check() function to verify Valkey is responding
- Add health check loop after starting cache (12s wait with PING test)
- Ensures cache is ready before proceeding with bootstrap
This fixes the issue where botserver would hang waiting for cache
connection because the cache component was never started.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Use local pg_isready path when available (./botserver-stack/bin/tables/bin/pg_isready)
- Fall back to system pg_isready if local binary not found
- Prevents 30-second timeout during bootstrap when PostgreSQL is actually running
- Applied to both readiness checks in start_all() method
- Removed -d 'postgres' parameter from pg_isready health checks
- Health check now only verifies server connection on port 5432
- Fixes false positive failures when PostgreSQL is running but specific database has issues
- PostgreSQL logs showed 'database system is ready' but health check was failing
Add pg_isready health check to the 'already running' branch to ensure
PostgreSQL is properly detected as ready, even when running as a
non-interactive user (sudo -u gbuser).
This complements the previous fix for fresh PostgreSQL starts.
Changed pg_isready checks from '-U gbuser' to '-d postgres' to properly
detect PostgreSQL readiness during bootstrap. The gbuser database doesn't
exist yet during startup, causing pg_isready to fail and bootstrap to timeout.
This fixes the issue when running botserver as a non-interactive user
(e.g., sudo -u gbuser).
- Modify bootstrap to read .valid file and validate templates before loading
- Templates not in .valid file are skipped during bootstrap
- Backward compatible: if .valid file missing, all templates are loaded
- Enables controlled template loading during bootstrap
- shell_script_arg blocks $( and backticks for user input safety
- trusted_shell_script_arg allows these for internal installer scripts
- Internal scripts need shell features like command substitution
- Updated bootstrap, installer, facade, and llm modules
- Allow &, ?, = in URL arguments (http:// or https://)
- Allow // pattern in URLs (needed for protocol)
- These are safe since Command::new().args() doesn't use shell
- Fixes Vault health check with query parameters
- Add debug logging to safe_curl and vault_health_check
- Add vault_health_check() function that checks if client certs exist
- If certs exist: use mTLS (secure, post-installation)
- If certs don't exist yet: use plain TLS (during initial bootstrap)
- This allows bootstrap to complete while maintaining mTLS security after setup
- No security hole: mTLS is enforced once certs are generated
- Remove tls_client_ca_file from vault config templates
- Remove --cert/--key from health checks
- TLS still enabled for encryption, just no client cert required
- TODO: Re-enable mTLS when binary with cert health checks is compiled
- Keep mTLS enabled for security (even in dev)
- Add --cert and --key to all curl commands for Vault health checks
- Fix fetch_vault_credentials to use https and mTLS
- Fix Zitadel commands to use https with VAULT_CACERT
- All Vault communications now use proper mutual TLS
- Remove tls_client_ca_file from vault config in installer.rs (Linux and macOS)
- Remove tls_client_ca_file from vault config in bootstrap/mod.rs
- TLS encryption still enabled, just no client certificate required
- Health checks now work with simple -sk curl flags
Major additions:
- Video editing engine with AI features (transcription, captions, TTS, scene detection)
- RBAC middleware and organization management
- Security enhancements (MFA, passkey, DLP, encryption, audit)
- Billing and subscription management
- Contacts management
- Dashboards module
- Learn/LMS module
- Social features
- Compliance (SOC2, SOP middleware, vulnerability scanner)
- New migrations for RBAC, learn, and video tables
- Fix MinIO health check to use HTTPS instead of HTTP
- Add Vault connectivity check before fetching credentials
- Add CA cert configuration for S3 client
- Add Qdrant vector_db setup with TLS configuration
- Fix Qdrant default URL to use HTTPS
- Always sync templates to S3 buckets (not just on create)
- Skip .gbkb root files, only index files in subfolders
Database Schema v7.0.0:
- Create new 'gb' schema with PostgreSQL ENUMs instead of VARCHAR for all domain values
- Add sharding infrastructure (shard_config, tenant_shard_map tables)
- Implement partitioned tables for sessions, messages, and analytics (monthly partitions)
- Add Snowflake-like ID generation for distributed systems
- Design for billion-user scale with proper indexing strategies
Rust Enums:
- Add comprehensive enum types in core/shared/enums.rs
- Implement ToSql/FromSql for Diesel ORM integration
- Include: ChannelType, MessageRole, MessageType, LlmProvider, ContextProvider
- Include: TaskStatus, TaskPriority, ExecutionMode, RiskLevel, ApprovalStatus, IntentType
- All enums stored as SMALLINT for efficiency
Other fixes:
- Fix hardcoded gpt-4 model in auto_task modules to use bot config
- Add vector_db to required bootstrap components
- Add Qdrant health check before KB indexing
- Change verbose START messages to trace level
- Fix episodic memory role handling in Claude client
- Disable auth for /api routes during development
This is a DESTRUCTIVE migration - only for fresh installations.