- Add vector_db_health_check() function to verify Qdrant availability
- Add wait loop for vector_db startup in bootstrap (15 seconds)
- Fallback to local LLM when external URL configured but no API key provided
- Prevent external LLM (api.z.ai) usage without authentication key
This fixes the production issues:
- Qdrant vector database not available at https://localhost:6333
- External LLM being used instead of local when no key is configured
- Ensures vector_db is properly started and ready before use
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
The previous /dev/tcp test was giving false positives, reporting that
Valkey was running when it was actually down. This caused bootstrap to
skip starting Valkey, leading to botserver hanging on cache connection.
Changes:
- Use nc (netcat) with -z flag for reliable port checking
- Final fallback: /dev/tcp with actual PING/PONG verification
- Only returns true if port is open AND responds correctly
This ensures cache_health_check() accurately reports Valkey status.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Try valkey-cli first (preferred for Valkey installations)
- Fall back to redis-cli (for Redis installations)
- Fall back to TCP connection test (works for both)
This fixes environments that only have Valkey installed without
Redis symlinks or redis-cli.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Fix component name mismatch: "redis" -> "cache" in bootstrap_manager
- Add cache_health_check() function to verify Valkey is responding
- Add health check loop after starting cache (12s wait with PING test)
- Ensures cache is ready before proceeding with bootstrap
This fixes the issue where botserver would hang waiting for cache
connection because the cache component was never started.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Use local pg_isready path when available (./botserver-stack/bin/tables/bin/pg_isready)
- Fall back to system pg_isready if local binary not found
- Prevents 30-second timeout during bootstrap when PostgreSQL is actually running
- Applied to both readiness checks in start_all() method
- Removed -d 'postgres' parameter from pg_isready health checks
- Health check now only verifies server connection on port 5432
- Fixes false positive failures when PostgreSQL is running but specific database has issues
- PostgreSQL logs showed 'database system is ready' but health check was failing
Add pg_isready health check to the 'already running' branch to ensure
PostgreSQL is properly detected as ready, even when running as a
non-interactive user (sudo -u gbuser).
This complements the previous fix for fresh PostgreSQL starts.
Changed pg_isready checks from '-U gbuser' to '-d postgres' to properly
detect PostgreSQL readiness during bootstrap. The gbuser database doesn't
exist yet during startup, causing pg_isready to fail and bootstrap to timeout.
This fixes the issue when running botserver as a non-interactive user
(e.g., sudo -u gbuser).
- Modify bootstrap to read .valid file and validate templates before loading
- Templates not in .valid file are skipped during bootstrap
- Backward compatible: if .valid file missing, all templates are loaded
- Enables controlled template loading during bootstrap
- shell_script_arg blocks $( and backticks for user input safety
- trusted_shell_script_arg allows these for internal installer scripts
- Internal scripts need shell features like command substitution
- Updated bootstrap, installer, facade, and llm modules
- Allow &, ?, = in URL arguments (http:// or https://)
- Allow // pattern in URLs (needed for protocol)
- These are safe since Command::new().args() doesn't use shell
- Fixes Vault health check with query parameters
- Add debug logging to safe_curl and vault_health_check
- Add vault_health_check() function that checks if client certs exist
- If certs exist: use mTLS (secure, post-installation)
- If certs don't exist yet: use plain TLS (during initial bootstrap)
- This allows bootstrap to complete while maintaining mTLS security after setup
- No security hole: mTLS is enforced once certs are generated
- Remove tls_client_ca_file from vault config templates
- Remove --cert/--key from health checks
- TLS still enabled for encryption, just no client cert required
- TODO: Re-enable mTLS when binary with cert health checks is compiled
- Keep mTLS enabled for security (even in dev)
- Add --cert and --key to all curl commands for Vault health checks
- Fix fetch_vault_credentials to use https and mTLS
- Fix Zitadel commands to use https with VAULT_CACERT
- All Vault communications now use proper mutual TLS
- Remove tls_client_ca_file from vault config in installer.rs (Linux and macOS)
- Remove tls_client_ca_file from vault config in bootstrap/mod.rs
- TLS encryption still enabled, just no client certificate required
- Health checks now work with simple -sk curl flags
Major additions:
- Video editing engine with AI features (transcription, captions, TTS, scene detection)
- RBAC middleware and organization management
- Security enhancements (MFA, passkey, DLP, encryption, audit)
- Billing and subscription management
- Contacts management
- Dashboards module
- Learn/LMS module
- Social features
- Compliance (SOC2, SOP middleware, vulnerability scanner)
- New migrations for RBAC, learn, and video tables
- Fix MinIO health check to use HTTPS instead of HTTP
- Add Vault connectivity check before fetching credentials
- Add CA cert configuration for S3 client
- Add Qdrant vector_db setup with TLS configuration
- Fix Qdrant default URL to use HTTPS
- Always sync templates to S3 buckets (not just on create)
- Skip .gbkb root files, only index files in subfolders
Database Schema v7.0.0:
- Create new 'gb' schema with PostgreSQL ENUMs instead of VARCHAR for all domain values
- Add sharding infrastructure (shard_config, tenant_shard_map tables)
- Implement partitioned tables for sessions, messages, and analytics (monthly partitions)
- Add Snowflake-like ID generation for distributed systems
- Design for billion-user scale with proper indexing strategies
Rust Enums:
- Add comprehensive enum types in core/shared/enums.rs
- Implement ToSql/FromSql for Diesel ORM integration
- Include: ChannelType, MessageRole, MessageType, LlmProvider, ContextProvider
- Include: TaskStatus, TaskPriority, ExecutionMode, RiskLevel, ApprovalStatus, IntentType
- All enums stored as SMALLINT for efficiency
Other fixes:
- Fix hardcoded gpt-4 model in auto_task modules to use bot config
- Add vector_db to required bootstrap components
- Add Qdrant health check before KB indexing
- Change verbose START messages to trace level
- Fix episodic memory role handling in Claude client
- Disable auth for /api routes during development
This is a DESTRUCTIVE migration - only for fresh installations.
- Change from error to warn when bucket creation fails
- Continue bootstrap without drive if MinIO not available
- Prevents startup failure when S3 not configured
- Fix match arms with identical bodies by consolidating patterns
- Fix case-insensitive file extension comparisons using eq_ignore_ascii_case
- Fix unnecessary Debug formatting in log/format macros
- Fix clone_from usage instead of clone assignment
- Fix let...else patterns where appropriate
- Fix format! append to String using write! macro
- Fix unwrap_or with function calls to use unwrap_or_else
- Add missing fields to manual Debug implementations
- Fix duplicate code in if blocks
- Add type aliases for complex types
- Rename struct fields to avoid common prefixes
- Various other clippy warning fixes
Note: Some 'unused async' warnings remain for functions that are
called with .await but don't contain await internally - these are
kept async for API compatibility.