Commit graph

4313 commits

Author SHA1 Message Date
723407cfd6 fix: add 60s timeout to LLM stream reads and add concurrent scan guard
All checks were successful
BotServer CI/CD / build (push) Successful in 3m53s
- Add tokio timeout to SSE stream reads in OpenAI client (60s)
- Prevents indefinite hang when Kimi/Nvidia stops responding
- Add scanning AtomicBool to prevent concurrent check_gbkb_changes calls
- Skip GBKB scan entirely when all KBs already indexed in Qdrant

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-13 12:58:11 -03:00
c1df15eb48 fix: skip GBKB scan when all KBs already indexed in Qdrant
All checks were successful
BotServer CI/CD / build (push) Successful in 3m39s
- Check kb_indexed_folders before acquiring file_states write lock
- Eliminates deadlock from concurrent check_gbkb_changes calls
- Prevents unnecessary PDF re-downloads every 10 seconds
- Removes debug logging, adds clean early-return

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-13 12:22:11 -03:00
326305d55e debug: add LLM output traces to diagnose blank HTML rendering issue
All checks were successful
BotServer CI/CD / build (push) Successful in 4m0s
- Log full LLM response preview (500 chars) with has_html detection
- Log WebSocket send with message type, completeness, and content preview
- Use clone() for chunk in BotResponse to ensure accurate logging

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-13 11:57:43 -03:00
d1652fc413 feat: add build_date to health endpoint for CI deploy verification
All checks were successful
BotServer CI/CD / build (push) Successful in 4m21s
- Add BOTSERVER_BUILD_DATE env var to /api/health response
- Set build date during CI compilation via environment variable
- Enables checking deployed binary age without SSH access

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-13 11:49:10 -03:00
4fb626399d fix: prevent infinite KB reindexing loop by using last_modified as primary change detector
All checks were successful
BotServer CI/CD / build (push) Successful in 4m2s
- Use last_modified timestamp instead of ETag for change detection
- Skip re-queueing KBs that are already indexed in Qdrant
- Preserve indexed status across scans when content unchanged
- Add normalize_etag helper for consistent ETag comparison

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-13 11:24:37 -03:00
e98dc47ea1 fix: TOOL_EXEC with USE KB now falls through to LLM pipeline for KB-injected response
All checks were successful
BotServer CI/CD / build (push) Successful in 3m50s
When a tool button like Cartas activates a KB via USE KB, instead of
returning just the tool result (empty/label), the handler now checks
if session has active KBs. If so and result is empty/trivial,
falls through to the full LLM pipeline which injects KB context.
2026-04-13 10:02:47 -03:00
1f77d7f099 fix: skip KB re-indexing when kb_collections already has docs, prevents vector DB loop
All checks were successful
BotServer CI/CD / build (push) Successful in 4m5s
2026-04-13 09:53:25 -03:00
86939c17d8 fix: stop KB re-indexing every cycle, add kb_indexed_folders tracking
All checks were successful
BotServer CI/CD / build (push) Successful in 6m13s
- Add kb_indexed_folders set to track successfully indexed KB folders
- Skip re-queuing KB for indexing if already indexed and files unchanged
- Remove kb_key from indexed set when files change (forces re-index)
- Clear indexed set on KB folder deletion
- Fix hardcoded salesianos in drive_monitor prompt key (from previous commit)
2026-04-13 09:37:15 -03:00
dd68cdbe6c fix: remove hardcoded salesianos, strip think tags globally, block reasoning_content leak
All checks were successful
BotServer CI/CD / build (push) Successful in 6m38s
- drive_monitor: replace hardcoded salesianos.gbot with dynamic bot_name
- llm/mod.rs: stop falling back to reasoning_content as content
- llm/claude.rs: same fix for Claude handler
- deepseek_r3: export strip_think_tags for reuse
- gpt_oss_20b: use strip_think_tags so all models strip tags
- gpt_oss_120b: use strip_think_tags so all models strip tags
2026-04-13 09:04:22 -03:00
dbec0df923 fix: DriveMonitor config.csv sync uses Last-Modified in addition to ETag
All checks were successful
BotServer CI/CD / build (push) Successful in 5m46s
ETag in MinIO is an MD5 content hash, so re-uploading the same content
preserves the ETag. Add last_modified comparison so config.csv changes
that don't alter content hash still get synced. Also fixes EmbeddingConfig
fallback from previous commit.
2026-04-13 08:33:37 -03:00
1148069652 fix: EmbeddingConfig::from_bot_config fallback to default bot config
All checks were successful
BotServer CI/CD / build (push) Successful in 6m9s
When a bot lacks embedding-url in its own config, from_bot_config now
falls back to the default bot's config via ConfigManager::get_config.
Previously it returned empty string, causing embedding server connection
failures for bots without explicit embedding configuration.
2026-04-13 08:19:42 -03:00
782618e265 fix: ADD_SUGGESTION compilation error - AS keyword case mismatch
All checks were successful
BotServer CI/CD / build (push) Successful in 2m50s
Root cause: compiler converts AS -> as (lowercase keywords) but Rhai
custom syntax expected uppercase 'AS'. Rhai syntax is case-sensitive.

Changed:
- ADD_SUGGESTION_TOOL: 'AS' -> 'as'
- ADD_SUGGESTION_TEXT: 'AS' -> 'as'
- ADD_SUGGESTION: 'AS' -> 'as'

This fixes: 'Syntax error: Expecting AS for ADD_SUGGESTION_TOOL expression'

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-13 07:40:16 -03:00
666acb9360 fix: DEADLOCK in check_gbkb_changes - removed nested file_states read lock
All checks were successful
BotServer CI/CD / build (push) Successful in 3m44s
Root cause: file_states.write().await was held while trying to acquire
file_states.read().await for KB backoff check. Tokio RwLock is not
reentrant - this caused permanent deadlock.

Fix: Removed the file_states.read() backoff check. KB processor now
just checks files_being_indexed set and queues to pending_kb_index.
Backoff is handled by the KB processor itself based on fail_count.

This fixes salesianos DriveMonitor hanging for 5+ minutes every cycle.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-12 22:28:02 -03:00
3322234712 debug: add logging to track check_gbkb_changes hang
All checks were successful
BotServer CI/CD / build (push) Successful in 3m40s
Added debug logging at key points in check_gbkb_changes:
- ENTER with bot ID and prefix
- Object listing results
- File states lock acquisition
- New/modified file detection
- PDF detection
- File download batches
- Final remaining files download
- EXIT confirmation

This will help identify exactly where the 5-minute timeout occurs.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-12 22:09:38 -03:00
8e539206d4 fix: KB processor works with and without llm/research features
All checks were successful
BotServer CI/CD / build (push) Successful in 3m55s
- Added stub start_kb_processor() for non-llm builds
- Added _pending_kb_index field for non-llm builds
- Extracted KB processor logic into start_kb_processor_inner()
- Removed unused is_embedding_server_ready import

This ensures DriveMonitor compiles and runs correctly in production
where CI builds without --features llm.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-12 21:40:06 -03:00
112ac51da3 fix: KB processor runs as background task, no longer blocks check_for_changes
All checks were successful
BotServer CI/CD / build (push) Successful in 3m50s
- Added start_kb_processor() method: long-running background task per bot
- check_gbkb_changes now queues KB folders to pending_kb_index (non-blocking)
- KB processor polls pending_kb_index and processes one at a time per bot
- Removed inline tokio::spawn from check_gbkb_changes that was causing 5min timeouts
- Added pending_kb_index field to DriveMonitor struct

This fixes salesianos DriveMonitor timeout - check_for_changes now completes
in seconds instead of hanging on KB embedding/indexing.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-12 21:28:03 -03:00
ad998b52d4 fix: check_gbot only scans .gbot/ folder, not entire bucket
All checks were successful
BotServer CI/CD / build (push) Successful in 4m21s
- Added prefix filter to list_objects_v2 call: only scans {bot}.gbot/
- Removed scanning of .gbkb and .gbdialog paths which caused 5min timeouts
- This fixes salesianos DriveMonitor timeout and embed/index failure

Also fixed header detection for name,value CSV format.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-12 21:02:01 -03:00
4dbc418aab fix: detect name,value header in config.csv
Header detection was only checking for key,value but the actual
CSV uses name,value as header row. Now both are detected and skipped.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-12 20:47:56 -03:00
36fdf52780 fix: sync_gbot_config now handles CSV with or without header row
All checks were successful
BotServer CI/CD / build (push) Successful in 3m32s
- Removed unconditional .skip(1) that was skipping first config line
- Added header detection: skips first line only if it looks like 'key,value' header
- Added validation to skip empty keys
- Also fixed indentation in drive_monitor gbkb file processing

This fixes the issue where config.csv changes on Drive weren't being
synced to bot_configuration database table for salesianos bot.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-12 20:32:30 -03:00
4cd469afc3 fix: track config.csv ETag to avoid unnecessary syncs
All checks were successful
BotServer CI/CD / build (push) Successful in 5m2s
- Add ETag tracking for config.csv files in DriveMonitor
- Only download and sync config.csv when ETag changes
- Prevents unnecessary database updates on every check
- Uses __config__ prefix for config.csv state keys
2026-04-12 19:49:28 -03:00
1977c4c0af fix: extract base URL for embedding health checks
All checks were successful
BotServer CI/CD / build (push) Successful in 4m2s
- Add extract_base_url() helper to parse scheme://host:port from full URLs
- Fix health check to use base URL instead of full endpoint path
- Allows embedding-url config like http://host:port/v1/embeddings to work correctly
- Health check now goes to http://host:port/health instead of http://host:port/v1/embeddings/health
2026-04-12 19:33:35 -03:00
c6a47c84ac fix: use ADD_SUGGESTION_TOOL instead of ADD_SUGG_TOOL
All checks were successful
BotServer CI/CD / build (push) Successful in 3m17s
2026-04-12 18:33:34 -03:00
efe45bb296 fix: use .ast files in tool_executor
All checks were successful
BotServer CI/CD / build (push) Successful in 3m14s
2026-04-12 17:56:33 -03:00
20af25e9e2 fix: use compile_preprocessed for .ast files
All checks were successful
BotServer CI/CD / build (push) Successful in 3m29s
2026-04-12 17:48:41 -03:00
af85426ed4 fix: delete orphaned .gbkb files when removed from MinIO
All checks were successful
BotServer CI/CD / build (push) Successful in 3m6s
When a .gbkb file is deleted from the bucket, DriveMonitor now:
- Deletes the downloaded file from work directory
- When entire KB folder is empty, removes the folder too
- Prevents disk accumulation of orphaned knowledge base files
2026-04-12 16:49:05 -03:00
135dfb06d5 fix: delete orphaned .ast files when .bas is removed from MinIO
All checks were successful
BotServer CI/CD / build (push) Successful in 3m4s
When a .bas file is deleted from the bucket, DriveMonitor now:
- Deletes the corresponding .ast compiled file
- Deletes .bas, .mcp.json, .tool.json files from work directory
- Removes the path from file_states tracking

This prevents stale compiled files from accumulating in production.
2026-04-12 16:43:29 -03:00
9cf176008d fix: preserve indexed status after .bas compilation
All checks were successful
BotServer CI/CD / build (push) Successful in 3m20s
Fixed bug where DriveMonitor would overwrite indexed=true status after
successful compilation, causing files to be recompiled on every cycle.

Changes:
- Track successful compilations in HashSet before acquiring write lock
- Set indexed=true for successfully compiled files in merge loop
- Preserve indexed status for unchanged files
- Handle compilation failures with proper fail_count tracking

This ensures new .bas files are compiled to .ast once and the indexed
status is preserved, preventing unnecessary recompilation.
2026-04-12 16:36:03 -03:00
7c4ec37700 fix: properly track compilation status in DriveMonitor
All checks were successful
BotServer CI/CD / build (push) Successful in 3m15s
- Do not mark .bas files as indexed unconditionally
- Only set indexed=true when compile_tool() completes successfully
- Reset fail_count and last_failed_at on successful compilation
- Retry failed compilations automatically on next cycle
- Fixes permanent compilation failure state for salesianos start.bas
2026-04-12 16:06:23 -03:00
9de9bc983b Fix ADD SUGGESTION preprocessor transform to avoid ADD BOT keyword conflict
All checks were successful
BotServer CI/CD / build (push) Successful in 3m20s
2026-04-12 15:32:31 -03:00
78130caaa1 chore: add clarification comment for closure clones
All checks were successful
BotServer CI/CD / build (push) Successful in 2m3s
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-12 15:07:47 -03:00
e34481b7f8 fix: separate clones for each closure to satisfy borrow checker
All checks were successful
BotServer CI/CD / build (push) Successful in 2m54s
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-12 14:26:31 -03:00
ff6f2200f0 fix: correct cache3/user_session3 reference in ADD_SUGG handler
Some checks failed
BotServer CI/CD / build (push) Failing after 45s
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-12 14:21:29 -03:00
74ac734253 fix: use single-token ADD_SUGG_TOOL to avoid ADD keyword conflicts
Some checks failed
BotServer CI/CD / build (push) Failing after 4m0s
- Replace ADD SUGGESTION TOOL with ADD_SUGG_TOOL (single token)
- Replace ADD SUGGESTION TEXT with ADD_SUGG_TEXT
- Replace ADD SUGGESTION with ADD_SUGG
- Keep ADD_SUGGESTION_TOOL as legacy alias for backward compat
- Preprocessor converts ADD SUGGESTION TOOL -> ADD_SUGG_TOOL automatically
- Eliminates collision with ADD BOT, ADD MEMBER in Rhai parser

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-12 14:14:03 -03:00
2f3dd957e3 fix: resolve kb_collections and kb_group_associations imports for directory feature
All checks were successful
BotServer CI/CD / build (push) Successful in 7m50s
- Extract kb_collections and kb_group_associations into dedicated schema/kb.rs module
- Gate kb module behind rbac feature (directory depends on rbac)
- Remove duplicate definitions from research.rs
- Fix import paths in directory/groups/kbs.rs
- Remove dead rbac_kb imports from settings/rbac.rs
- Gate llm::local module behind llm feature to fix missing set_embedding_server_ready

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-12 12:48:42 -03:00
504e6e12ad Enable directory feature in default build
Some checks failed
BotServer CI/CD / build (push) Failing after 12m19s
2026-04-12 11:46:24 -03:00
180bab0358 fix: mark embedding server ready when already running
All checks were successful
BotServer CI/CD / build (push) Successful in 3m36s
Previously, ensure_llama_servers_running() would return early
when both LLM and embedding servers were already running, without
calling set_embedding_server_ready(true). This caused DriveMonitor
to skip KB indexing with 'Embedding server not yet marked ready'.

Fix: call set_embedding_server_ready(true) before returning early
when servers are already running.
2026-04-12 10:27:23 -03:00
694fb91efe Add comment about batch_size reduction for llama-server stability
All checks were successful
BotServer CI/CD / build (push) Successful in 2m9s
2026-04-12 09:59:49 -03:00
d3673e1f34 Add KB fail state migration: fail_count and last_failed_at columns
All checks were successful
BotServer CI/CD / build (push) Successful in 48s
New migration 6.3.0-01-kb-fail-state to add columns to kb_documents
for intelligent backoff retry logic.
2026-04-12 09:43:47 -03:00
73f1898b62 Add fail_count and last_failed_at to kb_documents
All checks were successful
BotServer CI/CD / build (push) Successful in 3m7s
Simplified KB indexing state tracking - added columns directly
to kb_documents instead of separate table. This enables per-file
backoff retry logic.
2026-04-12 09:36:39 -03:00
256d55fc93 Add smart sleep based on fail_count to prevent excessive monitoring cycles
All checks were successful
BotServer CI/CD / build (push) Successful in 3m9s
- fail_count >= 3: sleep 1 hour
- fail_count >= 2: sleep 15 min
- fail_count >= 1: sleep 5 min
- fail_count = 0: sleep 10 sec (default)
2026-04-12 09:20:17 -03:00
789789e313 Fix backoff logic to be per KB folder instead of global
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
- Filter states by kb_folder_pattern (e.g. 'cartas/', 'proc/')
- Only apply backoff based on files in that specific KB folder
- Each KB folder has independent retry timing
2026-04-12 09:15:32 -03:00
ee273256fb Add backoff logic to KB indexing to prevent excessive retries
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
- fail_count 1: wait 5 minutes before retry
- fail_count 2: wait 15 minutes before retry
- fail_count 3+: wait 1 hour before retry

This prevents the 'already being indexed, skipping duplicate task' loop.
2026-04-12 09:13:33 -03:00
cd0e049e81 Reduce embedding batch_size from 16 to 2 to prevent llama-server crash
All checks were successful
BotServer CI/CD / build (push) Successful in 2m3s
The bge-small-en-v1.5-f32.gguf model has n_ctx_train=512. With batch_size=16
and ~300+ tokens per chunk, total tokens exceed 512 causing GGML_ASSERT crash.

Now with batch_size=2, embeddings are processed safely.
2026-04-12 08:21:39 -03:00
f48fa6d5f0 Add fail_count/last_failed_at to FileState for indexing retries
All checks were successful
BotServer CI/CD / build (push) Successful in 3m21s
- Skip re-indexing files that failed 3+ times within 1 hour
- Update file_states on indexing success (indexed=true, fail_count=0)
- Update file_states on indexing failure (fail_count++, last_failed_at=now)
- Don't skip KB indexing when embedding server not marked ready yet
- Embedding server health will be detected via wait_for_server() in kb_indexer
- Remove drive_monitor bypass of embedding check - let kb_indexer handle it
2026-04-12 07:47:13 -03:00
cdab04e999 Fix embedding health check: behavior-based instead of URL whitelist
All checks were successful
BotServer CI/CD / build (push) Successful in 3m32s
- Remove hardcoded URL list for remote API detection
- Try /health first, then probe with HEAD if 404/405
- Re-enable embedding server ready check in drive_monitor
- No more embedding_key hack that skipped health checks entirely
2026-04-12 07:15:54 -03:00
2bafd57046 Temp fix: Skip embedding server ready check in DriveMonitor KB indexing
All checks were successful
BotServer CI/CD / build (push) Successful in 3m19s
2026-04-12 06:58:55 -03:00
be3e4c4e54 Fix: Handle 'reasoning' field from NVIDIA kimi-k2.5 model
All checks were successful
BotServer CI/CD / build (push) Successful in 3m6s
2026-04-11 22:58:50 -03:00
47cb470c8e Fix: Handle reasoning_content from NVIDIA reasoning models (gpt-oss-120b)
All checks were successful
BotServer CI/CD / build (push) Successful in 3m16s
2026-04-11 22:30:39 -03:00
7a1ec157f1 Fix KB indexing: upsert kb_collections, consistent collection names, preserve indexed flag
All checks were successful
BotServer CI/CD / build (push) Successful in 3m23s
- Bug 1: check_gbkb_changes now preserves indexed=true from previous
  state when etag matches, preventing redundant re-indexing every cycle
- Bug 2: USE KB fallback uses bot_id_short (8 chars) instead of random
  UUID, matching the collection name convention used by DriveMonitor
- Bug 3: handle_gbkb_change now upserts into kb_collections table after
  successful indexing, so USE KB can find the collection at runtime
- Changed ON CONFLICT DO NOTHING to DO UPDATE for kb_collections inserts
- Changed process_gbkb_folder return type to Result<IndexingResult>
2026-04-11 21:26:02 -03:00
e81aee6221 fix: use bucket_name instead of bot_id (UUID) for file_states.json path
All checks were successful
BotServer CI/CD / build (push) Successful in 3m22s
File states were stored under /opt/gbo/work/{UUID}/file_states.json
but should be under /opt/gbo/work/{bucket_name}/file_states.json
like other bot data (e.g. /opt/gbo/work/salesianos.gbai/)

Also fixed file_states_static signature to use bucket_name consistently.
2026-04-11 20:40:23 -03:00