- Previous logic strictly limited results to 1 chunk per document
- This caused large documents (like ramais PDFs) to lose 90% of their
content since only the single highest-scoring chunk was kept
- Now we allow up to 10 chunks per document, while still sorting
by relevance and letting filter_by_tokens cap the overall size
- Detect HTML content (starts with <) in streaming messages and
bypass marked.parse() to render directly as innerHTML
- marked.parse() was corrupting the LLM's raw HTML output by
treating it as Markdown (escaping tags, wrapping in <p>, etc.)
- Updated PROMPT.md for Salesianos to be more explicit about
returning ramal data directly from KB context without asking
for unnecessary clarification
- Fixed ramais.bas tool (removed invalid BEGIN/END syntax)
The ADD_SUGGESTION_TOOL, ADD_SUGGESTION_TEXT, ADD_SUGGESTION, and
ADD_SWITCHER Rhai custom syntaxes expect lowercase 'as' but the
preprocessor was outputting uppercase 'AS'. This caused start.bas
to fail with 'Syntax error: Expecting as for ADD_SUGGESTION_TOOL',
which prevented KB context (USE KB) from being registered for the
session — so queries like 'ramal da Andressa' had no KB data.
Also fix: re-export CHECK_INTERVAL_SECS from drive_monitor module
to fix pre-existing private module access error.
- CHECK_INTERVAL_SECS: constante compartilhada (1 segundo)
- Protecao contra reentrancia usando is_processing
- Logging de tempo de scan para debugging
- DriveCompiler agora usa mesma constante
- Ideal para PDFs longos e .bas grandes
Two fixes for KB indexing failures with Cloudflare Workers AI:
1. check_health() now short-circuits for HTTPS URLs (remote APIs like
Cloudflare don't have /health endpoints and return 401/301/403 on
probes, which were incorrectly treated as 'unreachable')
2. index_single_file_with_id() now calls wait_for_server(30) instead
of immediately failing, giving the embedding server time to become
ready
Root cause: EMBEDDING_SERVER_READY is a global flag. When the default
bot's local embedding server check fails, it blocks ALL bots including
those using remote HTTPS APIs that don't need a local health check.
Remote APIs like Cloudflare Workers AI return 401 on /health and
301 on HEAD requests. These indicate the server IS reachable,
not down. Previously only 404/405 were treated as reachable,
causing all KB indexing to fail with 'Embedding server not available'.
- SCP to botui-new/botserver-new first, then mv into place
- Avoids 'dest open: Failure' when overwriting running binary
- pkill + systemctl stop before deploy, enable + start after
- botui was running outside systemd, so systemctl stop did nothing
- Add pkill -x as fallback after systemctl stop
- Enable service before starting so it persists across reboots
- Same pattern for both botui and botserver
- Move env block from workflow root to job level (Forgejo requirement)
- Replace hardcoded IP with ${{ vars.SYSTEM_HOST }} variable
- Fixes 'yaml: line 11: did not find expected key' error
- Applies to all 4 workflows: botlib, botserver, bottest, botui
- Add SKIP_INTEGRATION_TESTS and SKIP_E2E_TESTS env vars to bottest CI
- Add #[ignore] to email_integration_test.rs tests (need localhost:8080)
- Add #[ignore] to e2e/mod.rs tests that call TestHarness::full()
- Most integration tests already respect SKIP_INTEGRATION_TESTS env var
- Most e2e tests already respect SKIP_E2E_TESTS env var
The test was creating BotRunner::new() without setting a bot, causing
execute_bot_logic to fail with 'No bot configured' and return
response: None. Now calls set_bot(Bot::default()) before the session.
- Add shutdown tracing and 15s forced exit to prevent SIGTERM hangs
- Fix E0583: remove self-referential mod declarations in bottest integration files
- Fix E0599: correct .status() call on Result in performance.rs
- Fix botui CI deploy: use systemctl stop/start instead of pkill+nohup
- Update PROD.md with DB-driven CI log retrieval method
- Stop botserver via 'sudo systemctl stop' before SCP
- Start botserver via 'sudo systemctl start' after copy
- Use health check endpoint to verify deployment
- CI runner runs on alm-ci container but must deploy to system container
- Use scp to transfer binary from alm-ci to system (10.157.134.196)
- SSH to system container to stop old process, copy binary, restart
The forgejo-runner service inherits RUSTC_WRAPPER=sccache from
systemd environment. Set RUSTC_WRAPPER="" in workflow env to
override and prevent permission denied errors.
- Remove RUSTC_WRAPPER=sccache from all workflows (permission denied
in act container environment)
- Fix deploy paths to use CARGO_TARGET_DIR=/opt/gbo/work/target
instead of relative target/debug
- Remove path triggers from botserver workflow (all pushes trigger)
- Add mkdir for target and bin dirs in setup steps
- Fix all workflows to use /opt/gbo/work/generalbots (monorepo)
- Add proper env vars (SCCACHE, CARGO_TARGET_DIR, PATH) to all workflows
- Add deploy steps for botui (with process restart)
- Remove broken workflows for non-Rust packages (botapp, botbook,
botdevice, botmodels, botplugin)
- Add botlib test workflow
- drop(stream_tx) after spawning LLM task so stream_rx.recv() loop ends
when LLM finishes. Without this, the streaming loop hung forever and
is_complete:true + suggestions were never sent to WebSocket clients.
- Add single-arg ADD_SUGGESTION "text" syntax (registered LAST for
highest Rhai priority so it matches before 2-arg form).
- convert_keywords_to_lowercase() now only lowercases Rhai built-in
keywords (IF, ELSE, WHILE, etc.), not custom syntax keywords (TALK,
HEAR, ADD_SUGGESTION) which are case-sensitive in Rhai.
- sync_bas_to_work() downloads changed .bas files from S3 to work dir
when etag changes, preventing stale local copies used by compiler.
- Only upsert drive_files when ETag actually changed (was re-processing all files every 60s cycle)
- Skip S3 directory entries (keys ending with '/') to avoid storing stale directory markers
- Add debug-level logging for unchanged file skips
- Fixes noisy 'Added/updated drive_files' spam on every scan cycle
- Replace docs/sheet/slides with kb-extraction in default features (~4-6min compile time savings, ~300MB less disk)
- Add kb-extraction feature using zip+quick-xml+calamine for lightweight KB extraction
- Split document_processor.rs (829 lines) into mod.rs+types.rs+ooxml_extract.rs+rtf.rs
- Move DOCX/PPTX ZIP-based extraction to document_processor::ooxml_extract (no ooxmlsdk needed)
- Remove dead code: save_docx_preserving(), save_pptx_preserving() (zero callers)
- Fix dep: prefix for optional dependencies in feature definitions
- DriveMonitor: full S3 sync, ETag change detection, KB incremental indexing, config.csv sync
- ConfigManager: real DB reads from bot_configuration table
- 0 warnings, 0 errors on both default and full feature builds
- Fixed hardcoded port 9000 to 8300 (Zitadel default)
- Added base_url default with fallback to Vault URL
- Allows external Zitadel server configuration via Vault
- facade.rs: Updated help message with correct port
- Install Rust via rustup as SUDO_USER (not root)
- Install mold linker via system packages (apt/dnf/pacman)
- Install sccache via cargo install as SUDO_USER
- Set default toolchain to stable on install
- Use run_as_user helper for all cargo/rustup commands