Always manage services with `systemctl` inside the `system` Incus container. Never run `/opt/gbo/bin/botserver` or `/opt/gbo/bin/botui` directly — they will fail because they won't load the `.env` file containing Vault credentials and paths. The correct commands are `sudo incus exec system -- systemctl start|stop|restart|status botserver` and the same for `ui`. Systemctl handles environment loading, auto-restart, logging, and dependencies.
Never push secrets (API keys, passwords, tokens) to git. Never commit `init.json` (it contains Vault unseal keys). All secrets must come from Vault — only `VAULT_*` variables are allowed in `.env`. Never deploy manually via scp or ssh; always use CI/CD. Always push all submodules (botserver, botui, botlib) before or alongside the main repo. Always ask before pushing to ALM.
---
## Infrastructure Overview
The host machine is accessed via `ssh user@<hostname>`, running Incus (an LXD fork) as hypervisor. All services run inside named Incus containers. You enter containers with `sudo incus exec <container> -- <command>` and list them with `sudo incus list`.
### Container Architecture
| Container | Service | Technology | Binary Path | Logs Path | Data Path | Notes |
**Prevention:** Always configure static IP when creating new containers.
---
### Alert: "ALM botserver problem" / CI Build Failed
**Symptoms:** Deploy not working, CI status shows failure
**Quick Diagnostics:**
```bash
# Check CI database for recent runs
sudo incus exec tables -- bash -c 'export PGPASSWORD=<password>; psql -h localhost -U postgres -d PROD-ALM -c "SELECT id, status, created FROM action_run ORDER BY id DESC LIMIT 5;"'
# Status codes: 0=pending, 1=success, 2=failure, 3=cancelled, 6=running
```
**Quick Fixes:**
1.**If stuck at status 6 (running):**
```bash
RUN_ID=<stuck-run-id>
sudo incus exec tables -- bash -c "export PGPASSWORD=<password>; psql -h localhost -U postgres -d PROD-ALM -c \"UPDATE action_task SET status = 0 WHERE id = $RUN_ID; UPDATE action_run_job SET status = 0 WHERE run_id = $RUN_ID; UPDATE action_run SET status = 0 WHERE id = $RUN_ID;\""
sudo incus exec system -- systemctl restart botserver
sudo incus exec system -- systemctl restart ui
```
2.**If GLIBC mismatch:** Binary compiled with wrong glibc. Must rebuild inside system container (Debian 12, glibc 2.36).
3.**If port conflict:**
```bash
sudo incus exec system -- lsof -i :8080
sudo incus exec system -- killall botserver
sudo incus exec system -- systemctl start botserver
```
---
## Services Detail
Botserver runs as user `gbuser`, binary at `/opt/gbo/bin/botserver`, logs at `/opt/gbo/logs/out.log` and `/opt/gbo/logs/err.log`, systemd unit at `/etc/systemd/system/botserver.service`, env loaded from `/opt/gbo/bin/.env`. Bot BASIC scripts are stored in MinIO Drive under `{bot}.gbai/{bot}.gbdialog/*.bas` and are downloaded/compiled by DriveMonitor to `/opt/gbo/work/{bot}.gbai/{bot}.gbdialog/*.ast`.
The directory service runs Zitadel as user `root`, binary at `/opt/gbo/bin/zitadel`, logs at `/opt/gbo/logs/zitadel.log`, systemd unit at `/etc/systemd/system/directory.service`, and loads environment from the service configuration. Zitadel provides identity management and OAuth2 services for the platform.
Internally, Zitadel listens on port 8080 within the directory container. For external access:
- Via public domain (HTTPS): `https://<login-domain>` (configured through proxy container)
- Via host IP (HTTP): `http://<host-ip>:9000` (direct container port forwarding)
- Via container IP (HTTP): `http://<directory-container-ip>:9000` (direct container access)
Access the Zitadel console at `https://<login-domain>/ui/console` with admin credentials. Zitadel implements v1 Management API (deprecated) and v2 Organization/User services. Always use the v2 endpoints under `/v2/organizations` and `/v2/users` for all operations.
The botserver bootstrap also manages: Vault (secrets), PostgreSQL (database), Valkey (cache, password auth), MinIO (object storage), Zitadel (identity provider), and llama.cpp (LLM).
To obtain a PAT for Zitadel API access, check /opt/gbo/conf/directory/admin-pat.txt in the directory container. Use it with curl by setting the Authorization header: `Authorization: Bearer $(cat /opt/gbo/conf/directory/admin-pat.txt)` and include `-H "Host: <directory-ip> "` for correct host resolution.
### CI/CD Log Retrieval from Database (PREFERRED METHOD)
The most reliable way to get CI build logs — including compiler errors — is from the Forgejo ALM database and compressed log files. The runner logs (`/opt/gbo/logs/forgejo-runner.log`) show live activity but scroll away quickly. The database retains everything.
DriveMonitor is a background service inside botserver that watches MinIO buckets and syncs changes to the local filesystem and database every 10 seconds. It monitors three directory types per bot: the `.gbdialog/` folder for BASIC scripts (downloads and recompiles on change), the `.gbot/` folder for `config.csv` (syncs to the `bot_configuration` database table), and the `.gbkb/` folder for knowledge base documents (downloads and indexes for vector search).
Bot configuration is stored in two PostgreSQL tables inside the `botserver` database. The `bot_configuration` table holds key-value pairs with columns `bot_id`, `config_key`, `config_value`, `config_type`, `is_encrypted`, and `updated_at`. The `gbot_config_sync` table tracks sync state with columns `bot_id`, `config_file_path`, `last_sync_at`, `file_hash`, and `sync_count`.
The `config.csv` format is a plain CSV with no header: each line is `key,value`, for example `llm-provider,groq` or `theme-color1,#cc0000`. DriveMonitor syncs it when the file ETag changes in MinIO, on botserver startup, or after a restart.
**Check config status:** Query `bot_configuration` via `sudo incus exec tables -- psql -h localhost -U postgres -d botserver -c "SELECT config_key, config_value FROM bot_configuration WHERE bot_id = (SELECT id FROM bots WHERE name = '<botname>') ORDER BY config_key;"`. Check sync state via the `gbot_config_sync` table. Inspect the bucket directly with `sudo incus exec drive -- /opt/gbo/bin/mc cat local/<botname>.gbai/<botname>.gbot/config.csv`.
**Debug DriveMonitor:** Monitor live logs with `sudo incus exec system -- tail -f /opt/gbo/logs/out.log | grep -E "(DRIVE_MONITOR|check_gbot|config)"`. An empty `gbot_config_sync` table means DriveMonitor has not synced yet. If no new log entries appear after 30 seconds, the loop may be stuck — restart botserver with systemctl to clear the state.
**Common config issues:** If config.csv is missing from the bucket, create and upload it with `mc cp`. If the database shows stale values, restart botserver to force a fresh sync, or as a temporary fix update the database directly with `UPDATE bot_configuration SET config_value = 'groq', updated_at = NOW() WHERE ...`. To force a re-sync without restarting, copy config.csv over itself with `mc cp local/... local/...` to change the ETag.
---
## MinIO (Drive) Operations
All bot files live in MinIO buckets. Use the `mc` CLI at `/opt/gbo/bin/mc` from inside the `drive` container. The bucket structure per bot is: `{bot}.gbai/` as root, `{bot}.gbai/{bot}.gbdialog/` for BASIC scripts, `{bot}.gbai/{bot}.gbot/` for config.csv, and `{bot}.gbai/{bot}.gbkb/` for knowledge base folders.
Common mc commands: `mc ls local/` lists all buckets; `mc ls local/botname.gbai/` lists a bucket; `mc cat local/.../start.bas` prints a file; `mc cp local/.../file /tmp/file` downloads; `mc cp /tmp/file local/.../file` uploads (this triggers DriveMonitor recompile); `mc stat local/.../config.csv` shows ETag and metadata; `mc mb local/newbot.gbai` creates a bucket; `mc rb local/oldbot.gbai` removes an empty bucket.
If mc is not found, use the full path `/opt/gbo/bin/mc`. If alias `local` is not configured, check with `mc config host list`. If MinIO is not running, check with `sudo incus exec drive -- systemctl status minio`.
---
## Vault Security Architecture
HashiCorp Vault is the single source of truth for all secrets. Botserver reads `VAULT_ADDR` and `VAULT_TOKEN` from `/opt/gbo/bin/.env` at startup, initializes a TLS/mTLS client, then reads credentials from Vault paths. If Vault is unavailable, it falls back to defaults. The `.env` file must only contain `VAULT_*` variables plus `PORT`, `DATA_DIR`, `WORK_DIR`, and `LOAD_ONLY`.
**Global Vault paths:** `gbo/tables` holds PostgreSQL credentials; `gbo/drive` holds MinIO access key and secret; `gbo/cache` holds Valkey password; `gbo/llm` holds LLM URL and API keys; `gbo/directory` holds Zitadel config; `gbo/email` holds SMTP credentials; `gbo/vectordb` holds Qdrant config; `gbo/jwt` holds JWT signing secret; `gbo/encryption` holds the master encryption key. Organization-scoped secrets follow patterns like `gbo/orgs/{org_id}/bots/{bot_id}` and tenant infrastructure uses `gbo/tenants/{tenant_id}/infrastructure`.
**Credential resolution:** For any service, botserver checks the most specific Vault path first (org+bot level), falls back to a default bot path, then falls back to the global path, and only uses environment variables as a last resort in development.
**Verify Vault health:** `sudo incus exec vault -- curl -k -sf https://localhost:8200/v1/sys/health` should return JSON with `"sealed":false`. To read a secret: set `VAULT_ADDR`, `VAULT_TOKEN`, and `VAULT_CACERT` then run `vault kv get secret/gbo/tables`. To test from the system container, use curl with `--cacert /opt/gbo/conf/system/certificates/ca/ca.crt` and `-H "X-Vault-Token: <token>"`.
**init.json** is stored at `/opt/gbo/bin/botserver-stack/conf/vault/vault-conf/init.json` and contains the root token and 5 unseal keys (3 needed to unseal). Never commit this file to git. Store it encrypted in a secure location.
**Vault troubleshooting — cannot connect:** Check that the vault container's systemd unit is running, verify the token in `.env` is not expired with `vault token lookup`, confirm the CA cert path in `.env` matches the actual file location, and test network connectivity from system to vault container. To generate a new token: `vault token create -policy="botserver" -ttl="8760h" -format=json` then update `.env` and restart botserver.
**Vault troubleshooting — secrets missing:** Run `vault kv get secret/gbo/tables` (and other paths) to check if secrets exist. If a path returns NOT FOUND, add secrets with `vault kv put secret/gbo/tables host=<ip> port=5432 database=botserver username=gbuser password=<pw>` and similar for other paths.
**Vault sealed after restart:** Run `vault operator unseal <key1>`, repeat with key2 and key3 (3 of 5 keys from init.json), then verify with `vault status`.
**TLS certificate errors:** Confirm `/opt/gbo/conf/system/certificates/ca/ca.crt` exists in the system container. If missing, copy it from the vault container using `incus file pull vault/opt/gbo/conf/vault/ca.crt /tmp/ca.crt` then place it at the expected path.
**Vault snapshots:** Stop vault, run `sudo incus snapshot create vault backup-$(date +%Y%m%d-%H%M)`, start vault. Restore with `sudo incus snapshot restore vault <name>` while vault is stopped.
---
## DNS Management
### Updating DNS Records
**CRITICAL:** When updating DNS zone files, you MUST:
1.**Update the serial number** in the SOA record (format: YYYYMMDDNN)
2.**Run sync-zones.sh** to propagate changes to secondary nameservers
3.**Anonymize IPs and credentials** in all documentation and logs
**Workflow:**
```bash
# 1. Edit zone file
sudo incus exec dns -- nano /opt/gbo/data/pragmatismo.com.br.zone
# 2. Update serial (YYYYMMDDNN format)
# Example: 2026041801 (April 18, 2026, change #1)
sudo incus exec dns -- sed -i 's/2026041801/2026041802/' /opt/gbo/data/pragmatismo.com.br.zone
- Caddy will fail to obtain certificate if DNS is not propagated
- Wait up to 1 hour for DNS propagation before adding Caddy config
- Check Caddy logs for "challenge failed" errors - indicates DNS not ready
- Certificate is automatically renewed by Caddy
---
## Troubleshooting Quick Reference
**botserver won't start:** Run `sudo incus exec system -- ldd /opt/gbo/bin/botserver | grep "not found"` to check for missing libraries. Run `sudo incus exec system -- timeout 10 /opt/gbo/bin/botserver 2>&1` to see startup errors. Confirm `/opt/gbo/work/` exists and is accessible.
**botui can't reach botserver:** Check that the `ui.service` systemd file has `BOTSERVER_URL=http://localhost:5858` — not the external HTTPS URL. Fix with `sed -i 's|BOTSERVER_URL=.*|BOTSERVER_URL=http://localhost:5858|'` on the service file, then `systemctl daemon-reload` and `systemctl restart ui`.
**Suggestions not showing:** Confirm bot `.bas` files exist in MinIO Drive under `{bot}.gbai/{bot}.gbdialog/`. Check logs for compilation errors. Clear the AST cache in `/opt/gbo/work/` and restart botserver.
**IPv6 DNS timeouts on external APIs (Groq, Cloudflare):** The container's DNS may return AAAA records without IPv6 connectivity. The container should have `IPV6=no` in its network config and `gai.conf` set appropriately. Check for `RES_OPTIONS=inet4` in `botserver.service` if issues persist.
**Logs show development paths instead of Drive:** Botserver is using hardcoded dev paths. Check `.env` has `DATA_DIR=/opt/gbo/work/` and `WORK_DIR=/opt/gbo/work/`, verify the systemd unit has `EnvironmentFile=/opt/gbo/bin/.env`, and confirm Vault is reachable so service discovery works. Expected startup log lines include `info watcher:Watching data directory /opt/gbo/work` and `info botserver:BotServer started successfully on port 5858`.
**Migrations not running after push:** If `stat /opt/gbo/bin/botserver` shows old timestamp and `__diesel_schema_migrations` table has no new entries, CI did not rebuild. Make a trivial code change (e.g., add a comment) in botserver and push again to force rebuild.
---
## Drive (MinIO) File Operations Cheatsheet
All `mc` commands run inside the `drive` container with `PATH` set: `sudo incus exec drive -- bash -c 'export PATH=/opt/gbo/bin:$PATH && mc <command>'`. If `local` alias is missing, create it with credentials from Vault path `gbo/drive`.
**List bucket contents recursively:** `mc ls local/<bot>.gbai/ --recursive`
**Read a file from Drive:** `mc cat local/<bot>.gbai/<bot>.gbdialog/start.bas`
**Download a file:** `mc cp local/<bot>.gbai/<bot>.gbdialog/start.bas /tmp/start.bas`
**Upload a file to Drive (triggers DriveMonitor recompile):** Transfer file to host via `scp`, push into drive container with `sudo incus file push /tmp/file drive/tmp/file`, then `mc put /tmp/file local/<bot>.gbai/<bot>.gbdialog/start.bas`
**Full upload workflow example — updating config.csv:**
**Application logs** (searchable, timestamped, most useful): `sudo incus exec system -- tail -f /opt/gbo/logs/err.log` (errors and debug) or `/opt/gbo/logs/out.log` (stdout). The systemd journal only captures process lifecycle events, not application output.
**Search logs for specific bot activity:** `grep -i "botname\|llm\|Model:\|KB\|USE_KB\|drive_monitor" /opt/gbo/logs/err.log | tail -30`
**Check which LLM model a bot is using:** `grep "Model:" /opt/gbo/logs/err.log | tail -5`
| zitadel | directory | `/opt/gbo/bin/zitadel` | Runs as root on port 8080 internally |
**Quick psql query — bot config:** `sudo incus exec tables -- psql -h localhost -U postgres -d botserver -c "SELECT config_key, config_value FROM bot_configuration WHERE bot_id = (SELECT id FROM bots WHERE name = 'botname') ORDER BY config_key;"`
**Quick psql query — active KBs for session:** `sudo incus exec tables -- psql -h localhost -U postgres -d botserver -c "SELECT * FROM session_kb_associations WHERE session_id = '<uuid>' AND is_active = true;"`
---
## BASIC Compilation Architecture
Compilation and runtime are now strictly separated. **Compilation** happens only in `BasicCompiler` inside DriveMonitor when it detects `.bas` file changes. The output is a fully preprocessed `.ast` file written to `work/<bot>.gbai/<bot>.gbdialog/<tool>.ast`. **Runtime** (start.bas, TOOL_EXEC, automation, schedule) loads only `.ast` files and calls `ScriptService::run()` which does `engine.compile() + eval_ast_with_scope()` on the already-preprocessed Rhai source — no preprocessing at runtime.
The `.ast` file has all transforms applied: `USE KB "cartas"` becomes `USE_KB("cartas")`, `IF/END IF` → `if/{ }`, `WHILE/WEND` → `while/{ }`, `BEGIN TALK/END TALK` → function calls, `SAVE`, `FOR EACH/NEXT`, `SELECT CASE`, `SET SCHEDULE`, `WEBHOOK`, `USE WEBSITE`, `LLM` keyword expansion, variable predeclaration, and keyword lowercasing. Runtime never calls `compile()`, `compile_tool_script()`, or `compile_preprocessed()` — those methods no longer exist.
**Tools (TOOL_EXEC) load `.ast` only** — there is no `.bas` fallback. If an `.ast` file is missing, the tool fails with "Failed to read tool .ast file". DriveMonitor must have compiled it first.
**Suggestion deduplication** uses Redis `SADD` (set) instead of `RPUSH` (list). This prevents duplicate suggestion buttons when `start.bas` runs multiple times per session. The key format is `suggestions:{bot_id}:{session_id}` and `get_suggestions` uses `SMEMBERS` to read it.