Add Chapter 19: Maintenance and Updates documentation

- Create maintenance chapter with component update guides
- Add updating-components.md with step-by-step procedures for all stack components
- Add component-reference.md with versions, URLs, checksums, and alternatives for each service
- Add security-auditing.md with cargo audit, CVE monitoring, Trivy/Grype scanning
- Add backup-recovery.md with full backup/restore procedures
- Add troubleshooting.md for common issues and solutions
- Update SUMMARY.md with new chapter entry
This commit is contained in:
Rodrigo Rodriguez (Pragmatismo) 2025-12-10 12:55:05 -03:00
parent ebe7e19ab7
commit be661f7cf2
7 changed files with 2589 additions and 0 deletions

View file

@ -0,0 +1,78 @@
# Chapter 19: Maintenance and Updates
BotServer includes a complete stack of self-hosted services that power your bots. This chapter covers how to maintain, update, and troubleshoot these components.
## Stack Components Overview
BotServer automatically installs and manages these services:
| Component | Service | Default Port | Purpose |
|-----------|---------|--------------|---------|
| **vault** | HashiCorp Vault | 8200 | Secrets management |
| **tables** | PostgreSQL | 5432 | Primary database |
| **directory** | Zitadel | 8080 | Identity & access management |
| **drive** | MinIO | 9000, 9001 | Object storage (S3-compatible) |
| **cache** | Valkey | 6379 | In-memory cache (Redis-compatible) |
| **llm** | llama.cpp | 8081, 8082 | Local LLM & embedding server |
| **email** | Stalwart | 25, 993 | Mail server |
| **proxy** | Caddy | 443, 80 | HTTPS reverse proxy |
| **dns** | CoreDNS | 53 | Local DNS resolution |
| **alm** | Forgejo | 3000 | Git repository (ALM) |
| **alm_ci** | Forgejo Runner | - | CI/CD runner |
| **meeting** | LiveKit | 7880 | Video conferencing |
## Directory Structure
```
botserver-stack/
├── bin/ # Service binaries
│ ├── vault/
│ ├── tables/
│ ├── drive/
│ ├── cache/
│ ├── llm/
│ └── ...
├── conf/ # Configuration files
├── data/ # Persistent data
└── logs/ # Service logs
botserver-installers/ # Downloaded archives (cache)
```
## Why Self-Hosted?
1. **Privacy** - Data never leaves your infrastructure
2. **Offline** - Works without internet after initial setup
3. **Cost** - No per-user or API fees
4. **Control** - Full access to all services
5. **Compliance** - Meet data residency requirements
## Chapter Contents
- [Updating Components](./updating-components.md) - How to update individual services
- [Component Reference](./component-reference.md) - Detailed info for each component
- [Security Auditing](./security-auditing.md) - Running security audits
- [Backup and Recovery](./backup-recovery.md) - Data protection strategies
- [Troubleshooting](./troubleshooting.md) - Common issues and solutions
## Quick Commands
```bash
# Check service status
./botserver status
# View logs
tail -f botserver-stack/logs/llm.log
# Restart all services
./botserver restart
# Update a specific component
./botserver update llm
```
## Related Documentation
- [Installation](../01-introduction/installation.md) - Initial setup
- [Secrets Management](../08-config/secrets-management.md) - Vault configuration
- [LLM Configuration](../08-config/llm-config.md) - AI model settings

View file

@ -0,0 +1,449 @@
# Backup and Recovery
Protecting your BotServer data requires regular backups of databases, configurations, and file storage. This guide covers backup strategies, procedures, and disaster recovery.
---
## What to Backup
| Component | Data Location | Priority | Method |
|-----------|---------------|----------|--------|
| PostgreSQL | `botserver-stack/data/tables/` | **Critical** | pg_dump |
| Vault | `botserver-stack/data/vault/` | **Critical** | Vault snapshot |
| MinIO | `botserver-stack/data/drive/` | **Critical** | mc mirror |
| Configurations | `botserver-stack/conf/` | High | File copy |
| Bot Packages | S3 buckets (*.gbai) | High | mc mirror |
| Models | `botserver-stack/data/llm/` | Medium | File copy |
| Logs | `botserver-stack/logs/` | Low | Optional |
---
## Quick Backup Commands
```bash
# Full backup (all components)
./botserver backup
# Backup specific component
./botserver backup tables
./botserver backup drive
./botserver backup vault
# Backup to specific location
./botserver backup --output /mnt/backup/$(date +%Y%m%d)
```
---
## Database Backup (PostgreSQL)
### Full Database Dump
```bash
# Using pg_dump
pg_dump $DATABASE_URL > backup-$(date +%Y%m%d-%H%M%S).sql
# Compressed backup
pg_dump $DATABASE_URL | gzip > backup-$(date +%Y%m%d).sql.gz
# Custom format (faster restore)
pg_dump -Fc $DATABASE_URL > backup-$(date +%Y%m%d).dump
```
### Incremental Backups with WAL
Enable WAL archiving in `postgresql.conf`:
```ini
wal_level = replica
archive_mode = on
archive_command = 'cp %p /backup/wal/%f'
```
### Automated Database Backup Script
```bash
#!/bin/bash
# backup-database.sh
BACKUP_DIR="/backup/postgres"
RETENTION_DAYS=30
DATE=$(date +%Y%m%d-%H%M%S)
mkdir -p $BACKUP_DIR
# Create backup
pg_dump -Fc $DATABASE_URL > "$BACKUP_DIR/botserver-$DATE.dump"
# Remove old backups
find $BACKUP_DIR -name "*.dump" -mtime +$RETENTION_DAYS -delete
echo "Backup complete: botserver-$DATE.dump"
```
### Database Restore
```bash
# From SQL dump
psql $DATABASE_URL < backup.sql
# From custom format (faster)
pg_restore -d $DATABASE_URL backup.dump
# Drop and recreate (clean restore)
pg_restore -c -d $DATABASE_URL backup.dump
```
---
## Vault Backup
### Snapshot Method
```bash
# Create Vault snapshot
VAULT_ADDR=http://localhost:8200 vault operator raft snapshot save vault-backup-$(date +%Y%m%d).snap
```
### File-Based Backup
```bash
# Stop Vault first
./botserver stop vault
# Copy data directory
tar -czvf vault-data-$(date +%Y%m%d).tar.gz botserver-stack/data/vault/
# Copy unseal keys (store securely!)
cp botserver-stack/conf/vault/init.json /secure/location/
```
### Vault Restore
```bash
# Stop Vault
./botserver stop vault
# Restore data
rm -rf botserver-stack/data/vault/*
tar -xzvf vault-data-backup.tar.gz -C botserver-stack/data/
# Start and unseal
./botserver start vault
./botserver unseal
```
**Warning:** Keep `init.json` (unseal keys and root token) in a secure, separate location!
---
## Object Storage Backup (MinIO)
### Using MinIO Client (mc)
```bash
# Configure mc
mc alias set local http://localhost:9000 $DRIVE_ACCESS_KEY $DRIVE_SECRET_KEY
# Backup all buckets
mc mirror local/ /backup/minio/
# Backup specific bot
mc mirror local/mybot.gbai /backup/bots/mybot.gbai/
```
### Sync to Remote Storage
```bash
# Backup to S3
mc mirror local/ s3/botserver-backup/
# Backup to Backblaze B2
mc mirror local/ b2/botserver-backup/
# Backup to another MinIO
mc mirror local/ remote/botserver-backup/
```
### Restore from Backup
```bash
# Restore all buckets
mc mirror /backup/minio/ local/
# Restore specific bucket
mc mirror /backup/bots/mybot.gbai/ local/mybot.gbai/
```
---
## Configuration Backup
### Full Configuration Backup
```bash
# Backup all configs
tar -czvf config-backup-$(date +%Y%m%d).tar.gz \
botserver-stack/conf/ \
3rdparty.toml \
.env
# Exclude certificates (backup separately with encryption)
tar -czvf config-backup-$(date +%Y%m%d).tar.gz \
--exclude='certificates' \
botserver-stack/conf/
```
### Certificate Backup (Encrypted)
```bash
# Backup certificates with encryption
tar -cz botserver-stack/conf/system/certificates/ | \
gpg --symmetric --cipher-algo AES256 > certs-backup.tar.gz.gpg
```
### Restore Configuration
```bash
# Restore configs
tar -xzvf config-backup.tar.gz
# Restore encrypted certificates
gpg --decrypt certs-backup.tar.gz.gpg | tar -xz
```
---
## Full System Backup
### Complete Backup Script
```bash
#!/bin/bash
# full-backup.sh
set -e
BACKUP_DIR="/backup/botserver/$(date +%Y%m%d)"
mkdir -p "$BACKUP_DIR"
echo "Starting full backup to $BACKUP_DIR"
# 1. Database
echo "Backing up database..."
pg_dump -Fc $DATABASE_URL > "$BACKUP_DIR/database.dump"
# 2. Vault snapshot
echo "Backing up Vault..."
VAULT_ADDR=http://localhost:8200 vault operator raft snapshot save "$BACKUP_DIR/vault.snap" 2>/dev/null || \
tar -czvf "$BACKUP_DIR/vault-data.tar.gz" botserver-stack/data/vault/
# 3. Object storage
echo "Backing up drive..."
mc mirror local/ "$BACKUP_DIR/drive/" --quiet
# 4. Configurations
echo "Backing up configurations..."
tar -czvf "$BACKUP_DIR/config.tar.gz" \
botserver-stack/conf/ \
3rdparty.toml \
.env \
config/
# 5. Models (optional, large files)
if [ "$1" == "--include-models" ]; then
echo "Backing up models..."
tar -czvf "$BACKUP_DIR/models.tar.gz" botserver-stack/data/llm/
fi
# Create manifest
echo "Creating manifest..."
cat > "$BACKUP_DIR/manifest.txt" << EOF
BotServer Backup
Date: $(date)
Host: $(hostname)
Contents:
- database.dump: PostgreSQL database
- vault.snap: Vault secrets
- drive/: Object storage contents
- config.tar.gz: Configuration files
EOF
echo "Backup complete: $BACKUP_DIR"
du -sh "$BACKUP_DIR"
```
### Scheduled Backups
Add to crontab:
```bash
# Daily database backup at 2 AM
0 2 * * * /opt/botserver/scripts/backup-database.sh
# Weekly full backup on Sunday at 3 AM
0 3 * * 0 /opt/botserver/scripts/full-backup.sh
# Monthly backup with models
0 4 1 * * /opt/botserver/scripts/full-backup.sh --include-models
```
---
## Disaster Recovery
### Recovery Procedure
1. **Install fresh BotServer**
```bash
./botserver --skip-bootstrap
```
2. **Restore configurations**
```bash
tar -xzvf config-backup.tar.gz
```
3. **Restore Vault**
```bash
tar -xzvf vault-data.tar.gz
./botserver start vault
./botserver unseal
```
4. **Restore database**
```bash
./botserver start tables
pg_restore -d $DATABASE_URL database.dump
```
5. **Restore object storage**
```bash
./botserver start drive
mc mirror /backup/drive/ local/
```
6. **Start remaining services**
```bash
./botserver start
```
7. **Verify**
```bash
./botserver status
./botserver test
```
### Recovery Time Objectives
| Scenario | RTO Target | Method |
|----------|------------|--------|
| Single component failure | < 15 min | Restart/restore component |
| Database corruption | < 1 hour | pg_restore from backup |
| Full server failure | < 4 hours | Full restore procedure |
| Data center failure | < 24 hours | Geo-replicated restore |
---
## Backup Verification
### Test Restore Regularly
```bash
# Restore to test environment
./botserver --test-restore /backup/latest/
# Verify database integrity
pg_restore --list database.dump
psql $DATABASE_URL -c "SELECT COUNT(*) FROM bots;"
# Verify drive contents
mc ls local/
```
### Backup Integrity Checks
```bash
# Verify backup file integrity
sha256sum /backup/*/database.dump > /backup/checksums.txt
# Verify on restore
sha256sum -c /backup/checksums.txt
```
---
## Cloud Backup Integration
### AWS S3
```bash
# Configure AWS CLI
aws configure
# Sync backups to S3
aws s3 sync /backup/botserver/ s3://my-backup-bucket/botserver/
# Enable versioning for point-in-time recovery
aws s3api put-bucket-versioning \
--bucket my-backup-bucket \
--versioning-configuration Status=Enabled
```
### Backblaze B2
```bash
# Configure rclone
rclone config
# Sync backups
rclone sync /backup/botserver/ b2:my-backup-bucket/botserver/
```
### Encrypted Remote Backup
```bash
# Encrypt before upload
tar -cz /backup/botserver/ | \
gpg --symmetric --cipher-algo AES256 | \
aws s3 cp - s3://my-backup-bucket/botserver-$(date +%Y%m%d).tar.gz.gpg
```
---
## Retention Policy
| Backup Type | Retention | Storage |
|-------------|-----------|---------|
| Hourly snapshots | 24 hours | Local |
| Daily backups | 30 days | Local + Remote |
| Weekly backups | 12 weeks | Remote |
| Monthly backups | 12 months | Remote (cold) |
| Yearly backups | 7 years | Archive |
### Cleanup Script
```bash
#!/bin/bash
# cleanup-backups.sh
BACKUP_DIR="/backup/botserver"
# Remove daily backups older than 30 days
find $BACKUP_DIR/daily -mtime +30 -delete
# Remove weekly backups older than 12 weeks
find $BACKUP_DIR/weekly -mtime +84 -delete
# Remove monthly backups older than 12 months
find $BACKUP_DIR/monthly -mtime +365 -delete
```
---
## See Also
- [Updating Components](./updating-components.md) - Safe update procedures
- [Troubleshooting](./troubleshooting.md) - Recovery from common issues
- [Security Auditing](./security-auditing.md) - Protecting backup data

View file

@ -0,0 +1,501 @@
# Component Reference
This reference provides detailed information about each component in the BotServer stack, including current versions, alternatives, and configuration options.
---
## Core Components
### Vault (Secrets Management)
| Property | Value |
|----------|-------|
| **Service** | HashiCorp Vault |
| **Current Version** | 1.15.4 |
| **Default Port** | 8200 |
| **Binary Path** | `botserver-stack/bin/vault/vault` |
| **Config Path** | `botserver-stack/conf/vault/` |
| **Data Path** | `botserver-stack/data/vault/` |
| **Log File** | `botserver-stack/logs/vault.log` |
**Download URL:**
```
https://releases.hashicorp.com/vault/1.15.4/vault_1.15.4_linux_amd64.zip
```
**Purpose:**
- Stores all service credentials (database, drive, cache)
- Manages encryption keys
- Provides secrets rotation
- Issues short-lived tokens
**Alternatives:**
| Alternative | License | Notes |
|-------------|---------|-------|
| [OpenBao](https://openbao.org/) | MPL-2.0 | Fork of Vault, fully open source |
| [Infisical](https://infisical.com/) | MIT | Modern secrets management |
| [SOPS](https://github.com/getsops/sops) | MPL-2.0 | File-based encryption |
| [Doppler](https://doppler.com/) | Proprietary | Cloud-based alternative |
---
### PostgreSQL (Tables/Database)
| Property | Value |
|----------|-------|
| **Service** | PostgreSQL |
| **Current Version** | 17.2.0 |
| **Default Port** | 5432 |
| **Binary Path** | `botserver-stack/bin/tables/` |
| **Config Path** | `botserver-stack/conf/tables/` |
| **Data Path** | `botserver-stack/data/tables/` |
| **Log File** | `botserver-stack/logs/postgres.log` |
**Download URL:**
```
https://github.com/theseus-rs/postgresql-binaries/releases/download/17.2.0/postgresql-17.2.0-x86_64-unknown-linux-gnu.tar.gz
```
**Purpose:**
- Primary relational database
- Stores bot configurations, users, conversations
- Supports full-text search
- Handles transactions and ACID compliance
**Alternatives:**
| Alternative | License | Notes |
|-------------|---------|-------|
| [CockroachDB](https://www.cockroachlabs.com/) | BSL/CCL | Distributed SQL, PostgreSQL-compatible |
| [YugabyteDB](https://www.yugabyte.com/) | Apache-2.0 | Distributed PostgreSQL |
| [Neon](https://neon.tech/) | Apache-2.0 | Serverless PostgreSQL |
| [Supabase](https://supabase.com/) | Apache-2.0 | PostgreSQL with extras |
---
### Zitadel (Directory/Identity)
| Property | Value |
|----------|-------|
| **Service** | Zitadel |
| **Current Version** | 2.70.4 |
| **Default Port** | 8080 |
| **Binary Path** | `botserver-stack/bin/directory/zitadel` |
| **Config Path** | `botserver-stack/conf/directory/` |
| **Data Path** | Uses PostgreSQL |
| **Log File** | `botserver-stack/logs/zitadel.log` |
**Download URL:**
```
https://github.com/zitadel/zitadel/releases/download/v2.70.4/zitadel-linux-amd64.tar.gz
```
**Purpose:**
- User authentication and authorization
- OAuth2/OIDC provider
- Single Sign-On (SSO)
- Multi-factor authentication
- Service credential provisioning
**Alternatives:**
| Alternative | License | Notes |
|-------------|---------|-------|
| [Keycloak](https://www.keycloak.org/) | Apache-2.0 | Java-based, feature-rich |
| [Authentik](https://goauthentik.io/) | Custom OSS | Python-based, modern UI |
| [Authelia](https://www.authelia.com/) | Apache-2.0 | Lightweight, Nginx integration |
| [Ory](https://www.ory.sh/) | Apache-2.0 | Modular identity infrastructure |
| [Casdoor](https://casdoor.org/) | Apache-2.0 | Go-based, UI-focused |
---
### MinIO (Drive/Object Storage)
| Property | Value |
|----------|-------|
| **Service** | MinIO |
| **Current Version** | Latest |
| **Default Ports** | 9000 (API), 9001 (Console) |
| **Binary Path** | `botserver-stack/bin/drive/minio` |
| **Config Path** | `botserver-stack/conf/drive/` |
| **Data Path** | `botserver-stack/data/drive/` |
| **Log File** | `botserver-stack/logs/minio.log` |
**Download URL:**
```
https://dl.min.io/server/minio/release/linux-amd64/minio
```
**Purpose:**
- S3-compatible object storage
- Stores bot packages (.gbai, .gbkb, etc.)
- File uploads and downloads
- Static asset hosting
**Alternatives:**
| Alternative | License | Notes |
|-------------|---------|-------|
| [SeaweedFS](https://github.com/seaweedfs/seaweedfs) | Apache-2.0 | Distributed, fast |
| [Garage](https://garagehq.deuxfleurs.fr/) | AGPL-3.0 | Lightweight, geo-distributed |
| [Ceph](https://ceph.io/) | LGPL-2.1 | Enterprise-grade, complex |
| [LakeFS](https://lakefs.io/) | Apache-2.0 | Git-like versioning for data |
---
### Valkey (Cache)
| Property | Value |
|----------|-------|
| **Service** | Valkey |
| **Current Version** | 8.0.2 |
| **Default Port** | 6379 |
| **Binary Path** | `botserver-stack/bin/cache/valkey-server` |
| **Config Path** | `botserver-stack/conf/cache/` |
| **Data Path** | `botserver-stack/data/cache/` |
| **Log File** | `botserver-stack/logs/valkey.log` |
**Download URL:**
```
https://github.com/valkey-io/valkey/archive/refs/tags/8.0.2.tar.gz
```
**Note:** Valkey requires compilation from source. Build dependencies: `gcc`, `make`
**Purpose:**
- In-memory caching
- Session storage
- Rate limiting
- Pub/Sub messaging
- Queue management
**Alternatives:**
| Alternative | License | Notes |
|-------------|---------|-------|
| [KeyDB](https://docs.keydb.dev/) | BSD-3 | Multi-threaded Redis fork |
| [Dragonfly](https://www.dragonflydb.io/) | BSL | High-performance, Redis-compatible |
| [Garnet](https://github.com/microsoft/garnet) | MIT | Microsoft's cache store |
| [Skytable](https://skytable.io/) | AGPL-3.0 | Modern NoSQL |
---
### llama.cpp (LLM Server)
| Property | Value |
|----------|-------|
| **Service** | llama.cpp |
| **Current Version** | b7345 |
| **Default Ports** | 8081 (LLM), 8082 (Embedding) |
| **Binary Path** | `botserver-stack/bin/llm/llama-server` |
| **Config Path** | `botserver-stack/conf/llm/` |
| **Data Path** | `botserver-stack/data/llm/` (models) |
| **Log File** | `botserver-stack/logs/llm.log` |
**Download URLs by Platform:**
| Platform | URL |
|----------|-----|
| Linux x64 | `https://github.com/ggml-org/llama.cpp/releases/download/b7345/llama-b7345-bin-ubuntu-x64.zip` |
| Linux x64 Vulkan | `https://github.com/ggml-org/llama.cpp/releases/download/b7345/llama-b7345-bin-ubuntu-vulkan-x64.zip` |
| macOS ARM64 | `https://github.com/ggml-org/llama.cpp/releases/download/b7345/llama-b7345-bin-macos-arm64.zip` |
| macOS x64 | `https://github.com/ggml-org/llama.cpp/releases/download/b7345/llama-b7345-bin-macos-x64.zip` |
| Windows x64 | `https://github.com/ggml-org/llama.cpp/releases/download/b7345/llama-b7345-bin-win-cpu-x64.zip` |
| Windows CUDA 12 | `https://github.com/ggml-org/llama.cpp/releases/download/b7345/llama-b7345-bin-win-cuda-12.4-x64.zip` |
| Windows CUDA 13 | `https://github.com/ggml-org/llama.cpp/releases/download/b7345/llama-b7345-bin-win-cuda-13.1-x64.zip` |
**SHA256 Checksums:**
```
llama-b7345-bin-ubuntu-x64.zip: 91b066ecc53c20693a2d39703c12bc7a69c804b0768fee064d47df702f616e52
llama-b7345-bin-ubuntu-vulkan-x64.zip: 03f0b3acbead2ddc23267073a8f8e0207937c849d3704c46c61cf167c1001442
llama-b7345-bin-macos-arm64.zip: 72ae9b4a4605aa1223d7aabaa5326c66c268b12d13a449fcc06f61099cd02a52
llama-b7345-bin-macos-x64.zip: bec6b805cf7533f66b38f29305429f521dcb2be6b25dbce73a18df448ec55cc5
llama-b7345-bin-win-cpu-x64.zip: ea449082c8e808a289d9a1e8331f90a0379ead4dd288a1b9a2d2c0a7151836cd
llama-b7345-bin-win-cuda-12.4-x64.zip: 7a82aba2662fa7d4477a7a40894de002854bae1ab8b0039888577c9a2ca24cae
llama-b7345-bin-win-cuda-13.1-x64.zip: 06ea715cefb07e9862394e6d1ffa066f4c33add536b1f1aa058723f86ae05572
```
**Purpose:**
- Local LLM inference
- Text embeddings for semantic search
- OpenAI-compatible API
- Supports GGUF model format
**Alternatives:**
| Alternative | License | Notes |
|-------------|---------|-------|
| [Ollama](https://ollama.ai/) | MIT | User-friendly, model management |
| [vLLM](https://github.com/vllm-project/vllm) | Apache-2.0 | High throughput, production-grade |
| [Text Generation Inference](https://github.com/huggingface/text-generation-inference) | Apache-2.0 | HuggingFace's solution |
| [LocalAI](https://localai.io/) | MIT | Drop-in OpenAI replacement |
| [LM Studio](https://lmstudio.ai/) | Proprietary | Desktop GUI application |
---
## Supporting Components
### Stalwart (Email Server)
| Property | Value |
|----------|-------|
| **Service** | Stalwart Mail Server |
| **Current Version** | 0.10.7 |
| **Default Ports** | 25 (SMTP), 993 (IMAPS), 587 (Submission) |
| **Binary Path** | `botserver-stack/bin/email/stalwart-mail` |
| **Config Path** | `botserver-stack/conf/email/` |
| **Data Path** | `botserver-stack/data/email/` |
| **Log File** | `botserver-stack/logs/stalwart.log` |
**Download URL:**
```
https://github.com/stalwartlabs/mail-server/releases/download/v0.10.7/stalwart-mail-x86_64-linux.tar.gz
```
**Purpose:**
- Full email server (SMTP, IMAP, JMAP)
- Email sending and receiving
- Spam filtering
- DKIM/SPF/DMARC support
**Alternatives:**
| Alternative | License | Notes |
|-------------|---------|-------|
| [Maddy](https://maddy.email/) | GPL-3.0 | Composable mail server |
| [Mail-in-a-Box](https://mailinabox.email/) | CC0 | All-in-one solution |
| [Postal](https://postalserver.io/) | MIT | Sending-focused |
| [Haraka](https://haraka.github.io/) | MIT | Node.js SMTP |
---
### Caddy (Proxy)
| Property | Value |
|----------|-------|
| **Service** | Caddy |
| **Current Version** | 2.9.1 |
| **Default Ports** | 443 (HTTPS), 80 (HTTP) |
| **Binary Path** | `botserver-stack/bin/proxy/caddy` |
| **Config Path** | `botserver-stack/conf/proxy/Caddyfile` |
| **Data Path** | `botserver-stack/data/proxy/` |
| **Log File** | `botserver-stack/logs/caddy.log` |
**Download URL:**
```
https://github.com/caddyserver/caddy/releases/download/v2.9.1/caddy_2.9.1_linux_amd64.tar.gz
```
**Purpose:**
- Automatic HTTPS with Let's Encrypt
- Reverse proxy for all services
- Load balancing
- HTTP/2 and HTTP/3 support
**Alternatives:**
| Alternative | License | Notes |
|-------------|---------|-------|
| [Nginx](https://nginx.org/) | BSD-2 | Industry standard |
| [Traefik](https://traefik.io/) | MIT | Cloud-native, auto-discovery |
| [HAProxy](https://www.haproxy.org/) | GPL-2.0 | High performance |
| [Envoy](https://www.envoyproxy.io/) | Apache-2.0 | Service mesh ready |
---
### CoreDNS (DNS)
| Property | Value |
|----------|-------|
| **Service** | CoreDNS |
| **Current Version** | 1.11.1 |
| **Default Port** | 53 |
| **Binary Path** | `botserver-stack/bin/dns/coredns` |
| **Config Path** | `botserver-stack/conf/dns/Corefile` |
| **Log File** | `botserver-stack/logs/coredns.log` |
**Download URL:**
```
https://github.com/coredns/coredns/releases/download/v1.11.1/coredns_1.11.1_linux_amd64.tgz
```
**Purpose:**
- Local DNS resolution
- Service discovery (*.botserver.local)
- DNS-based load balancing
**Alternatives:**
| Alternative | License | Notes |
|-------------|---------|-------|
| [PowerDNS](https://www.powerdns.com/) | GPL-2.0 | Feature-rich, authoritative |
| [Unbound](https://nlnetlabs.nl/projects/unbound/) | BSD | Validating resolver |
| [dnsmasq](https://thekelleys.org.uk/dnsmasq/doc.html) | GPL-2.0 | Lightweight |
---
### Forgejo (ALM/Git)
| Property | Value |
|----------|-------|
| **Service** | Forgejo |
| **Current Version** | 10.0.2 |
| **Default Port** | 3000 |
| **Binary Path** | `botserver-stack/bin/alm/forgejo` |
| **Config Path** | `botserver-stack/conf/alm/` |
| **Data Path** | `botserver-stack/data/alm/` |
| **Log File** | `botserver-stack/logs/forgejo.log` |
**Download URL:**
```
https://codeberg.org/forgejo/forgejo/releases/download/v10.0.2/forgejo-10.0.2-linux-amd64
```
**Purpose:**
- Git repository hosting
- Issue tracking
- CI/CD pipelines
- Code review
**Alternatives:**
| Alternative | License | Notes |
|-------------|---------|-------|
| [Gitea](https://gitea.io/) | MIT | Original project |
| [GitLab](https://gitlab.com/) | MIT (CE) | Full DevOps platform |
| [Gogs](https://gogs.io/) | MIT | Lightweight |
| [OneDev](https://onedev.io/) | MIT | Built-in CI/CD |
---
### LiveKit (Meeting/Video)
| Property | Value |
|----------|-------|
| **Service** | LiveKit |
| **Current Version** | 2.8.2 |
| **Default Ports** | 7880 (HTTP), 7881 (RTC) |
| **Binary Path** | `botserver-stack/bin/meeting/livekit-server` |
| **Config Path** | `botserver-stack/conf/meeting/` |
| **Log File** | `botserver-stack/logs/livekit.log` |
**Download URL:**
```
https://github.com/livekit/livekit/releases/download/v2.8.2/livekit_2.8.2_linux_amd64.tar.gz
```
**Purpose:**
- Real-time video/audio communication
- WebRTC infrastructure
- Screen sharing
- Recording
**Alternatives:**
| Alternative | License | Notes |
|-------------|---------|-------|
| [Jitsi](https://jitsi.org/) | Apache-2.0 | Full-featured, established |
| [BigBlueButton](https://bigbluebutton.org/) | LGPL-3.0 | Education-focused |
| [Janus](https://janus.conf.meetecho.com/) | GPL-3.0 | WebRTC gateway |
| [mediasoup](https://mediasoup.org/) | ISC | Node.js SFU |
---
## Optional Components
### Qdrant (Vector Database)
| Property | Value |
|----------|-------|
| **Service** | Qdrant |
| **Current Version** | Latest |
| **Default Ports** | 6333 (HTTP), 6334 (gRPC) |
| **Binary Path** | `botserver-stack/bin/vector_db/qdrant` |
**Download URL:**
```
https://github.com/qdrant/qdrant/releases/latest/download/qdrant-x86_64-unknown-linux-gnu.tar.gz
```
**Purpose:**
- Vector similarity search
- Knowledge base embeddings
- Semantic search
**Alternatives:**
| Alternative | License | Notes |
|-------------|---------|-------|
| [Milvus](https://milvus.io/) | Apache-2.0 | Distributed, scalable |
| [Weaviate](https://weaviate.io/) | BSD-3 | GraphQL API |
| [Chroma](https://www.trychroma.com/) | Apache-2.0 | Simple, embedded |
| [pgvector](https://github.com/pgvector/pgvector) | PostgreSQL | PostgreSQL extension |
---
### InfluxDB (Time Series)
| Property | Value |
|----------|-------|
| **Service** | InfluxDB |
| **Current Version** | 2.7.5 |
| **Default Port** | 8086 |
| **Binary Path** | `botserver-stack/bin/timeseries_db/influxd` |
**Download URL:**
```
https://download.influxdata.com/influxdb/releases/influxdb2-2.7.5-linux-amd64.tar.gz
```
**Purpose:**
- Metrics storage
- Time-series analytics
- Monitoring dashboards
**Alternatives:**
| Alternative | License | Notes |
|-------------|---------|-------|
| [TimescaleDB](https://www.timescale.com/) | Apache-2.0 | PostgreSQL extension |
| [VictoriaMetrics](https://victoriametrics.com/) | Apache-2.0 | Prometheus-compatible |
| [QuestDB](https://questdb.io/) | Apache-2.0 | High-performance SQL |
| [Prometheus](https://prometheus.io/) | Apache-2.0 | Monitoring-focused |
---
## Default LLM Models
### DeepSeek R1 Distill Qwen 1.5B
| Property | Value |
|----------|-------|
| **Filename** | `DeepSeek-R1-Distill-Qwen-1.5B-Q3_K_M.gguf` |
| **Size** | ~1.1 GB |
| **RAM Required** | 4 GB |
| **Use Case** | Default conversational model |
**Download URL:**
```
https://huggingface.co/bartowski/DeepSeek-R1-Distill-Qwen-1.5B-GGUF/resolve/main/DeepSeek-R1-Distill-Qwen-1.5B-Q3_K_M.gguf
```
### BGE Small EN v1.5
| Property | Value |
|----------|-------|
| **Filename** | `bge-small-en-v1.5-f32.gguf` |
| **Size** | ~130 MB |
| **RAM Required** | 512 MB |
| **Use Case** | Text embeddings for semantic search |
**Download URL:**
```
https://huggingface.co/CompendiumLabs/bge-small-en-v1.5-gguf/resolve/main/bge-small-en-v1.5-f32.gguf
```
---
## Configuration Files Reference
| File | Purpose |
|------|---------|
| `3rdparty.toml` | Component download URLs and checksums |
| `config/llm_releases.json` | Platform-specific LLM builds |
| `botserver-stack/conf/*/` | Per-component configuration |
| `.env` | Environment variables (generated) |
---
## See Also
- [Updating Components](./updating-components.md) - How to update
- [Security Auditing](./security-auditing.md) - Vulnerability scanning
- [Troubleshooting](./troubleshooting.md) - Common issues

View file

@ -0,0 +1,427 @@
# Security Auditing
Regular security audits ensure your BotServer installation remains protected against known vulnerabilities. This guide covers automated scanning, manual reviews, and best practices.
---
## Rust Dependency Auditing
### cargo-audit
BotServer uses `cargo-audit` to scan Rust dependencies for known vulnerabilities.
**Install cargo-audit:**
```bash
cargo install cargo-audit
```
**Run audit:**
```bash
cd botserver
cargo audit
```
**Expected output (clean):**
```
Fetching advisory database from `https://github.com/RustSec/advisory-db`
Loaded 650 security advisories (from ~/.cargo/advisory-db)
Scanning Cargo.lock for vulnerabilities (425 crate dependencies)
```
**Output with vulnerabilities:**
```
Crate: openssl
Version: 0.10.38
Title: `openssl` `X509NameRef::entries` is unsound
Date: 2023-11-23
ID: RUSTSEC-2023-0072
URL: https://rustsec.org/advisories/RUSTSEC-2023-0072
Severity: medium
Solution: Upgrade to >=0.10.60
```
### Automated CI/CD Auditing
Add to your CI pipeline (`.github/workflows/security.yml`):
```yaml
name: Security Audit
on:
push:
branches: [main]
pull_request:
schedule:
- cron: '0 0 * * *' # Daily at midnight
jobs:
audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: rustsec/audit-check@v1
with:
token: ${{ secrets.GITHUB_TOKEN }}
```
### Strict Auditing
Fail on any warning:
```bash
cargo audit --deny warnings
```
Fail on unmaintained crates:
```bash
cargo audit --deny unmaintained
```
Generate JSON report:
```bash
cargo audit --json > audit-report.json
```
---
## Stack Component Vulnerabilities
### CVE Monitoring
Monitor security advisories for each component:
| Component | Security Feed |
|-----------|---------------|
| PostgreSQL | [postgresql.org/support/security](https://www.postgresql.org/support/security/) |
| Vault | [security.hashicorp.com](https://www.hashicorp.com/security) |
| MinIO | [github.com/minio/minio/security](https://github.com/minio/minio/security/advisories) |
| Zitadel | [github.com/zitadel/zitadel/security](https://github.com/zitadel/zitadel/security/advisories) |
| llama.cpp | [github.com/ggml-org/llama.cpp/security](https://github.com/ggml-org/llama.cpp/security/advisories) |
| Valkey | [github.com/valkey-io/valkey/security](https://github.com/valkey-io/valkey/security/advisories) |
| Caddy | [github.com/caddyserver/caddy/security](https://github.com/caddyserver/caddy/security/advisories) |
| Stalwart | [github.com/stalwartlabs/mail-server/security](https://github.com/stalwartlabs/mail-server/security/advisories) |
### Trivy Container Scanning
If using containers, scan with Trivy:
```bash
# Install Trivy
curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | sh -s -- -b /usr/local/bin
# Scan filesystem
trivy fs --security-checks vuln,config ./botserver-stack/
# Scan specific binary
trivy fs --security-checks vuln ./botserver-stack/bin/vault/
```
### Grype Binary Scanning
Scan binaries for vulnerabilities:
```bash
# Install Grype
curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | sh -s -- -b /usr/local/bin
# Scan directory
grype dir:./botserver-stack/bin/
```
---
## Network Security Audit
### Port Scanning
Verify only expected ports are open:
```bash
# Local port check
ss -tlnp | grep LISTEN
# Expected ports
# 8200 - Vault
# 5432 - PostgreSQL
# 8080 - Zitadel / API
# 9000 - MinIO API
# 9001 - MinIO Console
# 6379 - Valkey
# 8081 - LLM Server
# 8082 - Embedding Server
# 443 - HTTPS Proxy
# 53 - DNS
```
External port scan:
```bash
nmap -sT -p- localhost
```
### TLS Certificate Audit
Check certificate validity:
```bash
# Check expiration
openssl x509 -in botserver-stack/conf/system/certificates/api/server.crt -noout -dates
# Check certificate chain
openssl verify -CAfile botserver-stack/conf/system/certificates/ca/ca.crt \
botserver-stack/conf/system/certificates/api/server.crt
```
### Firewall Rules
Ensure proper firewall configuration:
```bash
# UFW (Ubuntu)
sudo ufw status verbose
# iptables
sudo iptables -L -n -v
```
Recommended rules:
```bash
# Allow only necessary ports
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow 443/tcp # HTTPS
sudo ufw allow 8080/tcp # API (if exposed)
```
---
## Secrets Audit
### Vault Health Check
```bash
# Check Vault seal status
curl -s http://localhost:8200/v1/sys/seal-status | jq
# List enabled auth methods
VAULT_ADDR=http://localhost:8200 vault auth list
# Audit enabled secrets engines
VAULT_ADDR=http://localhost:8200 vault secrets list
```
### Environment Variable Audit
Check for leaked secrets:
```bash
# Search for hardcoded secrets
grep -r "password" --include="*.toml" --include="*.json" --include="*.csv" .
grep -r "secret" --include="*.toml" --include="*.json" --include="*.csv" .
grep -r "api_key" --include="*.toml" --include="*.json" --include="*.csv" .
# Check .env file permissions
ls -la .env
# Should be: -rw------- (600)
```
### Rotate Secrets
Regular rotation schedule:
```bash
# Generate new database password
./botserver rotate-secret tables
# Generate new drive credentials
./botserver rotate-secret drive
# Rotate all secrets
./botserver rotate-secrets --all
```
---
## Code Security Analysis
### Static Analysis with Clippy
```bash
# Run Clippy with all lints
cargo clippy -- -W clippy::all -W clippy::pedantic -W clippy::nursery
# Security-focused lints
cargo clippy -- -W clippy::unwrap_used -W clippy::expect_used
```
### SAST with Semgrep
```bash
# Install Semgrep
pip install semgrep
# Run Rust security rules
semgrep --config p/rust .
# Run all security rules
semgrep --config p/security-audit .
```
### Dependency Review
Check for outdated dependencies:
```bash
# List outdated crates
cargo outdated
# Check for yanked crates
cargo audit --deny yanked
```
---
## Database Security
### PostgreSQL Audit
```bash
# Check authentication settings
cat botserver-stack/conf/tables/pg_hba.conf
# Verify SSL is enabled
psql $DATABASE_URL -c "SHOW ssl;"
# Check user permissions
psql $DATABASE_URL -c "SELECT * FROM pg_roles WHERE rolname NOT LIKE 'pg_%';"
```
### Connection Security
Ensure encrypted connections:
```sql
-- Check current connections
SELECT datname, usename, ssl, client_addr
FROM pg_stat_ssl
JOIN pg_stat_activity ON pg_stat_ssl.pid = pg_stat_activity.pid;
```
---
## Compliance Checks
### OWASP Top 10
| Risk | Mitigation | Status Check |
|------|------------|--------------|
| Injection | Parameterized queries | `grep -r "raw_sql" src/` |
| Broken Auth | Zitadel handles auth | Check Zitadel config |
| Sensitive Data | Vault encryption | `vault status` |
| XXE | No XML parsing | N/A |
| Broken Access | RBAC via Zitadel | Check permissions |
| Security Misconfig | Audit configs | Review `conf/` |
| XSS | Template escaping | Askama auto-escapes |
| Insecure Deserialization | Serde validation | Code review |
| Vulnerable Components | `cargo audit` | Automated |
| Logging | Structured logs | Check log config |
### SOC 2 Checklist
- [ ] Access controls documented
- [ ] Encryption at rest enabled
- [ ] Encryption in transit (TLS)
- [ ] Audit logging enabled
- [ ] Backup procedures documented
- [ ] Incident response plan
- [ ] Vulnerability management process
---
## Audit Schedule
| Audit Type | Frequency | Tool |
|------------|-----------|------|
| Dependency vulnerabilities | Daily (CI) | cargo-audit |
| Container scanning | Weekly | Trivy |
| Secret rotation | Monthly | Vault |
| Port scanning | Monthly | nmap |
| Full security review | Quarterly | Manual |
| Penetration testing | Annually | External |
---
## Automated Security Script
Create `security-audit.sh`:
```bash
#!/bin/bash
set -e
echo "=== BotServer Security Audit ==="
echo "Date: $(date)"
echo
echo "--- Rust Dependency Audit ---"
cargo audit --deny warnings || echo "WARN: Vulnerabilities found"
echo
echo "--- Checking for Hardcoded Secrets ---"
if grep -r "password.*=" --include="*.rs" src/ 2>/dev/null | grep -v "fn\|let\|//"; then
echo "WARN: Potential hardcoded passwords found"
fi
echo
echo "--- Port Scan ---"
ss -tlnp | grep LISTEN
echo
echo "--- Certificate Expiry ---"
for cert in botserver-stack/conf/system/certificates/*/server.crt; do
if [ -f "$cert" ]; then
expiry=$(openssl x509 -in "$cert" -noout -enddate 2>/dev/null | cut -d= -f2)
echo "$cert: $expiry"
fi
done
echo
echo "--- Vault Status ---"
curl -s http://localhost:8200/v1/sys/seal-status 2>/dev/null | jq -r '.sealed' || echo "Vault not running"
echo
echo "=== Audit Complete ==="
```
Run periodically:
```bash
chmod +x security-audit.sh
./security-audit.sh > audit-$(date +%Y%m%d).log
```
---
## Reporting Vulnerabilities
If you discover a security vulnerability in BotServer:
1. **Do NOT** create a public GitHub issue
2. Email security@generalbots.ai with details
3. Include steps to reproduce
4. Allow 90 days for fix before disclosure
---
## See Also
- [Secrets Management](../08-config/secrets-management.md) - Vault configuration
- [Updating Components](./updating-components.md) - Applying security updates
- [Backup and Recovery](./backup-recovery.md) - Data protection

View file

@ -0,0 +1,576 @@
# Troubleshooting
This guide covers common issues you may encounter with BotServer and their solutions.
---
## Quick Diagnostics
### Check Overall Status
```bash
# View all service status
./botserver status
# Check specific service
./botserver status llm
./botserver status tables
./botserver status vault
```
### View Logs
```bash
# All logs
tail -f botserver-stack/logs/*.log
# Specific service
tail -100 botserver-stack/logs/llm.log
tail -100 botserver-stack/logs/postgres.log
tail -100 botserver-stack/logs/vault.log
# With filtering
grep -i error botserver-stack/logs/*.log
grep -i "failed\|error\|panic" botserver-stack/logs/*.log
```
### System Resources
```bash
# Memory usage
free -h
# Disk space
df -h botserver-stack/
# Process list
ps aux | grep -E "llama|postgres|minio|vault|valkey"
# Open ports
ss -tlnp | grep LISTEN
```
---
## Startup Issues
### Bootstrap Fails
**Symptom:** `./botserver` fails during initial setup
**Common Causes & Solutions:**
1. **Port already in use**
```bash
# Find what's using the port
lsof -i :8080
lsof -i :5432
# Kill conflicting process
kill -9 <PID>
# Or change port in config
```
2. **Insufficient disk space**
```bash
# Check available space
df -h
# Clean up old installers
rm -rf botserver-installers/*.old
# Clean logs
rm -f botserver-stack/logs/*.log.old
```
3. **Download failure**
```bash
# Clear cache and retry
rm -rf botserver-installers/component-name*
./botserver bootstrap
# Manual download
curl -L -o botserver-installers/file.zip "URL"
```
4. **Permission denied**
```bash
# Fix permissions
chmod +x botserver
chmod -R u+rwX botserver-stack/
```
### Vault Won't Start
**Symptom:** Vault fails to initialize or unseal
**Solutions:**
1. **First-time setup failed**
```bash
# Reset Vault completely
rm -rf botserver-stack/data/vault/*
rm -f botserver-stack/conf/vault/init.json
./botserver bootstrap
```
2. **Vault is sealed**
```bash
# Check seal status
curl http://localhost:8200/v1/sys/seal-status
# Unseal manually
./botserver unseal
```
3. **Lost unseal keys**
```bash
# Check init.json exists
cat botserver-stack/conf/vault/init.json
# If lost, must reset Vault (DATA LOSS)
./botserver reset vault
```
### Database Won't Start
**Symptom:** PostgreSQL fails to start
**Solutions:**
1. **Corrupted data directory**
```bash
# Check PostgreSQL logs
tail -50 botserver-stack/logs/postgres.log
# Try recovery
./botserver-stack/bin/tables/bin/pg_resetwal -f botserver-stack/data/tables/
```
2. **Port conflict**
```bash
# Check if another PostgreSQL is running
lsof -i :5432
# Stop system PostgreSQL
sudo systemctl stop postgresql
```
3. **Incorrect permissions**
```bash
chmod 700 botserver-stack/data/tables/
```
---
## Service Issues
### LLM Server Not Responding
**Symptom:** Requests to port 8081/8082 fail
**Solutions:**
1. **Check if running**
```bash
pgrep llama-server
curl -k https://localhost:8081/health
```
2. **Model not found**
```bash
# Verify model exists
ls -la botserver-stack/data/llm/
# Re-download model
./botserver update llm
```
3. **Out of memory**
```bash
# Check memory usage
free -h
# Use smaller model or reduce context
# Edit config.csv:
# llm-server-ctx-size,2048
```
4. **GPU issues**
```bash
# Check CUDA
nvidia-smi
# Fall back to CPU
# Edit config.csv:
# llm-server-gpu-layers,0
```
5. **Restart LLM server**
```bash
pkill llama-server
./botserver start llm
```
### Drive (MinIO) Issues
**Symptom:** File uploads/downloads fail
**Solutions:**
1. **Check MinIO status**
```bash
curl http://localhost:9000/minio/health/live
```
2. **Credential issues**
```bash
# Verify credentials from Vault
./botserver show-secret drive
# Test with mc client
mc alias set local http://localhost:9000 ACCESS_KEY SECRET_KEY
mc ls local/
```
3. **Disk full**
```bash
df -h botserver-stack/data/drive/
# Clean old versions
mc rm --recursive --force local/bucket/.minio.sys/
```
### Cache (Valkey) Issues
**Symptom:** Session errors, slow responses
**Solutions:**
1. **Check Valkey status**
```bash
./botserver-stack/bin/cache/valkey-cli ping
# Expected: PONG
```
2. **Memory issues**
```bash
./botserver-stack/bin/cache/valkey-cli info memory
# Flush cache if needed
./botserver-stack/bin/cache/valkey-cli FLUSHALL
```
3. **Connection refused**
```bash
# Check if running
pgrep valkey-server
# Restart
./botserver restart cache
```
### Directory (Zitadel) Issues
**Symptom:** Login fails, authentication errors
**Solutions:**
1. **Check Zitadel logs**
```bash
tail -100 botserver-stack/logs/zitadel.log
```
2. **Database connection**
```bash
# Zitadel uses PostgreSQL
psql $DATABASE_URL -c "SELECT 1;"
```
3. **Certificate issues**
```bash
# Regenerate certificates
./botserver regenerate-certs
```
---
## Connection Issues
### Cannot Connect to Database
**Error:** `connection refused` or `authentication failed`
**Solutions:**
1. **Verify DATABASE_URL**
```bash
echo $DATABASE_URL
# Should be: postgres://user:pass@localhost:5432/dbname
```
2. **Check PostgreSQL is running**
```bash
pgrep postgres
./botserver status tables
```
3. **Test connection**
```bash
psql $DATABASE_URL -c "SELECT 1;"
```
4. **Check pg_hba.conf**
```bash
cat botserver-stack/conf/tables/pg_hba.conf
# Ensure local connections are allowed
```
### SSL/TLS Certificate Errors
**Error:** `certificate verify failed` or `SSL handshake failed`
**Solutions:**
1. **Regenerate certificates**
```bash
./botserver regenerate-certs
```
2. **Check certificate validity**
```bash
openssl x509 -in botserver-stack/conf/system/certificates/api/server.crt -noout -dates
```
3. **Skip verification (development only)**
```bash
curl -k https://localhost:8081/health
```
### Network Timeouts
**Error:** Requests timeout after waiting
**Solutions:**
1. **Check DNS resolution**
```bash
nslookup api.botserver.local
```
2. **Verify firewall rules**
```bash
sudo ufw status
sudo iptables -L
```
3. **Check service is listening**
```bash
ss -tlnp | grep 8080
```
---
## Performance Issues
### Slow Response Times
**Solutions:**
1. **Check system resources**
```bash
top -b -n 1 | head -20
iostat -x 1 3
```
2. **Database performance**
```bash
psql $DATABASE_URL -c "SELECT * FROM pg_stat_activity;"
# Vacuum database
psql $DATABASE_URL -c "VACUUM ANALYZE;"
```
3. **LLM performance**
```bash
# Reduce context size
# config.csv: llm-server-ctx-size,2048
# Use GPU layers
# config.csv: llm-server-gpu-layers,35
```
4. **Enable caching**
```bash
# Verify cache is working
./botserver-stack/bin/cache/valkey-cli info stats
```
### High Memory Usage
**Solutions:**
1. **Identify memory hogs**
```bash
ps aux --sort=-%mem | head -10
```
2. **Reduce LLM memory**
```bash
# Use quantized model (Q3_K_M instead of F16)
# Reduce context: llm-server-ctx-size,1024
# Reduce batch: llm-server-batch-size,256
```
3. **Limit PostgreSQL memory**
```bash
# Edit postgresql.conf
shared_buffers = 256MB
work_mem = 64MB
```
### High Disk Usage
**Solutions:**
1. **Find large files**
```bash
du -sh botserver-stack/*
du -sh botserver-stack/data/*
```
2. **Clean logs**
```bash
truncate -s 0 botserver-stack/logs/*.log
```
3. **Clean old installers**
```bash
# Keep only latest versions
ls -la botserver-installers/
rm botserver-installers/old-*
```
4. **Prune drive storage**
```bash
mc rm --recursive --older-than 30d local/bucket/
```
---
## Update Issues
### Component Update Failed
**Symptom:** Update command fails or service won't start after update
**Solutions:**
1. **Clear cache and retry**
```bash
rm botserver-installers/component-name*
./botserver update component-name
```
2. **Checksum mismatch**
```bash
# Verify checksum
sha256sum botserver-installers/file.zip
# Compare with 3rdparty.toml
grep sha256 3rdparty.toml | grep component
# Update checksum if release changed
```
3. **Rollback to previous version**
```bash
# If old version cached
ls botserver-installers/
# Restore old binary
cp botserver-installers/old-version.zip /tmp/
unzip /tmp/old-version.zip -d botserver-stack/bin/component/
```
### Database Migration Failed
**Solutions:**
1. **Check migration status**
```bash
./botserver migrate --status
```
2. **Run migrations manually**
```bash
./botserver migrate
```
3. **Rollback migration**
```bash
./botserver migrate --rollback
```
4. **Reset from backup**
```bash
pg_restore -c -d $DATABASE_URL backup.dump
```
---
## Common Error Messages
| Error | Cause | Solution |
|-------|-------|----------|
| `connection refused` | Service not running | Start the service |
| `permission denied` | File permissions | `chmod +x` on binary |
| `address already in use` | Port conflict | Kill conflicting process |
| `out of memory` | Insufficient RAM | Reduce model/context size |
| `no such file or directory` | Missing binary/config | Re-run bootstrap |
| `certificate verify failed` | SSL issues | Regenerate certificates |
| `authentication failed` | Wrong credentials | Check Vault secrets |
| `disk quota exceeded` | Disk full | Clean logs/old files |
| `too many open files` | ulimit too low | `ulimit -n 65536` |
| `connection timed out` | Network/firewall | Check firewall rules |
---
## Getting Help
### Collect Diagnostics
```bash
# Generate diagnostic report
./botserver diagnose > diagnostics-$(date +%Y%m%d).txt
# Include in bug reports:
# - BotServer version
# - OS and architecture
# - Error messages
# - Relevant logs
```
### Debug Logging
```bash
# Enable verbose logging
RUST_LOG=debug ./botserver
# Trace level (very verbose)
RUST_LOG=trace ./botserver
```
### Community Support
- GitHub Issues: [github.com/GeneralBots/BotServer/issues](https://github.com/GeneralBots/BotServer/issues)
- Documentation: [docs.generalbots.ai](https://docs.generalbots.ai)
---
## See Also
- [Updating Components](./updating-components.md) - Safe update procedures
- [Backup and Recovery](./backup-recovery.md) - Data protection
- [Security Auditing](./security-auditing.md) - Security checks

View file

@ -0,0 +1,552 @@
# Updating Components
BotServer's stack components are regularly updated by their respective maintainers. This guide explains how to check for updates, apply them safely, and verify everything works correctly.
## Update Philosophy
BotServer uses a **conservative update strategy**:
1. **Pinned Versions** - Each component has a tested version in `3rdparty.toml`
2. **Checksum Verification** - Downloads are verified with SHA256 hashes
3. **Cached Downloads** - Updates are cached in `botserver-installers/` for offline use
4. **Rollback Ready** - Previous binaries can be restored from cache
## Checking for Updates
### View Current Versions
Check installed versions:
```bash
./botserver version --all
```
Example output:
```
BotServer Stack Versions:
vault: 1.15.4
tables: 17.2.0 (PostgreSQL)
directory: 2.70.4 (Zitadel)
drive: latest (MinIO)
cache: 8.0.2 (Valkey)
llm: b7345 (llama.cpp)
email: 0.10.7 (Stalwart)
proxy: 2.9.1 (Caddy)
dns: 1.11.1 (CoreDNS)
alm: 10.0.2 (Forgejo)
meeting: 2.8.2 (LiveKit)
```
### Check Upstream Releases
| Component | Release Page |
|-----------|--------------|
| llama.cpp | [github.com/ggml-org/llama.cpp/releases](https://github.com/ggml-org/llama.cpp/releases) |
| PostgreSQL | [postgresql.org/download](https://www.postgresql.org/download/) |
| MinIO | [github.com/minio/minio/releases](https://github.com/minio/minio/releases) |
| Valkey | [github.com/valkey-io/valkey/releases](https://github.com/valkey-io/valkey/releases) |
| Zitadel | [github.com/zitadel/zitadel/releases](https://github.com/zitadel/zitadel/releases) |
| Vault | [releases.hashicorp.com/vault](https://releases.hashicorp.com/vault/) |
| Stalwart | [github.com/stalwartlabs/mail-server/releases](https://github.com/stalwartlabs/mail-server/releases) |
| Caddy | [github.com/caddyserver/caddy/releases](https://github.com/caddyserver/caddy/releases) |
| CoreDNS | [github.com/coredns/coredns/releases](https://github.com/coredns/coredns/releases) |
| Forgejo | [codeberg.org/forgejo/forgejo/releases](https://codeberg.org/forgejo/forgejo/releases) |
| LiveKit | [github.com/livekit/livekit/releases](https://github.com/livekit/livekit/releases) |
---
## Updating the Configuration
Component URLs and checksums are defined in `3rdparty.toml`. To update a component:
### 1. Edit `3rdparty.toml`
```toml
[components.llm]
name = "Llama.cpp Server"
url = "https://github.com/ggml-org/llama.cpp/releases/download/b7345/llama-b7345-bin-ubuntu-x64.zip"
filename = "llama-b7345-bin-ubuntu-x64.zip"
sha256 = "91b066ecc53c20693a2d39703c12bc7a69c804b0768fee064d47df702f616e52"
```
### 2. Get the New Checksum
Most releases publish SHA256 checksums. If not, calculate it:
```bash
# Download and calculate checksum
curl -L -o new-release.zip "https://github.com/.../new-release.zip"
sha256sum new-release.zip
```
### 3. Update Both Files
Update both configuration files to stay in sync:
- `3rdparty.toml` - Main component registry
- `config/llm_releases.json` - LLM-specific builds and checksums
---
## Component Update Procedures
### Updating llama.cpp (LLM Server)
The LLM server powers local AI inference. Updates often include performance improvements and new model support.
**Step 1: Check the latest release**
Visit [github.com/ggml-org/llama.cpp/releases](https://github.com/ggml-org/llama.cpp/releases)
**Step 2: Update `3rdparty.toml`**
```toml
[components.llm]
name = "Llama.cpp Server"
url = "https://github.com/ggml-org/llama.cpp/releases/download/b7345/llama-b7345-bin-ubuntu-x64.zip"
filename = "llama-b7345-bin-ubuntu-x64.zip"
sha256 = "91b066ecc53c20693a2d39703c12bc7a69c804b0768fee064d47df702f616e52"
```
**Step 3: Update `config/llm_releases.json`**
This file contains platform-specific builds:
```json
{
"llama_cpp": {
"version": "b7345",
"base_url": "https://github.com/ggml-org/llama.cpp/releases/download",
"checksums": {
"llama-b7345-bin-ubuntu-x64.zip": "sha256:91b066ecc53c20693a2d39703c12bc7a69c804b0768fee064d47df702f616e52",
"llama-b7345-bin-macos-arm64.zip": "sha256:72ae9b4a4605aa1223d7aabaa5326c66c268b12d13a449fcc06f61099cd02a52"
}
}
}
```
**Step 4: Update installer.rs version constant**
```rust
const LLAMA_CPP_VERSION: &str = "b7345";
```
**Step 5: Apply the update**
```bash
# Stop LLM service
pkill llama-server
# Remove old binary
rm -rf botserver-stack/bin/llm/*
# Re-run bootstrap (downloads new version)
./botserver bootstrap
# Or manually trigger download
./botserver update llm
```
**Available llama.cpp Builds (b7345)**
| Platform | Architecture | Variant | Filename |
|----------|-------------|---------|----------|
| Linux | x64 | CPU | `llama-b7345-bin-ubuntu-x64.zip` |
| Linux | x64 | Vulkan | `llama-b7345-bin-ubuntu-vulkan-x64.zip` |
| Linux | s390x | CPU | `llama-b7345-bin-ubuntu-s390x.zip` |
| macOS | ARM64 | Metal | `llama-b7345-bin-macos-arm64.zip` |
| macOS | x64 | CPU | `llama-b7345-bin-macos-x64.zip` |
| Windows | x64 | CPU | `llama-b7345-bin-win-cpu-x64.zip` |
| Windows | x64 | CUDA 12.4 | `llama-b7345-bin-win-cuda-12.4-x64.zip` |
| Windows | x64 | CUDA 13.1 | `llama-b7345-bin-win-cuda-13.1-x64.zip` |
| Windows | x64 | Vulkan | `llama-b7345-bin-win-vulkan-x64.zip` |
| Windows | ARM64 | CPU | `llama-b7345-bin-win-cpu-arm64.zip` |
> **Note:** Linux releases are transitioning from `.zip` to `.tar.gz` format.
---
### Updating PostgreSQL (Tables)
**Warning:** Database updates require careful planning. Always backup first!
```bash
# Backup database
pg_dump $DATABASE_URL > backup-$(date +%Y%m%d).sql
# Update 3rdparty.toml
[components.tables]
url = "https://github.com/theseus-rs/postgresql-binaries/releases/download/17.2.0/postgresql-17.2.0-x86_64-unknown-linux-gnu.tar.gz"
filename = "postgresql-17.2.0-x86_64-unknown-linux-gnu.tar.gz"
# Stop services
./botserver stop
# Apply update
./botserver update tables
# Start services
./botserver start
# Verify
psql $DATABASE_URL -c "SELECT version();"
```
---
### Updating MinIO (Drive)
MinIO updates are generally safe and backward-compatible.
```bash
# Update 3rdparty.toml
[components.drive]
url = "https://dl.min.io/server/minio/release/linux-amd64/minio"
filename = "minio"
# Apply update
./botserver update drive
# Verify
curl http://localhost:9000/minio/health/live
```
---
### Updating Valkey (Cache)
Valkey requires compilation from source.
```bash
# Update 3rdparty.toml
[components.cache]
url = "https://github.com/valkey-io/valkey/archive/refs/tags/8.0.2.tar.gz"
filename = "valkey-8.0.2.tar.gz"
# Stop cache
./botserver stop cache
# Remove old build
rm -rf botserver-stack/bin/cache/*
# Rebuild
./botserver update cache
# Verify
./botserver-stack/bin/cache/valkey-cli ping
```
---
### Updating Zitadel (Directory)
**Warning:** Directory service updates may require database migrations.
```bash
# Backup Zitadel database
pg_dump -d zitadel > zitadel-backup-$(date +%Y%m%d).sql
# Update 3rdparty.toml
[components.directory]
url = "https://github.com/zitadel/zitadel/releases/download/v2.70.4/zitadel-linux-amd64.tar.gz"
filename = "zitadel-linux-amd64.tar.gz"
# Stop directory
./botserver stop directory
# Apply update
./botserver update directory
# Run migrations (if needed)
./botserver-stack/bin/directory/zitadel setup
# Start
./botserver start directory
```
---
### Updating Vault (Secrets)
**Critical:** Vault updates require unsealing after restart.
```bash
# Update 3rdparty.toml
[components.vault]
url = "https://releases.hashicorp.com/vault/1.15.4/vault_1.15.4_linux_amd64.zip"
filename = "vault_1.15.4_linux_amd64.zip"
# Stop Vault
./botserver stop vault
# Apply update
./botserver update vault
# Start and unseal
./botserver start vault
./botserver unseal
```
---
## Platform-Specific Builds
### Automatic Detection
BotServer automatically detects your platform and downloads the appropriate build:
1. **Operating System** - Linux, macOS, Windows
2. **Architecture** - x64, ARM64, s390x
3. **GPU Support** - CUDA, Vulkan, Metal, ROCm
### Manual Override
Force a specific build variant:
```toml
# In 3rdparty.toml - use Vulkan build instead of CPU
[components.llm]
url = "https://github.com/ggml-org/llama.cpp/releases/download/b7345/llama-b7345-bin-ubuntu-vulkan-x64.zip"
```
### GPU Detection
The installer checks for GPU support:
```rust
// Linux CUDA detection
if Path::new("/usr/local/cuda").exists() || env::var("CUDA_HOME").is_ok() {
// Use CUDA build
}
// Vulkan detection
if Path::new("/usr/share/vulkan").exists() || env::var("VULKAN_SDK").is_ok() {
// Use Vulkan build
}
```
---
## Offline Updates
### Pre-download for Air-Gapped Systems
1. Download releases on a connected machine:
```bash
# Download all components
mkdir offline-updates
cd offline-updates
# LLM
curl -LO https://github.com/ggml-org/llama.cpp/releases/download/b7345/llama-b7345-bin-ubuntu-x64.zip
# Database
curl -LO https://github.com/theseus-rs/postgresql-binaries/releases/download/17.2.0/postgresql-17.2.0-x86_64-unknown-linux-gnu.tar.gz
# ... other components
```
2. Transfer to air-gapped system
3. Copy to cache directory:
```bash
cp offline-updates/* /path/to/botserver-installers/
```
4. Run bootstrap (uses cached files):
```bash
./botserver bootstrap
```
---
## Verifying Updates
### Run Tests
```bash
# Run test suite
cargo test
# Integration tests
./botserver test
```
### Health Checks
```bash
# Check all services
./botserver status
# Individual service checks
curl -k https://localhost:8081/health # LLM
curl -k https://localhost:8082/health # Embedding
curl http://localhost:9000/minio/health/live # Drive
```
### Security Audit
After updating dependencies:
```bash
# Rust dependencies
cargo audit
# Check for known vulnerabilities
cargo audit --deny warnings
```
---
## Rollback Procedure
If an update causes issues:
### Quick Rollback
```bash
# Stop services
./botserver stop
# Restore from cache (previous version must exist)
cp botserver-installers/llama-b4547-bin-ubuntu-x64.zip /tmp/
unzip /tmp/llama-b4547-bin-ubuntu-x64.zip -d botserver-stack/bin/llm/
# Restart
./botserver start
```
### Full Rollback
```bash
# Restore database from backup
psql $DATABASE_URL < backup-20241210.sql
# Restore old binaries
rm -rf botserver-stack/bin/
tar -xzf botserver-stack-backup.tar.gz
# Restart
./botserver start
```
---
## Update Schedule Recommendations
| Component | Update Frequency | Risk Level |
|-----------|-----------------|------------|
| llama.cpp | Weekly/Monthly | Low |
| MinIO | Monthly | Low |
| Valkey | Quarterly | Low |
| Caddy | Monthly | Low |
| CoreDNS | Quarterly | Low |
| PostgreSQL | Quarterly | Medium |
| Zitadel | Quarterly | Medium |
| Vault | Quarterly | High |
| Stalwart | Monthly | Medium |
### Security Updates
Apply security patches immediately for:
- Vault (secrets management)
- PostgreSQL (database)
- Zitadel (authentication)
---
## Automating Updates
### Update Script
Create `update-components.sh`:
```bash
#!/bin/bash
set -e
echo "Backing up current state..."
./botserver backup
echo "Stopping services..."
./botserver stop
echo "Updating components..."
for component in llm drive cache; do
echo "Updating $component..."
./botserver update $component
done
echo "Starting services..."
./botserver start
echo "Running health checks..."
./botserver status
echo "Update complete!"
```
### Scheduled Updates
Use cron for automated updates (use with caution):
```bash
# Weekly LLM updates (low risk)
0 3 * * 0 /path/to/botserver update llm
# Monthly full updates
0 3 1 * * /path/to/update-components.sh
```
---
## Troubleshooting Updates
### Download Failures
```bash
# Clear cache and retry
rm botserver-installers/component-name*
./botserver update component-name
```
### Checksum Mismatch
```bash
# Verify checksum manually
sha256sum botserver-installers/llama-b7345-bin-ubuntu-x64.zip
# Compare with 3rdparty.toml
```
### Service Won't Start
```bash
# Check logs
tail -100 botserver-stack/logs/llm.log
# Check permissions
ls -la botserver-stack/bin/llm/
# Make executable
chmod +x botserver-stack/bin/llm/llama-server
```
### Database Migration Errors
```bash
# Run migrations manually
./botserver migrate
# Or reset (WARNING: data loss)
./botserver reset tables
```
---
## See Also
- [Component Reference](./component-reference.md) - Detailed component documentation
- [Security Auditing](./security-auditing.md) - Vulnerability scanning
- [Backup and Recovery](./backup-recovery.md) - Data protection

View file

@ -379,6 +379,12 @@
- [Multimodal](./18-appendix-external-services/multimodal.md)
- [Console (XtreeUI)](./18-appendix-external-services/console.md)
- [Appendix C: Maintenance](./19-maintenance/README.md)
- [Updating Components](./19-maintenance/updating-components.md)
- [Component Reference](./19-maintenance/component-reference.md)
- [Security Auditing](./19-maintenance/security-auditing.md)
- [Backup and Recovery](./19-maintenance/backup-recovery.md)
- [Troubleshooting](./19-maintenance/troubleshooting.md)
- [Appendix D: Documentation Style](./16-appendix-docs-style/conversation-examples.md)
- [SVG and Conversation Standards](./16-appendix-docs-style/svg.md)