From af865b87e733f5caf4a27e232a47a13519750ea6 Mon Sep 17 00:00:00 2001 From: "Rodrigo Rodriguez (Pragmatismo)" Date: Mon, 24 Nov 2025 08:42:58 -0300 Subject: [PATCH] - 7 docs revised. --- docs/src/SUMMARY.md | 3 +- docs/src/chapter-06-gbdialog/keyword-find.md | 382 ++++++----- .../chapter-06-gbdialog/keyword-send-mail.md | 14 +- docs/src/chapter-07-gbapp/README.md | 616 +++++++++--------- docs/src/chapter-07-gbapp/architecture.md | 49 +- .../src/chapter-07-gbapp/assets/data-flow.svg | 139 ++++ .../assets/system-architecture.svg | 155 +++++ docs/src/chapter-07-gbapp/building.md | 13 +- docs/src/chapter-07-gbapp/example-gbapp.md | 354 ++++++++++ docs/src/chapter-07-gbapp/philosophy.md | 279 ++++++++ docs/src/chapter-07-gbapp/prompt-manager.md | 261 -------- docs/src/chapter-08-config/README.md | 133 +++- docs/src/chapter-08-config/answer-modes.md | 1 - docs/src/chapter-08-config/config-csv.md | 458 +++++++------ docs/src/chapter-08-config/llm-config.md | 262 +++++--- docs/src/chapter-08-config/parameters.md | 399 +++++------- 16 files changed, 2084 insertions(+), 1434 deletions(-) create mode 100644 docs/src/chapter-07-gbapp/assets/data-flow.svg create mode 100644 docs/src/chapter-07-gbapp/assets/system-architecture.svg create mode 100644 docs/src/chapter-07-gbapp/example-gbapp.md create mode 100644 docs/src/chapter-07-gbapp/philosophy.md delete mode 100644 docs/src/chapter-08-config/answer-modes.md diff --git a/docs/src/SUMMARY.md b/docs/src/SUMMARY.md index d24178828..b4a6d4226 100644 --- a/docs/src/SUMMARY.md +++ b/docs/src/SUMMARY.md @@ -88,11 +88,12 @@ - [Architecture Overview](./chapter-07-gbapp/architecture.md) - [Building from Source](./chapter-07-gbapp/building.md) - [Container Deployment (LXC)](./chapter-07-gbapp/containers.md) + - [Philosophy](./chapter-07-gbapp/philosophy.md) + - [Example gbapp](./chapter-07-gbapp/example-gbapp.md) - [Module Structure](./chapter-07-gbapp/crates.md) - [Service Layer](./chapter-07-gbapp/services.md) - [Creating Custom Keywords](./chapter-07-gbapp/custom-keywords.md) - [Adding Dependencies](./chapter-07-gbapp/dependencies.md) - - [Prompt Manager](./chapter-07-gbapp/prompt-manager.md) # Part VIII - Bot Configuration diff --git a/docs/src/chapter-06-gbdialog/keyword-find.md b/docs/src/chapter-06-gbdialog/keyword-find.md index c2f20c7d9..5ec24256e 100644 --- a/docs/src/chapter-06-gbdialog/keyword-find.md +++ b/docs/src/chapter-06-gbdialog/keyword-find.md @@ -1,251 +1,293 @@ # FIND -**Search for specific data in storage or knowledge bases.** The FIND keyword performs targeted searches in bot memory, databases, or document collections to locate specific information. +Search and retrieve data from database tables using filter criteria. ## Syntax ```basic -result = FIND(pattern) -result = FIND(pattern, location) -result = FIND(pattern, location, options) +result = FIND "table_name", "filter_criteria" ``` ## Parameters -- `pattern` - Search pattern or query string -- `location` - Where to search (optional, defaults to current KB) -- `options` - Search options like case sensitivity, limit (optional) +- `table_name` - The name of the database table to search +- `filter_criteria` - Filter expression in the format "field=value" ## Description -FIND searches for data matching a pattern. Unlike semantic search with LLM, FIND does exact or pattern-based matching. Useful for structured data, IDs, specific values. - -## Search Locations - -### Bot Memory -```basic -' Find in bot's permanent storage -user_data = FIND("user_*", "BOT_MEMORY") -' Returns all keys starting with "user_" -``` - -### Session Variables -```basic -' Find in current session -form_fields = FIND("form_*", "SESSION") -' Returns all form-related variables -``` - -### Knowledge Base -```basic -' Find specific documents -policies = FIND("*.pdf", "policies") -' Returns all PDFs in policies collection -``` - -### Database -```basic -' Find in database tables -orders = FIND("status:pending", "orders") -' Returns pending orders -``` +FIND searches database tables for records matching specified criteria. It returns an array of matching records that can be iterated over using FOR EACH loops. ## Examples ### Basic Search ```basic -' Find a specific user -user = FIND("email:john@example.com") - -if user - TALK "Found user: " + user.name -else - TALK "User not found" -end -``` - -### Pattern Matching -```basic -' Find all items matching pattern -items = FIND("SKU-2024-*") +' Find records with specific action +items = FIND "gb.rob", "ACTION=EMUL" FOR EACH item IN items - TALK item.name + ": " + item.price -END + TALK "Found: " + item.company +NEXT ``` -### Multi-Criteria Search +### Single Field Filter ```basic -' Complex search with multiple conditions -results = FIND("type:invoice AND status:unpaid AND date>2024-01-01") +' Find pending orders +orders = FIND "orders", "status=pending" -total = 0 -FOR EACH invoice IN results - total = total + invoice.amount -END -TALK "Total unpaid: $" + total +FOR EACH order IN orders + TALK "Order #" + order.id + " is pending" +NEXT ``` -### Search with Options +### Working with Results ```basic -' Limited, case-insensitive search -matches = FIND("john", "customers", { - case_sensitive: false, - limit: 10, - fields: ["name", "email"] -}) +' Find and process customer records +customers = FIND "customers", "city=Seattle" + +FOR EACH customer IN customers + TALK customer.name + " from " + customer.address + + ' Access fields with dot notation + email = customer.email + phone = customer.phone + + ' Update related data + SET "contacts", "id=" + customer.id, "last_contacted=" + NOW() +NEXT ``` -## Return Values +## Return Value -FIND returns different types based on matches: +FIND returns an array of records from the specified table. Each record is an object with fields accessible via dot notation. -- **Single match** - Returns the item directly -- **Multiple matches** - Returns array of items -- **No matches** - Returns null or empty array -- **Error** - Returns null with error in ERROR variable +- Returns empty array if no matches found +- Returns array of matching records if successful +- Each record contains all columns from the table + +## Field Access + +Access fields in returned records using dot notation: + +```basic +items = FIND "products", "category=electronics" + +FOR EACH item IN items + ' Access fields directly + TALK item.name + TALK item.price + TALK item.description + + ' Use null coalescing for optional fields + website = item.website ?? "" + + ' Check field existence + IF item.discount != "" THEN + TALK "On sale: " + item.discount + "% off" + END IF +NEXT +``` ## Common Patterns -### Check Existence +### Process All Matching Records ```basic -exists = FIND("id:" + user_id) -if exists - TALK "User already registered" -else - ' Create new user -end +tasks = FIND "tasks", "status=open" + +FOR EACH task IN tasks + ' Process each task + TALK "Processing task: " + task.title + + ' Update task status + SET "tasks", "id=" + task.id, "status=in_progress" +NEXT ``` -### Filter Results +### Check If Records Exist ```basic -all_products = FIND("*", "products") -in_stock = [] +users = FIND "users", "email=john@example.com" -FOR EACH product IN all_products - if product.quantity > 0 - in_stock.append(product) - end -END +IF LENGTH(users) > 0 THEN + TALK "User exists" +ELSE + TALK "User not found" +END IF ``` -### Aggregate Data +### Data Enrichment ```basic -sales = FIND("date:" + TODAY(), "transactions") -daily_total = 0 +companies = FIND "companies", "needs_update=true" -FOR EACH sale IN sales - daily_total = daily_total + sale.amount -END - -TALK "Today's sales: $" + daily_total +FOR EACH company IN companies + ' Get additional data + website = company.website ?? "" + + IF website == "" THEN + ' Look up website + website = WEBSITE OF company.name + + ' Update record + SET "companies", "id=" + company.id, "website=" + website + END IF + + ' Fetch and process website data + page = GET website + ' Process page content... +NEXT ``` -### Search History +### Batch Processing with Delays ```basic -' Find previous conversations -history = FIND("session:" + user_id, "messages", { - sort: "timestamp DESC", - limit: 50 -}) +emails = FIND "email_queue", "sent=false" -TALK "Your last conversation:" -FOR EACH message IN history - TALK message.timestamp + ": " + message.content -END +FOR EACH email IN emails + ' Send email + SEND MAIL email.to, email.subject, email.body + + ' Mark as sent + SET "email_queue", "id=" + email.id, "sent=true" + + ' Rate limiting + WAIT 1000 +NEXT ``` -## Performance Tips +## Filter Expressions + +The filter parameter uses simple equality expressions: + +- `"field=value"` - Match exact value +- Multiple conditions must be handled in BASIC code after retrieval -### Use Specific Patterns ```basic -' Good - Specific pattern -orders = FIND("order_2024_01_*") +' Get all records then filter in BASIC +all_orders = FIND "orders", "status=active" -' Bad - Too broad -everything = FIND("*") +FOR EACH order IN all_orders + ' Additional filtering in code + IF order.amount > 1000 AND order.priority == "high" THEN + ' Process high-value orders + TALK "Priority order: " + order.id + END IF +NEXT ``` -### Limit Results -```basic -' Get only what you need -recent = FIND("*", "logs", {limit: 100}) -``` +## Working with Different Data Types -### Cache Repeated Searches ```basic -' Cache for session -if not cached_products - cached_products = FIND("*", "products") -end -' Use cached_products instead of searching again +products = FIND "products", "active=true" + +FOR EACH product IN products + ' String fields + name = product.name + + ' Numeric fields + price = product.price + quantity = product.quantity + + ' Date fields + created = product.created_at + + ' Boolean-like fields (stored as strings) + IF product.featured == "true" THEN + TALK "Featured: " + name + END IF +NEXT ``` ## Error Handling ```basic -try - results = FIND(user_query) - if results - TALK "Found " + LENGTH(results) + " matches" - else - TALK "No results found" - end -catch error - TALK "Search failed. Please try again." - LOG "FIND error: " + error -end +' Handle potential errors +items = FIND "inventory", "warehouse=main" + +IF items == null THEN + TALK "Error accessing inventory data" +ELSE IF LENGTH(items) == 0 THEN + TALK "No items found in main warehouse" +ELSE + TALK "Found " + LENGTH(items) + " items" + ' Process items... +END IF ``` -## Comparison with Other Keywords +## Performance Considerations -| Keyword | Purpose | Use When | -|---------|---------|----------| -| FIND | Exact/pattern search | Looking for specific values | -| LLM | Semantic search | Understanding meaning | -| GET | Direct retrieval | Know exact key | -| USE KB | Activate knowledge | Need document context | +1. **Limit Results**: The system automatically limits to 10 results for safety +2. **Use Specific Filters**: More specific filters reduce processing time +3. **Avoid Full Table Scans**: Always provide a filter criterion +4. **Process in Batches**: For large datasets, process in chunks -## Advanced Usage - -### Dynamic Location ```basic -department = GET user.department -data = FIND("*", department + "_records") +' Process records in batches +batch = FIND "large_table", "processed=false" + +count = 0 +FOR EACH record IN batch + ' Process record + SET "large_table", "id=" + record.id, "processed=true" + + count = count + 1 + IF count >= 10 THEN + EXIT FOR ' Process max 10 at a time + END IF +NEXT ``` -### Compound Searches +## Integration with Other Keywords + +### With SET for Updates ```basic -' Find in multiple places -local = FIND(query, "local_db") -remote = FIND(query, "remote_api") -results = MERGE(local, remote) +users = FIND "users", "newsletter=true" + +FOR EACH user IN users + ' Update last_notified field + SET "users", "id=" + user.id, "last_notified=" + NOW() +NEXT ``` -### Conditional Fields +### With LLM for Processing ```basic -search_fields = ["name"] -if advanced_mode - search_fields.append(["email", "phone", "address"]) -end +articles = FIND "articles", "needs_summary=true" -results = FIND(term, "contacts", {fields: search_fields}) +FOR EACH article IN articles + summary = LLM "Summarize: " + article.content + SET "articles", "id=" + article.id, "summary=" + summary +NEXT ``` +### With CREATE SITE +```basic +companies = FIND "companies", "needs_site=true" + +FOR EACH company IN companies + alias = LLM "Create URL alias for: " + company.name + CREATE SITE alias, "template", "Create site for " + company.name + SET "companies", "id=" + company.id, "site_url=" + alias +NEXT +``` + +## Limitations + +- Maximum 10 records returned per query (system limit) +- Filter supports simple equality only +- Complex queries require post-processing in BASIC +- Table must exist in the database +- User must have read permissions on the table + ## Best Practices -✅ **Be specific** - Use precise patterns to avoid large result sets -✅ **Handle empty results** - Always check if FIND returned data -✅ **Use appropriate location** - Search where data actually lives -✅ **Limit when possible** - Don't retrieve more than needed +✅ **Always check results** - Verify FIND returned data before processing +✅ **Use specific filters** - Reduce result set size with precise criteria +✅ **Handle empty results** - Check LENGTH before iterating +✅ **Update as you go** - Mark records as processed to avoid reprocessing -❌ **Don't search everything** - Avoid FIND("*") without limits -❌ **Don't assume order** - Results may not be sorted unless specified -❌ **Don't ignore errors** - Wrap in try/catch for production +❌ **Don't assume order** - Results may not be sorted +❌ **Don't ignore limits** - Remember the 10-record limit +❌ **Don't use without filter** - Always provide filter criteria ## See Also -- [GET](./keyword-get.md) - Direct key retrieval -- [SET](./keyword-set.md) - Store data -- [USE KB](./keyword-use-kb.md) - Semantic document search -- [LLM](./keyword-llm.md) - AI-powered search \ No newline at end of file +- [SET](./keyword-set.md) - Update database records +- [GET](./keyword-get.md) - Retrieve single values +- [FOR EACH](./keyword-for-each.md) - Iterate over results +- [LLM](./keyword-llm.md) - Process found data with AI \ No newline at end of file diff --git a/docs/src/chapter-06-gbdialog/keyword-send-mail.md b/docs/src/chapter-06-gbdialog/keyword-send-mail.md index 63eb9c0cc..091b2582b 100644 --- a/docs/src/chapter-06-gbdialog/keyword-send-mail.md +++ b/docs/src/chapter-06-gbdialog/keyword-send-mail.md @@ -242,15 +242,11 @@ END IF ### Authentication Failed -```basic -' Test SMTP connection -TEST_SMTP_CONNECTION() -IF CONNECTION_OK THEN - TALK "SMTP connection successful" -ELSE - TALK "Check email-user and email-pass in config.csv" -END IF -``` +Check SMTP configuration: +1. Verify credentials in `config.csv` +2. Ensure SMTP server allows your connection +3. Check if port 587/465 is open +4. Verify TLS/SSL settings match server requirements ### Emails Going to Spam diff --git a/docs/src/chapter-07-gbapp/README.md b/docs/src/chapter-07-gbapp/README.md index 6f3249338..7085c4f86 100644 --- a/docs/src/chapter-07-gbapp/README.md +++ b/docs/src/chapter-07-gbapp/README.md @@ -1,405 +1,379 @@ -# Extending BotServer +# gbapp: Virtual Crates Architecture -This chapter covers how to extend and customize BotServer to meet specific requirements, from creating custom keywords to building new channel adapters and integrating with external systems. +This chapter explains how BotServer uses the gbapp concept as virtual crates within the `src/` directory, elegantly mapping the old package system to the new Rust architecture. -## Overview +## The gbapp Evolution: From Packages to Virtual Crates -BotServer is designed to be extensible at multiple levels: -- **BASIC Keywords**: Add new commands to the scripting language -- **Channel Adapters**: Support new messaging platforms -- **Storage Backends**: Integrate different storage systems -- **Authentication Providers**: Connect to various identity services -- **LLM Providers**: Add support for new language models +### Historical Context (Node.js Era) +In previous versions, `.gbapp` packages were external Node.js modules that extended BotServer functionality through a plugin system. -## Extension Points +### Current Architecture (Rust Era) +The `.gbapp` concept now lives as **virtual crates** inside `src/`: +- **Virtual Crates**: Each gbapp is a module inside `src/` (like `src/core`, `src/basic`, `src/channels`) +- **Same Mental Model**: Developers familiar with the old system can think of each directory as a "package" +- **Native Performance**: All code compiles into a single optimized binary +- **Contribution Path**: Add new gbapps by creating modules in `src/` -### 1. Custom BASIC Keywords +## How gbapp Virtual Crates Work -Create new keywords by implementing them in Rust: +``` +src/ +├── core/ # core.gbapp (virtual crate) +├── basic/ # basic.gbapp (BASIC interpreter) +├── channels/ # channels.gbapp (communication) +├── storage/ # storage.gbapp (persistence) +├── auth/ # auth.gbapp (authentication) +├── llm/ # llm.gbapp (AI integration) +└── your_feature/ # your_feature.gbapp (your contribution!) +``` + +Each directory is conceptually a gbapp - a self-contained module that contributes functionality to the whole. + +## Why This Change? + +1. **Simplicity**: One cohesive codebase instead of fragmented extensions +2. **Performance**: Native Rust performance without extension overhead +3. **Reliability**: Thoroughly tested core features vs. variable-quality plugins +4. **BASIC Power**: BASIC + LLM combination eliminates need for custom code +5. **Maintenance**: Easier to maintain one strong core than many extensions + +## Contributing New Keywords + +### Contributing a New gbapp Virtual Crate + +To add functionality, create a new gbapp as a module in `src/`: ```rust -// src/basic/keywords/my_keyword.rs -use rhai::{Engine, Dynamic}; +// src/your_feature/mod.rs - Your gbapp virtual crate +pub mod keywords; +pub mod services; +pub mod models; + +// src/your_feature/keywords/mod.rs use crate::shared::state::AppState; +use rhai::Engine; -pub fn register_my_keyword(engine: &mut Engine, state: Arc) { - engine.register_fn("MY_KEYWORD", move |param: String| -> String { - // Your implementation here - format!("Processed: {}", param) +pub fn register_keywords(engine: &mut Engine, state: Arc) { + engine.register_fn("YOUR KEYWORD", move |param: String| -> String { + // Implementation + format!("Result: {}", param) }); } ``` -Register in the keyword module: +This maintains the conceptual model of packages while leveraging Rust's module system. + +### Contribution Process for gbapp Virtual Crates + +1. **Fork** the BotServer repository +2. **Create** your gbapp module in `src/your_feature/` +3. **Structure** it like existing gbapps (core, basic, etc.) +4. **Test** thoroughly with unit and integration tests +5. **Document** in the appropriate chapter +6. **Submit PR** describing your gbapp's purpose + +Example structure for a new gbapp: +``` +src/analytics/ # analytics.gbapp +├── mod.rs # Module definition +├── keywords.rs # BASIC keywords +├── services.rs # Core services +├── models.rs # Data models +└── tests.rs # Unit tests +``` + +## Adding New Components + +Components are features compiled into BotServer via Cargo features: + +### Current Components in Cargo.toml + +```toml +[features] +# Core features +chat = [] # Chat functionality +drive = [] # Storage system +tasks = [] # Task management +calendar = [] # Calendar integration +meet = [] # Video meetings +mail = [] # Email system + +# Enterprise features +compliance = [] # Compliance tools +attendance = [] # Attendance tracking +directory = [] # User directory +``` + +### Adding a New Component + +1. **Define Feature** in `Cargo.toml`: +```toml +[features] +your_feature = ["dep:required_crate"] +``` + +2. **Implement** in appropriate module: ```rust -// src/basic/keywords/mod.rs -pub fn register_all_keywords(engine: &mut Engine, state: Arc) { - // ... existing keywords - register_my_keyword(engine, state.clone()); +#[cfg(feature = "your_feature")] +pub mod your_feature { + // Implementation } ``` -Use in BASIC scripts: -```basic -result = MY_KEYWORD "input data" -TALK result +3. **Register** in `installer.rs`: +```rust +fn register_your_feature(&mut self) { + self.components.insert( + "your_feature", + Component { + name: "Your Feature", + description: "Feature description", + port: None, + setup_required: false, + }, + ); +} ``` -### 2. Channel Adapters +## Understanding the gbapp → Virtual Crate Mapping -Implement a new messaging channel: +The transition from Node.js packages to Rust modules maintains conceptual familiarity: + +| Old (Node.js) | New (Rust) | Location | Purpose | +|---------------|------------|----------|---------| +| `core.gbapp` | `core` module | `src/core/` | Core engine functionality | +| `basic.gbapp` | `basic` module | `src/basic/` | BASIC interpreter | +| `whatsapp.gbapp` | `channels::whatsapp` | `src/channels/whatsapp/` | WhatsApp integration | +| `kb.gbapp` | `storage::kb` | `src/storage/kb/` | Knowledge base | +| `custom.gbapp` | `custom` module | `src/custom/` | Your contribution | + +### Creating Private gbapp Virtual Crates + +For proprietary features, you can still create private gbapps: ```rust -// src/channels/my_channel.rs -use async_trait::async_trait; -use crate::channels::traits::ChannelAdapter; - -pub struct MyChannelAdapter { - config: MyChannelConfig, -} - -#[async_trait] -impl ChannelAdapter for MyChannelAdapter { - async fn send_message(&self, recipient: &str, message: &str) -> Result<()> { - // Send message implementation - } - - async fn receive_message(&self) -> Result { - // Receive message implementation - } - - async fn send_attachment(&self, recipient: &str, file: &[u8]) -> Result<()> { - // Send file implementation - } +// Fork BotServer, then add your private gbapp +// src/proprietary/mod.rs +#[cfg(feature = "proprietary")] +pub mod my_private_feature { + // Your private implementation } ``` -### 3. Storage Providers - -Add support for new storage backends: - -```rust -// src/storage/my_storage.rs -use async_trait::async_trait; -use crate::storage::traits::StorageProvider; - -pub struct MyStorageProvider { - client: MyStorageClient, -} - -#[async_trait] -impl StorageProvider for MyStorageProvider { - async fn get(&self, key: &str) -> Result> { - // Retrieve object - } - - async fn put(&self, key: &str, data: &[u8]) -> Result<()> { - // Store object - } - - async fn delete(&self, key: &str) -> Result<()> { - // Delete object - } - - async fn list(&self, prefix: &str) -> Result> { - // List objects - } -} +Then in `Cargo.toml`: +```toml +[features] +proprietary = [] ``` -## Architecture for Extensions +This keeps your code separate while benefiting from core updates. -### Plugin System +### Benefits of the Virtual Crate Approach -BotServer uses a modular architecture that supports plugins: +1. **Familiar Mental Model**: Developers understand "packages" +2. **Clean Separation**: Each gbapp is self-contained +3. **Easy Discovery**: All gbapps visible in `src/` +4. **Native Performance**: Everything compiles together +5. **Type Safety**: Rust ensures interfaces are correct + +## Real Examples of gbapp Virtual Crates in src/ ``` -botserver/ -├── src/ -│ ├── core/ # Core functionality -│ ├── basic/ # BASIC interpreter -│ │ └── keywords/ # Keyword implementations -│ ├── channels/ # Channel adapters -│ ├── storage/ # Storage providers -│ ├── auth/ # Authentication modules -│ └── llm/ # LLM integrations +src/ +├── core/ # Core gbapp - Bootstrap, package manager +│ ├── mod.rs +│ ├── bootstrap.rs +│ └── package_manager/ +│ +├── basic/ # BASIC gbapp - Interpreter and keywords +│ ├── mod.rs +│ ├── interpreter.rs +│ └── keywords/ +│ ├── mod.rs +│ ├── talk.rs +│ ├── hear.rs +│ └── llm.rs +│ +├── channels/ # Channels gbapp - Communication adapters +│ ├── mod.rs +│ ├── whatsapp.rs +│ ├── teams.rs +│ └── email.rs +│ +└── analytics/ # Your new gbapp! + ├── mod.rs + ├── keywords.rs # ADD ANALYTICS, GET METRICS + └── services.rs # Analytics engine ``` -### Dependency Injection +## Development Environment -Extensions use dependency injection for configuration: +### System Requirements -```rust -// Configuration -#[derive(Deserialize)] -pub struct ExtensionConfig { - pub enabled: bool, - pub options: HashMap, -} +- **Disk Space**: 8GB minimum for development +- **RAM**: 8GB recommended +- **Database**: Any SQL database (abstracted) +- **Storage**: Any S3-compatible storage (abstracted) -// Registration -pub fn register_extension(app_state: &mut AppState, config: ExtensionConfig) { - if config.enabled { - let extension = MyExtension::new(config.options); - app_state.extensions.push(Box::new(extension)); - } -} +### No Brand Lock-in + +BotServer uses generic terms: +- ❌ PostgreSQL → ✅ "database" +- ❌ MinIO → ✅ "drive storage" +- ❌ Qdrant → ✅ "vector database" +- ❌ Redis → ✅ "cache" + +This ensures vendor neutrality and flexibility. + +## Security Best Practices + +### Regular Audits + +Run security audits regularly: +```bash +cargo audit ``` -## Common Extension Patterns +This checks for known vulnerabilities in dependencies. -### 1. API Integration +### Secure Coding -Create a keyword for external API calls: +When contributing: +- Validate all inputs +- Use safe Rust patterns +- Avoid `unsafe` blocks +- Handle errors properly +- Add security tests -```rust -pub fn register_api_keyword(engine: &mut Engine) { - engine.register_fn("CALL_API", |url: String, method: String| -> Dynamic { - let runtime = tokio::runtime::Runtime::new().unwrap(); - runtime.block_on(async { - let client = reqwest::Client::new(); - let response = match method.as_str() { - "GET" => client.get(&url).send().await, - "POST" => client.post(&url).send().await, - _ => return Dynamic::from("Invalid method"), - }; - - match response { - Ok(resp) => Dynamic::from(resp.text().await.unwrap_or_default()), - Err(_) => Dynamic::from("Error calling API"), - } - }) - }); -} -``` - -### 2. Database Operations - -Add custom database queries: - -```rust -pub fn register_db_keyword(engine: &mut Engine, state: Arc) { - let state_clone = state.clone(); - engine.register_fn("QUERY_DB", move |sql: String| -> Vec { - let mut conn = state_clone.conn.get().unwrap(); - - // Execute query (with proper sanitization) - let results = diesel::sql_query(sql) - .load::(&mut conn) - .unwrap_or_default(); - - // Convert to Dynamic - results.into_iter() - .map(|r| Dynamic::from(r)) - .collect() - }); -} -``` - -### 3. Event Handlers - -Implement custom event processing: - -```rust -pub trait EventHandler: Send + Sync { - fn handle_event(&self, event: Event) -> Result<()>; -} - -pub struct CustomEventHandler; - -impl EventHandler for CustomEventHandler { - fn handle_event(&self, event: Event) -> Result<()> { - match event { - Event::MessageReceived(msg) => { - // Process incoming message - }, - Event::SessionStarted(session) => { - // Initialize session - }, - Event::Error(err) => { - // Handle errors - }, - _ => Ok(()), - } - } -} -``` - -## Testing Extensions +## Testing Your Contributions ### Unit Tests - ```rust #[cfg(test)] mod tests { - use super::*; - #[test] - fn test_my_keyword() { - let mut engine = Engine::new(); - register_my_keyword(&mut engine); - - let result: String = engine.eval(r#"MY_KEYWORD "test""#).unwrap(); - assert_eq!(result, "Processed: test"); + fn test_keyword() { + // Test your keyword } } ``` ### Integration Tests - ```rust #[tokio::test] -async fn test_channel_adapter() { - let adapter = MyChannelAdapter::new(test_config()); - - // Test sending - let result = adapter.send_message("user123", "Test message").await; - assert!(result.is_ok()); - - // Test receiving - let message = adapter.receive_message().await.unwrap(); - assert_eq!(message.content, "Expected response"); +async fn test_feature() { + // Test feature integration } ``` -## Deployment Considerations - -### Configuration - -Extensions are configured in `config.csv`: - -```csv -name,value -extension_my_feature,enabled -extension_my_feature_option1,value1 -extension_my_feature_option2,value2 +### BASIC Script Tests +```basic +' test_script.bas +result = YOUR KEYWORD "test" +IF result != "expected" THEN + TALK "Test failed" +ELSE + TALK "Test passed" +END IF ``` -### Performance Impact +## Documentation Requirements -Consider performance when adding extensions: -- Use async operations for I/O -- Implement caching where appropriate -- Profile resource usage -- Add metrics and monitoring +All contributions must include: -### Security +1. **Keyword Documentation** in Chapter 6 +2. **Architecture Updates** if structural changes +3. **API Documentation** for new endpoints +4. **BASIC Examples** showing usage +5. **Migration Guide** if breaking changes -Ensure extensions are secure: -- Validate all input -- Use prepared statements for database queries -- Implement rate limiting -- Add authentication where needed -- Follow least privilege principle +## Performance Considerations -## Best Practices +### Benchmarking -### 1. Error Handling - -Always handle errors gracefully: - -```rust -pub fn my_extension_function() -> Result { - // Use ? operator for error propagation - let data = fetch_data()?; - let processed = process_data(data)?; - Ok(format!("Success: {}", processed)) -} +Before submitting: +```bash +cargo bench ``` -### 2. Logging +### Profiling -Add comprehensive logging: - -```rust -use log::{info, warn, error, debug}; - -pub fn process_request(req: Request) { - debug!("Processing request: {:?}", req); - - match handle_request(req) { - Ok(result) => info!("Request successful: {}", result), - Err(e) => error!("Request failed: {}", e), - } -} +Identify bottlenecks: +```bash +cargo flamegraph ``` -### 3. Documentation +## Community Guidelines -Document your extensions: +### What We Accept -```rust -/// Custom keyword for data processing -/// -/// # Arguments -/// * `input` - The data to process -/// -/// # Returns -/// Processed data as a string -/// -/// # Example -/// ```basic -/// result = PROCESS_DATA "raw input" -/// ``` -pub fn process_data_keyword(input: String) -> String { - // Implementation -} +✅ New BASIC keywords that benefit many users +✅ Performance improvements +✅ Bug fixes with tests +✅ Documentation improvements +✅ Security enhancements + +### What We Don't Accept + +❌ Vendor-specific integrations (use generic interfaces) +❌ Extensions that bypass BASIC +❌ Features achievable with existing keywords +❌ Undocumented code +❌ Code without tests + +## The Power of BASIC + LLM + +Remember: In 2025, 100% BASIC/LLM applications are reality. Before adding a keyword, consider: + +1. Can this be done with existing keywords + LLM? +2. Will this keyword benefit multiple use cases? +3. Does it follow the BASIC philosophy of simplicity? + +### Example: No Custom Code Needed + +Instead of custom integration code: +```basic +' Everything in BASIC +data = GET "api.example.com/data" +processed = LLM "Process this data: " + data +result = FIND "table", "criteria=" + processed +SEND MAIL user, "Results", result ``` -## Examples of Extensions +## Future Direction -### Weather Integration +BotServer's future is: +- **Stronger Core**: More powerful built-in keywords +- **Better LLM Integration**: Smarter AI capabilities +- **Simpler BASIC**: Even easier scripting +- **Community-Driven**: Features requested by users -```rust -pub fn register_weather_keyword(engine: &mut Engine) { - engine.register_fn("GET_WEATHER", |city: String| -> String { - // Call weather API - let api_key = std::env::var("WEATHER_API_KEY").unwrap_or_default(); - let url = format!("https://api.weather.com/v1/weather?city={}&key={}", city, api_key); - - // Fetch and parse response - // Return weather information - }); -} -``` +## How to Get Started -### Custom Analytics - -```rust -pub struct AnalyticsExtension { - client: AnalyticsClient, -} - -impl AnalyticsExtension { - pub fn track_event(&self, event: &str, properties: HashMap) { - self.client.track(Event { - name: event.to_string(), - properties, - timestamp: Utc::now(), - }); - } -} -``` +1. **Fork** the repository +2. **Read** existing code in `src/basic/keywords/` +3. **Discuss** your idea in GitHub Issues +4. **Implement** following the patterns +5. **Test** thoroughly +6. **Document** completely +7. **Submit** PR with clear explanation ## Summary -Extending BotServer allows you to: -- Add domain-specific functionality -- Integrate with existing systems -- Support new communication channels -- Implement custom business logic -- Enhance the platform's capabilities +The `.gbapp` concept has elegantly evolved from external Node.js packages to **virtual crates** within `src/`. This approach: +- **Preserves the mental model** developers are familiar with +- **Maps perfectly** to Rust's module system +- **Encourages contribution** by making the structure clear +- **Maintains separation** while compiling to a single binary -The modular architecture and clear extension points make it straightforward to add new features while maintaining system stability and performance. +Each directory in `src/` is effectively a gbapp - contribute by adding your own! With BASIC + LLM handling the complexity, your gbapp just needs to provide the right keywords and services. ## See Also -- [Prompt Manager](./prompt-manager.md) - Managing AI prompts and responses -- [Hooks System](./hooks.md) - Event-driven extensions -- [Adapter Development](./adapters.md) - Creating custom adapters -- [Chapter 2: Packages](../chapter-02/README.md) - Understanding bot components -- [Chapter 3: KB and Tools](../chapter-03/kb-and-tools.md) - Knowledge base and tool system -- [Chapter 5: BASIC Reference](../chapter-05/README.md) - Complete command reference -- [Chapter 8: External APIs](../chapter-08/external-apis.md) - API integration patterns -- [Chapter 9: Advanced Topics](../chapter-09/README.md) - Advanced features -- [Chapter 10: Development](../chapter-10/README.md) - Development tools and practices \ No newline at end of file +- [Philosophy](./philosophy.md) - The gbapp philosophy: Let machines do machine work +- [Architecture](./architecture.md) - System architecture +- [Building](./building.md) - Build process +- [Custom Keywords](./custom-keywords.md) - Keyword implementation +- [Services](./services.md) - Core services +- [Chapter 6: BASIC Reference](../chapter-06-gbdialog/README.md) - BASIC language +- [Chapter 9: API](../chapter-09-api/README.md) - API documentation diff --git a/docs/src/chapter-07-gbapp/architecture.md b/docs/src/chapter-07-gbapp/architecture.md index 79fefd607..e9c53bd52 100644 --- a/docs/src/chapter-07-gbapp/architecture.md +++ b/docs/src/chapter-07-gbapp/architecture.md @@ -4,50 +4,15 @@ BotServer follows a modular architecture designed for scalability, maintainabili ## Core Architecture -### System Architecture Diagram +### Data Flow Architecture -``` -┌─────────────────────────────────────────────────────────────────────────┐ -│ BotServer (Binary) │ -├─────────────────────────────────────────────────────────────────────────┤ -│ │ -│ ┌────────────────────────────────────────────────────────────────────┐ │ -│ │ Core Engine │ │ -│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ -│ │ │ BASIC │ │ Session │ │ Context │ │ │ -│ │ │ Interpreter │ │ Manager │ │ Manager │ │ │ -│ │ │ (Rhai) │ │ (Tokio) │ │ (Memory) │ │ │ -│ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │ -│ └─────────┼──────────────────┼──────────────────┼───────────────────┘ │ -│ │ │ │ │ -│ ┌─────────▼──────────────────▼──────────────────▼───────────────────┐ │ -│ │ AI & NLP Layer │ │ -│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ -│ │ │ LLM │ │ Embeddings │ │ Knowledge │ │ │ -│ │ │ Integration │ │ (BGE) │ │ Base │ │ │ -│ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │ -│ └─────────┼──────────────────┼──────────────────┼───────────────────┘ │ -│ │ │ │ │ -│ ┌─────────▼──────────────────▼──────────────────▼───────────────────┐ │ -│ │ Communication Layer │ │ -│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ -│ │ │ WebSocket │ │ REST API │ │ Channels │ │ │ -│ │ │ Server │ │ (Axum) │ │ Adapters │ │ │ -│ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │ -│ └─────────┼──────────────────┼──────────────────┼───────────────────┘ │ -│ │ │ │ │ -│ ┌─────────▼──────────────────▼──────────────────▼───────────────────┐ │ -│ │ Storage Layer │ │ -│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ -│ │ │PostgreSQL│ │ Valkey │ │ SeaweedFS│ │ Qdrant │ │ │ -│ │ │ (Diesel) │ │ Cache │ │ S3 │ │ Vectors │ │ │ -│ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │ -│ └─────────────────────────────────────────────────────────────────────┘ │ -│ │ -└─────────────────────────────────────────────────────────────────────────┘ -``` +![BotServer Data Flow Architecture](./assets/data-flow.svg) -### Module Dependency Graph +### System Architecture + +![BotServer System Architecture](./assets/system-architecture.svg) + +## Module Dependency Graph ``` main.rs diff --git a/docs/src/chapter-07-gbapp/assets/data-flow.svg b/docs/src/chapter-07-gbapp/assets/data-flow.svg new file mode 100644 index 000000000..ca221fa1c --- /dev/null +++ b/docs/src/chapter-07-gbapp/assets/data-flow.svg @@ -0,0 +1,139 @@ + + + + + + + + + + + + + + + + + + + + + BotServer Data Flow Architecture + + + + + User Input Layer + + + + Web UI + + + WhatsApp + + + Teams + + + Email + + + API + + + + + + + + + Core Processing Engine + + + + Session Manager + User Context + + + + BASIC Interpreter + Script Execution + + + + LLM Integration + AI Processing + + + + Knowledge Base + Vector Search + + + + + + + + + Tool System + External APIs & Functions + + + + Cache Layer + Response Optimization + + + + + + + + + Storage & Persistence Layer + + + + Database + User Data + + + + Vector DB + Embeddings + + + + Drive Storage + Files & Assets + + + + Cache + Fast Access + + + + + + + + + + + + + + Data Flow: + + Request/Response + + + Data Access + + + + All components run in async Rust for maximum performance + diff --git a/docs/src/chapter-07-gbapp/assets/system-architecture.svg b/docs/src/chapter-07-gbapp/assets/system-architecture.svg new file mode 100644 index 000000000..ec8cb2c91 --- /dev/null +++ b/docs/src/chapter-07-gbapp/assets/system-architecture.svg @@ -0,0 +1,155 @@ + + + + + + + + + BotServer Architecture - Virtual Crates System + + + + + + + BotServer Binary + + + + compiles to + + + + + Core Engine (src/core/) + + + + Bootstrap + System Init + Service Start + + + Package Manager + Component Registry + Module Loader + + + Session Manager + Context Handling + State Management + + + Shared State + AppState + Configuration + + + Utils + Helpers + Common + + + + + + + + + + Virtual Crates (gbapp modules in src/) + + + + basic.gbapp + src/basic/ + + • BASIC Interpreter + • Keywords Registry + • Script Execution + • Rhai Engine + + + + channels.gbapp + src/channels/ + + • WhatsApp + • Teams + • Email + • Web UI + + + + storage.gbapp + src/storage/ + + • Knowledge Base + • Drive Integration + • Vector DB + • Cache + + + + your_feature.gbapp + src/your_feature/ + + • Your Keywords + • Your Services + • Your Models + + Add yours! + + + + + + + + + AI & LLM Integration + + + LLM Service + + + Embeddings + + + Semantic Search + + + + + + Persistence Layer + + + Database + + + Vector DB + + + Drive + + + Cache + + + + + Key Concepts: + + + Virtual Crates = Modules in src/ + + + Your Contribution Space + + All compile to single optimized binary + + + + gbapp virtual crates: The bridge between old Node.js packages and new Rust modules + diff --git a/docs/src/chapter-07-gbapp/building.md b/docs/src/chapter-07-gbapp/building.md index e5af26722..13c99d9d4 100644 --- a/docs/src/chapter-07-gbapp/building.md +++ b/docs/src/chapter-07-gbapp/building.md @@ -9,7 +9,7 @@ This guide covers building BotServer from source, including dependencies, featur - **Operating System**: Linux, macOS, or Windows - **Rust**: 1.70 or later (2021 edition) - **Memory**: 4GB RAM minimum (8GB recommended) -- **Disk Space**: 2GB for dependencies and build artifacts +- **Disk Space**: 8GB for development environment ### Install Rust @@ -412,6 +412,17 @@ Find duplicate dependencies: cargo tree --duplicates ``` +### Security Audit + +Run security audit to check for known vulnerabilities in dependencies: + +```bash +cargo install cargo-audit +cargo audit +``` + +This should be run regularly during development to ensure dependencies are secure. + ## Build Artifacts After a successful release build, you'll have: diff --git a/docs/src/chapter-07-gbapp/example-gbapp.md b/docs/src/chapter-07-gbapp/example-gbapp.md new file mode 100644 index 000000000..bf4659a94 --- /dev/null +++ b/docs/src/chapter-07-gbapp/example-gbapp.md @@ -0,0 +1,354 @@ +# Example: Creating a New gbapp Virtual Crate + +This guide walks through creating a new gbapp virtual crate called `analytics` that adds analytics capabilities to BotServer. + +## Step 1: Create the Module Structure + +Create your gbapp directory in `src/`: + +``` +src/analytics/ # analytics.gbapp virtual crate +├── mod.rs # Module definition +├── keywords.rs # BASIC keywords +├── services.rs # Core functionality +├── models.rs # Data structures +└── tests.rs # Unit tests +``` + +## Step 2: Define the Module + +**src/analytics/mod.rs** +```rust +//! Analytics gbapp - Provides analytics and reporting functionality +//! +//! This virtual crate adds analytics keywords to BASIC and provides +//! services for tracking and reporting bot interactions. + +pub mod keywords; +pub mod services; +pub mod models; + +#[cfg(test)] +mod tests; + +use crate::shared::state::AppState; +use std::sync::Arc; + +/// Initialize the analytics gbapp +pub fn init(state: Arc) -> Result<(), Box> { + log::info!("Initializing analytics.gbapp virtual crate"); + + // Initialize analytics services + services::init_analytics_service(&state)?; + + Ok(()) +} +``` + +## Step 3: Add BASIC Keywords + +**src/analytics/keywords.rs** +```rust +use crate::shared::state::AppState; +use rhai::{Engine, Dynamic}; +use std::sync::Arc; + +/// Register analytics keywords with the BASIC interpreter +pub fn register_keywords(engine: &mut Engine, state: Arc) { + let state_clone = state.clone(); + + // TRACK EVENT keyword + engine.register_fn("TRACK EVENT", move |event_name: String, properties: String| -> String { + let result = tokio::task::block_in_place(|| { + tokio::runtime::Handle::current().block_on(async { + crate::analytics::services::track_event(&state_clone, &event_name, &properties).await + }) + }); + + match result { + Ok(_) => format!("Event '{}' tracked", event_name), + Err(e) => format!("Failed to track event: {}", e), + } + }); + + // GET ANALYTICS keyword + engine.register_fn("GET ANALYTICS", move |metric: String, timeframe: String| -> Dynamic { + let result = tokio::task::block_in_place(|| { + tokio::runtime::Handle::current().block_on(async { + crate::analytics::services::get_analytics(&metric, &timeframe).await + }) + }); + + match result { + Ok(data) => Dynamic::from(data), + Err(_) => Dynamic::UNIT, + } + }); + + // GENERATE REPORT keyword + engine.register_fn("GENERATE REPORT", move |report_type: String| -> String { + // Use LLM to generate natural language report + let data = crate::analytics::services::get_report_data(&report_type); + + let prompt = format!( + "Generate a {} report from this data: {}", + report_type, data + ); + + // This would call the LLM service + format!("Report generated for: {}", report_type) + }); +} +``` + +## Step 4: Implement Services + +**src/analytics/services.rs** +```rust +use crate::shared::state::AppState; +use crate::shared::models::AnalyticsEvent; +use std::sync::Arc; +use anyhow::Result; + +/// Initialize analytics service +pub fn init_analytics_service(state: &Arc) -> Result<()> { + // Set up database tables, connections, etc. + log::debug!("Analytics service initialized"); + Ok(()) +} + +/// Track an analytics event +pub async fn track_event( + state: &Arc, + event_name: &str, + properties: &str, +) -> Result<()> { + // Store event in database + let conn = state.conn.get()?; + + // Implementation details... + log::debug!("Tracked event: {}", event_name); + + Ok(()) +} + +/// Get analytics data +pub async fn get_analytics(metric: &str, timeframe: &str) -> Result { + // Query analytics data + let results = match metric { + "user_count" => get_user_count(timeframe).await?, + "message_volume" => get_message_volume(timeframe).await?, + "engagement_rate" => get_engagement_rate(timeframe).await?, + _ => return Err(anyhow::anyhow!("Unknown metric: {}", metric)), + }; + + Ok(results) +} + +/// Get data for report generation +pub fn get_report_data(report_type: &str) -> String { + // Gather data based on report type + match report_type { + "daily" => get_daily_report_data(), + "weekly" => get_weekly_report_data(), + "monthly" => get_monthly_report_data(), + _ => "{}".to_string(), + } +} + +// Helper functions +async fn get_user_count(timeframe: &str) -> Result { + // Implementation + Ok("100".to_string()) +} + +async fn get_message_volume(timeframe: &str) -> Result { + // Implementation + Ok("5000".to_string()) +} + +async fn get_engagement_rate(timeframe: &str) -> Result { + // Implementation + Ok("75%".to_string()) +} + +fn get_daily_report_data() -> String { + // Gather daily metrics + r#"{"users": 100, "messages": 1500, "sessions": 50}"#.to_string() +} + +fn get_weekly_report_data() -> String { + // Gather weekly metrics + r#"{"users": 500, "messages": 8000, "sessions": 300}"#.to_string() +} + +fn get_monthly_report_data() -> String { + // Gather monthly metrics + r#"{"users": 2000, "messages": 35000, "sessions": 1200}"#.to_string() +} +``` + +## Step 5: Define Data Models + +**src/analytics/models.rs** +```rust +use serde::{Deserialize, Serialize}; +use chrono::{DateTime, Utc}; + +#[derive(Debug, Serialize, Deserialize)] +pub struct AnalyticsEvent { + pub id: uuid::Uuid, + pub event_name: String, + pub properties: serde_json::Value, + pub user_id: Option, + pub session_id: String, + pub timestamp: DateTime, +} + +#[derive(Debug, Serialize, Deserialize)] +pub struct MetricSnapshot { + pub metric_name: String, + pub value: f64, + pub timestamp: DateTime, + pub dimensions: serde_json::Value, +} + +#[derive(Debug, Serialize, Deserialize)] +pub struct Report { + pub report_type: String, + pub generated_at: DateTime, + pub data: serde_json::Value, + pub summary: String, +} +``` + +## Step 6: Register with Core + +Update `src/basic/keywords/mod.rs` to include your gbapp: + +```rust +use crate::analytics; + +pub fn register_all_keywords(engine: &mut Engine, state: Arc) { + // ... existing keywords + + // Register analytics.gbapp keywords + analytics::keywords::register_keywords(engine, state.clone()); +} +``` + +Update `src/main.rs` or initialization code: + +```rust +// Initialize analytics gbapp +analytics::init(state.clone())?; +``` + +## Step 7: Add Tests + +**src/analytics/tests.rs** +```rust +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_track_event() { + // Test event tracking + let event_name = "user_login"; + let properties = r#"{"user_id": "123"}"#; + + // Test implementation + assert!(true); + } + + #[tokio::test] + async fn test_get_analytics() { + // Test analytics retrieval + let metric = "user_count"; + let timeframe = "daily"; + + // Test implementation + assert!(true); + } +} +``` + +## Step 8: Use in BASIC Scripts + +Now your gbapp keywords are available in BASIC: + +```basic +' Track user actions +TRACK EVENT "button_clicked", "button=submit" + +' Get metrics +daily_users = GET ANALYTICS "user_count", "daily" +TALK "Daily active users: " + daily_users + +' Generate AI-powered report +report = GENERATE REPORT "weekly" +TALK report + +' Combine with LLM for insights +metrics = GET ANALYTICS "all", "monthly" +insights = LLM "Analyze these metrics and provide insights: " + metrics +TALK insights +``` + +## Step 9: Add Feature Flag (Optional) + +If your gbapp should be optional, add it to `Cargo.toml`: + +```toml +[features] +analytics = [] + +# Include in default features if always needed +default = ["ui-server", "chat", "analytics"] +``` + +Then conditionally compile: + +```rust +#[cfg(feature = "analytics")] +pub mod analytics; + +#[cfg(feature = "analytics")] +analytics::keywords::register_keywords(engine, state.clone()); +``` + +## Benefits of This Approach + +1. **Clean Separation**: Your gbapp is self-contained +2. **Easy Discovery**: Visible in `src/analytics/` +3. **Type Safety**: Rust compiler checks everything +4. **Native Performance**: Compiles into the main binary +5. **Familiar Structure**: Like the old `.gbapp` packages + +## Best Practices + +✅ **DO:** +- Keep your gbapp focused on one domain +- Provide clear BASIC keywords +- Use LLM for complex logic +- Write comprehensive tests +- Document your keywords + +❌ **DON'T:** +- Create overly complex implementations +- Duplicate existing functionality +- Skip error handling +- Forget about async/await +- Ignore the BASIC-first philosophy + +## Summary + +Creating a gbapp virtual crate is straightforward: +1. Create a module in `src/` +2. Define keywords for BASIC +3. Implement services +4. Register with core +5. Use in BASIC scripts + +Your gbapp becomes part of BotServer's compiled binary, providing native performance while maintaining the conceptual clarity of the package system. Most importantly, remember that the implementation should be minimal - let BASIC + LLM handle the complexity! \ No newline at end of file diff --git a/docs/src/chapter-07-gbapp/philosophy.md b/docs/src/chapter-07-gbapp/philosophy.md new file mode 100644 index 000000000..a873e189f --- /dev/null +++ b/docs/src/chapter-07-gbapp/philosophy.md @@ -0,0 +1,279 @@ +# The gbapp Philosophy: Let Machines Do Machine Work + +## Core Principle: Automation First + +In 2025, the gbapp philosophy is simple and powerful: + +**"If a machine can do the work, let it do the work."** + +## The Hierarchy of Development + +### 1. LLM First (90% of cases) +Let AI write the code for you: +```basic +' Don't write complex logic - describe what you want +result = LLM "Generate a function that validates email addresses and returns true/false: " + email +``` + +### 2. BASIC for Flow Control (9% of cases) +Use BASIC only for orchestration: +```basic +' BASIC is just glue between AI calls +data = GET "api/data" +processed = LLM "Process this: " + data +SET "results", processed +``` + +### 3. Rust for Core Only (1% of cases) +Write Rust only when: +- Contributing new keywords to core +- Building fundamental infrastructure +- Optimizing critical performance paths + +## What gbapp Really Is + +**gbapp is NOT:** +- ❌ External plugin packages +- ❌ Separate npm modules +- ❌ A way to bypass BASIC +- ❌ Runtime extensions + +**gbapp IS:** +- ✅ Virtual crates inside `src/` +- ✅ Rust modules that compile together +- ✅ The bridge between old and new thinking +- ✅ A familiar mental model for contributions +- ✅ A mindset: "Code through automation" + +## Real-World Examples + +### Wrong Approach (Old Thinking) +```javascript +// 500 lines of custom Node.js, Python or C# code for data validation +function validateComplexBusinessRules(data) { + // ... hundreds of lines of logic +} +``` + +### Right Approach (2025 Reality) +```basic +' 3 lines - let AI handle complexity +rules = GET "business-rules.txt" +validation = LLM "Validate this data against these rules: " + data + " Rules: " + rules +IF validation CONTAINS "valid" THEN TALK "Approved" ELSE TALK "Rejected: " + validation +``` + +## The Multi-SDK Reality + +You don't need separate SDKs or plugins. Everything integrates through BASIC + LLM: + +### Integrating Any API +```basic +' No SDK needed - just describe what you want +data = GET "https://server/data" +answer = LLM "Do a good report from this json: " + data +TALK data +``` + +### Working with Any Database +```basic +' No ORM needed - AI understands SQL +results = FIND "users", "all users who logged in today" +``` + +### Processing Any Format +```basic +' No parser library needed +xml_data = GET "complex.xml" +json = LLM "Convert this XML to JSON: " + xml_data +SET BOT MEMORY "processed_data", json +``` + +## When to Write Code + +### Use LLM When: +- Processing unstructured data +- Implementing business logic +- Transforming between formats +- Making decisions +- Generating content +- Analyzing patterns +- **Basically: 90% of everything** + +### Use BASIC When: +- Orchestrating AI calls +- Simple flow control +- Managing state +- Connecting systems +- **Just the glue** + +### Use Rust When: +- Building new keywords in your gbapp virtual crate +- Creating a new gbapp module in `src/` +- System-level optimization +- Contributing new features as gbapps +- **Only for core enhancements** + +## The gbapp Mindset + +Stop thinking about: +- "How do I code this?" +- "What library do I need?" +- "How do I extend the system?" + +Start thinking about: +- "How do I describe this to AI?" +- "What's the simplest BASIC flow?" +- "How does this help everyone?" + +## Examples of Getting Real + +### Data Enrichment (Old Way) +```javascript +// 1000+ lines of code +// Multiple NPM packages +// Complex error handling +// Maintenance nightmare +``` + +### Data Enrichmentay) +```basic +items = FIND "companies", "needs_enrichment=true" +FOR EACH item IN items + website = WEBSITE OF item.company + page = GET website + enriched = LLM "Extract company info from: " + page + SET "companies", "id=" + item.id, "data=" + enriched +NEXT +``` + +### Report Generation (Old Way) +```python +# Custom reporting engine +# Template systems +# Complex formatting logic +# PDF libraries +``` + +### Report Generation (Get Real Way) +```basic +data = FIND "sales", "month=current" +report = LLM "Create executive summary from: " + data +CREATE SITE "report", "template", report +``` + +## The Ultimate Test + +Before writing ANY code, ask yourself: + +1. **Can LLM do this?** (Usually YES) +2. **Can BASIC orchestrate it?** (Almost always YES) +3. **Do I really need Rust?** (Almost never) + +## Benefits of This Approach + +### For Developers +- 100x faster development +- No dependency management +- No version conflicts +- No maintenance burden +- Focus on business logic, not implementation + +### For Organizations +- Reduced complexity +- Lower maintenance costs +- Faster iterations +- No vendor lock-in +- Anyone can contribute + +### For the Community +- Shared improvements benefit everyone +- No fragmentation +- Consistent experience +- Collective advancement + +## The Future is Already Here + +In 2025, this isn't aspirational - it's reality: + +- **100% BASIC/LLM applications** are production-ready +- **Zero custom code** for most use cases +- **AI handles complexity** better than humans +- **Machines do machine work** while humans do human work + +## Migration Path + +### From Extensions to Virtual Crates +``` +Old: node_modules/ + └── my-plugin.gbapp/ + ├── index.js (500 lines) + ├── package.json + └── complex logic + +New: src/ + └── my_feature/ # my_feature.gbapp (virtual crate) + ├── mod.rs # 50 lines + └── keywords.rs # Register BASIC keywords + +Plus: my-bot.gbdialog/ + └── logic.bas (5 lines using LLM) +``` + +### From Code to Descriptions +``` +Old: Write algorithm to process data +New: Describe what you want to LLM +``` + +### From Libraries to LLM +``` +Old: Import 20 NPM packages +New: Single LLM call with description +``` + +## Get Real Guidelines + +✅ **DO:** +- Describe problems to LLM +- Use BASIC as glue +- Contribute keywords to core +- Share your patterns +- Think automation-first + +❌ **DON'T:** +- Write complex algorithms +- Build separate plugins +- Create custom frameworks +- Maintain separate codebases +- Fight the machine + +## The Virtual Crate Architecture + +Each gbapp is now a module in `src/`: +``` +src/ +├── core/ # core.gbapp +├── basic/ # basic.gbapp +├── channels/ # channels.gbapp +└── your_feature/ # your_feature.gbapp (your contribution!) +``` + +This elegant mapping preserves the conceptual model while leveraging Rust's power. + +## Conclusion + +gbapp in 2025 has evolved from external packages to virtual crates - Rust modules inside `src/` that compile into a single, optimized binary. This preserves the familiar mental model while delivering native performance. + +The philosophy remains: machines are better at machine work. Your job is to describe what you want, not implement how to do it. The combination of BASIC + LLM eliminates the need for traditional programming in almost all cases. + + +## Examples Repository + +See `/templates/` for real-world examples of 100% BASIC/LLM applications: +- CRM system: 50 lines of BASIC +- Email automation: 30 lines of BASIC +- Data pipeline: 20 lines of BASIC +- Report generator: 15 lines of BASIC + +Each would have been thousands of lines in traditional code. diff --git a/docs/src/chapter-07-gbapp/prompt-manager.md b/docs/src/chapter-07-gbapp/prompt-manager.md index 4d371e5ba..b2c790257 100644 --- a/docs/src/chapter-07-gbapp/prompt-manager.md +++ b/docs/src/chapter-07-gbapp/prompt-manager.md @@ -1,262 +1 @@ # Prompt Manager - -The Prompt Manager module provides centralized management of LLM prompts, templates, and system instructions used throughout BotServer. - -## Overview - -Located in `src/prompt_manager/`, this module maintains a library of reusable prompts that can be: -- Versioned and updated without code changes -- Customized per bot instance -- Composed dynamically based on context -- Optimized for different LLM models - -## Architecture - -``` -src/prompt_manager/ -├── mod.rs # Main module interface -├── prompts.csv # Default prompt library -└── templates/ # Complex prompt templates - ├── system.md # System instructions - ├── tools.md # Tool-use prompts - └── context.md # Context formatting -``` - -## Prompt Library Format - -The `prompts.csv` file stores prompts in a structured format: - -```csv -id,category,name,content,model,version -1,system,default,"You are a helpful assistant...",gpt-4,1.0 -2,tools,function_call,"To use a tool, follow this format...",any,1.0 -3,context,kb_search,"Search the knowledge base for: {query}",any,1.0 -``` - -### Fields - -| Field | Description | -|-------|-------------| -| `id` | Unique identifier | -| `category` | Prompt category (system, tools, context, etc.) | -| `name` | Prompt name for retrieval | -| `content` | The actual prompt text with placeholders | -| `model` | Target model or "any" for universal | -| `version` | Version for tracking changes | - -## Usage in BASIC - -Prompts are automatically loaded and can be referenced in dialogs: - -```basic -' For background processing only - not for interactive conversations -' Generate content for storage -summary = LLM "Use prompt: customer_service" -SET BOT MEMORY "service_info", summary - -' For interactive conversations, use SET CONTEXT -SET CONTEXT "support_issue", issue -TALK "How can I help you with your technical issue?" -``` - -## Rust API - -### Loading Prompts - -```rust -use crate::prompt_manager::PromptManager; - -let manager = PromptManager::new(); -manager.load_from_csv("prompts.csv")?; -``` - -### Retrieving Prompts - -```rust -// Get a specific prompt -let prompt = manager.get_prompt("system", "default")?; - -// Get prompt with variable substitution -let mut vars = HashMap::new(); -vars.insert("query", "user question"); -let formatted = manager.format_prompt("context", "kb_search", vars)?; -``` - -### Dynamic Composition - -```rust -// Compose multiple prompts -let system = manager.get_prompt("system", "default")?; -let tools = manager.get_prompt("tools", "available")?; -let context = manager.get_prompt("context", "current")?; - -let full_prompt = manager.compose(vec![system, tools, context])?; -``` - -## Prompt Categories - -### System Prompts -Define the AI assistant's role and behavior: -- `default`: Standard helpful assistant -- `professional`: Business-focused responses -- `technical`: Developer-oriented assistance -- `creative`: Creative writing and ideation - -### Tool Prompts -Instructions for tool usage: -- `function_call`: How to invoke functions -- `parameter_format`: Parameter formatting rules -- `error_handling`: Tool error responses - -### Context Prompts -Templates for providing context: -- `kb_search`: Knowledge base query format -- `conversation_history`: Previous message format -- `user_context`: User information format - -### Guardrail Prompts -Safety and compliance instructions: -- `content_filter`: Inappropriate content handling -- `pii_protection`: Personal data protection -- `compliance`: Regulatory compliance rules - -## Custom Prompts - -Bots can override default prompts by providing their own: - -``` -mybot.gbai/ -└── mybot.gbot/ - ├── config.csv - └── prompts.csv # Custom prompts override defaults -``` - -## Model-Specific Optimization - -Prompts can be optimized for different models: - -```csv -id,category,name,content,model,version -1,system,default,"You are Claude...",claude-3,1.0 -2,system,default,"You are GPT-4...",gpt-4,1.0 -3,system,default,"You are a helpful assistant",llama-3,1.0 -``` - -The manager automatically selects the best match for the current model. - -## Variables and Placeholders - -Prompts support variable substitution using `{variable}` syntax: - -``` -"Search for {query} in {collection} and return {limit} results" -``` - -Variables are replaced at runtime: - -```rust -let vars = hashmap!{ - "query" => "pricing information", - "collection" => "docs", - "limit" => "5" -}; -let prompt = manager.format_prompt("search", "template", vars)?; -``` - -## Prompt Versioning - -Track prompt evolution: - -```csv -id,category,name,content,model,version -1,system,default,"Original prompt...",gpt-4,1.0 -2,system,default,"Updated prompt...",gpt-4,1.1 -3,system,default,"Latest prompt...",gpt-4,2.0 -``` - -The manager uses the latest version by default but can retrieve specific versions: - -```rust -let prompt = manager.get_prompt_version("system", "default", "1.0")?; -``` - -## Performance Optimization - -### Caching -Frequently used prompts are cached in memory: - -```rust -manager.cache_prompt("system", "default"); -``` - -### Token Counting -Estimate token usage before sending: - -```rust -let tokens = manager.estimate_tokens(prompt, "gpt-4")?; -if tokens > MAX_TOKENS { - prompt = manager.compress_prompt(prompt, MAX_TOKENS)?; -} -``` - -### Compression -Automatically compress prompts while maintaining meaning: - -```rust -let compressed = manager.compress_prompt(original, target_tokens)?; -``` - -## Best Practices - -1. **Modularity**: Keep prompts focused on single responsibilities -2. **Versioning**: Always version prompts for rollback capability -3. **Testing**: Test prompts across different models -4. **Documentation**: Document the purpose and expected output -5. **Variables**: Use placeholders for dynamic content -6. **Optimization**: Tailor prompts to specific model capabilities - -## Integration with BASIC - -The Prompt Manager is automatically available in BASIC dialogs: - -```basic -' Load custom prompt library -LOAD_PROMPTS "custom_prompts.csv" - -' For background processing - generate content once -greeting = LLM PROMPT("customer_greeting") -SET BOT MEMORY "standard_greeting", greeting - -' For interactive conversations with variables -SET CONTEXT "customer_name", customer_name -SET CONTEXT "support_ticket", support_ticket -TALK "Let me help you with your support request." -``` - -## Monitoring and Analytics - -Track prompt performance: - -```rust -// Log prompt usage -manager.log_usage("system", "default", response_quality); - -// Get analytics -let stats = manager.get_prompt_stats("system", "default")?; -println!("Success rate: {}%", stats.success_rate); -println!("Avg response time: {}ms", stats.avg_latency); -``` - -## Error Handling - -Handle missing or invalid prompts gracefully: - -```rust -match manager.get_prompt("custom", "missing") { - Ok(prompt) => use_prompt(prompt), - Err(PromptError::NotFound) => use_default(), - Err(PromptError::Invalid) => log_and_fallback(), - Err(e) => return Err(e), -} -``` - diff --git a/docs/src/chapter-08-config/README.md b/docs/src/chapter-08-config/README.md index c6573bfad..32b959471 100644 --- a/docs/src/chapter-08-config/README.md +++ b/docs/src/chapter-08-config/README.md @@ -1,42 +1,111 @@ -## gbot Reference -`config.csv` defines the bot’s behaviour and parameters. +# Bot Configuration + +This chapter covers bot configuration through the `config.csv` file system. Each bot's behavior is controlled by a simple CSV configuration file in its `.gbot` package. + +## Configuration System + +BotServer uses a straightforward name-value CSV format for configuration: ```csv -# config.csv – Bot configuration -bot_name,GeneralBot -language,en -theme,default.gbtheme -knowledge_base,default.gbkb -max_context_tokens,2048 +name,value +setting_name,setting_value +another_setting,another_value ``` -### Key Columns -- **bot_name** – Display name of the bot. -- **language** – Locale for formatting (used by `FORMAT`). -- **theme** – UI theme package (`.gbtheme`). -- **knowledge_base** – Default knowledge‑base package (`.gbkb`). -- **max_context_tokens** – Maximum number of tokens retained in the session context. -- **max_context_tokens** – Limit for the amount of context sent to the LLM. +## File Location -### Editing the Configuration -The file is a simple CSV; each line is `key,value`. Comments start with `#`. After editing, restart the server to apply changes. +``` +mybot.gbai/ +└── mybot.gbot/ + └── config.csv +``` -### Runtime Effects -- Changing **theme** updates the UI served from `web/static/`. -- Modifying **knowledge_base** switches the vector collection used for semantic search. -- Adjusting **answer_mode** influences the order of tool invocation and LLM calls. +## Configuration Categories -For advanced configuration, see `src/bot/config.rs` which parses this file into the `BotConfig` struct. +### Server Settings +- Web server binding and ports +- Site generation paths +- Service endpoints + +### LLM Configuration +- Model paths (local GGUF files) +- Service URLs +- Cache settings +- Server parameters (when embedded) + +### Prompt Management +- Context compaction levels +- History retention +- Token management + +### Email Integration +- SMTP server settings +- Authentication credentials +- Sender configuration + +### Theme Customization +- Color schemes +- Logo URLs +- Bot titles + +### Custom Database +- External database connections +- Authentication details + +## Key Features + +### Simple Format +- Plain CSV with name-value pairs +- No complex syntax +- Human-readable + +### Flexible Structure +- Empty rows for visual grouping +- Optional settings with defaults +- Extensible for custom needs + +### Local-First +- Designed for local LLM models +- Self-hosted services +- No cloud dependency by default + +## Example Configurations + +### Minimal Setup +Just the essentials to run a bot: +```csv +name,value +llm-url,http://localhost:8081 +llm-model,../../../../data/llm/model.gguf +``` + +### Production Setup +Full configuration with all services: +```csv +name,value +, +server_host,0.0.0.0 +server_port,8080 +, +llm-url,http://localhost:8081 +llm-model,../../../../data/llm/production-model.gguf +llm-cache,true +, +email-server,smtp.company.com +email-from,bot@company.com +, +theme-title,Company Assistant +``` + +## Configuration Philosophy + +1. **Defaults Work**: Most settings have sensible defaults +2. **Local First**: Assumes local services, not cloud APIs +3. **Simple Values**: All values are strings, parsed as needed +4. **No Magic**: What you see is what you get ## See Also -- [config.csv Reference](./config-csv.md) - Complete configuration options -- [PostgreSQL Setup](./postgresql.md) - Database configuration -- [MinIO Storage](./minio.md) - Object storage setup -- [Qdrant Vector DB](./qdrant.md) - Vector database configuration -- [Valkey Cache](./valkey.md) - Caching layer setup -- [Chapter 2: .gbot](../chapter-02/gbot.md) - Bot configuration package -- [Chapter 3: Knowledge Base](../chapter-03/README.md) - KB configuration -- [Chapter 5: BASIC Reference](../chapter-05/README.md) - Script configuration -- [Chapter 9: Storage](../chapter-09/storage.md) - Storage architecture -- [Chapter 11: Infrastructure](../chapter-11/README.md) - Complete infrastructure guide +- [config.csv Format](./config-csv.md) - Complete reference +- [LLM Configuration](./llm-config.md) - Language model settings +- [Parameters](./parameters.md) - All available parameters \ No newline at end of file diff --git a/docs/src/chapter-08-config/answer-modes.md b/docs/src/chapter-08-config/answer-modes.md deleted file mode 100644 index 387fb473f..000000000 --- a/docs/src/chapter-08-config/answer-modes.md +++ /dev/null @@ -1 +0,0 @@ -# Answer Modes diff --git a/docs/src/chapter-08-config/config-csv.md b/docs/src/chapter-08-config/config-csv.md index 2b65cc91e..e846ce585 100644 --- a/docs/src/chapter-08-config/config-csv.md +++ b/docs/src/chapter-08-config/config-csv.md @@ -1,263 +1,243 @@ # config.csv Format -The `config.csv` file is the central configuration for each bot instance. Located in the `.gbot` package directory, it controls all bot behavior, integrations, and system settings. +The `config.csv` file is the central configuration for each bot, located in the `.gbot` package. It uses a simple name-value pair format. -## File Location - -``` -mybot.gbai/ -└── mybot.gbot/ - └── config.csv -``` - -## Format - -Configuration uses simple CSV format with two columns: `key` and `value`. +## File Format ```csv -key,value -botId,00000000-0000-0000-0000-000000000000 -title,My Bot Name -description,Bot description here +name,value +setting_name,setting_value +another_setting,another_value ``` -## Core Settings +- **Empty rows** are used for visual grouping +- **No quotes** needed for string values +- **Case-sensitive** names -### Bot Identity +## Core Server Settings -| Key | Description | Default | Example | -|-----|-------------|---------|---------| -| `botId` | Unique bot identifier (UUID) | Generated | `00000000-0000-0000-0000-000000000000` | -| `title` | Bot display name | Required | `Customer Support Bot` | -| `description` | Bot description | Empty | `Handles customer inquiries` | -| `logoUrl` | Bot avatar/logo URL | Empty | `https://example.com/logo.png` | -| `welcomeMessage` | Initial greeting | Empty | `Hello! How can I help you today?` | - -### LLM Configuration - -| Key | Description | Default | Example | -|-----|-------------|---------|---------| -| `llm-model` | Model path or name | Local model | `../../../../data/llm/model.gguf` | -| `llm-key` | API key (if using external) | `none` | `sk-...` for external APIs | -| `llm-url` | LLM endpoint URL | `http://localhost:8081` | Local or external endpoint | -| `llm-cache` | Enable LLM caching | `false` | `true` | -| `llm-cache-ttl` | Cache time-to-live (seconds) | `3600` | `7200` | - -### Knowledge Base - -| Key | Description | Default | Example | -|-----|-------------|---------|---------| -| `qdrantUrl` | Vector database URL | `http://localhost:6333` | `http://qdrant:6333` | -| `qdrantApiKey` | Qdrant API key | Empty | `your-api-key` | -| `embeddingModel` | Model for embeddings | `text-embedding-ada-002` | `all-MiniLM-L6-v2` | -| `chunkSize` | Text chunk size | `1000` | `500` | -| `chunkOverlap` | Overlap between chunks | `200` | `100` | -| `topK` | Number of search results | `5` | `10` | - -### Storage Configuration ### Server Configuration - -| Key | Description | Default | Example | -|-----|-------------|---------|---------| -| `server_host` | Server bind address | `0.0.0.0` | `localhost` | -| `server_port` | Server port | `8080` | `3000` | -| `sites_root` | Sites root directory | `/tmp` | `/var/www` | -| `mcp-server` | Enable MCP server | `false` | `true` | - -### Database - -| Key | Description | Default | Example | -|-----|-------------|---------|---------| -| `databaseUrl` | PostgreSQL connection | Required | `postgresql://user:pass@localhost/botdb` | -| `maxConnections` | Connection pool size | `10` | `25` | -| `connectionTimeout` | Timeout in seconds | `30` | `60` | - -### Email Integration - -| Key | Description | Default | Example | -|-----|-------------|---------|---------| -| `smtpHost` | SMTP server | Empty | `smtp.gmail.com` | -| `smtpPort` | SMTP port | `587` | `465` | -| `smtpUser` | Email username | Empty | `bot@example.com` | -| `smtpPassword` | Email password | Empty | `app-specific-password` | -| `smtpFrom` | From address | Empty | `noreply@example.com` | -| `smtpUseTls` | Use TLS | `true` | `false` | - -### Calendar Integration - -| Key | Description | Default | Example | -|-----|-------------|---------|---------| -| `calendarEnabled` | Enable calendar features | `false` | `true` | -| `calendarProvider` | Calendar service | `google` | `microsoft`, `caldav` | -| `calendarApiKey` | Calendar API key | Empty | `your-api-key` | -| `workingHoursStart` | Business hours start | `09:00` | `08:30` | -| `workingHoursEnd` | Business hours end | `17:00` | `18:00` | -| `timezone` | Default timezone | `UTC` | `America/New_York` | - -### Authentication - -| Key | Description | Default | Example | -|-----|-------------|---------|---------| -| `authEnabled` | Require authentication | `false` | `true` | -| `authProvider` | Auth provider | `local` | `oauth`, `saml`, `ldap` | -| `authClientId` | OAuth client ID | Empty | `client-id` | -| `authClientSecret` | OAuth secret | Empty | `client-secret` | -| `authCallbackUrl` | OAuth callback | Empty | `https://bot.example.com/auth/callback` | -| `jwtSecret` | JWT signing secret | Generated | `your-secret-key` | -| `sessionTimeout` | Session duration (min) | `1440` | `60` | - -### Channel Configuration - -| Key | Description | Default | Example | -|-----|-------------|---------|---------| -| `webEnabled` | Enable web interface | `true` | `false` | -| `whatsappEnabled` | Enable WhatsApp | `false` | `true` | -| `whatsappToken` | WhatsApp API token | Empty | `EAAI...` | -| `whatsappPhoneId` | WhatsApp phone ID | Empty | `123456789` | -| `teamsEnabled` | Enable MS Teams | `false` | `true` | -| `teamsAppId` | Teams app ID | Empty | `app-id` | -| `teamsAppPassword` | Teams app password | Empty | `app-password` | -| `slackEnabled` | Enable Slack | `false` | `true` | -| `slackToken` | Slack bot token | Empty | `xoxb-...` | - -### Security - -| Key | Description | Default | Example | -|-----|-------------|---------|---------| -| `corsOrigins` | Allowed CORS origins | `*` | `https://example.com` | -| `rateLimitPerMinute` | API rate limit | `60` | `100` | -| `maxFileSize` | Max upload size (MB) | `10` | `50` | -| `allowedFileTypes` | Permitted file types | `pdf,doc,txt` | `*` | -| `encryptionKey` | Data encryption key | Generated | `base64-key` | -| `requireHttps` | Force HTTPS | `false` | `true` | - -### Monitoring - -| Key | Description | Default | Example | -|-----|-------------|---------|---------| -| `metricsEnabled` | Enable metrics | `false` | `true` | -| `metricsEndpoint` | Metrics endpoint | `/metrics` | `/admin/metrics` | -| `loggingLevel` | Log level | `info` | `debug`, `warn`, `error` | -| `logToFile` | Log to file | `false` | `true` | -| `logFilePath` | Log file location | `./logs` | `/var/log/botserver` | -| `sentryDsn` | Sentry error tracking | Empty | `https://...@sentry.io/...` | - -### Advanced Features - -| Key | Description | Default | Example | -|-----|-------------|---------|---------| -| `webAutomationEnabled` | Enable web scraping | `false` | `true` | -| `ocrEnabled` | Enable OCR | `false` | `true` | -| `speechEnabled` | Enable speech | `false` | `true` | -| `translationEnabled` | Enable translation | `false` | `true` | -| `cacheEnabled` | Enable cache component | `false` | `true` | -| `cacheUrl` | Cache URL | `redis://localhost:6379` | `redis://cache:6379` | - -## Environment Variable Override - -Any config value can be overridden using environment variables: - -```bash -# Override LLM model -export BOT_LLM_MODEL=gpt-4-turbo - -# Override database URL -export BOT_DATABASE_URL=postgresql://prod@db/botserver +```csv +server_host,0.0.0.0 +server_port,8080 +sites_root,/tmp ``` -## Multiple Bots Configuration +| Name | Description | Default | Example | +|------|-------------|---------|---------| +| `server_host` | Bind address for the web server | `0.0.0.0` | `0.0.0.0` | +| `server_port` | Port for the web interface | `8080` | `8080` | +| `sites_root` | Directory for generated sites | `/tmp` | `/tmp` | -Each bot has its own `config.csv`. The system loads all bot configurations on startup: +## LLM Configuration -``` -templates/ -├── support.gbai/ -│ └── support.gbot/ -│ └── config.csv # Support bot config -├── sales.gbai/ -│ └── sales.gbot/ -│ └── config.csv # Sales bot config -└── default.gbai/ - └── default.gbot/ - └── config.csv # Default bot config +### LLM Connection +```csv +llm-key,none +llm-url,http://localhost:8081 +llm-model,../../../../data/llm/DeepSeek-R1-Distill-Qwen-1.5B-Q3_K_M.gguf ``` -## Configuration Validation +| Name | Description | Default | Example | +|------|-------------|---------|---------| +| `llm-key` | API key for LLM service | `none` | `none` or API key | +| `llm-url` | LLM service endpoint | `http://localhost:8081` | `http://localhost:8081` | +| `llm-model` | Path to GGUF model file | Model path | `../../../../data/llm/model.gguf` | -The system validates configuration on startup: -- Required fields must be present -- UUIDs must be valid format -- URLs must be reachable -- API keys are tested -- File paths must exist +### LLM Cache Settings +```csv +llm-cache,false +llm-cache-ttl,3600 +llm-cache-semantic,true +llm-cache-threshold,0.95 +``` -## Hot Reload +| Name | Description | Default | Example | +|------|-------------|---------|---------| +| `llm-cache` | Enable response caching | `false` | `true` or `false` | +| `llm-cache-ttl` | Cache TTL in seconds | `3600` | `3600` | +| `llm-cache-semantic` | Enable semantic similarity caching | `true` | `true` or `false` | +| `llm-cache-threshold` | Similarity threshold (0-1) | `0.95` | `0.95` | -Changes to `config.csv` can be reloaded without restart: -1. Edit the file -2. Call `/api/admin/reload-config` endpoint -3. Or use the admin UI reload button +### LLM Server Settings (when running embedded) +```csv +llm-server,false +llm-server-path,botserver-stack/bin/llm/build/bin +llm-server-host,0.0.0.0 +llm-server-port,8081 +llm-server-gpu-layers,0 +llm-server-n-moe,0 +llm-server-ctx-size,4096 +llm-server-n-predict,1024 +llm-server-parallel,6 +llm-server-cont-batching,true +llm-server-mlock,false +llm-server-no-mmap,false +``` -## Security Best Practices +| Name | Description | Default | +|------|-------------|---------| +| `llm-server` | Run embedded LLM server | `false` | +| `llm-server-path` | Path to LLM server binaries | `botserver-stack/bin/llm/build/bin` | +| `llm-server-host` | LLM server bind address | `0.0.0.0` | +| `llm-server-port` | LLM server port | `8081` | +| `llm-server-gpu-layers` | GPU layers to offload | `0` | +| `llm-server-n-moe` | Number of MoE experts | `0` | +| `llm-server-ctx-size` | Context size in tokens | `4096` | +| `llm-server-n-predict` | Max prediction tokens | `1024` | +| `llm-server-parallel` | Parallel requests | `6` | +| `llm-server-cont-batching` | Continuous batching | `true` | +| `llm-server-mlock` | Lock model in memory | `false` | +| `llm-server-no-mmap` | Disable memory mapping | `false` | -1. **Never commit API keys** - Use environment variables -2. **Encrypt sensitive values** - Use `encryptionKey` setting -3. **Rotate credentials regularly** - Update keys monthly -4. **Use strong JWT secrets** - At least 32 characters -5. **Restrict CORS origins** - Don't use `*` in production -6. **Enable HTTPS** - Set `requireHttps=true` -7. **Set rate limits** - Prevent abuse -8. **Monitor access** - Enable logging and metrics - -## Troubleshooting - -### Bot Won't Start -- Check required fields are set -- Verify database connection -- Ensure bot ID is unique - -### LLM Not Responding -- Verify API key is valid -- Check endpoint URL -- Test rate limits - -### Storage Issues -- Verify drive is running -- Check access credentials -- Test bucket permissions - -### Authentication Problems -- Verify JWT secret matches -- Check session timeout -- Test OAuth callback URL - -## Example Configuration - -Complete example for a production bot: +## Prompt Settings ```csv -key,value -botId,a1b2c3d4-e5f6-7890-abcd-ef1234567890 -title,Customer Support Assistant -description,24/7 automated customer support -welcomeMessage,Hello! I'm here to help with any questions. -llmModel,gpt-4 -llmApiKey,${LLM_API_KEY} -llmTemperature,0.3 -databaseUrl,${DATABASE_URL} -minioEndpoint,storage.example.com -minioAccessKey,${MINIO_ACCESS} -minioSecretKey,${MINIO_SECRET} -minioBucket,support-bot -minioUseSsl,true -authEnabled,true -authProvider,oauth -authClientId,${OAUTH_CLIENT_ID} -authClientSecret,${OAUTH_CLIENT_SECRET} -corsOrigins,https://app.example.com -requireHttps,true -loggingLevel,info -metricsEnabled,true -cacheEnabled,true -cacheUrl,redis://cache:6379 +prompt-compact,4 +prompt-history,2 ``` + +| Name | Description | Default | Example | +|------|-------------|---------|---------| +| `prompt-compact` | Context compaction level | `4` | `4` | +| `prompt-history` | Messages to keep in history | Not set | `2` | + +## Embedding Configuration + +```csv +embedding-url,http://localhost:8082 +embedding-model,../../../../data/llm/bge-small-en-v1.5-f32.gguf +``` + +| Name | Description | Default | +|------|-------------|---------| +| `embedding-url` | Embedding service endpoint | `http://localhost:8082` | +| `embedding-model` | Path to embedding model | Model path | + +## Email Configuration + +```csv +email-from,from@domain.com +email-server,mail.domain.com +email-port,587 +email-user,user@domain.com +email-pass, +``` + +| Name | Description | Example | +|------|-------------|---------| +| `email-from` | Sender email address | `noreply@example.com` | +| `email-server` | SMTP server hostname | `smtp.gmail.com` | +| `email-port` | SMTP port | `587` | +| `email-user` | SMTP username | `user@example.com` | +| `email-pass` | SMTP password | Password (empty if not set) | + +## Theme Configuration + +```csv +theme-color1,#0d2b55 +theme-color2,#fff9c2 +theme-logo,https://pragmatismo.com.br/icons/general-bots.svg +theme-title,Announcements General Bots +``` + +| Name | Description | Example | +|------|-------------|---------| +| `theme-color1` | Primary theme color | `#0d2b55` | +| `theme-color2` | Secondary theme color | `#fff9c2` | +| `theme-logo` | Logo URL | `https://example.com/logo.svg` | +| `theme-title` | Bot display title | `My Bot` | + +## Custom Database + +```csv +custom-server,localhost +custom-port,5432 +custom-database,mycustomdb +custom-username, +custom-password, +``` + +| Name | Description | Example | +|------|-------------|---------| +| `custom-server` | Database server | `localhost` | +| `custom-port` | Database port | `5432` | +| `custom-database` | Database name | `mydb` | +| `custom-username` | Database user | Username | +| `custom-password` | Database password | Password | + +## MCP Server + +```csv +mcp-server,false +``` + +| Name | Description | Default | +|------|-------------|---------| +| `mcp-server` | Enable MCP server | `false` | + +## Complete Example + +### Minimal Configuration +```csv +name,value +server_port,8080 +llm-url,http://localhost:8081 +llm-model,../../../../data/llm/model.gguf +``` + +### Production Configuration +```csv +name,value +, +server_host,0.0.0.0 +server_port,443 +sites_root,/var/www/sites +, +llm-key,sk-... +llm-url,https://api.openai.com +llm-model,gpt-4 +, +llm-cache,true +llm-cache-ttl,7200 +, +email-from,bot@company.com +email-server,smtp.company.com +email-port,587 +email-user,bot@company.com +email-pass,secure_password +, +theme-title,Company Assistant +theme-color1,#003366 +theme-color2,#ffffff +``` + +## Configuration Loading + +1. Default values are applied first +2. `config.csv` values override defaults +3. Environment variables override config.csv (if implemented) +4. All values are strings - parsed as needed by the application + +## Best Practices + +✅ **DO:** +- Group related settings with empty rows +- Use descriptive values +- Keep sensitive data in environment variables when possible +- Test configuration changes in development first + +❌ **DON'T:** +- Include quotes around values +- Use spaces around commas +- Leave trailing commas +- Include comments with # (use empty name field instead) + +## Validation + +The system validates: +- Required fields are present +- Port numbers are valid (1-65535) +- URLs are properly formatted +- File paths exist (for model files) +- Email settings are complete if email features are used \ No newline at end of file diff --git a/docs/src/chapter-08-config/llm-config.md b/docs/src/chapter-08-config/llm-config.md index 399e52b68..1c821952e 100644 --- a/docs/src/chapter-08-config/llm-config.md +++ b/docs/src/chapter-08-config/llm-config.md @@ -1,16 +1,10 @@ # LLM Configuration -Configure Large Language Model providers for bot conversations. BotServer prioritizes local models for privacy and cost-effectiveness. +Configuration for Language Model integration in BotServer, supporting both local GGUF models and external API services. -## Overview +## Local Model Configuration -BotServer supports both local models (GGUF format) and cloud APIs. The default configuration uses local models running on your hardware. - -## Local Models (Default) - -### Configuration - -From `default.gbai/default.gbot/config.csv`: +BotServer is designed to work with local GGUF models by default: ```csv llm-key,none @@ -18,13 +12,35 @@ llm-url,http://localhost:8081 llm-model,../../../../data/llm/DeepSeek-R1-Distill-Qwen-1.5B-Q3_K_M.gguf ``` -### LLM Server Settings +### Model Path + +The `llm-model` parameter accepts: +- **Relative paths**: `../../../../data/llm/model.gguf` +- **Absolute paths**: `/opt/models/model.gguf` +- **Model names**: When using external APIs like `gpt-4` + +### Supported Model Formats + +- **GGUF**: Quantized models for CPU/GPU inference +- **Q3_K_M, Q4_K_M, Q5_K_M**: Different quantization levels +- **F16, F32**: Full precision models + +## LLM Server Configuration + +### Running Embedded Server + +BotServer can run its own LLM server: ```csv -llm-server,false +llm-server,true llm-server-path,botserver-stack/bin/llm/build/bin llm-server-host,0.0.0.0 llm-server-port,8081 +``` + +### Server Performance Parameters + +```csv llm-server-gpu-layers,0 llm-server-ctx-size,4096 llm-server-n-predict,1024 @@ -32,136 +48,174 @@ llm-server-parallel,6 llm-server-cont-batching,true ``` -### Supported Local Models +| Parameter | Description | Impact | +|-----------|-------------|---------| +| `llm-server-gpu-layers` | Layers to offload to GPU | 0 = CPU only, higher = more GPU | +| `llm-server-ctx-size` | Context window size | More context = more memory | +| `llm-server-n-predict` | Max tokens to generate | Limits response length | +| `llm-server-parallel` | Concurrent requests | Higher = more throughput | +| `llm-server-cont-batching` | Continuous batching | Improves multi-user performance | -- **DeepSeek-R1-Distill-Qwen** - Efficient reasoning model -- **Llama-3** - Open source, high quality -- **Mistral** - Fast and capable -- **Phi-3** - Microsoft's small but powerful model -- **Qwen** - Multilingual support - -### GPU Acceleration +### Memory Management ```csv -llm-server-gpu-layers,33 # Number of layers to offload to GPU +llm-server-mlock,false +llm-server-no-mmap,false ``` -Set to 0 for CPU-only operation. +- **mlock**: Locks model in RAM (prevents swapping) +- **no-mmap**: Disables memory mapping (uses more RAM) -## Embeddings Configuration +## Cache Configuration -For semantic search and vector operations: +### Basic Cache Settings ```csv -embedding-url,http://localhost:8082 -embedding-model,../../../../data/llm/bge-small-en-v1.5-f32.gguf -``` - -## Caching Configuration - -Reduce latency and costs with intelligent caching: - -```csv -llm-cache,true +llm-cache,false llm-cache-ttl,3600 +``` + +Caching reduces repeated LLM calls for identical inputs. + +### Semantic Cache + +```csv llm-cache-semantic,true llm-cache-threshold,0.95 ``` -## Cloud Providers (Optional) +Semantic caching matches similar (not just identical) queries: +- **threshold**: 0.95 = 95% similarity required +- Lower threshold = more cache hits but less accuracy -### External API Configuration +## External API Configuration -For cloud LLM services, configure: +### OpenAI-Compatible APIs ```csv -llm-key,your-api-key -llm-url,https://api.provider.com/v1 -llm-model,model-name +llm-key,sk-your-api-key +llm-url,https://api.openai.com/v1 +llm-model,gpt-4 ``` -### Provider Examples +### Local API Servers -| Provider | URL | Model Examples | -|----------|-----|----------------| -| Local | http://localhost:8081 | GGUF models | -| API Compatible | Various | Various models | -| Custom | Your endpoint | Your models | +```csv +llm-key,none +llm-url,http://localhost:8081 +llm-model,local-model-name +``` + +## Configuration Examples + +### Minimal Local Setup +```csv +name,value +llm-url,http://localhost:8081 +llm-model,../../../../data/llm/model.gguf +``` + +### High-Performance Local +```csv +name,value +llm-server,true +llm-server-gpu-layers,32 +llm-server-ctx-size,8192 +llm-server-parallel,8 +llm-server-cont-batching,true +llm-cache,true +llm-cache-semantic,true +``` + +### Low-Resource Setup +```csv +name,value +llm-server-ctx-size,2048 +llm-server-n-predict,512 +llm-server-parallel,2 +llm-cache,false +llm-server-mlock,false +``` + +### External API +```csv +name,value +llm-key,sk-... +llm-url,https://api.anthropic.com +llm-model,claude-3 +llm-cache,true +llm-cache-ttl,7200 +``` ## Performance Tuning -### Context Size +### For Responsiveness +- Decrease `llm-server-ctx-size` +- Decrease `llm-server-n-predict` +- Enable `llm-cache` +- Enable `llm-cache-semantic` -```csv -llm-server-ctx-size,4096 # Maximum context window -prompt-compact,4 # Compact after N exchanges -``` +### For Quality +- Increase `llm-server-ctx-size` +- Increase `llm-server-n-predict` +- Use higher quantization (Q5_K_M or F16) +- Disable semantic cache or increase threshold -### Parallel Processing +### For Multiple Users +- Enable `llm-server-cont-batching` +- Increase `llm-server-parallel` +- Enable caching +- Consider GPU offloading -```csv -llm-server-parallel,6 # Concurrent requests -llm-server-cont-batching,true # Continuous batching -``` +## Model Selection Guidelines -### Memory Settings +### Small Models (1-3B parameters) +- Fast responses +- Low memory usage +- Good for simple tasks +- Example: `DeepSeek-R1-Distill-Qwen-1.5B` -```csv -llm-server-mlock,false # Lock model in memory -llm-server-no-mmap,false # Disable memory mapping -``` +### Medium Models (7-13B parameters) +- Balanced performance +- Moderate memory usage +- Good general purpose +- Example: `Llama-2-7B`, `Mistral-7B` -## Model Selection Guide - -| Use Case | Recommended Model | Configuration | -|----------|------------------|---------------| -| General chat | DeepSeek-R1-Distill | Default config | -| Code assistance | Qwen-Coder | Increase context | -| Multilingual | Qwen-Multilingual | Add language params | -| Fast responses | Phi-3-mini | Reduce predict tokens | -| High accuracy | Llama-3-70B | Increase GPU layers | - -## Monitoring - -Check LLM server status: - -```bash -curl http://localhost:8081/health -``` - -View model information: - -```bash -curl http://localhost:8081/v1/models -``` +### Large Models (30B+ parameters) +- Best quality +- High memory requirements +- Complex reasoning +- Example: `Llama-2-70B`, `Mixtral-8x7B` ## Troubleshooting -### Model Not Loading - -1. Check file path is correct -2. Verify GGUF format -3. Ensure sufficient memory -4. Check GPU drivers (if using GPU) +### Model Won't Load +- Check file path exists +- Verify sufficient RAM +- Ensure compatible GGUF version ### Slow Responses +- Reduce context size +- Enable caching +- Use GPU offloading +- Choose smaller model -1. Reduce context size -2. Enable GPU acceleration -3. Use smaller model -4. Enable caching +### Out of Memory +- Reduce `llm-server-ctx-size` +- Reduce `llm-server-parallel` +- Use more quantized model (Q3 instead of Q5) +- Disable `llm-server-mlock` -### High Memory Usage - -1. Use quantized models (Q4, Q5) -2. Reduce batch size -3. Enable memory mapping -4. Lower context size +### Connection Refused +- Verify `llm-server` is true +- Check port not in use +- Ensure firewall allows connection ## Best Practices -1. **Start with local models** - Better privacy and no API costs -2. **Use appropriate model size** - Balance quality vs speed -3. **Enable caching** - Reduce redundant computations -4. **Monitor resources** - Watch CPU/GPU/memory usage -5. **Test different models** - Find the best fit for your use case \ No newline at end of file +1. **Start Small**: Begin with small models and scale up +2. **Use Caching**: Enable for production deployments +3. **Monitor Memory**: Watch RAM usage during operation +4. **Test Thoroughly**: Verify responses before production +5. **Document Models**: Keep notes on model performance +6. **Version Control**: Track config.csv changes \ No newline at end of file diff --git a/docs/src/chapter-08-config/parameters.md b/docs/src/chapter-08-config/parameters.md index a86b60a3f..b0fe66904 100644 --- a/docs/src/chapter-08-config/parameters.md +++ b/docs/src/chapter-08-config/parameters.md @@ -1,295 +1,188 @@ -# Bot Parameters +# Configuration Parameters -Comprehensive reference for all bot configuration parameters available in `config.csv`. +Complete reference of all available parameters in `config.csv`. -## Parameter Categories +## Server Parameters -Bot parameters are organized into functional groups for easier management and understanding. +### Web Server +| Parameter | Description | Default | Type | +|-----------|-------------|---------|------| +| `server_host` | Server bind address | `0.0.0.0` | IP address | +| `server_port` | Server listen port | `8080` | Number (1-65535) | +| `sites_root` | Generated sites directory | `/tmp` | Path | -## Core Bot Settings - -### Identity Parameters - -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `botId` | UUID | Yes | Generated | Unique bot identifier | -| `title` | String | Yes | None | Bot display name | -| `description` | String | No | Empty | Bot description | -| `version` | String | No | "1.0" | Bot version | -| `author` | String | No | Empty | Bot creator | -| `language` | String | No | "en" | Default language | -| `timezone` | String | No | "UTC" | Bot timezone | - -### Behavior Parameters - -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `welcomeMessage` | String | No | Empty | Initial greeting | -| `fallbackMessage` | String | No | "I don't understand" | Default error response | -| `goodbyeMessage` | String | No | "Goodbye!" | Session end message | -| `typingDelay` | Number | No | 1000 | Typing indicator delay (ms) | -| `responseTimeout` | Number | No | 30000 | Response timeout (ms) | -| `maxRetries` | Number | No | 3 | Maximum retry attempts | -| `debugMode` | Boolean | No | false | Enable debug logging | +### MCP Server +| Parameter | Description | Default | Type | +|-----------|-------------|---------|------| +| `mcp-server` | Enable MCP protocol server | `false` | Boolean | ## LLM Parameters -### Model Configuration +### Core LLM Settings +| Parameter | Description | Default | Type | +|-----------|-------------|---------|------| +| `llm-key` | API key for LLM service | `none` | String | +| `llm-url` | LLM service endpoint | `http://localhost:8081` | URL | +| `llm-model` | Model path or identifier | Required | Path/String | -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `llmProvider` | String | Yes | "openai" | LLM provider (openai, anthropic, google, local) | -| `llmModel` | String | Yes | "gpt-4" | Model name | -| `llmApiKey` | String | Yes* | None | API key (*not required for local) | -| `llmEndpoint` | String | No | Provider default | Custom API endpoint | -| `llmOrganization` | String | No | Empty | Organization ID (OpenAI) | -| `llmProject` | String | No | Empty | Project ID (Google) | +### LLM Cache +| Parameter | Description | Default | Type | +|-----------|-------------|---------|------| +| `llm-cache` | Enable response caching | `false` | Boolean | +| `llm-cache-ttl` | Cache time-to-live | `3600` | Seconds | +| `llm-cache-semantic` | Semantic similarity cache | `true` | Boolean | +| `llm-cache-threshold` | Similarity threshold | `0.95` | Float (0-1) | -### Response Control +### Embedded LLM Server +| Parameter | Description | Default | Type | +|-----------|-------------|---------|------| +| `llm-server` | Run embedded server | `false` | Boolean | +| `llm-server-path` | Server binary path | `botserver-stack/bin/llm/build/bin` | Path | +| `llm-server-host` | Server bind address | `0.0.0.0` | IP address | +| `llm-server-port` | Server port | `8081` | Number | +| `llm-server-gpu-layers` | GPU offload layers | `0` | Number | +| `llm-server-n-moe` | MoE experts count | `0` | Number | +| `llm-server-ctx-size` | Context size | `4096` | Tokens | +| `llm-server-n-predict` | Max predictions | `1024` | Tokens | +| `llm-server-parallel` | Parallel requests | `6` | Number | +| `llm-server-cont-batching` | Continuous batching | `true` | Boolean | +| `llm-server-mlock` | Lock in memory | `false` | Boolean | +| `llm-server-no-mmap` | Disable mmap | `false` | Boolean | -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `llmTemperature` | Float | No | 0.7 | Creativity (0.0-1.0) | -| `llmMaxTokens` | Number | No | 2000 | Max response tokens | -| `llmTopP` | Float | No | 1.0 | Nucleus sampling | -| `llmFrequencyPenalty` | Float | No | 0.0 | Reduce repetition | -| `llmPresencePenalty` | Float | No | 0.0 | Encourage new topics | -| `llmStopSequences` | String | No | Empty | Stop generation sequences | -| `llmSystemPrompt` | String | No | Default | System instruction | +## Embedding Parameters -### Cost Management +| Parameter | Description | Default | Type | +|-----------|-------------|---------|------| +| `embedding-url` | Embedding service endpoint | `http://localhost:8082` | URL | +| `embedding-model` | Embedding model path | Required for KB | Path | -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `llmCostLimit` | Number | No | 100 | Monthly cost limit ($) | -| `llmTokenLimit` | Number | No | 1000000 | Monthly token limit | -| `llmRequestLimit` | Number | No | 10000 | Daily request limit | -| `llmCacheEnabled` | Boolean | No | true | Enable response caching | -| `llmCacheTTL` | Number | No | 3600 | Cache duration (seconds) | +## Prompt Parameters -## Knowledge Base Parameters +| Parameter | Description | Default | Type | +|-----------|-------------|---------|------| +| `prompt-compact` | Context compaction level | `4` | Number | +| `prompt-history` | Messages in history | Not set | Number | -### Vector Database +## Email Parameters -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `vectorDbUrl` | String | No | "http://localhost:6333" | Qdrant URL | -| `vectorDbApiKey` | String | No | Empty | Qdrant API key | -| `vectorDbCollection` | String | No | Bot name | Default collection | -| `embeddingModel` | String | No | "text-embedding-ada-002" | Embedding model | -| `embeddingDimension` | Number | No | 1536 | Vector dimension | +| Parameter | Description | Default | Type | +|-----------|-------------|---------|------| +| `email-from` | Sender address | Required for email | Email | +| `email-server` | SMTP hostname | Required for email | Hostname | +| `email-port` | SMTP port | `587` | Number | +| `email-user` | SMTP username | Required for email | String | +| `email-pass` | SMTP password | Required for email | String | -### Search Configuration +## Theme Parameters -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `searchTopK` | Number | No | 5 | Results to return | -| `searchThreshold` | Float | No | 0.7 | Minimum similarity | -| `searchRerank` | Boolean | No | false | Enable reranking | -| `chunkSize` | Number | No | 1000 | Text chunk size | -| `chunkOverlap` | Number | No | 200 | Chunk overlap | +| Parameter | Description | Default | Type | +|-----------|-------------|---------|------| +| `theme-color1` | Primary color | Not set | Hex color | +| `theme-color2` | Secondary color | Not set | Hex color | +| `theme-logo` | Logo URL | Not set | URL | +| `theme-title` | Bot display title | Not set | String | -## Storage Parameters +## Custom Database Parameters -### Object Storage +| Parameter | Description | Default | Type | +|-----------|-------------|---------|------| +| `custom-server` | Database server | `localhost` | Hostname | +| `custom-port` | Database port | `5432` | Number | +| `custom-database` | Database name | Not set | String | +| `custom-username` | Database user | Not set | String | +| `custom-password` | Database password | Not set | String | -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `storageProvider` | String | No | "drive" | Storage provider | -| `storageEndpoint` | String | Yes | "localhost:9000" | S3-compatible drive endpoint | -| `storageAccessKey` | String | Yes | None | Access key | -| `storageSecretKey` | String | Yes | None | Secret key | -| `storageBucket` | String | No | "botserver" | Default bucket | -| `storageRegion` | String | No | "us-east-1" | AWS region | -| `storageUseSsl` | Boolean | No | false | Use HTTPS | +## Parameter Types -### File Handling +### Boolean +Values: `true` or `false` (case-sensitive) -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `maxFileSize` | Number | No | 10 | Max file size (MB) | -| `allowedFileTypes` | String | No | "pdf,doc,txt,csv" | Allowed extensions | -| `fileRetention` | Number | No | 90 | Days to keep files | -| `autoDeleteTemp` | Boolean | No | true | Auto-delete temp files | +### Number +Integer values, must be within valid ranges: +- Ports: 1-65535 +- Tokens: Positive integers +- Percentages: 0-100 -## Communication Parameters +### Float +Decimal values: +- Thresholds: 0.0 to 1.0 -### Email Settings +### Path +File system paths: +- Relative: `../../../../data/model.gguf` +- Absolute: `/opt/models/model.gguf` -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `emailEnabled` | Boolean | No | false | Enable email | -| `smtpHost` | String | No* | Empty | SMTP server | -| `smtpPort` | Number | No | 587 | SMTP port | -| `smtpUser` | String | No* | Empty | Email username | -| `smtpPassword` | String | No* | Empty | Email password | -| `smtpFrom` | String | No* | Empty | From address | -| `smtpUseTls` | Boolean | No | true | Use TLS | -| `smtpUseStarttls` | Boolean | No | true | Use STARTTLS | +### URL +Valid URLs: +- HTTP: `http://localhost:8081` +- HTTPS: `https://api.example.com` -### Channel Configuration +### String +Any text value (no quotes needed in CSV) -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `webEnabled` | Boolean | No | true | Web interface | -| `webPort` | Number | No | 8080 | Web port | -| `whatsappEnabled` | Boolean | No | false | WhatsApp integration | -| `whatsappToken` | String | No* | Empty | WhatsApp token | -| `teamsEnabled` | Boolean | No | false | Teams integration | -| `teamsAppId` | String | No* | Empty | Teams app ID | -| `slackEnabled` | Boolean | No | false | Slack integration | -| `slackToken` | String | No* | Empty | Slack token | +### Email +Valid email format: `user@domain.com` -## Security Parameters +### Hex Color +HTML color codes: `#RRGGBB` format -### Authentication +## Required vs Optional -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `authRequired` | Boolean | No | false | Require authentication | -| `authProvider` | String | No | "local" | Auth provider | -| `jwtSecret` | String | Yes* | Generated | JWT secret | -| `jwtExpiration` | Number | No | 86400 | Token expiration (s) | -| `sessionTimeout` | Number | No | 3600 | Session timeout (s) | -| `maxSessions` | Number | No | 100 | Max concurrent sessions | +### Always Required +- None - all parameters have defaults or are optional -### Access Control +### Required for Features +- **LLM**: `llm-model` must be set +- **Email**: `email-from`, `email-server`, `email-user` +- **Embeddings**: `embedding-model` for knowledge base +- **Custom DB**: `custom-database` if using external database -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `corsOrigins` | String | No | "*" | Allowed origins | -| `ipWhitelist` | String | No | Empty | Allowed IPs | -| `ipBlacklist` | String | No | Empty | Blocked IPs | -| `rateLimitPerMinute` | Number | No | 60 | Requests per minute | -| `rateLimitPerHour` | Number | No | 1000 | Requests per hour | -| `requireHttps` | Boolean | No | false | Force HTTPS | +## Configuration Precedence -### Data Protection +1. **Built-in defaults** (hardcoded) +2. **config.csv values** (override defaults) +3. **Environment variables** (if implemented, override config) -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `encryptData` | Boolean | No | true | Encrypt stored data | -| `encryptionKey` | String | Yes* | Generated | Encryption key | -| `maskPii` | Boolean | No | true | Mask personal data | -| `auditLogging` | Boolean | No | true | Enable audit logs | -| `dataRetention` | Number | No | 365 | Data retention (days) | +## Special Values -## Performance Parameters +- `none` - Explicitly no value (for `llm-key`) +- Empty string - Unset/use default +- `false` - Feature disabled +- `true` - Feature enabled -### Caching +## Performance Tuning -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `cacheEnabled` | Boolean | No | true | Enable caching | -| `cacheProvider` | String | No | "cache" | Cache provider | -| `cacheUrl` | String | No | "redis://localhost:6379" | Cache URL | -| `cacheTtl` | Number | No | 3600 | Default TTL (s) | -| `cacheMaxSize` | Number | No | 100 | Max cache size (MB) | - -### Resource Limits - -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `maxCpu` | Number | No | 2 | CPU cores limit | -| `maxMemory` | Number | No | 2048 | Memory limit (MB) | -| `maxConnections` | Number | No | 100 | DB connections | -| `maxWorkers` | Number | No | 4 | Worker threads | -| `queueSize` | Number | No | 1000 | Task queue size | - -## Monitoring Parameters - -### Logging - -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `logLevel` | String | No | "info" | Log level | -| `logToFile` | Boolean | No | true | Log to file | -| `logFilePath` | String | No | "./logs" | Log directory | -| `logRotation` | String | No | "daily" | Rotation schedule | -| `logRetention` | Number | No | 30 | Keep logs (days) | - -### Metrics - -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `metricsEnabled` | Boolean | No | false | Enable metrics | -| `metricsEndpoint` | String | No | "/metrics" | Metrics endpoint | -| `sentryDsn` | String | No | Empty | Sentry DSN | -| `datadogApiKey` | String | No | Empty | Datadog API key | -| `prometheusPort` | Number | No | 9090 | Prometheus port | - -## Feature Flags - -### Experimental Features - -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `betaFeatures` | Boolean | No | false | Enable beta features | -| `webAutomation` | Boolean | No | false | Web scraping | -| `ocrEnabled` | Boolean | No | false | OCR support | -| `speechEnabled` | Boolean | No | false | Speech I/O | -| `visionEnabled` | Boolean | No | false | Image analysis | -| `codeExecution` | Boolean | No | false | Code running | - -## Environment-Specific Parameters - -### Development - -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `devMode` | Boolean | No | false | Development mode | -| `hotReload` | Boolean | No | false | Hot reload | -| `mockServices` | Boolean | No | false | Use mock services | -| `verboseErrors` | Boolean | No | false | Detailed errors | - -### Production - -| Parameter | Type | Required | Default | Description | -|-----------|------|----------|---------|-------------| -| `prodMode` | Boolean | No | true | Production mode | -| `clustering` | Boolean | No | false | Enable clustering | -| `loadBalancing` | Boolean | No | false | Load balancing | -| `autoScale` | Boolean | No | false | Auto-scaling | - -## Parameter Validation - -Parameters are validated on startup: -1. Required parameters must be present -2. Types are checked and coerced -3. Ranges are enforced -4. Dependencies verified -5. Conflicts detected - -## Environment Variable Override - -Any parameter can be overridden via environment: -```bash -BOT_TITLE="My Bot" BOT_LLM_MODEL="gpt-4-turbo" botserver +### For Local Models +```csv +llm-server-ctx-size,8192 +llm-server-n-predict,2048 +llm-server-parallel,4 +llm-cache,true +llm-cache-ttl,7200 ``` -## Dynamic Parameter Updates - -Some parameters can be updated at runtime: -- Log level -- Rate limits -- Cache settings -- Feature flags - -Use the admin API to update: -``` -POST /api/admin/config -{ - "logLevel": "debug", - "rateLimitPerMinute": 120 -} +### For Production +```csv +llm-server-cont-batching,true +llm-cache-semantic,true +llm-cache-threshold,0.90 +llm-server-parallel,8 ``` -## Best Practices +### For Low Memory +```csv +llm-server-ctx-size,2048 +llm-server-n-predict,512 +llm-server-mlock,false +llm-server-no-mmap,false +llm-cache,false +``` -1. **Start with defaults**: Most parameters have sensible defaults -2. **Override only what's needed**: Don't set everything -3. **Use environment variables**: For sensitive values -4. **Document custom values**: Explain why changed -5. **Test configuration**: Validate before production -6. **Monitor performance**: Adjust based on metrics -7. **Version control**: Track configuration changes \ No newline at end of file +## Validation Rules + +1. **Paths**: Model files must exist +2. **URLs**: Must be valid format +3. **Ports**: Must be 1-65535 +4. **Emails**: Must contain @ and domain +5. **Colors**: Must be valid hex format +6. **Booleans**: Exactly `true` or `false`