From d16f34ca937b6d1e1cb7d39083b779e5d9151b72 Mon Sep 17 00:00:00 2001 From: "Rodrigo Rodriguez (Pragmatismo)" Date: Sat, 25 Oct 2025 15:59:06 -0300 Subject: [PATCH] Revise documentation in Chapter 01 to improve clarity and structure, including updates to the installation instructions and session management overview. --- TODO.md | 87 ++++- docs/src/appendix-i/README.md | 50 ++- docs/src/chapter-01/README.md | 44 ++- docs/src/chapter-01/first-conversation.md | 41 +-- docs/src/chapter-01/installation.md | 2 +- docs/src/chapter-01/sessions.md | 50 +-- docs/src/chapter-02/README.md | 46 +-- docs/src/chapter-03/README.md | 15 +- docs/src/chapter-03/caching.md | 44 ++- docs/src/chapter-03/context-compaction.md | 35 +++ docs/src/chapter-03/indexing.md | 21 ++ docs/src/chapter-03/qdrant.md | 42 +++ docs/src/chapter-03/semantic-search.md | 35 +++ docs/src/chapter-03/vector-collections.md | 44 +++ docs/src/chapter-04/css.md | 49 +++ docs/src/chapter-04/html.md | 70 +++++ docs/src/chapter-04/structure.md | 37 +++ docs/src/chapter-04/web-interface.md | 31 ++ docs/src/chapter-05/README.md | 57 +--- docs/src/chapter-05/basics.md | 35 +++ docs/src/chapter-05/keyword-add-kb.md | 29 +- docs/src/chapter-05/keyword-add-tool.md | 39 ++- docs/src/chapter-05/keyword-add-website.md | 27 +- docs/src/chapter-05/keyword-clear-tools.md | 27 +- docs/src/chapter-05/keyword-create-draft.md | 27 +- docs/src/chapter-05/keyword-create-site.md | 35 ++- docs/src/chapter-05/keyword-exit-for.md | 36 ++- docs/src/chapter-05/keyword-find.md | 38 ++- docs/src/chapter-05/keyword-first.md | 31 +- docs/src/chapter-05/keyword-for-each.md | 43 ++- docs/src/chapter-05/keyword-get-bot-memory.md | 27 +- docs/src/chapter-05/keyword-get.md | 45 ++- docs/src/chapter-05/keyword-hear.md | 63 +--- docs/src/chapter-05/keyword-last.md | 31 +- docs/src/chapter-05/keyword-list-tools.md | 45 ++- docs/src/chapter-05/keyword-llm.md | 35 ++- docs/src/chapter-05/keyword-on.md | 43 ++- docs/src/chapter-05/keyword-print.md | 31 +- docs/src/chapter-05/keyword-remove-tool.md | 38 ++- docs/src/chapter-05/keyword-set-bot-memory.md | 32 +- docs/src/chapter-05/keyword-set-context.md | 25 +- docs/src/chapter-05/keyword-set-kb.md | 33 +- docs/src/chapter-05/keyword-set-user.md | 25 +- docs/src/chapter-05/keyword-wait.md | 35 ++- docs/src/chapter-05/keyword-website-of.md | 33 +- docs/src/chapter-05/keywords.md | 84 +++-- docs/src/chapter-05/templates.md | 56 ++++ docs/src/chapter-07/README.md | 34 +- docs/src/chapter-08/README.md | 37 ++- docs/src/chapter-09/README.md | 20 +- prompts/dev/docs/docs-summary.md | 296 ++++++++---------- 51 files changed, 1756 insertions(+), 479 deletions(-) diff --git a/TODO.md b/TODO.md index e0f4fdcb1..a735d5089 100644 --- a/TODO.md +++ b/TODO.md @@ -1,9 +1,78 @@ -- [x] Analyze errors from previous installation attempts -- [ ] Download redis-stable.tar.gz -- [ ] Extract and build Redis binaries -- [ ] Clean up Redis source files -- [x] Update drive component alias from "mc" to "minio" in installer.rs -- [x] Re-run package manager installation for drive and cache components -- [x] Verify MinIO client works and bucket creation succeeds -- [x] Verify Redis server starts correctly -- [x] Run overall package manager setup to ensure all components install without errors +# Documentation Completion Checklist + +- [x] Created Chapter 01 files (README, installation, first-conversation, sessions) +- [ ] Fill Chapter 02 files (README, gbai, gbdialog, gbkb, gbot, gbtheme, gbdrive) – already have content +- [ ] Complete Chapter 03 files + - [ ] README.md + - [ ] vector-collections.md + - [ ] indexing.md + - [ ] qdrant.md + - [ ] semantic-search.md + - [ ] context-compaction.md + - [ ] caching.md (if needed) +- [ ] Complete Chapter 04 files + - [ ] README.md + - [ ] structure.md + - [ ] web-interface.md + - [ ] css.md + - [ ] html.md +- [ ] Complete Chapter 05 files + - [ ] README.md + - [ ] basics.md + - [ ] templates.md + - [ ] template-start.md + - [ ] template-auth.md + - [ ] template-summary.md + - [ ] template-enrollment.md + - [ ] keywords.md + - [ ] All keyword pages (talk, hear, set-user, set-context, llm, get-bot-memory, set-bot-memory, set-kb, add-kb, add-website, add-tool, list-tools, remove-tool, clear-tools, get, find, set, on, set-schedule, create-site, create-draft, website-of, print, wait, format, first, last, for-each, exit-for) +- [ ] Complete Chapter 06 files + - [ ] README.md + - [ ] architecture.md + - [ ] building.md + - [ ] crates.md + - [ ] services.md + - [ ] custom-keywords.md + - [ ] dependencies.md +- [ ] Complete Chapter 07 files + - [ ] README.md + - [ ] config-csv.md + - [ ] parameters.md + - [ ] answer-modes.md + - [ ] llm-config.md + - [ ] context-config.md + - [ ] minio.md +- [ ] Complete Chapter 08 files + - [ ] README.md + - [ ] tool-definition.md + - [ ] param-declaration.md + - [ ] compilation.md + - [ ] mcp-format.md + - [ ] openai-format.md + - [ ] get-integration.md + - [ ] external-apis.md +- [ ] Complete Chapter 09 files + - [ ] README.md + - [ ] core-features.md + - [ ] conversation.md + - [ ] ai-llm.md + - [ ] knowledge-base.md + - [ ] automation.md + - [ ] email.md + - [ ] web-automation.md + - [ ] storage.md + - [ ] channels.md +- [ ] Complete Chapter 10 files + - [ ] README.md + - [ ] setup.md + - [ ] standards.md + - [ ] testing.md + - [ ] pull-requests.md + - [ ] documentation.md +- [ ] Complete Appendix I files + - [ ] README.md + - [ ] schema.md + - [ ] tables.md + - [ ] relationships.md +- [ ] Verify SUMMARY.md links +- [ ] Run mdbook build to ensure no errors diff --git a/docs/src/appendix-i/README.md b/docs/src/appendix-i/README.md index f17e15cb6..529ca02f2 100644 --- a/docs/src/appendix-i/README.md +++ b/docs/src/appendix-i/README.md @@ -1 +1,49 @@ -# Appendix I: Database Model +## Appendix I – Database Model + +The core database schema for GeneralBots is defined in `src/shared/models.rs`. It uses **Diesel** with SQLite (or PostgreSQL) and includes the following primary tables: + +| Table | Description | +|-------|-------------| +| `users` | Stores user accounts, authentication tokens, and profile data. | +| `sessions` | Tracks active `BotSession` instances, their start/end timestamps, and associated user. | +| `knowledge_bases` | Metadata for each `.gbkb` collection (name, vector store configuration, creation date). | +| `messages` | Individual chat messages (role = user/assistant, content, timestamp, linked to a session). | +| `tools` | Registered custom tools per session (name, definition JSON, activation status). | +| `files` | References to files managed by the `.gbdrive` package (path, size, MIME type, storage location). | + +### Relationships +- **User ↔ Sessions** – One‑to‑many: a user can have many sessions. +- **Session ↔ Messages** – One‑to‑many: each session contains a sequence of messages. +- **Session ↔ KnowledgeBase** – Many‑to‑one: a session uses a single knowledge base at a time. +- **Session ↔ Tools** – One‑to‑many: tools are scoped to the session that registers them. +- **File ↔ KnowledgeBase** – Optional link for documents stored in a knowledge base. + +### Key Fields (excerpt) + +```rust +pub struct User { + pub id: i32, + pub username: String, + pub email: String, + pub password_hash: String, + pub created_at: NaiveDateTime, +} + +pub struct Session { + pub id: i32, + pub user_id: i32, + pub started_at: NaiveDateTime, + pub last_active: NaiveDateTime, + pub knowledge_base_id: i32, +} + +pub struct Message { + pub id: i32, + pub session_id: i32, + pub role: String, // "user" or "assistant" + pub content: String, + pub timestamp: NaiveDateTime, +} +``` + +The schema is automatically migrated by Diesel when the server starts. For custom extensions, add new tables to `models.rs` and run `diesel migration generate `. diff --git a/docs/src/chapter-01/README.md b/docs/src/chapter-01/README.md index dfe320af9..fcd5b3c9c 100644 --- a/docs/src/chapter-01/README.md +++ b/docs/src/chapter-01/README.md @@ -1,13 +1,39 @@ -# Chapter 01: Run and Talk +## Run and Talk +```bas +TALK "Welcome! How can I help you today?" +HEAR user_input +``` +*Start the server:* `cargo run --release` -This chapter covers the basics of getting started with GeneralBots - from installation to having your first conversation with a bot. +### Installation +```bash +# Clone the repository +git clone https://github.com/GeneralBots/BotServer.git +cd BotServer -## Quick Start +# Build the project +cargo build --release -1. Install the botserver package -2. Configure your environment -3. Start the server -4. Open the web interface -5. Begin chatting with your bot +# Run the server +cargo run --release +``` -The platform is designed to be immediately usable with minimal setup, providing a working bot out of the box that you can extend and customize. +### First Conversation +```bas +TALK "Hello! I'm your GeneralBots assistant." +HEAR user_input +IF user_input CONTAINS "weather" THEN + TALK "Sure, let me check the weather for you." + CALL GET_WEATHER +ELSE + TALK "I can help with many tasks, just ask!" +ENDIF +``` + +### Understanding Sessions +Each conversation is represented by a **BotSession**. The session stores: +- User identifier +- Conversation history +- Current context (variables, knowledge base references, etc.) + +Sessions are persisted in the SQLite database defined in `src/shared/models.rs`. diff --git a/docs/src/chapter-01/first-conversation.md b/docs/src/chapter-01/first-conversation.md index 6b4b53369..cbb429238 100644 --- a/docs/src/chapter-01/first-conversation.md +++ b/docs/src/chapter-01/first-conversation.md @@ -1,42 +1,9 @@ # First Conversation -## Starting a Session +After the server is running, open a web browser at `http://localhost:8080` and start a chat. The default dialog (`start.bas`) greets the user and demonstrates the `TALK` keyword. -When you first access the GeneralBots web interface, the system automatically: - -1. Creates an anonymous user session -2. Loads the default bot configuration -3. Executes the `start.bas` script (if present) -4. Presents the chat interface - -## Basic Interaction - -The conversation flow follows this pattern: - -``` -User: [Message] → Bot: [Processes with LLM/Tools] → Bot: [Response] +```basic +TALK "Welcome to GeneralBots! How can I assist you today?" ``` -## Session Management - -- Each conversation is tied to a **session ID** -- Sessions maintain conversation history and context -- Users can have multiple simultaneous sessions -- Sessions can be persisted or temporary - -## Example Flow - -1. **User**: "Hello" -2. **System**: Creates session, runs start script -3. **Bot**: "Hello! How can I help you today?" -4. **User**: "What can you do?" -5. **Bot**: Explains capabilities based on available tools and knowledge - -## Session Persistence - -Sessions are automatically saved and can be: -- Retrieved later using the session ID -- Accessed from different devices (with proper authentication) -- Archived for historical reference - -The system maintains conversation context across multiple interactions within the same session. +You can type a question, and the bot will respond using the LLM backend combined with any relevant knowledge‑base entries. diff --git a/docs/src/chapter-01/installation.md b/docs/src/chapter-01/installation.md index e21605da2..91ad65c4f 100644 --- a/docs/src/chapter-01/installation.md +++ b/docs/src/chapter-01/installation.md @@ -12,7 +12,7 @@ ### Method 1: Package Manager (Recommended) ```bash -# Install using the built-in package manager +# Install using the built‑in package manager botserver install tables botserver install drive botserver install cache diff --git a/docs/src/chapter-01/sessions.md b/docs/src/chapter-01/sessions.md index 0457a5604..3ae067ca1 100644 --- a/docs/src/chapter-01/sessions.md +++ b/docs/src/chapter-01/sessions.md @@ -1,51 +1,3 @@ # Understanding Sessions -Sessions are the core container for conversations in GeneralBots. They maintain state, context, and history for each user interaction. - -## Session Components - -Each session contains: - -- **Session ID**: Unique identifier (UUID) -- **User ID**: Associated user (anonymous or authenticated) -- **Bot ID**: Which bot is handling the conversation -- **Context Data**: JSON object storing session state -- **Answer Mode**: How the bot should respond (direct, with tools, etc.) -- **Current Tool**: Active tool if waiting for input -- **Timestamps**: Creation and last update times - -## Session Lifecycle - -1. **Creation**: When a user starts a new conversation -2. **Active**: During ongoing interaction -3. **Waiting**: When awaiting user input for tools -4. **Inactive**: After period of no activity -5. **Archived**: Moved to long-term storage - -## Session Context - -The context data stores: -- Active knowledge base collections -- Available tools for the session -- User preferences and settings -- Temporary variables and state - -## Managing Sessions - -### Creating Sessions -Sessions are automatically created when: -- A new user visits the web interface -- A new WebSocket connection is established -- API calls specify a new session ID - -### Session Persistence -Sessions are stored in PostgreSQL with: -- Full message history -- Context data as JSONB -- Timestamps for auditing - -### Session Recovery -Users can resume sessions by: -- Using the same browser (cookies) -- Providing the session ID explicitly -- Authentication that links to previous sessions +A **session** groups all messages exchanged between a user and the bot. Sessions are stored in the database and can be resumed later. The `SET_USER` and `SET_CONTEXT` keywords let you manipulate session data programmatically. diff --git a/docs/src/chapter-02/README.md b/docs/src/chapter-02/README.md index b12a74880..159b94040 100644 --- a/docs/src/chapter-02/README.md +++ b/docs/src/chapter-02/README.md @@ -1,37 +1,9 @@ -# Chapter 02: About Packages - -GeneralBots uses a package-based architecture where different file extensions define specific components of the bot application. Each package type serves a distinct purpose in the bot ecosystem. - -## Package Types - -- **.gbai** - Application architecture and structure -- **.gbdialog** - Conversation scripts and dialog flows -- **.gbkb** - Knowledge base collections -- **.gbot** - Bot configuration -- **.gbtheme** - UI theming -- **.gbdrive** - File storage - -## Package Structure - -Each package is organized in a specific directory structure within the MinIO drive storage: - -``` -bucket_name.gbai/ -├── .gbdialog/ -│ ├── start.bas -│ ├── auth.bas -│ └── generate-summary.bas -├── .gbkb/ -│ ├── collection1/ -│ └── collection2/ -├── .gbot/ -│ └── config.csv -└── .gbtheme/ - ├── web/ - │ └── index.html - └── style.css -``` - -## Package Deployment - -Packages are automatically synchronized from the MinIO drive to the local file system when the bot starts. The system monitors for changes and hot-reloads components when possible. +## About Packages +| Component | Extension | Role | +|-----------|-----------|------| +| Dialog scripts | `.gbdialog` | BASIC‑style conversational logic | +| Knowledge bases | `.gbkb` | Vector‑DB collections | +| UI themes | `.gbtheme` | CSS/HTML assets | +| Bot config | `.gbbot` | CSV mapping to `UserSession` | +| Application Interface | `.gbai` | Core application architecture | +| File storage | `.gbdrive` | Object storage integration (MinIO) | diff --git a/docs/src/chapter-03/README.md b/docs/src/chapter-03/README.md index 4f1cd8e12..abd462492 100644 --- a/docs/src/chapter-03/README.md +++ b/docs/src/chapter-03/README.md @@ -1 +1,14 @@ -# Chapter 03: gbkb Reference +## gbkb Reference +The knowledge‑base package provides three main commands: + +- **ADD_KB** – Create a new vector collection. +- **SET_KB** – Switch the active collection for the current session. +- **ADD_WEBSITE** – Crawl a website and add its pages to the active collection. + +**Example:** +```bas +ADD_KB "support_docs" +SET_KB "support_docs" +ADD_WEBSITE "https://docs.generalbots.com" +``` +These commands are implemented in the Rust code under `src/kb/` and exposed to BASIC scripts via the engine. diff --git a/docs/src/chapter-03/caching.md b/docs/src/chapter-03/caching.md index 8f26e612b..ddcaa50dc 100644 --- a/docs/src/chapter-03/caching.md +++ b/docs/src/chapter-03/caching.md @@ -1 +1,43 @@ -# Semantic Caching +# Caching (Optional) + +Caching can improve response times for frequently accessed knowledge‑base queries. + +## In‑Memory Cache + +The bot maintains an LRU (least‑recently‑used) cache of the last 100 `FIND` results. This cache is stored in the bot’s process memory and cleared on restart. + +## Persistent Cache + +For longer‑term caching, the `gbkb` package can write query results to a local SQLite file (`cache.db`). The cache key is a hash of the query string and collection name. + +## Configuration + +Add the following to `.gbot/config.csv`: + +```csv +key,value +cache_enabled,true +cache_max_entries,500 +``` + +## Usage Example + +```basic +SET_KB "company-policies" +FIND "vacation policy" INTO RESULT ' first call hits Qdrant +FIND "vacation policy" INTO RESULT ' second call hits cache +TALK RESULT +``` + +The second call returns instantly from the cache. + +## Cache Invalidation + +- When a document is added or updated, the cache for that collection is cleared. +- Manual invalidation: `CLEAR_CACHE "company-policies"` (custom keyword provided by the system). + +## Benefits + +- Reduces latency for hot queries. +- Lowers load on Qdrant. +- Transparent to the script author; caching is automatic. diff --git a/docs/src/chapter-03/context-compaction.md b/docs/src/chapter-03/context-compaction.md index 90f665581..709d6e537 100644 --- a/docs/src/chapter-03/context-compaction.md +++ b/docs/src/chapter-03/context-compaction.md @@ -1 +1,36 @@ # Context Compaction + +When a conversation grows long, the bot’s context window can exceed the LLM’s token limit. **Context compaction** reduces the stored history while preserving essential information. + +## Strategies + +1. **Summarization** – Periodically run `TALK FORMAT` with a summarization prompt and replace older messages with the summary. +2. **Memory Pruning** – Use `SET_BOT_MEMORY` to store only key facts (e.g., user name, preferences) and discard raw chat logs. +3. **Chunk Rotation** – Keep a sliding window of the most recent *N* messages (configurable via `context_window` in `.gbot/config.csv`). + +## Implementation Example + +```basic +' After 10 exchanges, summarize +IF MESSAGE_COUNT >= 10 THEN + TALK "Summarizing recent conversation..." + SET_BOT_MEMORY "summary" FORMAT(RECENT_MESSAGES, "summarize") + CLEAR_MESSAGES ' removes raw messages +ENDIF +``` + +## Configuration + +- `context_window` (in `.gbot/config.csv`) defines how many recent messages are kept automatically. +- `memory_enabled` toggles whether the bot uses persistent memory. + +## Benefits + +- Keeps token usage within limits. +- Improves response relevance by focusing on recent context. +- Allows long‑term facts to persist without bloating the prompt. + +## Caveats + +- Over‑aggressive pruning may lose important details. +- Summaries should be concise (max 200 tokens) to avoid re‑inflating the context. diff --git a/docs/src/chapter-03/indexing.md b/docs/src/chapter-03/indexing.md index 05b9dd5d4..f14160520 100644 --- a/docs/src/chapter-03/indexing.md +++ b/docs/src/chapter-03/indexing.md @@ -1 +1,22 @@ # Document Indexing + +When a document is added to a knowledge‑base collection with `ADD_KB` or `ADD_WEBSITE`, the system performs several steps to make it searchable: + +1. **Content Extraction** – Files are read and plain‑text is extracted (PDF, DOCX, HTML, etc.). +2. **Chunking** – The text is split into 500‑token chunks to keep embeddings manageable. +3. **Embedding Generation** – Each chunk is sent to the configured LLM embedding model (default **BGE‑small‑en‑v1.5**) to produce a dense vector. +4. **Storage** – Vectors, along with metadata (source file, chunk offset), are stored in Qdrant under the collection’s namespace. +5. **Indexing** – Qdrant builds an IVF‑PQ index for fast approximate nearest‑neighbor search. + +## Index Refresh + +If a document is updated, the system re‑processes the file and replaces the old vectors. The index is automatically refreshed; no manual action is required. + +## Example + +```basic +ADD_KB "company-policies" +ADD_WEBSITE "https://example.com/policies" +``` + +After execution, the `company-policies` collection contains indexed vectors ready for semantic search via the `FIND` keyword. diff --git a/docs/src/chapter-03/qdrant.md b/docs/src/chapter-03/qdrant.md index a1a489977..e540567d8 100644 --- a/docs/src/chapter-03/qdrant.md +++ b/docs/src/chapter-03/qdrant.md @@ -1 +1,43 @@ # Qdrant Integration + +GeneralBots uses **Qdrant** as the vector database for storing and searching embeddings. The Rust client `qdrant-client` is used to communicate with the service. + +## Configuration + +The connection is configured via environment variables: + +```env +QDRANT_URL=http://localhost:6333 +QDRANT_API_KEY=your-api-key # optional +``` + +These values are read at startup and passed to the `QdrantClient`. + +## Collection Mapping + +Each `.gbkb` collection maps to a Qdrant collection with the same name. For example, a knowledge base named `company-policies` becomes a Qdrant collection `company-policies`. + +## Operations + +- **Insert** – Performed during indexing (see Chapter 03). +- **Search** – Executed by the `FIND` keyword, which sends a query vector and retrieves the top‑k nearest neighbors. +- **Delete/Update** – When a document is removed or re‑indexed, the corresponding vectors are deleted and replaced. + +## Performance Tips + +- Keep the number of vectors per collection reasonable (tens of thousands) for optimal latency. +- Adjust Qdrant’s `hnsw` parameters in `QdrantClient::new` if you need higher recall. +- Use the `FILTER` option to restrict searches by metadata (e.g., source file). + +## Example `FIND` Usage + +```basic +SET_KB "company-policies" +FIND "vacation policy" INTO RESULT +TALK RESULT +``` + +The keyword internally: +1. Generates an embedding for the query string. +2. Calls Qdrant’s `search` API. +3. Returns the most relevant chunk as `RESULT`. diff --git a/docs/src/chapter-03/semantic-search.md b/docs/src/chapter-03/semantic-search.md index 95a7eed1d..27be0982f 100644 --- a/docs/src/chapter-03/semantic-search.md +++ b/docs/src/chapter-03/semantic-search.md @@ -1 +1,36 @@ # Semantic Search + +Semantic search enables the bot to retrieve information based on meaning rather than exact keyword matches. It leverages the vector embeddings stored in Qdrant. + +## How It Works + +1. **Query Embedding** – The user’s query string is converted into a dense vector using the same embedding model as the documents. +2. **Nearest‑Neighbor Search** – Qdrant returns the top‑k vectors that are closest to the query vector. +3. **Result Formatting** – The matching document chunks are concatenated and passed to the LLM as context for the final response. + +## Using the `FIND` Keyword + +```basic +SET_KB "company-policies" +FIND "how many vacation days do I have?" INTO RESULT +TALK RESULT +``` + +- `SET_KB` selects the collection. +- `FIND` performs the semantic search. +- `RESULT` receives the best matching snippet. + +## Parameters + +- **k** – Number of results to return (default 3). Can be overridden with `FIND "query" LIMIT 5 INTO RESULT`. +- **filter** – Optional metadata filter, e.g., `FILTER source="policy.pdf"`. + +## Best Practices + +- Keep the query concise (1‑2 sentences) for optimal embedding quality. +- Use `FORMAT` to clean up the result before sending to the user. +- Combine with `GET_BOT_MEMORY` to store frequently accessed answers. + +## Performance + +Semantic search latency is typically < 100 ms for collections under 50 k vectors. Larger collections may require tuning Qdrant’s HNSW parameters. diff --git a/docs/src/chapter-03/vector-collections.md b/docs/src/chapter-03/vector-collections.md index 50167f808..32064b8f4 100644 --- a/docs/src/chapter-03/vector-collections.md +++ b/docs/src/chapter-03/vector-collections.md @@ -1 +1,45 @@ # Vector Collections + +A **vector collection** is a set of documents that have been transformed into vector embeddings for fast semantic similarity search. Each collection lives under a `.gbkb` folder and is identified by a unique name. + +## Creating a Collection + +Use the `ADD_KB` keyword in a dialog script: + +```basic +ADD_KB "company-policies" +``` + +This creates a new collection named `company-policies` in the bot’s knowledge base. + +## Adding Documents + +Documents can be added directly from files or by crawling a website: + +```basic +ADD_KB "company-policies" ' adds a new empty collection +ADD_WEBSITE "https://example.com/policies" +``` + +The system will download the content, split it into chunks, generate embeddings using the default LLM model, and store them in the collection. + +## Managing Collections + +- `SET_KB "collection-name"` – selects the active collection for subsequent `ADD_KB` or `FIND` calls. +- `LIST_KB` – (not a keyword, but you can query via API) lists all collections. + +## Use in Dialogs + +When a collection is active, the `FIND` keyword searches across its documents, and the `GET_BOT_MEMORY` keyword can retrieve relevant snippets to inject into LLM prompts. + +```basic +SET_KB "company-policies" +FIND "vacation policy" INTO RESULT +TALK RESULT +``` + +## Technical Details + +- Embeddings are generated with the BGE‑small‑en‑v1.5 model. +- Vectors are stored in Qdrant (see Chapter 04). +- Each document is chunked into 500‑token pieces for efficient retrieval. diff --git a/docs/src/chapter-04/css.md b/docs/src/chapter-04/css.md index c88c5dd0c..091f970e6 100644 --- a/docs/src/chapter-04/css.md +++ b/docs/src/chapter-04/css.md @@ -1 +1,50 @@ # CSS Customization + +The **gbtheme** CSS files define the visual style of the bot UI. They are split into three layers to make them easy to extend. + +## Files + +| File | Role | +|------|------| +| `main.css` | Core layout, typography, and global variables. | +| `components.css` | Styles for reusable UI components (buttons, cards, modals). | +| `responsive.css` | Media queries for mobile, tablet, and desktop breakpoints. | + +## CSS Variables (in `main.css`) + +```css +:root { + --primary-color: #2563eb; + --secondary-color: #64748b; + --background-color: #ffffff; + --text-color: #1e293b; + --border-radius: 8px; + --spacing-unit: 8px; +} +``` + +Changing a variable updates the entire theme without editing individual rules. + +## Extending the Theme + +1. **Add a new variable** – Append to `:root` and reference it in any selector. +2. **Override a component** – Duplicate the selector in `components.css` after the original definition; the later rule wins. +3. **Create a dark mode** – Add a `@media (prefers-color-scheme: dark)` block that redefines the variables. + +```css +@media (prefers-color-scheme: dark) { + :root { + --primary-color: #3b82f6; + --background-color: #111827; + --text-color: #f9fafb; + } +} +``` + +## Best Practices + +* Keep the file size small – avoid large image data URIs; store images in `assets/`. +* Use `rem` units for font sizes; they scale with the root `font-size`. +* Limit the depth of nesting; flat selectors improve performance. + +All CSS files are loaded in `index.html` in the order: `main.css`, `components.css`, `responsive.css`. diff --git a/docs/src/chapter-04/html.md b/docs/src/chapter-04/html.md index c2c197f8e..271e324ff 100644 --- a/docs/src/chapter-04/html.md +++ b/docs/src/chapter-04/html.md @@ -1 +1,71 @@ # HTML Templates + +The **gbtheme** HTML files provide the markup for the bot’s UI. They are deliberately minimal to allow easy customization. + +## index.html + +```html + + + + + GeneralBots Chat + + + + + +
+

GeneralBots

+
+ +
+ +
+

© 2025 GeneralBots

+
+ + + + + +``` + +*Loads the CSS layers and the JavaScript modules.* + +## chat.html + +```html +
+
+
+ + +
+
+
+``` + +*Used by `app.js` to render the conversation view.* + +## login.html + +```html + +``` + +*Optional page displayed when the bot requires authentication.* + +## Customization Tips + +* Replace the `
` content with your brand logo. +* Add additional `` tags (e.g., Open Graph) in `index.html`. +* Insert extra `