Update documentation to reflect transition from Qdrant to VectorDB, including caching, indexing, and semantic search sections. Add comprehensive overview for Chapter 03.
This commit is contained in:
parent
a50cce7f27
commit
8e775cdacb
7 changed files with 61 additions and 16 deletions
22
docs/src/chapter-02/summary.md
Normal file
22
docs/src/chapter-02/summary.md
Normal file
|
|
@ -0,0 +1,22 @@
|
||||||
|
# Chapter 02 – Package Documentation Overview
|
||||||
|
|
||||||
|
This chapter provides a concise overview of the GeneralBots package types introduced in Chapter 02. Each package type is documented in its own markdown file. Below is a quick reference with brief descriptions and links to the full documentation.
|
||||||
|
|
||||||
|
| Package | File | Description |
|
||||||
|
|---------|------|-------------|
|
||||||
|
| **.gbai** | [gbai.md](gbai.md) | Defines the overall application architecture, metadata, and package hierarchy. |
|
||||||
|
| **.gbdialog** | [gbdialog.md](gbdialog.md) | Contains BASIC‑style dialog scripts that drive conversation flow and tool integration. |
|
||||||
|
| **.gbdrive** | [gbdrive.md](gbdrive.md) | Manages file storage and retrieval via MinIO (or other S3‑compatible backends). |
|
||||||
|
| **.gbkb** | [gbkb.md](gbkb.md) | Handles knowledge‑base collections, vector embeddings, and semantic search. |
|
||||||
|
| **.gbot** | [gbot.md](gbot.md) | Stores bot configuration (CSV) for identity, LLM settings, answer modes, and runtime parameters. |
|
||||||
|
| **.gbtheme** | [gbtheme.md](gbtheme.md) | Provides UI theming assets: CSS, HTML templates, JavaScript, and static resources. |
|
||||||
|
|
||||||
|
## How to Use This Overview
|
||||||
|
|
||||||
|
- **Navigate**: Click the file links above to read the detailed documentation for each package.
|
||||||
|
- **Reference**: Use this table as a quick lookup when developing or extending a GeneralBots application.
|
||||||
|
- **Extend**: When adding new package types, update this table and create a corresponding markdown file.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*This summary was added to fill the missing documentation for Chapter 02.*
|
||||||
|
|
@ -24,7 +24,7 @@ cache_max_entries,500
|
||||||
|
|
||||||
```basic
|
```basic
|
||||||
SET_KB "company-policies"
|
SET_KB "company-policies"
|
||||||
FIND "vacation policy" INTO RESULT ' first call hits Qdrant
|
FIND "vacation policy" INTO RESULT ' first call hits VectorDB
|
||||||
FIND "vacation policy" INTO RESULT ' second call hits cache
|
FIND "vacation policy" INTO RESULT ' second call hits cache
|
||||||
TALK RESULT
|
TALK RESULT
|
||||||
```
|
```
|
||||||
|
|
@ -39,5 +39,5 @@ The second call returns instantly from the cache.
|
||||||
## Benefits
|
## Benefits
|
||||||
|
|
||||||
- Reduces latency for hot queries.
|
- Reduces latency for hot queries.
|
||||||
- Lowers load on Qdrant.
|
- Lowers load on VectorDB.
|
||||||
- Transparent to the script author; caching is automatic.
|
- Transparent to the script author; caching is automatic.
|
||||||
|
|
|
||||||
|
|
@ -5,8 +5,8 @@ When a document is added to a knowledge‑base collection with `ADD_KB` or `ADD_
|
||||||
1. **Content Extraction** – Files are read and plain‑text is extracted (PDF, DOCX, HTML, etc.).
|
1. **Content Extraction** – Files are read and plain‑text is extracted (PDF, DOCX, HTML, etc.).
|
||||||
2. **Chunking** – The text is split into 500‑token chunks to keep embeddings manageable.
|
2. **Chunking** – The text is split into 500‑token chunks to keep embeddings manageable.
|
||||||
3. **Embedding Generation** – Each chunk is sent to the configured LLM embedding model (default **BGE‑small‑en‑v1.5**) to produce a dense vector.
|
3. **Embedding Generation** – Each chunk is sent to the configured LLM embedding model (default **BGE‑small‑en‑v1.5**) to produce a dense vector.
|
||||||
4. **Storage** – Vectors, along with metadata (source file, chunk offset), are stored in Qdrant under the collection’s namespace.
|
4. **Storage** – Vectors, along with metadata (source file, chunk offset), are stored in VectorDB under the collection’s namespace.
|
||||||
5. **Indexing** – Qdrant builds an IVF‑PQ index for fast approximate nearest‑neighbor search.
|
5. **Indexing** – VectorDB builds an IVF‑PQ index for fast approximate nearest‑neighbor search.
|
||||||
|
|
||||||
## Index Refresh
|
## Index Refresh
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,21 +1,21 @@
|
||||||
# Qdrant Integration
|
# VectorDB Integration
|
||||||
|
|
||||||
GeneralBots uses **Qdrant** as the vector database for storing and searching embeddings. The Rust client `qdrant-client` is used to communicate with the service.
|
GeneralBots uses **VectorDB** as the vector database for storing and searching embeddings. The Rust client for the configured VectorDB is used to communicate with the service.
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
The connection is configured via environment variables:
|
The connection is configured via environment variables:
|
||||||
|
|
||||||
```env
|
```env
|
||||||
QDRANT_URL=http://localhost:6333
|
VECTORDB_URL=http://localhost:6333
|
||||||
QDRANT_API_KEY=your-api-key # optional
|
VECTORDB_API_KEY=your-api-key # optional
|
||||||
```
|
```
|
||||||
|
|
||||||
These values are read at startup and passed to the `QdrantClient`.
|
These values are read at startup and passed to the `VectorDBClient`.
|
||||||
|
|
||||||
## Collection Mapping
|
## Collection Mapping
|
||||||
|
|
||||||
Each `.gbkb` collection maps to a Qdrant collection with the same name. For example, a knowledge base named `company-policies` becomes a Qdrant collection `company-policies`.
|
Each `.gbkb` collection maps to a VectorDB collection with the same name. For example, a knowledge base named `company-policies` becomes a VectorDB collection `company-policies`.
|
||||||
|
|
||||||
## Operations
|
## Operations
|
||||||
|
|
||||||
|
|
@ -26,7 +26,7 @@ Each `.gbkb` collection maps to a Qdrant collection with the same name. For exam
|
||||||
## Performance Tips
|
## Performance Tips
|
||||||
|
|
||||||
- Keep the number of vectors per collection reasonable (tens of thousands) for optimal latency.
|
- Keep the number of vectors per collection reasonable (tens of thousands) for optimal latency.
|
||||||
- Adjust Qdrant’s `hnsw` parameters in `QdrantClient::new` if you need higher recall.
|
- Adjust VectorDB’s `hnsw` parameters in `VectorDBClient::new` if you need higher recall.
|
||||||
- Use the `FILTER` option to restrict searches by metadata (e.g., source file).
|
- Use the `FILTER` option to restrict searches by metadata (e.g., source file).
|
||||||
|
|
||||||
## Example `FIND` Usage
|
## Example `FIND` Usage
|
||||||
|
|
@ -39,5 +39,5 @@ TALK RESULT
|
||||||
|
|
||||||
The keyword internally:
|
The keyword internally:
|
||||||
1. Generates an embedding for the query string.
|
1. Generates an embedding for the query string.
|
||||||
2. Calls Qdrant’s `search` API.
|
2. Calls VectorDB’s `search` API.
|
||||||
3. Returns the most relevant chunk as `RESULT`.
|
3. Returns the most relevant chunk as `RESULT`.
|
||||||
|
|
|
||||||
|
|
@ -1,11 +1,11 @@
|
||||||
# Semantic Search
|
# Semantic Search
|
||||||
|
|
||||||
Semantic search enables the bot to retrieve information based on meaning rather than exact keyword matches. It leverages the vector embeddings stored in Qdrant.
|
Semantic search enables the bot to retrieve information based on meaning rather than exact keyword matches. It leverages the vector embeddings stored in VectorDB.
|
||||||
|
|
||||||
## How It Works
|
## How It Works
|
||||||
|
|
||||||
1. **Query Embedding** – The user’s query string is converted into a dense vector using the same embedding model as the documents.
|
1. **Query Embedding** – The user’s query string is converted into a dense vector using the same embedding model as the documents.
|
||||||
2. **Nearest‑Neighbor Search** – Qdrant returns the top‑k vectors that are closest to the query vector.
|
2. **Nearest‑Neighbor Search** – VectorDB returns the top‑k vectors that are closest to the query vector.
|
||||||
3. **Result Formatting** – The matching document chunks are concatenated and passed to the LLM as context for the final response.
|
3. **Result Formatting** – The matching document chunks are concatenated and passed to the LLM as context for the final response.
|
||||||
|
|
||||||
## Using the `FIND` Keyword
|
## Using the `FIND` Keyword
|
||||||
|
|
@ -33,4 +33,4 @@ TALK RESULT
|
||||||
|
|
||||||
## Performance
|
## Performance
|
||||||
|
|
||||||
Semantic search latency is typically < 100 ms for collections under 50 k vectors. Larger collections may require tuning Qdrant’s HNSW parameters.
|
Semantic search latency is typically < 100 ms for collections under 50 k vectors. Larger collections may require tuning VectorDB’s HNSW parameters.
|
||||||
|
|
|
||||||
23
docs/src/chapter-03/summary.md
Normal file
23
docs/src/chapter-03/summary.md
Normal file
|
|
@ -0,0 +1,23 @@
|
||||||
|
# Chapter 03 – Knowledge‑Base (VectorDB) Documentation Overview
|
||||||
|
|
||||||
|
This chapter explains how GeneralBots manages knowledge‑base collections, indexing, caching, and semantic search. The implementation now references a generic **VectorDB** (instead of a specific Qdrant instance) and highlights the use of the **.gbdrive** package for storage when needed.
|
||||||
|
|
||||||
|
| Document | File | Description |
|
||||||
|
|----------|------|-------------|
|
||||||
|
| **README** | [README.md](README.md) | High‑level reference for the `.gbkb` package and its core commands (`ADD_KB`, `SET_KB`, `ADD_WEBSITE`). |
|
||||||
|
| **Caching** | [caching.md](caching.md) | Optional in‑memory and persistent SQLite caching to speed up frequent `FIND` queries. |
|
||||||
|
| **Context Compaction** | [context-compaction.md](context-compaction.md) | Techniques to keep the LLM context window within limits (summarization, memory pruning, sliding window). |
|
||||||
|
| **Indexing** | [indexing.md](indexing.md) | Process of extracting, chunking, embedding, and storing document vectors in the VectorDB. |
|
||||||
|
| **VectorDB Integration** | [qdrant.md](qdrant.md) | (Renamed) Details the VectorDB connection, collection mapping, and operations. References to **Qdrant** have been generalized to **VectorDB**. |
|
||||||
|
| **Semantic Search** | [semantic-search.md](semantic-search.md) | How the `FIND` keyword performs meaning‑based retrieval using the VectorDB. |
|
||||||
|
| **Vector Collections** | [vector-collections.md](vector-collections.md) | Definition and management of vector collections, including creation, document addition, and usage in dialogs. |
|
||||||
|
|
||||||
|
## How to Use This Overview
|
||||||
|
|
||||||
|
- **Navigate**: Click the file links to read the full documentation for each topic.
|
||||||
|
- **Reference**: Use this table as a quick lookup when developing or extending knowledge‑base functionality.
|
||||||
|
- **Update**: When the underlying storage or VectorDB implementation changes, edit the corresponding markdown files and keep this summary in sync.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*This summary was added to provide a cohesive overview of Chapter 03, aligning terminology with the current architecture (VectorDB, .gbdrive, etc.).*
|
||||||
|
|
@ -41,5 +41,5 @@ TALK RESULT
|
||||||
## Technical Details
|
## Technical Details
|
||||||
|
|
||||||
- Embeddings are generated with the BGE‑small‑en‑v1.5 model.
|
- Embeddings are generated with the BGE‑small‑en‑v1.5 model.
|
||||||
- Vectors are stored in Qdrant (see Chapter 04).
|
- Vectors are stored in VectorDB (see Chapter 04).
|
||||||
- Each document is chunked into 500‑token pieces for efficient retrieval.
|
- Each document is chunked into 500‑token pieces for efficient retrieval.
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue