2025-10-25 14:50:14 -03:00
|
|
|
# .gbkb Knowledge Base
|
|
|
|
|
|
|
|
|
|
The `.gbkb` package manages knowledge base collections that provide contextual information to the bot during conversations.
|
|
|
|
|
|
|
|
|
|
## What is .gbkb?
|
|
|
|
|
|
|
|
|
|
`.gbkb` (General Bot Knowledge Base) collections store:
|
|
|
|
|
- Document collections for semantic search
|
|
|
|
|
- Vector embeddings for similarity matching
|
|
|
|
|
- Metadata and indexing information
|
|
|
|
|
- Access control and organization
|
|
|
|
|
|
|
|
|
|
## Knowledge Base Structure
|
|
|
|
|
|
|
|
|
|
Each `.gbkb` collection is organized as:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
collection-name.gbkb/
|
2025-11-23 17:02:22 -03:00
|
|
|
documents/
|
|
|
|
|
doc1.pdf
|
|
|
|
|
doc2.txt
|
|
|
|
|
doc3.html
|
|
|
|
|
embeddings/ # Auto-generated
|
|
|
|
|
metadata.json # Collection info
|
|
|
|
|
index.json # Search indexes
|
2025-10-25 14:50:14 -03:00
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Supported Formats
|
|
|
|
|
|
|
|
|
|
The knowledge base can process:
|
|
|
|
|
- **Text files**: .txt, .md, .html
|
|
|
|
|
- **Documents**: .pdf, .docx
|
|
|
|
|
- **Web content**: URLs and web pages
|
|
|
|
|
- **Structured data**: .csv, .json
|
|
|
|
|
|
|
|
|
|
## Vector Embeddings
|
|
|
|
|
|
|
|
|
|
Each document is processed into vector embeddings using:
|
|
|
|
|
- BGE-small-en-v1.5 model (default)
|
|
|
|
|
- Chunking for large documents
|
|
|
|
|
- Metadata extraction and indexing
|
|
|
|
|
- Semantic similarity scoring
|
|
|
|
|
|
|
|
|
|
## Collection Management
|
|
|
|
|
|
|
|
|
|
### Creating Collections
|
|
|
|
|
```basic
|
2025-11-23 13:46:55 -03:00
|
|
|
USE KB "company-policies"
|
|
|
|
|
ADD WEBSITE "https://company.com/docs"
|
2025-10-25 14:50:14 -03:00
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Using Collections
|
|
|
|
|
```basic
|
2025-11-23 13:46:55 -03:00
|
|
|
USE KB "company-policies"
|
2025-10-25 14:50:14 -03:00
|
|
|
LLM "What is the vacation policy?"
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Multiple Collections
|
|
|
|
|
```basic
|
2025-11-23 13:46:55 -03:00
|
|
|
USE KB "policies"
|
|
|
|
|
USE KB "procedures"
|
|
|
|
|
USE KB "faqs"
|
2025-10-25 14:50:14 -03:00
|
|
|
REM All active collections contribute to context
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Semantic Search
|
|
|
|
|
|
|
|
|
|
The knowledge base provides:
|
|
|
|
|
- **Similarity search**: Find relevant documents
|
|
|
|
|
- **Hybrid search**: Combine semantic and keyword
|
|
|
|
|
- **Context injection**: Automatically add to LLM prompts
|
|
|
|
|
- **Relevance scoring**: Filter by similarity threshold
|
|
|
|
|
|
|
|
|
|
## Integration with Dialogs
|
|
|
|
|
|
|
|
|
|
Knowledge bases are automatically used when:
|
2025-11-23 13:46:55 -03:00
|
|
|
- `USE KB` is called
|
2025-10-25 14:50:14 -03:00
|
|
|
- Answer mode is set to use documents
|
|
|
|
|
- LLM queries benefit from contextual information
|