- Created detailed markdown files for keywords such as HEAR, TALK, and SET_USER. - Added examples and usage notes for each keyword to enhance user understanding. - Developed templates for common tasks like enrollment and authentication. - Structured documentation into chapters covering various aspects of the GeneralBots platform, including gbapp, gbkb, and gbtheme. - Introduced a glossary for key terms and concepts related to GeneralBots. - Implemented a user-friendly table of contents for easy navigation.
1.9 KiB
1.9 KiB
.gbkb Knowledge Base
The .gbkb package manages knowledge base collections that provide contextual information to the bot during conversations.
What is .gbkb?
.gbkb (General Bot Knowledge Base) collections store:
- Document collections for semantic search
- Vector embeddings for similarity matching
- Metadata and indexing information
- Access control and organization
Knowledge Base Structure
Each .gbkb collection is organized as:
collection-name.gbkb/
├── documents/
│ ├── doc1.pdf
│ ├── doc2.txt
│ └── doc3.html
├── embeddings/ # Auto-generated
├── metadata.json # Collection info
└── index.json # Search indexes
Supported Formats
The knowledge base can process:
- Text files: .txt, .md, .html
- Documents: .pdf, .docx
- Web content: URLs and web pages
- Structured data: .csv, .json
Vector Embeddings
Each document is processed into vector embeddings using:
- BGE-small-en-v1.5 model (default)
- Chunking for large documents
- Metadata extraction and indexing
- Semantic similarity scoring
Collection Management
Creating Collections
ADD_KB "company-policies"
ADD_WEBSITE "https://company.com/docs"
Using Collections
SET_KB "company-policies"
LLM "What is the vacation policy?"
Multiple Collections
ADD_KB "policies"
ADD_KB "procedures"
ADD_KB "faqs"
REM All active collections contribute to context
Semantic Search
The knowledge base provides:
- Similarity search: Find relevant documents
- Hybrid search: Combine semantic and keyword
- Context injection: Automatically add to LLM prompts
- Relevance scoring: Filter by similarity threshold
Integration with Dialogs
Knowledge bases are automatically used when:
SET_KBorADD_KBis called- Answer mode is set to use documents
- LLM queries benefit from contextual information