botserver/docs/src/chapter-03/semantic-search.md

# Semantic Search

Semantic search in BotServer happens automatically when you use `USE KB`. The system searches for relevant information based on meaning, not just keywords, and makes it available to the system AI during conversations.

## Search Pipeline

<img src="./assets/search-pipeline.svg" alt="Semantic Search Pipeline" style="max-height: 500px; width: 100%; object-fit: contain;">

## How It Works Automatically

1. **User asks a question** - Natural language input
2. **Query converted to vector** - Using the embedding model
3. **Search active collections** - Finds semantically similar content
4. **Inject into context** - Relevant chunks provided to system AI
5. **Generate response** - System AI answers using the knowledge

## Activating Semantic Search

Simply use `USE KB` to enable search for a collection:

```basic
USE KB "policies"
USE KB "procedures"
' Both collections are now searchable
' No explicit search commands needed
```

When users ask questions, the system automatically searches these collections and provides relevant context to the system AI.

## How Meaning-Based Search Works

Unlike keyword search, semantic search understands meaning:

- "How many days off do I get?" matches "vacation policy"
- "What's the return policy?" matches "refund procedures"
- "I'm feeling sick" matches "medical leave guidelines"

The system uses vector embeddings to find conceptually similar content, even when exact words don't match.

## Configuration

Search behavior is controlled by `config.csv`:

```csv
prompt-history,2     # How many previous messages to include
prompt-compact,4     # Compact context after N exchanges
```

These settings manage how much context the system AI receives, not the search itself.

## Multiple Collections

When multiple collections are active, the system searches all of them:

```basic
USE KB "products"
USE KB "support"
USE KB "warranty"

' User: "My laptop won't turn on"
' System searches all three collections for relevant info
```

## Search Quality

The quality of semantic search depends on:
- **Document organization** - Well-structured folders help
- **Embedding model** - BGE model works well, can be replaced
- **Content quality** - Clear, descriptive documents work best

## Real Example

```basic
' In start.bas
USE KB "company-handbook"

' User types: "What's the dress code?"
' System automatically:
' 1. Searches company-handbook for dress code info
' 2. Finds relevant sections about attire
' 3. Injects them into LLM context
' 4. LLM generates natural response with the information
```

## Performance

- Search happens in milliseconds
- No configuration needed
- Cached for repeated queries
- Only active collections are searched

## Best Practices

1. **Activate only needed collections** - Don't overload context
2. **Organize content well** - One topic per folder
3. **Use descriptive text** - Helps with matching
4. **Keep documents updated** - Fresh content = better answers

## Common Misconceptions

❌ **Wrong**: You need to call a search function
✅ **Right**: Search happens automatically with `USE KB`

❌ **Wrong**: You need to configure search parameters
✅ **Right**: It works out of the box

❌ **Wrong**: You need special commands to query
✅ **Right**: Users just ask questions naturally

## Troubleshooting

### Not finding relevant content?
- Check the collection is activated with `USE KB`
- Verify documents are in the right folder
- Ensure content is descriptive

### Too much irrelevant content?
- Use fewer collections simultaneously
- Organize documents into more specific folders
- Clear unused collections with `CLEAR KB`

Remember: The beauty of semantic search in BotServer is its simplicity - just `USE KB` and let the system handle the rest!
Add comprehensive documentation for GeneralBots, including keyword references, templates, and user guides - Created detailed markdown files for keywords such as HEAR, TALK, and SET_USER. - Added examples and usage notes for each keyword to enhance user understanding. - Developed templates for common tasks like enrollment and authentication. - Structured documentation into chapters covering various aspects of the GeneralBots platform, including gbapp, gbkb, and gbtheme. - Introduced a glossary for key terms and concepts related to GeneralBots. - Implemented a user-friendly table of contents for easy navigation. 2025-10-25 14:50:14 -03:00			`# Semantic Search`
Revise documentation in Chapter 01 to improve clarity and structure, including updates to the installation instructions and session management overview. 2025-10-25 15:59:06 -03:00
- From 4 to 7. 2025-11-23 20:12:09 -03:00			Semantic search in BotServer happens automatically when you use `USE KB`. The system searches for relevant information based on meaning, not just keywords, and makes it available to the system AI during conversations.
Revise documentation in Chapter 01 to improve clarity and structure, including updates to the installation instructions and session management overview. 2025-10-25 15:59:06 -03:00
- Fix .svgs. 2025-11-24 14:15:01 -03:00			`## Search Pipeline`

			`<img src="./assets/search-pipeline.svg" alt="Semantic Search Pipeline" style="max-height: 500px; width: 100%; object-fit: contain;">`

- More general docs. 2025-11-23 13:46:55 -03:00			`## How It Works Automatically`
Revise documentation in Chapter 01 to improve clarity and structure, including updates to the installation instructions and session management overview. 2025-10-25 15:59:06 -03:00
- More general docs. 2025-11-23 13:46:55 -03:00			`1. User asks a question - Natural language input`
			`2. Query converted to vector - Using the embedding model`
			`3. Search active collections - Finds semantically similar content`
- From 4 to 7. 2025-11-23 20:12:09 -03:00			`4. Inject into context - Relevant chunks provided to system AI`
			`5. Generate response - System AI answers using the knowledge`
Revise documentation in Chapter 01 to improve clarity and structure, including updates to the installation instructions and session management overview. 2025-10-25 15:59:06 -03:00
- More general docs. 2025-11-23 13:46:55 -03:00			`## Activating Semantic Search`

			Simply use `USE KB` to enable search for a collection:
Revise documentation in Chapter 01 to improve clarity and structure, including updates to the installation instructions and session management overview. 2025-10-25 15:59:06 -03:00
			```basic
- More general docs. 2025-11-23 13:46:55 -03:00			`USE KB "policies"`
			`USE KB "procedures"`
			`' Both collections are now searchable`
			`' No explicit search commands needed`
Revise documentation in Chapter 01 to improve clarity and structure, including updates to the installation instructions and session management overview. 2025-10-25 15:59:06 -03:00			```

- From 4 to 7. 2025-11-23 20:12:09 -03:00			`When users ask questions, the system automatically searches these collections and provides relevant context to the system AI.`
Revise documentation in Chapter 01 to improve clarity and structure, including updates to the installation instructions and session management overview. 2025-10-25 15:59:06 -03:00
- More general docs. 2025-11-23 13:46:55 -03:00			`## How Meaning-Based Search Works`
Revise documentation in Chapter 01 to improve clarity and structure, including updates to the installation instructions and session management overview. 2025-10-25 15:59:06 -03:00
- More general docs. 2025-11-23 13:46:55 -03:00			`Unlike keyword search, semantic search understands meaning:`
Revise documentation in Chapter 01 to improve clarity and structure, including updates to the installation instructions and session management overview. 2025-10-25 15:59:06 -03:00
- More general docs. 2025-11-23 13:46:55 -03:00			`- "How many days off do I get?" matches "vacation policy"`
			`- "What's the return policy?" matches "refund procedures"`
			`- "I'm feeling sick" matches "medical leave guidelines"`

			`The system uses vector embeddings to find conceptually similar content, even when exact words don't match.`

			`## Configuration`

			Search behavior is controlled by `config.csv`:

			```csv
			`prompt-history,2 # How many previous messages to include`
			`prompt-compact,4 # Compact context after N exchanges`
			```

- From 4 to 7. 2025-11-23 20:12:09 -03:00			`These settings manage how much context the system AI receives, not the search itself.`
- More general docs. 2025-11-23 13:46:55 -03:00
			`## Multiple Collections`

			`When multiple collections are active, the system searches all of them:`

			```basic
			`USE KB "products"`
			`USE KB "support"`
			`USE KB "warranty"`

			`' User: "My laptop won't turn on"`
			`' System searches all three collections for relevant info`
			```

			`## Search Quality`

			`The quality of semantic search depends on:`
			`- Document organization - Well-structured folders help`
			`- Embedding model - BGE model works well, can be replaced`
			`- Content quality - Clear, descriptive documents work best`
Revise documentation in Chapter 01 to improve clarity and structure, including updates to the installation instructions and session management overview. 2025-10-25 15:59:06 -03:00
- More general docs. 2025-11-23 13:46:55 -03:00			`## Real Example`

			```basic
			`' In start.bas`
			`USE KB "company-handbook"`

			`' User types: "What's the dress code?"`
			`' System automatically:`
			`' 1. Searches company-handbook for dress code info`
			`' 2. Finds relevant sections about attire`
			`' 3. Injects them into LLM context`
			`' 4. LLM generates natural response with the information`
			```
Revise documentation in Chapter 01 to improve clarity and structure, including updates to the installation instructions and session management overview. 2025-10-25 15:59:06 -03:00
			`## Performance`

- More general docs. 2025-11-23 13:46:55 -03:00			`- Search happens in milliseconds`
			`- No configuration needed`
			`- Cached for repeated queries`
			`- Only active collections are searched`

			`## Best Practices`

			`1. Activate only needed collections - Don't overload context`
			`2. Organize content well - One topic per folder`
			`3. Use descriptive text - Helps with matching`
			`4. Keep documents updated - Fresh content = better answers`

			`## Common Misconceptions`

			`❌ Wrong: You need to call a search function`
			✅ Right: Search happens automatically with `USE KB`

			`❌ Wrong: You need to configure search parameters`
			`✅ Right: It works out of the box`

			`❌ Wrong: You need special commands to query`
			`✅ Right: Users just ask questions naturally`

			`## Troubleshooting`

			`### Not finding relevant content?`
			- Check the collection is activated with `USE KB`
			`- Verify documents are in the right folder`
			`- Ensure content is descriptive`

			`### Too much irrelevant content?`
			`- Use fewer collections simultaneously`
			`- Organize documents into more specific folders`
			- Clear unused collections with `CLEAR KB`

			Remember: The beauty of semantic search in BotServer is its simplicity - just `USE KB` and let the system handle the rest!