docs: Update model names to latest (GPT-5, Claude 4.5, DeepSeek-R3)
- Update all model references across 14+ documentation files - GPT-4.1 → GPT-5, GPT-5 mini - Claude Sonnet/Opus → Claude Sonnet 4.5, Claude Opus 4.5 - DeepSeek-R1 → DeepSeek-R3 - Add Template: Attendance CRM to SUMMARY.md - Update attendant.csv docs with multi-channel columns - Update TASKS.md with completed model updates
This commit is contained in:
parent
80f1041263
commit
e5fd4bd3fc
25 changed files with 503 additions and 84 deletions
47
TASKS.md
47
TASKS.md
|
|
@ -29,31 +29,34 @@
|
|||
|
||||
---
|
||||
|
||||
## 🔴 CRITICAL: Model Name Updates Needed
|
||||
## ✅ COMPLETED: Model Name Updates
|
||||
|
||||
Old model names found in documentation that should be updated:
|
||||
Model names updated to current versions (2025-01):
|
||||
|
||||
| File | Current | Should Be |
|
||||
|------|---------|-----------|
|
||||
| `appendix-external-services/README.md` | `gpt-4o` | Generic or current |
|
||||
| `appendix-external-services/catalog.md` | `claude-opus-4.5` | Current Anthropic models |
|
||||
| `appendix-external-services/hosting-dns.md` | `GPT-4, Claude 3` | Generic reference |
|
||||
| `appendix-external-services/llm-providers.md` | `claude-sonnet-4.5`, `llama-4-scout` | Current models |
|
||||
| `chapter-02/gbot.md` | `GPT-4 or Claude 3` | Generic reference |
|
||||
| `chapter-02/template-llm-server.md` | `gpt-4` | Generic or current |
|
||||
| `chapter-02/template-llm-tools.md` | `gpt-4` | Generic or current |
|
||||
| `chapter-02/templates.md` | `gpt-4` | Generic or current |
|
||||
| `chapter-04-gbui/how-to/create-first-bot.md` | `gpt-4o` | Generic or current |
|
||||
| `chapter-04-gbui/how-to/monitor-sessions.md` | `gpt-4o active` | Generic reference |
|
||||
| `chapter-04-gbui/suite-manual.md` | `GPT-4o`, `Claude 3.5` | Current versions |
|
||||
| `chapter-06-gbdialog/keyword-model-route.md` | `gpt-3.5-turbo`, `gpt-4o` | Generic or current |
|
||||
| `chapter-06-gbdialog/keyword-use-model.md` | `gpt-4`, `codellama-7b` | Generic or current |
|
||||
| File | Updated To |
|
||||
|------|------------|
|
||||
| `appendix-external-services/README.md` | `claude-sonnet-4.5` |
|
||||
| `appendix-external-services/hosting-dns.md` | `GPT-5, o3`, `Claude Sonnet 4.5, Opus 4.5` |
|
||||
| `appendix-external-services/llm-providers.md` | `DeepSeek-R3, Claude Sonnet 4.5` |
|
||||
| `chapter-04-gbui/how-to/create-first-bot.md` | `claude-sonnet-4.5` |
|
||||
| `chapter-04-gbui/how-to/monitor-sessions.md` | `LLM active` (generic) |
|
||||
| `chapter-04-gbui/suite-manual.md` | `Claude Sonnet 4.5`, `Claude Opus 4.5`, `Gemini Pro`, `Llama 3.3` |
|
||||
| `chapter-06-gbdialog/keyword-model-route.md` | `claude-sonnet-4.5`, `gemini-flash`, `claude-opus-4.5` |
|
||||
| `chapter-06-gbdialog/basic-vs-automation-tools.md` | `GPT-5, o3`, `Claude Sonnet 4.5, Opus 4.5` |
|
||||
| `chapter-07-gbapp/architecture.md` | `GPT-5 and o3`, `Claude Sonnet 4.5 and Opus 4.5` |
|
||||
| `chapter-08-config/llm-config.md` | `claude-sonnet-4.5` |
|
||||
| `chapter-08-config/secrets-management.md` | `claude-sonnet-4.5` |
|
||||
| `chapter-11-features/ai-llm.md` | `GPT-5`, `o3`, `Claude Sonnet 4.5` |
|
||||
| `chapter-11-features/core-features.md` | `GPT-5, o3`, `Claude Sonnet 4.5, Opus 4.5` |
|
||||
| `executive-vision.md` | `GPT-5, o3`, `Claude Sonnet 4.5, Opus 4.5` |
|
||||
|
||||
### Recommendation
|
||||
Replace with:
|
||||
- Generic: `your-model-name`, `{model}`, `local-model.gguf`
|
||||
- Current local: `DeepSeek-R1-Distill-Qwen-1.5B`, `Qwen2.5-7B`
|
||||
- Current cloud: Provider-agnostic examples
|
||||
### Naming Convention Applied
|
||||
- OpenAI: `GPT-5`, `GPT-5 mini`, `o3`
|
||||
- Anthropic: `Claude Sonnet 4.5`, `Claude Opus 4.5`
|
||||
- Google: `Gemini Pro`, `Gemini Flash`
|
||||
- Meta: `Llama 3.3`
|
||||
- DeepSeek: `DeepSeek-V3`, `DeepSeek-R3`
|
||||
- Local: `model.gguf`, `local-model`
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -34,6 +34,7 @@
|
|||
- [Template: Reminders](./chapter-02/template-reminder.md)
|
||||
- [Template: Sales CRM](./chapter-02/template-crm.md)
|
||||
- [Template: CRM Contacts](./chapter-02/template-crm-contacts.md)
|
||||
- [Template: Attendance CRM](./chapter-02/template-attendance-crm.md)
|
||||
- [Template: Marketing](./chapter-02/template-marketing.md)
|
||||
- [Template: Creating Templates](./chapter-02/template-template.md)
|
||||
|
||||
|
|
|
|||
|
|
@ -50,7 +50,7 @@ Add these to your `config.csv`:
|
|||
key,value
|
||||
llm-provider,openai
|
||||
llm-api-key,YOUR_API_KEY
|
||||
llm-model,gpt-4o
|
||||
llm-model,claude-sonnet-4.5
|
||||
weather-api-key,YOUR_OPENWEATHERMAP_KEY
|
||||
whatsapp-api-key,YOUR_WHATSAPP_KEY
|
||||
whatsapp-phone-number-id,YOUR_PHONE_ID
|
||||
|
|
|
|||
|
|
@ -78,7 +78,7 @@ This catalog provides detailed information about every external service that Gen
|
|||
| **API Key Config** | `llm-api-key` (stored in Vault) |
|
||||
| **Documentation** | [platform.deepseek.com/docs](https://platform.deepseek.com/docs) |
|
||||
| **BASIC Keywords** | `LLM` |
|
||||
| **Supported Models** | `deepseek-v3.1`, `deepseek-r1` |
|
||||
| **Supported Models** | `deepseek-v3.1`, `deepseek-r3` |
|
||||
|
||||
### Mistral AI
|
||||
|
||||
|
|
|
|||
|
|
@ -195,10 +195,10 @@ vault kv put secret/botserver/smtp password="your-api-key"
|
|||
|
||||
| Provider | Models | Config Key |
|
||||
|----------|--------|------------|
|
||||
| OpenAI | GPT-4, GPT-3.5 | `llm-url=https://api.openai.com/v1` |
|
||||
| Anthropic | Claude 3 | `llm-url=https://api.anthropic.com` |
|
||||
| Groq | Llama, Mixtral | `llm-url=https://api.groq.com/openai/v1` |
|
||||
| DeepSeek | DeepSeek-V2 | `llm-url=https://api.deepseek.com` |
|
||||
| OpenAI | GPT-5, o3 | `llm-url=https://api.openai.com/v1` |
|
||||
| Anthropic | Claude Sonnet 4.5, Opus 4.5 | `llm-url=https://api.anthropic.com` |
|
||||
| Groq | Llama 3.3, Mixtral | `llm-url=https://api.groq.com/openai/v1` |
|
||||
| DeepSeek | DeepSeek-V3, R3 | `llm-url=https://api.deepseek.com` |
|
||||
| Local | Any GGUF | `llm-url=http://localhost:8081` |
|
||||
|
||||
### Local LLM Setup
|
||||
|
|
|
|||
|
|
@ -47,15 +47,15 @@ Known for safety, helpfulness, and extended thinking capabilities.
|
|||
|
||||
| Model | Context | Best For | Speed |
|
||||
|-------|---------|----------|-------|
|
||||
| Claude Opus | 200K | Most capable, complex reasoning | Slow |
|
||||
| Claude Sonnet | 200K | Best balance of capability/speed | Fast |
|
||||
| Claude Opus 4.5 | 200K | Most capable, complex reasoning | Slow |
|
||||
| Claude Sonnet 4.5 | 200K | Best balance of capability/speed | Fast |
|
||||
|
||||
**Configuration (config.csv):**
|
||||
|
||||
```csv
|
||||
name,value
|
||||
llm-provider,anthropic
|
||||
llm-model,claude-sonnet
|
||||
llm-model,claude-sonnet-4.5
|
||||
```
|
||||
|
||||
**Strengths:**
|
||||
|
|
@ -182,14 +182,14 @@ Known for efficient, capable models with exceptional reasoning.
|
|||
| Model | Context | Best For | Speed |
|
||||
|-------|---------|----------|-------|
|
||||
| DeepSeek-V3.1 | 128K | General purpose, optimized cost | Fast |
|
||||
| DeepSeek-R1 | 128K | Reasoning, math, science | Medium |
|
||||
| DeepSeek-R3 | 128K | Reasoning, math, science | Medium |
|
||||
|
||||
**Configuration (config.csv):**
|
||||
|
||||
```csv
|
||||
name,value
|
||||
llm-provider,deepseek
|
||||
llm-model,deepseek-r1
|
||||
llm-model,deepseek-r3
|
||||
llm-server-url,https://api.deepseek.com
|
||||
```
|
||||
|
||||
|
|
@ -215,7 +215,7 @@ General Bots uses **llama.cpp** server for local inference:
|
|||
name,value
|
||||
llm-provider,local
|
||||
llm-server-url,http://localhost:8081
|
||||
llm-model,DeepSeek-R1-Distill-Qwen-1.5B
|
||||
llm-model,DeepSeek-R3-Distill-Qwen-1.5B
|
||||
```
|
||||
|
||||
### Recommended Local Models
|
||||
|
|
@ -226,7 +226,7 @@ llm-model,DeepSeek-R1-Distill-Qwen-1.5B
|
|||
|-------|------|------|---------|
|
||||
| Llama 4 Scout 17B Q8 | 18GB | 24GB | Excellent |
|
||||
| Qwen3 72B Q4 | 42GB | 48GB+ | Excellent |
|
||||
| DeepSeek-R1 32B Q4 | 20GB | 24GB | Very Good |
|
||||
| DeepSeek-R3 32B Q4 | 20GB | 24GB | Very Good |
|
||||
|
||||
#### For Mid-Range GPU (12-16GB VRAM)
|
||||
|
||||
|
|
@ -234,14 +234,14 @@ llm-model,DeepSeek-R1-Distill-Qwen-1.5B
|
|||
|-------|------|------|---------|
|
||||
| Qwen3 14B Q8 | 15GB | 16GB | Very Good |
|
||||
| GPT-oss 20B Q4 | 12GB | 16GB | Very Good |
|
||||
| DeepSeek-R1-Distill 14B Q4 | 8GB | 12GB | Good |
|
||||
| DeepSeek-R3-Distill 14B Q4 | 8GB | 12GB | Good |
|
||||
| Gemma 3 27B Q4 | 16GB | 16GB | Good |
|
||||
|
||||
#### For Small GPU or CPU (8GB VRAM or less)
|
||||
|
||||
| Model | Size | VRAM | Quality |
|
||||
|-------|------|------|---------|
|
||||
| DeepSeek-R1-Distill 1.5B Q4 | 1GB | 4GB | Basic |
|
||||
| DeepSeek-R3-Distill 1.5B Q4 | 1GB | 4GB | Basic |
|
||||
| Gemma 2 9B Q4 | 5GB | 8GB | Acceptable |
|
||||
| Gemma 3 27B Q2 | 10GB | 8GB | Acceptable |
|
||||
|
||||
|
|
@ -254,7 +254,7 @@ Add models to `installer.rs` data_download_list:
|
|||
"https://huggingface.co/Qwen/Qwen3-14B-GGUF/resolve/main/qwen3-14b-q4_k_m.gguf"
|
||||
|
||||
// DeepSeek R1 Distill - For CPU or minimal GPU
|
||||
"https://huggingface.co/unsloth/DeepSeek-R1-Distill-Qwen-1.5B-GGUF/resolve/main/DeepSeek-R1-Distill-Qwen-1.5B-Q4_K_M.gguf"
|
||||
"https://huggingface.co/unsloth/DeepSeek-R3-Distill-Qwen-1.5B-GGUF/resolve/main/DeepSeek-R3-Distill-Qwen-1.5B-Q4_K_M.gguf"
|
||||
|
||||
// GPT-oss 20B - Good balance for agents
|
||||
"https://huggingface.co/openai/gpt-oss-20b-GGUF/resolve/main/gpt-oss-20b-q4_k_m.gguf"
|
||||
|
|
@ -290,11 +290,11 @@ Use different models for different tasks:
|
|||
```csv
|
||||
name,value
|
||||
llm-provider,anthropic
|
||||
llm-model,claude-sonnet
|
||||
llm-model,claude-sonnet-4.5
|
||||
llm-fast-provider,groq
|
||||
llm-fast-model,llama-3.3-70b
|
||||
llm-fallback-provider,local
|
||||
llm-fallback-model,DeepSeek-R1-Distill-Qwen-1.5B
|
||||
llm-fallback-model,DeepSeek-R3-Distill-Qwen-1.5B
|
||||
embedding-provider,local
|
||||
embedding-model,bge-small-en-v1.5
|
||||
```
|
||||
|
|
@ -305,11 +305,11 @@ embedding-model,bge-small-en-v1.5
|
|||
|
||||
| Use Case | Recommended | Why |
|
||||
|----------|-------------|-----|
|
||||
| Customer support | Claude Sonnet | Best at following guidelines |
|
||||
| Code generation | DeepSeek-R1, GPT-4o | Specialized for code |
|
||||
| Customer support | Claude Sonnet 4.5 | Best at following guidelines |
|
||||
| Code generation | DeepSeek-R3, Claude Sonnet 4.5 | Specialized for code |
|
||||
| Document analysis | Gemini Pro | 2M context window |
|
||||
| Real-time chat | Groq Llama 3.3 | Fastest responses |
|
||||
| Privacy-sensitive | Local DeepSeek-R1 | No external data transfer |
|
||||
| Privacy-sensitive | Local DeepSeek-R3 | No external data transfer |
|
||||
| Cost-sensitive | DeepSeek, Local models | Lowest cost per token |
|
||||
| Complex reasoning | Claude Opus, Gemini Pro | Best reasoning ability |
|
||||
| Real-time research | Grok | Live data access |
|
||||
|
|
|
|||
414
src/chapter-02/template-attendance-crm.md
Normal file
414
src/chapter-02/template-attendance-crm.md
Normal file
|
|
@ -0,0 +1,414 @@
|
|||
# Attendance CRM Template (attendance-crm.gbai)
|
||||
|
||||
A hybrid AI + Human support template that combines intelligent bot routing with human attendant management and full CRM automation. This template demonstrates the power of General Bots as an LLM orchestrator for customer service operations.
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
The Attendance CRM template provides:
|
||||
|
||||
- **Intelligent Routing** - Bot analyzes sentiment and auto-transfers frustrated customers
|
||||
- **LLM-Assisted Attendants** - AI tips, message polish, smart replies for human agents
|
||||
- **Queue Management** - Automated queue monitoring and load balancing
|
||||
- **CRM Automations** - Follow-ups, collections, lead nurturing, pipeline management
|
||||
- **Multi-Channel Support** - Works on WhatsApp, Web, and other channels
|
||||
|
||||
## Key Features
|
||||
|
||||
| Feature | Description |
|
||||
|---------|-------------|
|
||||
| **Sentiment-Based Transfer** | Auto-transfers when customer frustration is detected |
|
||||
| **AI Copilot for Attendants** | Real-time tips, smart replies, message polishing |
|
||||
| **Queue Health Monitoring** | Auto-reassign stale conversations, alert supervisors |
|
||||
| **Automated Follow-ups** | 1-day, 3-day, 7-day follow-up sequences |
|
||||
| **Collections Workflow** | Payment reminders from due date to legal escalation |
|
||||
| **Lead Scoring & Nurturing** | Score leads and re-engage cold prospects |
|
||||
| **Pipeline Management** | Weekly reviews, stale opportunity alerts |
|
||||
|
||||
---
|
||||
|
||||
## Package Structure
|
||||
|
||||
```
|
||||
attendance-crm.gbai/
|
||||
├── attendance-crm.gbdialog/
|
||||
│ ├── start.bas # Main entry - intelligent routing
|
||||
│ ├── queue-monitor.bas # Queue health monitoring (scheduled)
|
||||
│ ├── attendant-helper.bas # LLM assist tools for attendants
|
||||
│ └── crm-automations.bas # Follow-ups, collections, nurturing
|
||||
├── attendance-crm.gbot/
|
||||
│ └── config.csv # Bot configuration
|
||||
└── attendant.csv # Attendant team configuration
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
### config.csv
|
||||
|
||||
```csv
|
||||
name,value
|
||||
|
||||
# Bot Identity
|
||||
bot-name,Attendance CRM Bot
|
||||
bot-description,Hybrid AI + Human support with CRM integration
|
||||
|
||||
# CRM / Human Handoff - Required
|
||||
crm-enabled,true
|
||||
|
||||
# LLM Assist Features for Attendants
|
||||
attendant-llm-tips,true
|
||||
attendant-polish-message,true
|
||||
attendant-smart-replies,true
|
||||
attendant-auto-summary,true
|
||||
attendant-sentiment-analysis,true
|
||||
|
||||
# Bot Personality (used for LLM assist context)
|
||||
bot-system-prompt,You are a professional customer service assistant. Be helpful and empathetic.
|
||||
|
||||
# Auto-transfer triggers
|
||||
auto-transfer-on-frustration,true
|
||||
auto-transfer-threshold,3
|
||||
|
||||
# Queue Settings
|
||||
queue-timeout-minutes,30
|
||||
queue-notify-interval,5
|
||||
|
||||
# Lead Scoring
|
||||
lead-score-threshold-hot,70
|
||||
lead-score-threshold-warm,50
|
||||
|
||||
# Follow-up Automation
|
||||
follow-up-1-day,true
|
||||
follow-up-3-day,true
|
||||
follow-up-7-day,true
|
||||
|
||||
# Collections Automation
|
||||
collections-enabled,true
|
||||
collections-grace-days,3
|
||||
|
||||
# Working Hours
|
||||
business-hours-start,09:00
|
||||
business-hours-end,18:00
|
||||
business-days,1-5
|
||||
|
||||
# Notifications
|
||||
notify-on-vip,true
|
||||
notify-on-escalation,true
|
||||
notify-email,support@company.com
|
||||
```
|
||||
|
||||
### attendant.csv
|
||||
|
||||
Attendants can be identified by **any channel**: WhatsApp phone, email, Microsoft Teams, or Google account.
|
||||
|
||||
```csv
|
||||
id,name,channel,preferences,department,aliases,phone,email,teams,google
|
||||
att-001,João Silva,all,sales,commercial,joao;js;silva,+5511999990001,joao.silva@company.com,joao.silva@company.onmicrosoft.com,joao.silva@company.com
|
||||
att-002,Maria Santos,whatsapp,support,customer-service,maria;ms,+5511999990002,maria.santos@company.com,maria.santos@company.onmicrosoft.com,maria.santos@gmail.com
|
||||
att-003,Pedro Costa,web,technical,engineering,pedro;pc;tech,+5511999990003,pedro.costa@company.com,pedro.costa@company.onmicrosoft.com,pedro.costa@company.com
|
||||
att-004,Ana Oliveira,all,collections,finance,ana;ao;cobranca,+5511999990004,ana.oliveira@company.com,ana.oliveira@company.onmicrosoft.com,ana.oliveira@company.com
|
||||
att-005,Carlos Souza,whatsapp,sales,commercial,carlos;cs,+5511999990005,carlos.souza@company.com,carlos.souza@company.onmicrosoft.com,carlos.souza@gmail.com
|
||||
```
|
||||
|
||||
#### Column Reference
|
||||
|
||||
| Column | Description | Example |
|
||||
|--------|-------------|---------|
|
||||
| `id` | Unique attendant ID | `att-001` |
|
||||
| `name` | Display name | `João Silva` |
|
||||
| `channel` | Preferred channels (`all`, `whatsapp`, `web`, `teams`) | `all` |
|
||||
| `preferences` | Specialization area | `sales`, `support`, `technical` |
|
||||
| `department` | Department for routing | `commercial`, `engineering` |
|
||||
| `aliases` | Semicolon-separated nicknames for matching | `joao;js;silva` |
|
||||
| `phone` | WhatsApp number (E.164 format) | `+5511999990001` |
|
||||
| `email` | Email address for notifications | `joao@company.com` |
|
||||
| `teams` | Microsoft Teams UPN | `joao@company.onmicrosoft.com` |
|
||||
| `google` | Google Workspace email | `joao@company.com` |
|
||||
|
||||
The system can find an attendant by **any identifier** - phone, email, Teams UPN, Google account, name, or alias.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Scripts
|
||||
|
||||
### start.bas - Intelligent Routing
|
||||
|
||||
The main entry point analyzes every customer message and decides routing:
|
||||
|
||||
```basic
|
||||
' Analyze sentiment immediately
|
||||
sentiment = ANALYZE SENTIMENT session.id, message
|
||||
|
||||
' Track frustration
|
||||
IF sentiment.overall = "negative" THEN
|
||||
frustration_count = frustration_count + 1
|
||||
END IF
|
||||
|
||||
' Auto-transfer on high escalation risk
|
||||
IF sentiment.escalation_risk = "high" THEN
|
||||
tips = GET TIPS session.id, message
|
||||
result = TRANSFER TO HUMAN "support", "urgent", context_summary
|
||||
END IF
|
||||
```
|
||||
|
||||
**Key behaviors:**
|
||||
- Analyzes sentiment on every message
|
||||
- Tracks frustration count across conversation
|
||||
- Auto-transfers on explicit request ("falar com humano", "talk to human")
|
||||
- Auto-transfers when escalation risk is high
|
||||
- Auto-transfers after 3+ negative messages
|
||||
- Passes AI tips to attendant during transfer
|
||||
|
||||
### queue-monitor.bas - Queue Health
|
||||
|
||||
Scheduled job that runs every 5 minutes:
|
||||
|
||||
```basic
|
||||
SET SCHEDULE "queue-monitor", "*/5 * * * *"
|
||||
```
|
||||
|
||||
**What it does:**
|
||||
- Finds conversations waiting >10 minutes → auto-assigns
|
||||
- Finds inactive assigned conversations → reminds attendant
|
||||
- Finds conversations with offline attendants → reassigns
|
||||
- Detects abandoned conversations → sends follow-up, then resolves
|
||||
- Generates queue metrics for dashboard
|
||||
- Alerts supervisor if queue gets long or no attendants online
|
||||
|
||||
### attendant-helper.bas - LLM Assist Tools
|
||||
|
||||
Provides AI-powered assistance to human attendants:
|
||||
|
||||
```basic
|
||||
' Get tips for current conversation
|
||||
tips = USE TOOL "attendant-helper", "tips", session_id, message
|
||||
|
||||
' Polish a message before sending
|
||||
polished = USE TOOL "attendant-helper", "polish", session_id, message, "empathetic"
|
||||
|
||||
' Get smart reply suggestions
|
||||
replies = USE TOOL "attendant-helper", "replies", session_id
|
||||
|
||||
' Get conversation summary
|
||||
summary = USE TOOL "attendant-helper", "summary", session_id
|
||||
|
||||
' Analyze sentiment with recommendations
|
||||
sentiment = USE TOOL "attendant-helper", "sentiment", session_id, message
|
||||
|
||||
' Check if transfer is recommended
|
||||
should_transfer = USE TOOL "attendant-helper", "suggest_transfer", session_id
|
||||
```
|
||||
|
||||
### crm-automations.bas - Business Workflows
|
||||
|
||||
Scheduled CRM automations:
|
||||
|
||||
```basic
|
||||
' Daily follow-ups at 9am weekdays
|
||||
SET SCHEDULE "follow-ups", "0 9 * * 1-5"
|
||||
|
||||
' Daily collections at 8am weekdays
|
||||
SET SCHEDULE "collections", "0 8 * * 1-5"
|
||||
|
||||
' Daily lead nurturing at 10am weekdays
|
||||
SET SCHEDULE "lead-nurture", "0 10 * * 1-5"
|
||||
|
||||
' Weekly pipeline review Friday 2pm
|
||||
SET SCHEDULE "pipeline-review", "0 14 * * 5"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## BASIC Keywords Used
|
||||
|
||||
### Queue Management
|
||||
|
||||
| Keyword | Description | Example |
|
||||
|---------|-------------|---------|
|
||||
| `GET QUEUE` | Get queue status and items | `queue = GET QUEUE` |
|
||||
| `NEXT IN QUEUE` | Get next waiting conversation | `next = NEXT IN QUEUE` |
|
||||
| `ASSIGN CONVERSATION` | Assign to attendant | `ASSIGN CONVERSATION session_id, "att-001"` |
|
||||
| `RESOLVE CONVERSATION` | Mark as resolved | `RESOLVE CONVERSATION session_id, "Fixed"` |
|
||||
| `SET PRIORITY` | Change priority | `SET PRIORITY session_id, "urgent"` |
|
||||
|
||||
### Attendant Management
|
||||
|
||||
| Keyword | Description | Example |
|
||||
|---------|-------------|---------|
|
||||
| `GET ATTENDANTS` | List attendants | `attendants = GET ATTENDANTS "online"` |
|
||||
| `GET ATTENDANT STATS` | Get performance metrics | `stats = GET ATTENDANT STATS "att-001"` |
|
||||
| `SET ATTENDANT STATUS` | Change status | `SET ATTENDANT STATUS "att-001", "busy"` |
|
||||
|
||||
### LLM Assist
|
||||
|
||||
| Keyword | Description | Example |
|
||||
|---------|-------------|---------|
|
||||
| `GET TIPS` | Generate AI tips | `tips = GET TIPS session_id, message` |
|
||||
| `POLISH MESSAGE` | Improve message | `polished = POLISH MESSAGE text, "empathetic"` |
|
||||
| `GET SMART REPLIES` | Get reply suggestions | `replies = GET SMART REPLIES session_id` |
|
||||
| `GET SUMMARY` | Conversation summary | `summary = GET SUMMARY session_id` |
|
||||
| `ANALYZE SENTIMENT` | Sentiment analysis | `sentiment = ANALYZE SENTIMENT session_id, message` |
|
||||
|
||||
### Customer Journey
|
||||
|
||||
| Keyword | Description | Example |
|
||||
|---------|-------------|---------|
|
||||
| `TAG CONVERSATION` | Add tags | `TAG CONVERSATION session_id, "vip"` |
|
||||
| `ADD NOTE` | Add internal note | `ADD NOTE session_id, "Customer is VIP"` |
|
||||
| `GET CUSTOMER HISTORY` | Previous interactions | `history = GET CUSTOMER HISTORY user_id` |
|
||||
| `TRANSFER TO HUMAN` | Handoff to human | `TRANSFER TO HUMAN "support", "high", context` |
|
||||
|
||||
---
|
||||
|
||||
## Sample Conversations
|
||||
|
||||
### Auto-Transfer on Frustration
|
||||
|
||||
```
|
||||
Customer: Olá, preciso de ajuda com meu pedido
|
||||
|
||||
Bot: Olá! Claro, ficarei feliz em ajudar com seu pedido.
|
||||
Qual é o número do pedido?
|
||||
|
||||
Customer: #12345
|
||||
|
||||
Bot: Encontrei seu pedido #12345. Está em trânsito e deve
|
||||
chegar amanhã. Posso ajudar com mais alguma coisa?
|
||||
|
||||
Customer: Amanhã?! Isso é ridículo! Já deveria ter chegado há 3 dias!
|
||||
|
||||
Bot: [Sentiment: negative, Escalation Risk: medium]
|
||||
Entendo sua frustração. Peço desculpas pelo atraso.
|
||||
Se preferir falar com um atendente humano, é só me avisar.
|
||||
|
||||
Customer: SIM! Quero falar com alguém de verdade AGORA!
|
||||
|
||||
Bot: [Auto-transfer triggered]
|
||||
Estou transferindo você para Maria Santos.
|
||||
Um momento, por favor.
|
||||
|
||||
[Tips sent to Maria:]
|
||||
⚠️ Customer frustrated - 3 negative messages
|
||||
🎯 Issue: Delayed order #12345
|
||||
✅ Offer compensation for delay
|
||||
```
|
||||
|
||||
### Attendant Using LLM Assist
|
||||
|
||||
```
|
||||
[Customer message arrives]
|
||||
Customer: não consigo acessar minha conta faz 2 dias!!
|
||||
|
||||
[AI Tips appear in attendant UI:]
|
||||
💡 Tips:
|
||||
⚠️ Customer frustrated - use empathetic tone
|
||||
🎯 Intent: Account access issue
|
||||
✅ Verify account status, offer password reset
|
||||
|
||||
[Attendant types response:]
|
||||
Attendant: oi, vou verificar sua conta
|
||||
|
||||
[Clicks ✨ Polish button:]
|
||||
Polished: "Olá! Entendo como isso pode ser frustrante.
|
||||
Vou verificar sua conta agora mesmo e resolver
|
||||
isso para você."
|
||||
|
||||
[Attendant sends polished message]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Automation Workflows
|
||||
|
||||
### Follow-up Sequence
|
||||
|
||||
| Day | Action | Template |
|
||||
|-----|--------|----------|
|
||||
| 1 | Thank you message | `follow_up_thanks` |
|
||||
| 3 | Value proposition | `follow_up_value` |
|
||||
| 7 | Special offer (if score ≥50) | `follow_up_offer` |
|
||||
|
||||
### Collections Workflow
|
||||
|
||||
| Days Overdue | Action | Escalation |
|
||||
|--------------|--------|------------|
|
||||
| 0 (due today) | Friendly reminder | WhatsApp template |
|
||||
| 3 | First notice | WhatsApp + Email |
|
||||
| 7 | Second notice | + Notify collections team |
|
||||
| 15 | Final notice + late fees | + Queue for human call |
|
||||
| 30+ | Send to legal | + Suspend account |
|
||||
|
||||
---
|
||||
|
||||
## WhatsApp Templates Required
|
||||
|
||||
Configure these in Meta Business Manager:
|
||||
|
||||
| Template | Variables | Purpose |
|
||||
|----------|-----------|---------|
|
||||
| `follow_up_thanks` | name, interest | 1-day thank you |
|
||||
| `follow_up_value` | name, interest | 3-day value prop |
|
||||
| `follow_up_offer` | name, discount | 7-day offer |
|
||||
| `payment_due_today` | name, invoice_id, amount | Due reminder |
|
||||
| `payment_overdue_3` | name, invoice_id, amount | 3-day overdue |
|
||||
| `payment_overdue_7` | name, invoice_id, amount | 7-day overdue |
|
||||
| `payment_final_notice` | name, invoice_id, total | 15-day final |
|
||||
|
||||
---
|
||||
|
||||
## Metrics & Analytics
|
||||
|
||||
The template automatically tracks:
|
||||
|
||||
- **Queue Metrics**: Wait times, queue length, utilization
|
||||
- **Attendant Performance**: Resolved count, active conversations
|
||||
- **Sentiment Trends**: Per conversation and overall
|
||||
- **Automation Results**: Follow-ups sent, collections processed
|
||||
|
||||
Access via:
|
||||
- Dashboard at `/suite/analytics/`
|
||||
- API at `/api/attendance/insights`
|
||||
- Stored in `queue_metrics` and `automation_logs` tables
|
||||
|
||||
---
|
||||
|
||||
## Best Practices
|
||||
|
||||
### 1. Configure Sentiment Thresholds
|
||||
|
||||
Adjust `auto-transfer-threshold` based on your tolerance:
|
||||
- `2` = Very aggressive (transfer quickly)
|
||||
- `3` = Balanced (default)
|
||||
- `5` = Conservative (try harder with bot)
|
||||
|
||||
### 2. Set Business Hours
|
||||
|
||||
Configure `business-hours-*` to avoid sending automated messages at night.
|
||||
|
||||
### 3. Train Your Team
|
||||
|
||||
Ensure attendants know the WhatsApp commands:
|
||||
- `/tips` - Get AI tips
|
||||
- `/polish <message>` - Improve message
|
||||
- `/replies` - Get suggestions
|
||||
- `/resolve` - Close conversation
|
||||
|
||||
### 4. Monitor Queue Health
|
||||
|
||||
Set up alerts for:
|
||||
- Queue > 10 waiting
|
||||
- No attendants online during business hours
|
||||
- Average wait > 15 minutes
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- [Transfer to Human](../chapter-11-features/transfer-to-human.md) - Handoff details
|
||||
- [LLM-Assisted Attendant](../chapter-11-features/attendant-llm-assist.md) - AI copilot features
|
||||
- [Sales CRM Template](./template-crm.md) - Full CRM without attendance
|
||||
- [Attendance Queue Module](../appendix-external-services/attendance-queue.md) - Queue configuration
|
||||
|
|
@ -275,8 +275,8 @@ If you have API keys for AI services, configure them:
|
|||
|
||||
| Setting | Description | Example Value |
|
||||
|---------|-------------|---------------|
|
||||
| **LLM Provider** | AI service to use | `openai` |
|
||||
| **Model** | Specific model | `gpt-4o` |
|
||||
| **LLM Provider** | AI service to use | `anthropic` |
|
||||
| **Model** | Specific model | `claude-sonnet-4.5` |
|
||||
| **API Key** | Your API key | `sk-...` |
|
||||
|
||||
⚠️ **Warning**: Keep your API keys secret. Never share them.
|
||||
|
|
|
|||
|
|
@ -211,7 +211,7 @@ The dashboard shows the health of all components:
|
|||
│ ● PostgreSQL Running v16.2 24/100 connections │
|
||||
│ ● Qdrant Running v1.9.2 1.2M vectors │
|
||||
│ ● MinIO Running v2024.01 45.2 GB stored │
|
||||
│ ● BotModels Running v2.1.0 gpt-4o active │
|
||||
│ ● BotModels Running v2.1.0 LLM active │
|
||||
│ ● Vault Sealed v1.15.0 156 secrets │
|
||||
│ ● Cache Running v7.2.4 94.2% hit rate │
|
||||
│ ● InfluxDB Running v2.7.3 2,450 pts/sec │
|
||||
|
|
|
|||
|
|
@ -699,10 +699,10 @@ Sources is your library of prompts, templates, tools, and AI models. Find and us
|
|||
|
||||
| Model | Provider | Best For |
|
||||
|-------|----------|----------|
|
||||
| GPT-4o | OpenAI | General tasks, vision |
|
||||
| Claude 3.5 | Anthropic | Analysis, coding |
|
||||
| Gemini 1.5 | Google | Long documents |
|
||||
| Llama 3.1 | Meta | Open source, privacy |
|
||||
| Claude Sonnet 4.5 | Anthropic | General tasks, coding |
|
||||
| Claude Opus 4.5 | Anthropic | Complex analysis |
|
||||
| Gemini Pro | Google | Long documents |
|
||||
| Llama 3.3 | Meta | Open source, privacy |
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -99,8 +99,8 @@ This message reaches users on WhatsApp, Telegram, Web, or any configured channel
|
|||
|
||||
BASIC supports any LLM provider:
|
||||
|
||||
- OpenAI (GPT-4, GPT-3.5)
|
||||
- Anthropic (Claude)
|
||||
- OpenAI (GPT-5, o3)
|
||||
- Anthropic (Claude Sonnet 4.5, Opus 4.5)
|
||||
- Local models (Llama, Mistral via llama.cpp)
|
||||
- Groq, DeepSeek, and others
|
||||
- Any OpenAI-compatible API
|
||||
|
|
|
|||
|
|
@ -61,10 +61,10 @@ Add to `config.csv`:
|
|||
```csv
|
||||
llm-models,default;fast;quality;code
|
||||
model-routing-strategy,auto
|
||||
model-default,gpt-3.5-turbo
|
||||
model-fast,gpt-3.5-turbo
|
||||
model-quality,gpt-4o
|
||||
model-code,claude-sonnet
|
||||
model-default,claude-sonnet-4.5
|
||||
model-fast,gemini-flash
|
||||
model-quality,claude-opus-4.5
|
||||
model-code,claude-sonnet-4.5
|
||||
```
|
||||
|
||||
## Example: Task-Based Routing
|
||||
|
|
|
|||
|
|
@ -128,7 +128,7 @@ The system supports several routing strategies configured in `config.csv`:
|
|||
name,value
|
||||
model-routing-strategy,auto
|
||||
model-default,fast
|
||||
model-fast,DeepSeek-R1-Distill-Qwen-1.5B-Q3_K_M.gguf
|
||||
model-fast,DeepSeek-R3-Distill-Qwen-1.5B-Q3_K_M.gguf
|
||||
model-quality,gpt-4
|
||||
model-code,codellama-7b.gguf
|
||||
model-fallback-enabled,true
|
||||
|
|
|
|||
|
|
@ -56,7 +56,7 @@ The `web_server/` module implements the HTTP server and web interface. It serves
|
|||
|
||||
The `llm/` module provides large language model integration. It handles model selection based on configuration and requirements, formats prompts according to model expectations, manages token counting and context limits, streams responses for real-time display, tracks API costs for budgeting, and implements model fallbacks when primary providers are unavailable.
|
||||
|
||||
The `llm_models/` module contains specific implementations for different model providers. OpenAI integration supports GPT-3.5 and GPT-4 models. Anthropic integration provides access to Claude models. Google integration enables Gemini model usage. Meta integration supports Llama models for local deployment. Local model support allows self-hosted inference. Custom model implementations can be added for specialized providers.
|
||||
The `llm_models/` module contains specific implementations for different model providers. OpenAI integration supports GPT-5 and o3 models. Anthropic integration provides access to Claude Sonnet 4.5 and Opus 4.5 models. Google integration enables Gemini model usage. Meta integration supports Llama models for local deployment. Local model support allows self-hosted inference. Custom model implementations can be added for specialized providers.
|
||||
|
||||
The `prompt_manager/` module provides centralized prompt management capabilities. It maintains prompt templates for consistent interactions, handles variable substitution in prompts, optimizes prompts for specific models, supports version control of prompt changes, enables A/B testing of different approaches, and tracks prompt performance metrics.
|
||||
|
||||
|
|
|
|||
|
|
@ -373,7 +373,7 @@ Failover happens automatically within seconds, with clients redirected via the c
|
|||
# config.csv - Fallbacks
|
||||
fallback-llm-enabled,true
|
||||
fallback-llm-provider,local
|
||||
fallback-llm-model,DeepSeek-R1-Distill-Qwen-1.5B
|
||||
fallback-llm-model,DeepSeek-R3-Distill-Qwen-1.5B
|
||||
|
||||
fallback-cache-enabled,true
|
||||
fallback-cache-mode,memory
|
||||
|
|
|
|||
|
|
@ -584,7 +584,7 @@ Configure fallback behavior:
|
|||
# Fallback configuration
|
||||
fallback-llm-enabled,true
|
||||
fallback-llm-provider,local
|
||||
fallback-llm-model,DeepSeek-R1-Distill-Qwen-1.5B
|
||||
fallback-llm-model,DeepSeek-R3-Distill-Qwen-1.5B
|
||||
|
||||
fallback-cache-enabled,true
|
||||
fallback-cache-mode,memory
|
||||
|
|
|
|||
|
|
@ -56,7 +56,7 @@ A complete working configuration:
|
|||
name,value
|
||||
server-port,8080
|
||||
llm-url,http://localhost:8081
|
||||
llm-model,../../../../data/llm/DeepSeek-R1-Distill-Qwen-1.5B-Q3_K_M.gguf
|
||||
llm-model,../../../../data/llm/DeepSeek-R3-Distill-Qwen-1.5B-Q3_K_M.gguf
|
||||
episodic-memory-threshold,4
|
||||
```
|
||||
|
||||
|
|
|
|||
|
|
@ -41,7 +41,7 @@ For detailed LLM configuration, see the tables below. The basic settings are:
|
|||
```csv
|
||||
llm-key,none
|
||||
llm-url,http://localhost:8081
|
||||
llm-model,../../../../data/llm/DeepSeek-R1-Distill-Qwen-1.5B-Q3_K_M.gguf
|
||||
llm-model,../../../../data/llm/DeepSeek-R3-Distill-Qwen-1.5B-Q3_K_M.gguf
|
||||
```
|
||||
|
||||
#### Core LLM Settings
|
||||
|
|
@ -223,7 +223,7 @@ llm-server,true
|
|||
llm-server-gpu-layers,35
|
||||
llm-server-ctx-size,8192
|
||||
llm-server-n-predict,2048
|
||||
llm-model,../../../../data/llm/DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf
|
||||
llm-model,../../../../data/llm/DeepSeek-R3-Distill-Qwen-7B-Q4_K_M.gguf
|
||||
,
|
||||
# Disable cache for development
|
||||
llm-cache,false
|
||||
|
|
@ -296,15 +296,15 @@ Others require restart:
|
|||
**24GB+ VRAM (RTX 3090, 4090)**
|
||||
- DeepSeek-V3 (with MoE enabled)
|
||||
- Qwen2.5-32B-Instruct-Q4_K_M
|
||||
- DeepSeek-R1-Distill-Qwen-14B (runs fast with room to spare)
|
||||
- DeepSeek-R3-Distill-Qwen-14B (runs fast with room to spare)
|
||||
|
||||
**12-16GB VRAM (RTX 4070, 4070Ti)**
|
||||
- DeepSeek-R1-Distill-Llama-8B
|
||||
- DeepSeek-R3-Distill-Llama-8B
|
||||
- Qwen2.5-14B-Q4_K_M
|
||||
- Mistral-7B-Instruct-Q5_K_M
|
||||
|
||||
**8GB VRAM or CPU-Only**
|
||||
- DeepSeek-R1-Distill-Qwen-1.5B
|
||||
- DeepSeek-R3-Distill-Qwen-1.5B
|
||||
- Phi-3-mini-4k-instruct
|
||||
- Qwen2.5-3B-Instruct-Q5_K_M
|
||||
|
||||
|
|
|
|||
|
|
@ -9,7 +9,7 @@ BotServer is designed to work with local GGUF models by default. The minimal con
|
|||
```csv
|
||||
llm-key,none
|
||||
llm-url,http://localhost:8081
|
||||
llm-model,../../../../data/llm/DeepSeek-R1-Distill-Qwen-1.5B-Q3_K_M.gguf
|
||||
llm-model,../../../../data/llm/DeepSeek-R3-Distill-Qwen-1.5B-Q3_K_M.gguf
|
||||
```
|
||||
|
||||
### Model Path
|
||||
|
|
@ -156,7 +156,7 @@ Using a cloud provider for inference:
|
|||
name,value
|
||||
llm-key,sk-...
|
||||
llm-url,https://api.anthropic.com
|
||||
llm-model,claude-3
|
||||
llm-model,claude-sonnet-4.5
|
||||
llm-cache,true
|
||||
llm-cache-ttl,7200
|
||||
```
|
||||
|
|
@ -179,7 +179,7 @@ Supporting concurrent users requires enabling `llm-server-cont-batching` and inc
|
|||
|
||||
### Small Models (1-3B parameters)
|
||||
|
||||
Small models like DeepSeek-R1-Distill-Qwen-1.5B deliver fast responses with low memory usage. They work well for simple tasks, quick interactions, and resource-constrained environments.
|
||||
Small models like DeepSeek-R3-Distill-Qwen-1.5B deliver fast responses with low memory usage. They work well for simple tasks, quick interactions, and resource-constrained environments.
|
||||
|
||||
### Medium Models (7-13B parameters)
|
||||
|
||||
|
|
|
|||
|
|
@ -55,7 +55,7 @@ Complete reference of all available parameters in `config.csv`.
|
|||
|
||||
#### For RTX 3090 (24GB VRAM)
|
||||
You can run impressive models with proper configuration:
|
||||
- **DeepSeek-R1-Distill-Qwen-7B**: Set `llm-server-gpu-layers` to 35-40
|
||||
- **DeepSeek-R3-Distill-Qwen-7B**: Set `llm-server-gpu-layers` to 35-40
|
||||
- **Qwen2.5-32B-Instruct (Q4_K_M)**: Fits with `llm-server-gpu-layers` to 40-45
|
||||
- **DeepSeek-V3 (with MoE)**: Set `llm-server-n-moe` to 2-4 to run even 120B models! MoE only loads active experts
|
||||
- **Optimization**: Use `llm-server-ctx-size` of 8192 for longer contexts
|
||||
|
|
@ -63,20 +63,20 @@ You can run impressive models with proper configuration:
|
|||
#### For RTX 4070/4070Ti (12-16GB VRAM)
|
||||
Mid-range cards work great with quantized models:
|
||||
- **Qwen2.5-14B (Q4_K_M)**: Set `llm-server-gpu-layers` to 25-30
|
||||
- **DeepSeek-R1-Distill-Llama-8B**: Fully fits with layers at 32
|
||||
- **DeepSeek-R3-Distill-Llama-8B**: Fully fits with layers at 32
|
||||
- **Tips**: Keep `llm-server-ctx-size` at 4096 to save VRAM
|
||||
|
||||
#### For CPU-Only (No GPU)
|
||||
Modern CPUs can still run capable models:
|
||||
- **DeepSeek-R1-Distill-Qwen-1.5B**: Fast on CPU, great for testing
|
||||
- **DeepSeek-R3-Distill-Qwen-1.5B**: Fast on CPU, great for testing
|
||||
- **Phi-3-mini (3.8B)**: Excellent CPU performance
|
||||
- **Settings**: Set `llm-server-mlock` to `true` to prevent swapping
|
||||
- **Parallel**: Increase `llm-server-parallel` to CPU cores -2
|
||||
|
||||
#### Recommended Models (GGUF Format)
|
||||
- **Best Overall**: DeepSeek-R1-Distill series (1.5B to 70B)
|
||||
- **Best Overall**: DeepSeek-R3-Distill series (1.5B to 70B)
|
||||
- **Best Small**: Qwen2.5-3B-Instruct-Q5_K_M
|
||||
- **Best Medium**: DeepSeek-R1-Distill-Qwen-14B-Q4_K_M
|
||||
- **Best Medium**: DeepSeek-R3-Distill-Qwen-14B-Q4_K_M
|
||||
- **Best Large**: DeepSeek-V3, Qwen2.5-32B, or GPT2-120B-GGUF (with MoE enabled)
|
||||
|
||||
**Pro Tip**: The `llm-server-n-moe` parameter is magic for large models - it enables Mixture of Experts, letting you run 120B+ models on consumer hardware by only loading the experts needed for each token!
|
||||
|
|
|
|||
|
|
@ -88,8 +88,8 @@ The bot's `config.csv` contains **non-sensitive** configuration:
|
|||
|
||||
```csv
|
||||
# Bot behavior - NOT secrets
|
||||
llm-provider,openai
|
||||
llm-model,gpt-4o
|
||||
llm-provider,anthropic
|
||||
llm-model,claude-sonnet-4.5
|
||||
llm-temperature,0.7
|
||||
llm-max-tokens,4096
|
||||
|
||||
|
|
@ -369,8 +369,8 @@ Reference Vault secrets in your bot's config.csv:
|
|||
|
||||
```csv
|
||||
# Direct value (non-sensitive)
|
||||
llm-provider,openai
|
||||
llm-model,gpt-4o
|
||||
llm-provider,anthropic
|
||||
llm-model,claude-sonnet-4.5
|
||||
llm-temperature,0.7
|
||||
|
||||
# Vault reference (sensitive)
|
||||
|
|
|
|||
|
|
@ -12,7 +12,7 @@ The LLM integration in BotServer enables sophisticated conversational experience
|
|||
|
||||
### OpenAI
|
||||
|
||||
OpenAI serves as the primary LLM provider with support for multiple model tiers. GPT-3.5 Turbo provides fast, cost-effective responses for straightforward conversations. GPT-4 delivers more nuanced understanding for complex queries. GPT-4 Turbo offers an optimal balance of capability and speed. Custom fine-tuned models can be used when you have specialized requirements.
|
||||
OpenAI serves as the primary LLM provider with support for multiple model tiers. GPT-5 provides fast, cost-effective responses for straightforward conversations. GPT-5 mini delivers efficient processing for simpler queries. The o3 series offers superior reasoning for complex tasks. Custom fine-tuned models can be used when you have specialized requirements.
|
||||
|
||||
Configuration requires setting your API key and selecting a model:
|
||||
|
||||
|
|
@ -181,7 +181,7 @@ Choosing the right model involves balancing several factors. Capability requirem
|
|||
|
||||
### Model Comparison
|
||||
|
||||
GPT-3.5 Turbo offers the fastest responses at the lowest cost, suitable for straightforward questions. GPT-4 provides superior reasoning for complex queries at higher cost and latency. Local models like Llama variants offer privacy and cost predictability with varying capability levels. Specialized models may excel at particular domains like code or medical content.
|
||||
GPT-5 mini offers the fastest responses at the lowest cost, suitable for straightforward questions. Claude Sonnet 4.5 and GPT-5 provide superior reasoning for complex queries with good balance of cost and capability. Local models like Llama variants offer privacy and cost predictability with varying capability levels. Specialized models may excel at particular domains like code or medical content.
|
||||
|
||||
|
||||
## Integration with Tools
|
||||
|
|
|
|||
|
|
@ -38,7 +38,8 @@ Scripts stored as `.gbdialog` files in bot packages.
|
|||
|
||||
| Provider | Models | Features |
|
||||
|----------|--------|----------|
|
||||
| OpenAI | GPT-3.5, GPT-4 | Streaming, function calling |
|
||||
| OpenAI | GPT-5, o3 | Streaming, function calling |
|
||||
| Anthropic | Claude Sonnet 4.5, Opus 4.5 | Analysis, coding, guidelines |
|
||||
| Local | GGUF models | GPU acceleration, offline |
|
||||
|
||||
Features: prompt templates, context injection, token management, cost optimization.
|
||||
|
|
|
|||
|
|
@ -223,7 +223,7 @@ USE MODEL "auto"
|
|||
name,value
|
||||
model-routing-strategy,auto
|
||||
model-default,fast
|
||||
model-fast,DeepSeek-R1-Distill-Qwen-1.5B-Q3_K_M.gguf
|
||||
model-fast,DeepSeek-R3-Distill-Qwen-1.5B-Q3_K_M.gguf
|
||||
model-quality,gpt-4
|
||||
model-code,codellama-7b.gguf
|
||||
```
|
||||
|
|
|
|||
|
|
@ -232,8 +232,8 @@ botserver --start
|
|||
## INTEGRATION CAPABILITIES
|
||||
|
||||
### LLM Providers
|
||||
- OpenAI (GPT-4, GPT-3.5)
|
||||
- Anthropic (Claude)
|
||||
- OpenAI (GPT-5, o3)
|
||||
- Anthropic (Claude Sonnet 4.5, Opus 4.5)
|
||||
- Meta (Llama)
|
||||
- DeepSeek
|
||||
- Local models via Ollama
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue