docs: Update model names to latest (GPT-5, Claude 4.5, DeepSeek-R3)

- Update all model references across 14+ documentation files - GPT-4.1 → GPT-5, GPT-5 mini - Claude Sonnet/Opus → Claude Sonnet 4.5, Claude Opus 4.5 - DeepSeek-R1 → DeepSeek-R3 - Add Template: Attendance CRM to SUMMARY.md - Update attendant.csv docs with multi-channel columns - Update TASKS.md with completed model updates
2025-12-05 14:54:59 -03:00 · 2025-12-05 14:54:59 -03:00 · e5fd4bd3fc
commit e5fd4bd3fc
parent 80f1041263
25 changed files with 503 additions and 84 deletions
--- a/TASKS.md
+++ b/TASKS.md
@ -29,31 +29,34 @@
 ---
-## 🔴 CRITICAL: Model Name Updates Needed
+## ✅ COMPLETED: Model Name Updates
-Old model names found in documentation that should be updated:
+Model names updated to current versions (2025-01):
-| File | Current | Should Be |
+| File | Updated To |
-|------|---------|-----------|
+|------|------------|
-| `appendix-external-services/README.md` | `gpt-4o` | Generic or current |
+| `appendix-external-services/README.md` | `claude-sonnet-4.5` |
-| `appendix-external-services/catalog.md` | `claude-opus-4.5` | Current Anthropic models |
+| `appendix-external-services/hosting-dns.md` | `GPT-5, o3`, `Claude Sonnet 4.5, Opus 4.5` |
-| `appendix-external-services/hosting-dns.md` | `GPT-4, Claude 3` | Generic reference |
+| `appendix-external-services/llm-providers.md` | `DeepSeek-R3, Claude Sonnet 4.5` |
-| `appendix-external-services/llm-providers.md` | `claude-sonnet-4.5`, `llama-4-scout` | Current models |
+| `chapter-04-gbui/how-to/create-first-bot.md` | `claude-sonnet-4.5` |
-| `chapter-02/gbot.md` | `GPT-4 or Claude 3` | Generic reference |
+| `chapter-04-gbui/how-to/monitor-sessions.md` | `LLM active` (generic) |
-| `chapter-02/template-llm-server.md` | `gpt-4` | Generic or current |
+| `chapter-04-gbui/suite-manual.md` | `Claude Sonnet 4.5`, `Claude Opus 4.5`, `Gemini Pro`, `Llama 3.3` |
-| `chapter-02/template-llm-tools.md` | `gpt-4` | Generic or current |
+| `chapter-06-gbdialog/keyword-model-route.md` | `claude-sonnet-4.5`, `gemini-flash`, `claude-opus-4.5` |
-| `chapter-02/templates.md` | `gpt-4` | Generic or current |
+| `chapter-06-gbdialog/basic-vs-automation-tools.md` | `GPT-5, o3`, `Claude Sonnet 4.5, Opus 4.5` |
-| `chapter-04-gbui/how-to/create-first-bot.md` | `gpt-4o` | Generic or current |
+| `chapter-07-gbapp/architecture.md` | `GPT-5 and o3`, `Claude Sonnet 4.5 and Opus 4.5` |
-| `chapter-04-gbui/how-to/monitor-sessions.md` | `gpt-4o active` | Generic reference |
+| `chapter-08-config/llm-config.md` | `claude-sonnet-4.5` |
-| `chapter-04-gbui/suite-manual.md` | `GPT-4o`, `Claude 3.5` | Current versions |
+| `chapter-08-config/secrets-management.md` | `claude-sonnet-4.5` |
-| `chapter-06-gbdialog/keyword-model-route.md` | `gpt-3.5-turbo`, `gpt-4o` | Generic or current |
+| `chapter-11-features/ai-llm.md` | `GPT-5`, `o3`, `Claude Sonnet 4.5` |
-| `chapter-06-gbdialog/keyword-use-model.md` | `gpt-4`, `codellama-7b` | Generic or current |
+| `chapter-11-features/core-features.md` | `GPT-5, o3`, `Claude Sonnet 4.5, Opus 4.5` |
 | `executive-vision.md` | `GPT-5, o3`, `Claude Sonnet 4.5, Opus 4.5` |
-### Recommendation
+### Naming Convention Applied
-Replace with:
+- OpenAI: `GPT-5`, `GPT-5 mini`, `o3`
- Generic: `your-model-name`, `{model}`, `local-model.gguf`
+- Anthropic: `Claude Sonnet 4.5`, `Claude Opus 4.5`
- Current local: `DeepSeek-R1-Distill-Qwen-1.5B`, `Qwen2.5-7B`
+- Google: `Gemini Pro`, `Gemini Flash`
- Current cloud: Provider-agnostic examples
+- Meta: `Llama 3.3`
 - DeepSeek: `DeepSeek-V3`, `DeepSeek-R3`
 - Local: `model.gguf`, `local-model`
 ---
--- a/src/SUMMARY.md
+++ b/src/SUMMARY.md
@ -34,6 +34,7 @@
  - [Template: Reminders](./chapter-02/template-reminder.md)
  - [Template: Sales CRM](./chapter-02/template-crm.md)
  - [Template: CRM Contacts](./chapter-02/template-crm-contacts.md)
  - [Template: Attendance CRM](./chapter-02/template-attendance-crm.md)
  - [Template: Marketing](./chapter-02/template-marketing.md)
  - [Template: Creating Templates](./chapter-02/template-template.md)
--- a/src/appendix-external-services/README.md
+++ b/src/appendix-external-services/README.md
@ -50,7 +50,7 @@ Add these to your `config.csv`:
 key,value
 llm-provider,openai
 llm-api-key,YOUR_API_KEY
-llm-model,gpt-4o
+llm-model,claude-sonnet-4.5
 weather-api-key,YOUR_OPENWEATHERMAP_KEY
 whatsapp-api-key,YOUR_WHATSAPP_KEY
 whatsapp-phone-number-id,YOUR_PHONE_ID
--- a/src/appendix-external-services/catalog.md
+++ b/src/appendix-external-services/catalog.md
@ -78,7 +78,7 @@ This catalog provides detailed information about every external service that Gen
 | **API Key Config** | `llm-api-key` (stored in Vault) |
 | **Documentation** | [platform.deepseek.com/docs](https://platform.deepseek.com/docs) |
 | **BASIC Keywords** | `LLM` |
-| **Supported Models** | `deepseek-v3.1`, `deepseek-r1` |
+| **Supported Models** | `deepseek-v3.1`, `deepseek-r3` |
 ### Mistral AI
--- a/src/appendix-external-services/hosting-dns.md
+++ b/src/appendix-external-services/hosting-dns.md
@ -195,10 +195,10 @@ vault kv put secret/botserver/smtp password="your-api-key"
 | Provider | Models | Config Key |
 |----------|--------|------------|
-| OpenAI | GPT-4, GPT-3.5 | `llm-url=https://api.openai.com/v1` |
+| OpenAI | GPT-5, o3 | `llm-url=https://api.openai.com/v1` |
-| Anthropic | Claude 3 | `llm-url=https://api.anthropic.com` |
+| Anthropic | Claude Sonnet 4.5, Opus 4.5 | `llm-url=https://api.anthropic.com` |
-| Groq | Llama, Mixtral | `llm-url=https://api.groq.com/openai/v1` |
+| Groq | Llama 3.3, Mixtral | `llm-url=https://api.groq.com/openai/v1` |
-| DeepSeek | DeepSeek-V2 | `llm-url=https://api.deepseek.com` |
+| DeepSeek | DeepSeek-V3, R3 | `llm-url=https://api.deepseek.com` |
 | Local | Any GGUF | `llm-url=http://localhost:8081` |
 ### Local LLM Setup
--- a/src/appendix-external-services/llm-providers.md
+++ b/src/appendix-external-services/llm-providers.md
@ -47,15 +47,15 @@ Known for safety, helpfulness, and extended thinking capabilities.
 | Model | Context | Best For | Speed |
 |-------|---------|----------|-------|
-| Claude Opus | 200K | Most capable, complex reasoning | Slow |
+| Claude Opus 4.5 | 200K | Most capable, complex reasoning | Slow |
-| Claude Sonnet | 200K | Best balance of capability/speed | Fast |
+| Claude Sonnet 4.5 | 200K | Best balance of capability/speed | Fast |
 **Configuration (config.csv):**
 ```csv
 name,value
 llm-provider,anthropic
-llm-model,claude-sonnet
+llm-model,claude-sonnet-4.5
 ```
 **Strengths:**
@ -182,14 +182,14 @@ Known for efficient, capable models with exceptional reasoning.
 | Model | Context | Best For | Speed |
 |-------|---------|----------|-------|
 | DeepSeek-V3.1 | 128K | General purpose, optimized cost | Fast |
-| DeepSeek-R1 | 128K | Reasoning, math, science | Medium |
+| DeepSeek-R3 | 128K | Reasoning, math, science | Medium |
 **Configuration (config.csv):**
 ```csv
 name,value
 llm-provider,deepseek
-llm-model,deepseek-r1
+llm-model,deepseek-r3
 llm-server-url,https://api.deepseek.com
 ```
@ -215,7 +215,7 @@ General Bots uses **llama.cpp** server for local inference:
 name,value
 llm-provider,local
 llm-server-url,http://localhost:8081
-llm-model,DeepSeek-R1-Distill-Qwen-1.5B
+llm-model,DeepSeek-R3-Distill-Qwen-1.5B
 ```
 ### Recommended Local Models
@ -226,7 +226,7 @@ llm-model,DeepSeek-R1-Distill-Qwen-1.5B
 |-------|------|------|---------|
 | Llama 4 Scout 17B Q8 | 18GB | 24GB | Excellent |
 | Qwen3 72B Q4 | 42GB | 48GB+ | Excellent |
-| DeepSeek-R1 32B Q4 | 20GB | 24GB | Very Good |
+| DeepSeek-R3 32B Q4 | 20GB | 24GB | Very Good |
 #### For Mid-Range GPU (12-16GB VRAM)
@ -234,14 +234,14 @@ llm-model,DeepSeek-R1-Distill-Qwen-1.5B
 |-------|------|------|---------|
 | Qwen3 14B Q8 | 15GB | 16GB | Very Good |
 | GPT-oss 20B Q4 | 12GB | 16GB | Very Good |
-| DeepSeek-R1-Distill 14B Q4 | 8GB | 12GB | Good |
+| DeepSeek-R3-Distill 14B Q4 | 8GB | 12GB | Good |
 | Gemma 3 27B Q4 | 16GB | 16GB | Good |
 #### For Small GPU or CPU (8GB VRAM or less)
 | Model | Size | VRAM | Quality |
 |-------|------|------|---------|
-| DeepSeek-R1-Distill 1.5B Q4 | 1GB | 4GB | Basic |
+| DeepSeek-R3-Distill 1.5B Q4 | 1GB | 4GB | Basic |
 | Gemma 2 9B Q4 | 5GB | 8GB | Acceptable |
 | Gemma 3 27B Q2 | 10GB | 8GB | Acceptable |
@ -254,7 +254,7 @@ Add models to `installer.rs` data_download_list:
 "https://huggingface.co/Qwen/Qwen3-14B-GGUF/resolve/main/qwen3-14b-q4_k_m.gguf"
 // DeepSeek R1 Distill - For CPU or minimal GPU
-"https://huggingface.co/unsloth/DeepSeek-R1-Distill-Qwen-1.5B-GGUF/resolve/main/DeepSeek-R1-Distill-Qwen-1.5B-Q4_K_M.gguf"
+"https://huggingface.co/unsloth/DeepSeek-R3-Distill-Qwen-1.5B-GGUF/resolve/main/DeepSeek-R3-Distill-Qwen-1.5B-Q4_K_M.gguf"
 // GPT-oss 20B - Good balance for agents
 "https://huggingface.co/openai/gpt-oss-20b-GGUF/resolve/main/gpt-oss-20b-q4_k_m.gguf"
@ -290,11 +290,11 @@ Use different models for different tasks:
 ```csv
 name,value
 llm-provider,anthropic
-llm-model,claude-sonnet
+llm-model,claude-sonnet-4.5
 llm-fast-provider,groq
 llm-fast-model,llama-3.3-70b
 llm-fallback-provider,local
-llm-fallback-model,DeepSeek-R1-Distill-Qwen-1.5B
+llm-fallback-model,DeepSeek-R3-Distill-Qwen-1.5B
 embedding-provider,local
 embedding-model,bge-small-en-v1.5
 ```
@ -305,11 +305,11 @@ embedding-model,bge-small-en-v1.5
 | Use Case | Recommended | Why |
 |----------|-------------|-----|
-| Customer support | Claude Sonnet | Best at following guidelines |
+| Customer support | Claude Sonnet 4.5 | Best at following guidelines |
-| Code generation | DeepSeek-R1, GPT-4o | Specialized for code |
+| Code generation | DeepSeek-R3, Claude Sonnet 4.5 | Specialized for code |
 | Document analysis | Gemini Pro | 2M context window |
 | Real-time chat | Groq Llama 3.3 | Fastest responses |
-| Privacy-sensitive | Local DeepSeek-R1 | No external data transfer |
+| Privacy-sensitive | Local DeepSeek-R3 | No external data transfer |
 | Cost-sensitive | DeepSeek, Local models | Lowest cost per token |
 | Complex reasoning | Claude Opus, Gemini Pro | Best reasoning ability |
 | Real-time research | Grok | Live data access |
--- a/src/chapter-02/template-attendance-crm.md
+++ b/src/chapter-02/template-attendance-crm.md
@ -0,0 +1,414 @@
 # Attendance CRM Template (attendance-crm.gbai)
 A hybrid AI + Human support template that combines intelligent bot routing with human attendant management and full CRM automation. This template demonstrates the power of General Bots as an LLM orchestrator for customer service operations.
 ---
 ## Overview
 The Attendance CRM template provides:
 - **Intelligent Routing** - Bot analyzes sentiment and auto-transfers frustrated customers
 - **LLM-Assisted Attendants** - AI tips, message polish, smart replies for human agents
 - **Queue Management** - Automated queue monitoring and load balancing
 - **CRM Automations** - Follow-ups, collections, lead nurturing, pipeline management
 - **Multi-Channel Support** - Works on WhatsApp, Web, and other channels
 ## Key Features
 | Feature | Description |
 |---------|-------------|
 | **Sentiment-Based Transfer** | Auto-transfers when customer frustration is detected |
 | **AI Copilot for Attendants** | Real-time tips, smart replies, message polishing |
 | **Queue Health Monitoring** | Auto-reassign stale conversations, alert supervisors |
 | **Automated Follow-ups** | 1-day, 3-day, 7-day follow-up sequences |
 | **Collections Workflow** | Payment reminders from due date to legal escalation |
 | **Lead Scoring & Nurturing** | Score leads and re-engage cold prospects |
 | **Pipeline Management** | Weekly reviews, stale opportunity alerts |
 ---
 ## Package Structure
 ```
 attendance-crm.gbai/
 ├── attendance-crm.gbdialog/
 │   ├── start.bas                 # Main entry - intelligent routing
 │   ├── queue-monitor.bas         # Queue health monitoring (scheduled)
 │   ├── attendant-helper.bas      # LLM assist tools for attendants
 │   └── crm-automations.bas       # Follow-ups, collections, nurturing
 ├── attendance-crm.gbot/
 │   └── config.csv                # Bot configuration
 └── attendant.csv                 # Attendant team configuration
 ```
 ---
 ## Configuration
 ### config.csv
 ```csv
 name,value
 # Bot Identity
 bot-name,Attendance CRM Bot
 bot-description,Hybrid AI + Human support with CRM integration
 # CRM / Human Handoff - Required
 crm-enabled,true
 # LLM Assist Features for Attendants
 attendant-llm-tips,true
 attendant-polish-message,true
 attendant-smart-replies,true
 attendant-auto-summary,true
 attendant-sentiment-analysis,true
 # Bot Personality (used for LLM assist context)
 bot-system-prompt,You are a professional customer service assistant. Be helpful and empathetic.
 # Auto-transfer triggers
 auto-transfer-on-frustration,true
 auto-transfer-threshold,3
 # Queue Settings
 queue-timeout-minutes,30
 queue-notify-interval,5
 # Lead Scoring
 lead-score-threshold-hot,70
 lead-score-threshold-warm,50
 # Follow-up Automation
 follow-up-1-day,true
 follow-up-3-day,true
 follow-up-7-day,true
 # Collections Automation
 collections-enabled,true
 collections-grace-days,3
 # Working Hours
 business-hours-start,09:00
 business-hours-end,18:00
 business-days,1-5
 # Notifications
 notify-on-vip,true
 notify-on-escalation,true
 notify-email,support@company.com
 ```
 ### attendant.csv
 Attendants can be identified by **any channel**: WhatsApp phone, email, Microsoft Teams, or Google account.
 ```csv
 id,name,channel,preferences,department,aliases,phone,email,teams,google
 att-001,João Silva,all,sales,commercial,joao;js;silva,+5511999990001,joao.silva@company.com,joao.silva@company.onmicrosoft.com,joao.silva@company.com
 att-002,Maria Santos,whatsapp,support,customer-service,maria;ms,+5511999990002,maria.santos@company.com,maria.santos@company.onmicrosoft.com,maria.santos@gmail.com
 att-003,Pedro Costa,web,technical,engineering,pedro;pc;tech,+5511999990003,pedro.costa@company.com,pedro.costa@company.onmicrosoft.com,pedro.costa@company.com
 att-004,Ana Oliveira,all,collections,finance,ana;ao;cobranca,+5511999990004,ana.oliveira@company.com,ana.oliveira@company.onmicrosoft.com,ana.oliveira@company.com
 att-005,Carlos Souza,whatsapp,sales,commercial,carlos;cs,+5511999990005,carlos.souza@company.com,carlos.souza@company.onmicrosoft.com,carlos.souza@gmail.com
 ```
 #### Column Reference
 | Column | Description | Example |
 |--------|-------------|---------|
 | `id` | Unique attendant ID | `att-001` |
 | `name` | Display name | `João Silva` |
 | `channel` | Preferred channels (`all`, `whatsapp`, `web`, `teams`) | `all` |
 | `preferences` | Specialization area | `sales`, `support`, `technical` |
 | `department` | Department for routing | `commercial`, `engineering` |
 | `aliases` | Semicolon-separated nicknames for matching | `joao;js;silva` |
 | `phone` | WhatsApp number (E.164 format) | `+5511999990001` |
 | `email` | Email address for notifications | `joao@company.com` |
 | `teams` | Microsoft Teams UPN | `joao@company.onmicrosoft.com` |
 | `google` | Google Workspace email | `joao@company.com` |
 The system can find an attendant by **any identifier** - phone, email, Teams UPN, Google account, name, or alias.
 ```
 ---
 ## Scripts
 ### start.bas - Intelligent Routing
 The main entry point analyzes every customer message and decides routing:
 ```basic
 ' Analyze sentiment immediately
 sentiment = ANALYZE SENTIMENT session.id, message
 ' Track frustration
 IF sentiment.overall = "negative" THEN
    frustration_count = frustration_count + 1
 END IF
 ' Auto-transfer on high escalation risk
 IF sentiment.escalation_risk = "high" THEN
    tips = GET TIPS session.id, message
    result = TRANSFER TO HUMAN "support", "urgent", context_summary
 END IF
 ```
 **Key behaviors:**
 - Analyzes sentiment on every message
 - Tracks frustration count across conversation
 - Auto-transfers on explicit request ("falar com humano", "talk to human")
 - Auto-transfers when escalation risk is high
 - Auto-transfers after 3+ negative messages
 - Passes AI tips to attendant during transfer
 ### queue-monitor.bas - Queue Health
 Scheduled job that runs every 5 minutes:
 ```basic
 SET SCHEDULE "queue-monitor", "*/5 * * * *"
 ```
 **What it does:**
 - Finds conversations waiting >10 minutes → auto-assigns
 - Finds inactive assigned conversations → reminds attendant
 - Finds conversations with offline attendants → reassigns
 - Detects abandoned conversations → sends follow-up, then resolves
 - Generates queue metrics for dashboard
 - Alerts supervisor if queue gets long or no attendants online
 ### attendant-helper.bas - LLM Assist Tools
 Provides AI-powered assistance to human attendants:
 ```basic
 ' Get tips for current conversation
 tips = USE TOOL "attendant-helper", "tips", session_id, message
 ' Polish a message before sending
 polished = USE TOOL "attendant-helper", "polish", session_id, message, "empathetic"
 ' Get smart reply suggestions
 replies = USE TOOL "attendant-helper", "replies", session_id
 ' Get conversation summary
 summary = USE TOOL "attendant-helper", "summary", session_id
 ' Analyze sentiment with recommendations
 sentiment = USE TOOL "attendant-helper", "sentiment", session_id, message
 ' Check if transfer is recommended
 should_transfer = USE TOOL "attendant-helper", "suggest_transfer", session_id
 ```
 ### crm-automations.bas - Business Workflows
 Scheduled CRM automations:
 ```basic
 ' Daily follow-ups at 9am weekdays
 SET SCHEDULE "follow-ups", "0 9 * * 1-5"
 ' Daily collections at 8am weekdays
 SET SCHEDULE "collections", "0 8 * * 1-5"
 ' Daily lead nurturing at 10am weekdays
 SET SCHEDULE "lead-nurture", "0 10 * * 1-5"
 ' Weekly pipeline review Friday 2pm
 SET SCHEDULE "pipeline-review", "0 14 * * 5"
 ```
 ---
 ## BASIC Keywords Used
 ### Queue Management
 | Keyword | Description | Example |
 |---------|-------------|---------|
 | `GET QUEUE` | Get queue status and items | `queue = GET QUEUE` |
 | `NEXT IN QUEUE` | Get next waiting conversation | `next = NEXT IN QUEUE` |
 | `ASSIGN CONVERSATION` | Assign to attendant | `ASSIGN CONVERSATION session_id, "att-001"` |
 | `RESOLVE CONVERSATION` | Mark as resolved | `RESOLVE CONVERSATION session_id, "Fixed"` |
 | `SET PRIORITY` | Change priority | `SET PRIORITY session_id, "urgent"` |
 ### Attendant Management
 | Keyword | Description | Example |
 |---------|-------------|---------|
 | `GET ATTENDANTS` | List attendants | `attendants = GET ATTENDANTS "online"` |
 | `GET ATTENDANT STATS` | Get performance metrics | `stats = GET ATTENDANT STATS "att-001"` |
 | `SET ATTENDANT STATUS` | Change status | `SET ATTENDANT STATUS "att-001", "busy"` |
 ### LLM Assist
 | Keyword | Description | Example |
 |---------|-------------|---------|
 | `GET TIPS` | Generate AI tips | `tips = GET TIPS session_id, message` |
 | `POLISH MESSAGE` | Improve message | `polished = POLISH MESSAGE text, "empathetic"` |
 | `GET SMART REPLIES` | Get reply suggestions | `replies = GET SMART REPLIES session_id` |
 | `GET SUMMARY` | Conversation summary | `summary = GET SUMMARY session_id` |
 | `ANALYZE SENTIMENT` | Sentiment analysis | `sentiment = ANALYZE SENTIMENT session_id, message` |
 ### Customer Journey
 | Keyword | Description | Example |
 |---------|-------------|---------|
 | `TAG CONVERSATION` | Add tags | `TAG CONVERSATION session_id, "vip"` |
 | `ADD NOTE` | Add internal note | `ADD NOTE session_id, "Customer is VIP"` |
 | `GET CUSTOMER HISTORY` | Previous interactions | `history = GET CUSTOMER HISTORY user_id` |
 | `TRANSFER TO HUMAN` | Handoff to human | `TRANSFER TO HUMAN "support", "high", context` |
 ---
 ## Sample Conversations
 ### Auto-Transfer on Frustration
 ```
 Customer: Olá, preciso de ajuda com meu pedido
 Bot: Olá! Claro, ficarei feliz em ajudar com seu pedido.
     Qual é o número do pedido?
 Customer: #12345
 Bot: Encontrei seu pedido #12345. Está em trânsito e deve
     chegar amanhã. Posso ajudar com mais alguma coisa?
 Customer: Amanhã?! Isso é ridículo! Já deveria ter chegado há 3 dias!
 Bot: [Sentiment: negative, Escalation Risk: medium]
     Entendo sua frustração. Peço desculpas pelo atraso.
     Se preferir falar com um atendente humano, é só me avisar.
 Customer: SIM! Quero falar com alguém de verdade AGORA!
 Bot: [Auto-transfer triggered]
     Estou transferindo você para Maria Santos.
     Um momento, por favor.
     [Tips sent to Maria:]
     ⚠️ Customer frustrated - 3 negative messages
     🎯 Issue: Delayed order #12345
     ✅ Offer compensation for delay
 ```
 ### Attendant Using LLM Assist
 ```
 [Customer message arrives]
 Customer: não consigo acessar minha conta faz 2 dias!!
 [AI Tips appear in attendant UI:]
 💡 Tips:
   ⚠️ Customer frustrated - use empathetic tone
   🎯 Intent: Account access issue
   ✅ Verify account status, offer password reset
 [Attendant types response:]
 Attendant: oi, vou verificar sua conta
 [Clicks ✨ Polish button:]
 Polished: "Olá! Entendo como isso pode ser frustrante.
          Vou verificar sua conta agora mesmo e resolver
          isso para você."
 [Attendant sends polished message]
 ```
 ---
 ## Automation Workflows
 ### Follow-up Sequence
 | Day | Action | Template |
 |-----|--------|----------|
 | 1 | Thank you message | `follow_up_thanks` |
 | 3 | Value proposition | `follow_up_value` |
 | 7 | Special offer (if score ≥50) | `follow_up_offer` |
 ### Collections Workflow
 | Days Overdue | Action | Escalation |
 |--------------|--------|------------|
 | 0 (due today) | Friendly reminder | WhatsApp template |
 | 3 | First notice | WhatsApp + Email |
 | 7 | Second notice | + Notify collections team |
 | 15 | Final notice + late fees | + Queue for human call |
 | 30+ | Send to legal | + Suspend account |
 ---
 ## WhatsApp Templates Required
 Configure these in Meta Business Manager:
 | Template | Variables | Purpose |
 |----------|-----------|---------|
 | `follow_up_thanks` | name, interest | 1-day thank you |
 | `follow_up_value` | name, interest | 3-day value prop |
 | `follow_up_offer` | name, discount | 7-day offer |
 | `payment_due_today` | name, invoice_id, amount | Due reminder |
 | `payment_overdue_3` | name, invoice_id, amount | 3-day overdue |
 | `payment_overdue_7` | name, invoice_id, amount | 7-day overdue |
 | `payment_final_notice` | name, invoice_id, total | 15-day final |
 ---
 ## Metrics & Analytics
 The template automatically tracks:
 - **Queue Metrics**: Wait times, queue length, utilization
 - **Attendant Performance**: Resolved count, active conversations
 - **Sentiment Trends**: Per conversation and overall
 - **Automation Results**: Follow-ups sent, collections processed
 Access via:
 - Dashboard at `/suite/analytics/`
 - API at `/api/attendance/insights`
 - Stored in `queue_metrics` and `automation_logs` tables
 ---
 ## Best Practices
 ### 1. Configure Sentiment Thresholds
 Adjust `auto-transfer-threshold` based on your tolerance:
 - `2` = Very aggressive (transfer quickly)
 - `3` = Balanced (default)
 - `5` = Conservative (try harder with bot)
 ### 2. Set Business Hours
 Configure `business-hours-*` to avoid sending automated messages at night.
 ### 3. Train Your Team
 Ensure attendants know the WhatsApp commands:
 - `/tips` - Get AI tips
 - `/polish <message>` - Improve message
 - `/replies` - Get suggestions
 - `/resolve` - Close conversation
 ### 4. Monitor Queue Health
 Set up alerts for:
 - Queue > 10 waiting
 - No attendants online during business hours
 - Average wait > 15 minutes
 ---
 ## See Also
 - [Transfer to Human](../chapter-11-features/transfer-to-human.md) - Handoff details
 - [LLM-Assisted Attendant](../chapter-11-features/attendant-llm-assist.md) - AI copilot features
 - [Sales CRM Template](./template-crm.md) - Full CRM without attendance
 - [Attendance Queue Module](../appendix-external-services/attendance-queue.md) - Queue configuration
--- a/src/chapter-04-gbui/how-to/create-first-bot.md
+++ b/src/chapter-04-gbui/how-to/create-first-bot.md
@ -275,8 +275,8 @@ If you have API keys for AI services, configure them:
 | Setting | Description | Example Value |
 |---------|-------------|---------------|
-| **LLM Provider** | AI service to use | `openai` |
+| **LLM Provider** | AI service to use | `anthropic` |
-| **Model** | Specific model | `gpt-4o` |
+| **Model** | Specific model | `claude-sonnet-4.5` |
 | **API Key** | Your API key | `sk-...` |
 ⚠️ **Warning**: Keep your API keys secret. Never share them.
--- a/src/chapter-04-gbui/how-to/monitor-sessions.md
+++ b/src/chapter-04-gbui/how-to/monitor-sessions.md
@ -211,7 +211,7 @@ The dashboard shows the health of all components:
 │  ● PostgreSQL      Running    v16.2       24/100 connections           │
 │  ● Qdrant          Running    v1.9.2      1.2M vectors                 │
 │  ● MinIO           Running    v2024.01    45.2 GB stored               │
-│  ● BotModels       Running    v2.1.0      gpt-4o active                │
+│  ● BotModels       Running    v2.1.0      LLM active                   │
 │  ● Vault           Sealed     v1.15.0     156 secrets                  │
 │  ● Cache           Running    v7.2.4      94.2% hit rate               │
 │  ● InfluxDB        Running    v2.7.3      2,450 pts/sec                │
--- a/src/chapter-04-gbui/suite-manual.md
+++ b/src/chapter-04-gbui/suite-manual.md
@ -699,10 +699,10 @@ Sources is your library of prompts, templates, tools, and AI models. Find and us
 | Model | Provider | Best For |
 |-------|----------|----------|
-| GPT-4o | OpenAI | General tasks, vision |
+| Claude Sonnet 4.5 | Anthropic | General tasks, coding |
-| Claude 3.5 | Anthropic | Analysis, coding |
+| Claude Opus 4.5 | Anthropic | Complex analysis |
-| Gemini 1.5 | Google | Long documents |
+| Gemini Pro | Google | Long documents |
-| Llama 3.1 | Meta | Open source, privacy |
+| Llama 3.3 | Meta | Open source, privacy |
 ---
--- a/src/chapter-06-gbdialog/basic-vs-automation-tools.md
+++ b/src/chapter-06-gbdialog/basic-vs-automation-tools.md
@ -99,8 +99,8 @@ This message reaches users on WhatsApp, Telegram, Web, or any configured channel
 BASIC supports any LLM provider:
- OpenAI (GPT-4, GPT-3.5)
+- OpenAI (GPT-5, o3)
- Anthropic (Claude)
+- Anthropic (Claude Sonnet 4.5, Opus 4.5)
 - Local models (Llama, Mistral via llama.cpp)
 - Groq, DeepSeek, and others
 - Any OpenAI-compatible API
--- a/src/chapter-06-gbdialog/keyword-model-route.md
+++ b/src/chapter-06-gbdialog/keyword-model-route.md
@ -61,10 +61,10 @@ Add to `config.csv`:
 ```csv
 llm-models,default;fast;quality;code
 model-routing-strategy,auto
-model-default,gpt-3.5-turbo
+model-default,claude-sonnet-4.5
-model-fast,gpt-3.5-turbo
+model-fast,gemini-flash
-model-quality,gpt-4o
+model-quality,claude-opus-4.5
-model-code,claude-sonnet
+model-code,claude-sonnet-4.5
 ```
 ## Example: Task-Based Routing
--- a/src/chapter-06-gbdialog/keyword-use-model.md
+++ b/src/chapter-06-gbdialog/keyword-use-model.md
@ -128,7 +128,7 @@ The system supports several routing strategies configured in `config.csv`:
 name,value
 model-routing-strategy,auto
 model-default,fast
-model-fast,DeepSeek-R1-Distill-Qwen-1.5B-Q3_K_M.gguf
+model-fast,DeepSeek-R3-Distill-Qwen-1.5B-Q3_K_M.gguf
 model-quality,gpt-4
 model-code,codellama-7b.gguf
 model-fallback-enabled,true
--- a/src/chapter-07-gbapp/architecture.md
+++ b/src/chapter-07-gbapp/architecture.md
@ -56,7 +56,7 @@ The `web_server/` module implements the HTTP server and web interface. It serves
 The `llm/` module provides large language model integration. It handles model selection based on configuration and requirements, formats prompts according to model expectations, manages token counting and context limits, streams responses for real-time display, tracks API costs for budgeting, and implements model fallbacks when primary providers are unavailable.
-The `llm_models/` module contains specific implementations for different model providers. OpenAI integration supports GPT-3.5 and GPT-4 models. Anthropic integration provides access to Claude models. Google integration enables Gemini model usage. Meta integration supports Llama models for local deployment. Local model support allows self-hosted inference. Custom model implementations can be added for specialized providers.
+The `llm_models/` module contains specific implementations for different model providers. OpenAI integration supports GPT-5 and o3 models. Anthropic integration provides access to Claude Sonnet 4.5 and Opus 4.5 models. Google integration enables Gemini model usage. Meta integration supports Llama models for local deployment. Local model support allows self-hosted inference. Custom model implementations can be added for specialized providers.
 The `prompt_manager/` module provides centralized prompt management capabilities. It maintains prompt templates for consistent interactions, handles variable substitution in prompts, optimizes prompts for specific models, supports version control of prompt changes, enables A/B testing of different approaches, and tracks prompt performance metrics.
--- a/src/chapter-07-gbapp/infrastructure.md
+++ b/src/chapter-07-gbapp/infrastructure.md
@ -373,7 +373,7 @@ Failover happens automatically within seconds, with clients redirected via the c
 # config.csv - Fallbacks
 fallback-llm-enabled,true
 fallback-llm-provider,local
-fallback-llm-model,DeepSeek-R1-Distill-Qwen-1.5B
+fallback-llm-model,DeepSeek-R3-Distill-Qwen-1.5B
 fallback-cache-enabled,true
 fallback-cache-mode,memory
--- a/src/chapter-07-gbapp/scaling.md
+++ b/src/chapter-07-gbapp/scaling.md
@ -584,7 +584,7 @@ Configure fallback behavior:
 # Fallback configuration
 fallback-llm-enabled,true
 fallback-llm-provider,local
-fallback-llm-model,DeepSeek-R1-Distill-Qwen-1.5B
+fallback-llm-model,DeepSeek-R3-Distill-Qwen-1.5B
 fallback-cache-enabled,true
 fallback-cache-mode,memory
--- a/src/chapter-08-config/config-csv.md
+++ b/src/chapter-08-config/config-csv.md
@ -56,7 +56,7 @@ A complete working configuration:
 name,value
 server-port,8080
 llm-url,http://localhost:8081
-llm-model,../../../../data/llm/DeepSeek-R1-Distill-Qwen-1.5B-Q3_K_M.gguf
+llm-model,../../../../data/llm/DeepSeek-R3-Distill-Qwen-1.5B-Q3_K_M.gguf
 episodic-memory-threshold,4
 ```
--- a/src/chapter-08-config/context-config.md
+++ b/src/chapter-08-config/context-config.md
@ -41,7 +41,7 @@ For detailed LLM configuration, see the tables below. The basic settings are:
 ```csv
 llm-key,none
 llm-url,http://localhost:8081
-llm-model,../../../../data/llm/DeepSeek-R1-Distill-Qwen-1.5B-Q3_K_M.gguf
+llm-model,../../../../data/llm/DeepSeek-R3-Distill-Qwen-1.5B-Q3_K_M.gguf
 ```
 #### Core LLM Settings
@ -223,7 +223,7 @@ llm-server,true
 llm-server-gpu-layers,35
 llm-server-ctx-size,8192
 llm-server-n-predict,2048
-llm-model,../../../../data/llm/DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf
+llm-model,../../../../data/llm/DeepSeek-R3-Distill-Qwen-7B-Q4_K_M.gguf
 ,
 # Disable cache for development
 llm-cache,false
@ -296,15 +296,15 @@ Others require restart:
 **24GB+ VRAM (RTX 3090, 4090)**
 - DeepSeek-V3 (with MoE enabled)
 - Qwen2.5-32B-Instruct-Q4_K_M
- DeepSeek-R1-Distill-Qwen-14B (runs fast with room to spare)
+- DeepSeek-R3-Distill-Qwen-14B (runs fast with room to spare)
 **12-16GB VRAM (RTX 4070, 4070Ti)**  
- DeepSeek-R1-Distill-Llama-8B
+- DeepSeek-R3-Distill-Llama-8B
 - Qwen2.5-14B-Q4_K_M
 - Mistral-7B-Instruct-Q5_K_M
 **8GB VRAM or CPU-Only**
- DeepSeek-R1-Distill-Qwen-1.5B
+- DeepSeek-R3-Distill-Qwen-1.5B
 - Phi-3-mini-4k-instruct
 - Qwen2.5-3B-Instruct-Q5_K_M
--- a/src/chapter-08-config/llm-config.md
+++ b/src/chapter-08-config/llm-config.md
@ -9,7 +9,7 @@ BotServer is designed to work with local GGUF models by default. The minimal con
 ```csv
 llm-key,none
 llm-url,http://localhost:8081
-llm-model,../../../../data/llm/DeepSeek-R1-Distill-Qwen-1.5B-Q3_K_M.gguf
+llm-model,../../../../data/llm/DeepSeek-R3-Distill-Qwen-1.5B-Q3_K_M.gguf
 ```
 ### Model Path
@ -156,7 +156,7 @@ Using a cloud provider for inference:
 name,value
 llm-key,sk-...
 llm-url,https://api.anthropic.com
-llm-model,claude-3
+llm-model,claude-sonnet-4.5
 llm-cache,true
 llm-cache-ttl,7200
 ```
@ -179,7 +179,7 @@ Supporting concurrent users requires enabling `llm-server-cont-batching` and inc
 ### Small Models (1-3B parameters)
-Small models like DeepSeek-R1-Distill-Qwen-1.5B deliver fast responses with low memory usage. They work well for simple tasks, quick interactions, and resource-constrained environments.
+Small models like DeepSeek-R3-Distill-Qwen-1.5B deliver fast responses with low memory usage. They work well for simple tasks, quick interactions, and resource-constrained environments.
 ### Medium Models (7-13B parameters)
--- a/src/chapter-08-config/parameters.md
+++ b/src/chapter-08-config/parameters.md
@ -55,7 +55,7 @@ Complete reference of all available parameters in `config.csv`.
 #### For RTX 3090 (24GB VRAM)
 You can run impressive models with proper configuration:
- **DeepSeek-R1-Distill-Qwen-7B**: Set `llm-server-gpu-layers` to 35-40
+- **DeepSeek-R3-Distill-Qwen-7B**: Set `llm-server-gpu-layers` to 35-40
 - **Qwen2.5-32B-Instruct (Q4_K_M)**: Fits with `llm-server-gpu-layers` to 40-45
 - **DeepSeek-V3 (with MoE)**: Set `llm-server-n-moe` to 2-4 to run even 120B models! MoE only loads active experts
 - **Optimization**: Use `llm-server-ctx-size` of 8192 for longer contexts
@ -63,20 +63,20 @@ You can run impressive models with proper configuration:
 #### For RTX 4070/4070Ti (12-16GB VRAM)  
 Mid-range cards work great with quantized models:
 - **Qwen2.5-14B (Q4_K_M)**: Set `llm-server-gpu-layers` to 25-30
- **DeepSeek-R1-Distill-Llama-8B**: Fully fits with layers at 32
+- **DeepSeek-R3-Distill-Llama-8B**: Fully fits with layers at 32
 - **Tips**: Keep `llm-server-ctx-size` at 4096 to save VRAM
 #### For CPU-Only (No GPU)
 Modern CPUs can still run capable models:
- **DeepSeek-R1-Distill-Qwen-1.5B**: Fast on CPU, great for testing
+- **DeepSeek-R3-Distill-Qwen-1.5B**: Fast on CPU, great for testing
 - **Phi-3-mini (3.8B)**: Excellent CPU performance
 - **Settings**: Set `llm-server-mlock` to `true` to prevent swapping
 - **Parallel**: Increase `llm-server-parallel` to CPU cores -2
 #### Recommended Models (GGUF Format)
- **Best Overall**: DeepSeek-R1-Distill series (1.5B to 70B)
+- **Best Overall**: DeepSeek-R3-Distill series (1.5B to 70B)
 - **Best Small**: Qwen2.5-3B-Instruct-Q5_K_M
- **Best Medium**: DeepSeek-R1-Distill-Qwen-14B-Q4_K_M  
+- **Best Medium**: DeepSeek-R3-Distill-Qwen-14B-Q4_K_M  
 - **Best Large**: DeepSeek-V3, Qwen2.5-32B, or GPT2-120B-GGUF (with MoE enabled)
 **Pro Tip**: The `llm-server-n-moe` parameter is magic for large models - it enables Mixture of Experts, letting you run 120B+ models on consumer hardware by only loading the experts needed for each token!
--- a/src/chapter-08-config/secrets-management.md
+++ b/src/chapter-08-config/secrets-management.md
@ -88,8 +88,8 @@ The bot's `config.csv` contains **non-sensitive** configuration:
 ```csv
 # Bot behavior - NOT secrets
-llm-provider,openai
+llm-provider,anthropic
-llm-model,gpt-4o
+llm-model,claude-sonnet-4.5
 llm-temperature,0.7
 llm-max-tokens,4096
@ -369,8 +369,8 @@ Reference Vault secrets in your bot's config.csv:
 ```csv
 # Direct value (non-sensitive)
-llm-provider,openai
+llm-provider,anthropic
-llm-model,gpt-4o
+llm-model,claude-sonnet-4.5
 llm-temperature,0.7
 # Vault reference (sensitive)
--- a/src/chapter-11-features/ai-llm.md
+++ b/src/chapter-11-features/ai-llm.md
@ -12,7 +12,7 @@ The LLM integration in BotServer enables sophisticated conversational experience
 ### OpenAI
-OpenAI serves as the primary LLM provider with support for multiple model tiers. GPT-3.5 Turbo provides fast, cost-effective responses for straightforward conversations. GPT-4 delivers more nuanced understanding for complex queries. GPT-4 Turbo offers an optimal balance of capability and speed. Custom fine-tuned models can be used when you have specialized requirements.
+OpenAI serves as the primary LLM provider with support for multiple model tiers. GPT-5 provides fast, cost-effective responses for straightforward conversations. GPT-5 mini delivers efficient processing for simpler queries. The o3 series offers superior reasoning for complex tasks. Custom fine-tuned models can be used when you have specialized requirements.
 Configuration requires setting your API key and selecting a model:
@ -181,7 +181,7 @@ Choosing the right model involves balancing several factors. Capability requirem
 ### Model Comparison
-GPT-3.5 Turbo offers the fastest responses at the lowest cost, suitable for straightforward questions. GPT-4 provides superior reasoning for complex queries at higher cost and latency. Local models like Llama variants offer privacy and cost predictability with varying capability levels. Specialized models may excel at particular domains like code or medical content.
+GPT-5 mini offers the fastest responses at the lowest cost, suitable for straightforward questions. Claude Sonnet 4.5 and GPT-5 provide superior reasoning for complex queries with good balance of cost and capability. Local models like Llama variants offer privacy and cost predictability with varying capability levels. Specialized models may excel at particular domains like code or medical content.
 ## Integration with Tools
--- a/src/chapter-11-features/core-features.md
+++ b/src/chapter-11-features/core-features.md
@ -38,7 +38,8 @@ Scripts stored as `.gbdialog` files in bot packages.
 | Provider | Models | Features |
 |----------|--------|----------|
-| OpenAI | GPT-3.5, GPT-4 | Streaming, function calling |
+| OpenAI | GPT-5, o3 | Streaming, function calling |
 | Anthropic | Claude Sonnet 4.5, Opus 4.5 | Analysis, coding, guidelines |
 | Local | GGUF models | GPU acceleration, offline |
 Features: prompt templates, context injection, token management, cost optimization.
--- a/src/chapter-11-features/multi-agent-orchestration.md
+++ b/src/chapter-11-features/multi-agent-orchestration.md
@ -223,7 +223,7 @@ USE MODEL "auto"
 name,value
 model-routing-strategy,auto
 model-default,fast
-model-fast,DeepSeek-R1-Distill-Qwen-1.5B-Q3_K_M.gguf
+model-fast,DeepSeek-R3-Distill-Qwen-1.5B-Q3_K_M.gguf
 model-quality,gpt-4
 model-code,codellama-7b.gguf
 ```
--- a/src/executive-vision.md
+++ b/src/executive-vision.md
@ -232,8 +232,8 @@ botserver --start
 ## INTEGRATION CAPABILITIES
 ### LLM Providers
- OpenAI (GPT-4, GPT-3.5)
+- OpenAI (GPT-5, o3)
- Anthropic (Claude)
+- Anthropic (Claude Sonnet 4.5, Opus 4.5)
 - Meta (Llama)
 - DeepSeek
 - Local models via Ollama