18 KiB
18 KiB
Architecture Reference
System architecture and design overview for General Bots.
High-Level Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Clients │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Web │ │ Mobile │ │ API │ │ WhatsApp │ │
│ │ (HTMX) │ │ App │ │ Clients │ │ Telegram │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
└───────┼─────────────┼─────────────┼─────────────┼───────────────┘
│ │ │ │
└─────────────┴──────┬──────┴─────────────┘
│
┌────────▼────────┐
│ HTTP Server │
│ (Axum) │
└────────┬────────┘
│
┌────────────────────┼────────────────────┐
│ │ │
┌───────▼───────┐ ┌─────────▼─────────┐ ┌──────▼──────┐
│ REST API │ │ WebSocket │ │ Static │
│ Handlers │ │ Handlers │ │ Files │
└───────┬───────┘ └─────────┬─────────┘ └─────────────┘
│ │
└────────┬───────────┘
│
┌────────▼────────┐
│ App State │
│ (Shared Arc) │
└────────┬────────┘
│
┌─────────────┼─────────────┬─────────────┐
│ │ │ │
┌──▼──┐ ┌────▼────┐ ┌────▼────┐ ┌───▼───┐
│ DB │ │ Cache │ │ Storage │ │ LLM │
│(PG) │ │ (Redis) │ │ (S3) │ │(Multi)│
└─────┘ └─────────┘ └─────────┘ └───────┘
Core Components
HTTP Server (Axum)
The main entry point for all requests.
// Main router structure
Router::new()
.nest("/api", api_routes())
.nest("/ws", websocket_routes())
.nest_service("/", static_files())
.layer(middleware_stack())
Responsibilities:
- Request routing
- Middleware execution
- CORS handling
- Rate limiting
- Authentication
App State
Shared application state passed to all handlers.
pub struct AppState {
pub db_pool: Pool<ConnectionManager<PgConnection>>,
pub redis: RedisPool,
pub s3_client: S3Client,
pub llm_client: LlmClient,
pub qdrant: QdrantClient,
pub config: AppConfig,
}
Contains:
- Database connection pool
- Redis cache connection
- S3 storage client
- LLM provider clients
- Vector database client
- Application configuration
Request Flow
Request → Router → Middleware → Handler → Response
│
├── Auth Check
├── Rate Limit
├── Logging
└── Error Handling
Module Structure
botserver/src/
├── main.rs # Entry point
├── lib.rs # Library exports
├── core/ # Core infrastructure
│ ├── shared/ # Shared types and state
│ │ ├── state.rs # AppState definition
│ │ ├── models.rs # Common models
│ │ └── schema.rs # Database schema
│ ├── urls.rs # URL constants
│ ├── secrets/ # Vault integration
│ └── rate_limit.rs # Rate limiting
├── basic/ # BASIC language
│ ├── compiler/ # Compiler/parser
│ ├── runtime/ # Execution engine
│ └── keywords/ # Keyword implementations
├── llm/ # LLM integration
│ ├── mod.rs # Provider abstraction
│ ├── openai.rs # OpenAI client
│ ├── anthropic.rs # Anthropic client
│ └── prompt_manager/ # Prompt management
├── multimodal/ # Media processing
│ ├── vision.rs # Image analysis
│ ├── audio.rs # Speech processing
│ └── document.rs # Document parsing
├── security/ # Authentication
│ ├── auth.rs # Auth middleware
│ ├── zitadel.rs # Zitadel client
│ └── jwt.rs # Token handling
├── analytics/ # Analytics module
├── calendar/ # Calendar/CalDAV
├── designer/ # Bot builder
├── drive/ # File storage
├── email/ # Email (IMAP/SMTP)
├── meet/ # Video conferencing
├── paper/ # Document editor
├── research/ # KB search
├── sources/ # Templates
└── tasks/ # Task management
Data Flow
Chat Message Flow
User Input
│
▼
┌──────────────┐
│ WebSocket │
│ Handler │
└──────┬───────┘
│
▼
┌──────────────┐ ┌──────────────┐
│ Session │────▶│ Message │
│ Manager │ │ History │
└──────┬───────┘ └──────────────┘
│
▼
┌──────────────┐
│ KB Search │◀────── Vector DB (Qdrant)
│ (Context) │
└──────┬───────┘
│
▼
┌──────────────┐
│ Tool Check │◀────── Registered Tools
│ │
└──────┬───────┘
│
▼
┌──────────────┐
│ LLM Call │◀────── Semantic Cache (Redis)
│ │
└──────┬───────┘
│
▼
┌──────────────┐
│ Response │
│ Streaming │
└──────┬───────┘
│
▼
User Response
Tool Execution Flow
LLM Response (tool_call)
│
▼
┌──────────────┐
│ Tool Router │
└──────┬───────┘
│
▼
┌──────────────┐
│BASIC Runtime │
└──────┬───────┘
│
├──▶ Database Operations
├──▶ HTTP Requests
├──▶ File Operations
└──▶ Email/Notifications
│
▼
┌──────────────┐
│ Tool Result │
└──────┬───────┘
│
▼
LLM (continue or respond)
Database Schema
Core Tables
-- User sessions
CREATE TABLE user_sessions (
id UUID PRIMARY KEY,
user_id VARCHAR(255),
bot_id VARCHAR(100),
status VARCHAR(20),
metadata JSONB,
created_at TIMESTAMPTZ,
updated_at TIMESTAMPTZ
);
-- Message history
CREATE TABLE message_history (
id UUID PRIMARY KEY,
session_id UUID REFERENCES user_sessions(id),
role VARCHAR(20),
content TEXT,
tokens_used INTEGER,
created_at TIMESTAMPTZ
);
-- Bot configurations
CREATE TABLE bot_configs (
id UUID PRIMARY KEY,
bot_id VARCHAR(100) UNIQUE,
config JSONB,
created_at TIMESTAMPTZ,
updated_at TIMESTAMPTZ
);
-- Knowledge base documents
CREATE TABLE kb_documents (
id UUID PRIMARY KEY,
collection VARCHAR(100),
content TEXT,
embedding_id VARCHAR(100),
metadata JSONB,
created_at TIMESTAMPTZ
);
Indexes
CREATE INDEX idx_sessions_user ON user_sessions(user_id);
CREATE INDEX idx_sessions_bot ON user_sessions(bot_id);
CREATE INDEX idx_messages_session ON message_history(session_id);
CREATE INDEX idx_kb_collection ON kb_documents(collection);
Caching Strategy
Cache Layers
┌─────────────────────────────────────────┐
│ Semantic Cache │
│ (LLM responses by query similarity) │
└───────────────────┬─────────────────────┘
│
┌───────────────────▼─────────────────────┐
│ Session Cache │
│ (Active sessions, user context) │
└───────────────────┬─────────────────────┘
│
┌───────────────────▼─────────────────────┐
│ Data Cache │
│ (KB results, config, templates) │
└─────────────────────────────────────────┘
Cache Keys
| Pattern | TTL | Description |
|---|---|---|
session:{id} |
30m | Active session data |
semantic:{hash} |
24h | LLM response cache |
kb:{collection}:{hash} |
1h | KB search results |
config:{bot_id} |
5m | Bot configuration |
user:{id} |
15m | User preferences |
Semantic Cache
// Query similarity check
let cache_key = compute_embedding_hash(query);
if let Some(cached) = redis.get(&cache_key).await? {
if similarity(query, cached.query) > 0.95 {
return cached.response;
}
}
LLM Integration
Provider Abstraction
pub trait LlmProvider: Send + Sync {
async fn complete(&self, request: CompletionRequest)
-> Result<CompletionResponse>;
async fn complete_stream(&self, request: CompletionRequest)
-> Result<impl Stream<Item = StreamChunk>>;
async fn embed(&self, text: &str)
-> Result<Vec<f32>>;
}
Supported Providers
| Provider | Models | Features |
|---|---|---|
| OpenAI | GPT-5, GPT-4o, o3 | Streaming, Functions, Vision |
| Anthropic | Claude Sonnet 4.5, Claude Opus 4.5 | Streaming, Long context |
| Groq | Llama 3.3, Mixtral | Fast inference |
| Ollama | Any local | Self-hosted |
Request Flow
// 1. Build messages with context
let messages = build_messages(history, kb_context, system_prompt);
// 2. Add tools if registered
let tools = get_registered_tools(session);
// 3. Check semantic cache
if let Some(cached) = semantic_cache.get(&messages).await? {
return Ok(cached);
}
// 4. Call LLM
let response = llm.complete(CompletionRequest {
messages,
tools,
temperature: config.temperature,
max_tokens: config.max_tokens,
}).await?;
// 5. Cache response
semantic_cache.set(&messages, &response).await?;
BASIC Runtime
Compilation Pipeline
Source Code (.bas)
│
▼
┌──────────────┐
│ Lexer │──▶ Tokens
└──────────────┘
│
▼
┌──────────────┐
│ Parser │──▶ AST
└──────────────┘
│
▼
┌──────────────┐
│ Analyzer │──▶ Validated AST
└──────────────┘
│
▼
┌──────────────┐
│ Runtime │──▶ Execution
└──────────────┘
Execution Context
pub struct RuntimeContext {
pub session: Session,
pub variables: HashMap<String, Value>,
pub tools: Vec<Tool>,
pub kb: Vec<KnowledgeBase>,
pub state: Arc<AppState>,
}
Security Architecture
Authentication Flow
Client Request
│
▼
┌──────────────┐
│ Extract │
│ Token │
└──────┬───────┘
│
▼
┌──────────────┐ ┌──────────────┐
│ Validate │────▶│ Zitadel │
│ JWT │ │ (OIDC) │
└──────┬───────┘ └──────────────┘
│
▼
┌──────────────┐
│ Check │
│ Permissions │
└──────┬───────┘
│
▼
Handler Execution
Security Layers
- Transport: TLS 1.3 (rustls)
- Authentication: JWT/OAuth 2.0 (Zitadel)
- Authorization: Role-based access control
- Rate Limiting: Per-IP token bucket
- Input Validation: Type-safe parameters
- Output Sanitization: HTML escaping
Deployment Architecture
Single Instance
┌─────────────────────────────────────┐
│ Single Server │
│ ┌─────────────────────────────┐ │
│ │ BotServer │ │
│ └─────────────────────────────┘ │
│ ┌────────┐ ┌────────┐ ┌───────┐ │
│ │PostgreSQL│ │ Redis │ │ MinIO │ │
│ └────────┘ └────────┘ └───────┘ │
└─────────────────────────────────────┘
Clustered
┌─────────────────────────────────────────────────────┐
│ Load Balancer │
└───────────────────────┬─────────────────────────────┘
│
┌───────────────┼───────────────┐
│ │ │
┌────▼────┐ ┌─────▼────┐ ┌────▼────┐
│BotServer│ │BotServer │ │BotServer│
│ #1 │ │ #2 │ │ #3 │
└────┬────┘ └─────┬────┘ └────┬────┘
│ │ │
└───────────────┼───────────────┘
│
┌────────────────────┼────────────────────┐
│ │ │
┌──▼───┐ ┌─────▼────┐ ┌────▼────┐
│ PG │ │ Redis │ │ S3 │
│Cluster│ │ Cluster │ │ Cluster │
└──────┘ └──────────┘ └─────────┘
Performance Characteristics
| Operation | Latency | Throughput |
|---|---|---|
| REST API | < 10ms | 10,000 req/s |
| WebSocket message | < 5ms | 50,000 msg/s |
| KB search | < 50ms | 1,000 req/s |
| LLM call (cached) | < 20ms | 5,000 req/s |
| LLM call (uncached) | 500-3000ms | 50 req/s |
Monitoring Points
| Metric | Description |
|---|---|
http_requests_total |
Total HTTP requests |
http_request_duration |
Request latency |
ws_connections |
Active WebSocket connections |
llm_requests_total |
LLM API calls |
llm_cache_hits |
Semantic cache hit rate |
db_pool_size |
Active DB connections |
memory_usage |
Process memory |
Extension Points
Adding a New Module
- Create module in
src/ - Define routes in
mod.rs - Register in
lib.rs - Add to router in
main.rs
Adding a New LLM Provider
- Implement
LlmProvidertrait - Add provider enum variant
- Register in provider factory
Adding a New BASIC Keyword
- Add keyword to lexer
- Implement AST node
- Add runtime handler
- Update documentation