botserver/docs/reference/architecture.md
Rodrigo Rodriguez (Pragmatismo) d1301c9cd8 Add balanced documentation structure
Documentation organized with equilibrium:
- Small (50-100 lines): Index files
- Medium (250-400 lines): Guides
- Large (450-600 lines): Complete references

Structure:
- docs/api/ - REST endpoints, WebSocket
- docs/guides/ - Getting started, deployment, templates
- docs/reference/ - BASIC language, configuration, architecture

Updated README.md to point to new docs location.
2025-12-04 12:44:18 -03:00

18 KiB

Architecture Reference

System architecture and design overview for General Bots.

High-Level Architecture

┌─────────────────────────────────────────────────────────────────┐
│                         Clients                                  │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐        │
│  │   Web    │  │  Mobile  │  │   API    │  │ WhatsApp │        │
│  │  (HTMX)  │  │   App    │  │ Clients  │  │ Telegram │        │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘  └────┬─────┘        │
└───────┼─────────────┼─────────────┼─────────────┼───────────────┘
        │             │             │             │
        └─────────────┴──────┬──────┴─────────────┘
                             │
                    ┌────────▼────────┐
                    │   HTTP Server   │
                    │     (Axum)      │
                    └────────┬────────┘
                             │
        ┌────────────────────┼────────────────────┐
        │                    │                    │
┌───────▼───────┐  ┌─────────▼─────────┐  ┌──────▼──────┐
│  REST API     │  │    WebSocket      │  │   Static    │
│  Handlers     │  │    Handlers       │  │   Files     │
└───────┬───────┘  └─────────┬─────────┘  └─────────────┘
        │                    │
        └────────┬───────────┘
                 │
        ┌────────▼────────┐
        │   App State     │
        │  (Shared Arc)   │
        └────────┬────────┘
                 │
   ┌─────────────┼─────────────┬─────────────┐
   │             │             │             │
┌──▼──┐     ┌────▼────┐   ┌────▼────┐   ┌───▼───┐
│ DB  │     │  Cache  │   │ Storage │   │  LLM  │
│(PG) │     │ (Redis) │   │  (S3)   │   │(Multi)│
└─────┘     └─────────┘   └─────────┘   └───────┘

Core Components

HTTP Server (Axum)

The main entry point for all requests.

// Main router structure
Router::new()
    .nest("/api", api_routes())
    .nest("/ws", websocket_routes())
    .nest_service("/", static_files())
    .layer(middleware_stack())

Responsibilities:

  • Request routing
  • Middleware execution
  • CORS handling
  • Rate limiting
  • Authentication

App State

Shared application state passed to all handlers.

pub struct AppState {
    pub db_pool: Pool<ConnectionManager<PgConnection>>,
    pub redis: RedisPool,
    pub s3_client: S3Client,
    pub llm_client: LlmClient,
    pub qdrant: QdrantClient,
    pub config: AppConfig,
}

Contains:

  • Database connection pool
  • Redis cache connection
  • S3 storage client
  • LLM provider clients
  • Vector database client
  • Application configuration

Request Flow

Request → Router → Middleware → Handler → Response
                      │
                      ├── Auth Check
                      ├── Rate Limit
                      ├── Logging
                      └── Error Handling

Module Structure

botserver/src/
├── main.rs              # Entry point
├── lib.rs               # Library exports
├── core/                # Core infrastructure
│   ├── shared/          # Shared types and state
│   │   ├── state.rs     # AppState definition
│   │   ├── models.rs    # Common models
│   │   └── schema.rs    # Database schema
│   ├── urls.rs          # URL constants
│   ├── secrets/         # Vault integration
│   └── rate_limit.rs    # Rate limiting
├── basic/               # BASIC language
│   ├── compiler/        # Compiler/parser
│   ├── runtime/         # Execution engine
│   └── keywords/        # Keyword implementations
├── llm/                 # LLM integration
│   ├── mod.rs           # Provider abstraction
│   ├── openai.rs        # OpenAI client
│   ├── anthropic.rs     # Anthropic client
│   └── prompt_manager/  # Prompt management
├── multimodal/          # Media processing
│   ├── vision.rs        # Image analysis
│   ├── audio.rs         # Speech processing
│   └── document.rs      # Document parsing
├── security/            # Authentication
│   ├── auth.rs          # Auth middleware
│   ├── zitadel.rs       # Zitadel client
│   └── jwt.rs           # Token handling
├── analytics/           # Analytics module
├── calendar/            # Calendar/CalDAV
├── designer/            # Bot builder
├── drive/               # File storage
├── email/               # Email (IMAP/SMTP)
├── meet/                # Video conferencing
├── paper/               # Document editor
├── research/            # KB search
├── sources/             # Templates
└── tasks/               # Task management

Data Flow

Chat Message Flow

User Input
    │
    ▼
┌──────────────┐
│  WebSocket   │
│   Handler    │
└──────┬───────┘
       │
       ▼
┌──────────────┐     ┌──────────────┐
│   Session    │────▶│   Message    │
│   Manager    │     │   History    │
└──────┬───────┘     └──────────────┘
       │
       ▼
┌──────────────┐
│  KB Search   │◀────── Vector DB (Qdrant)
│  (Context)   │
└──────┬───────┘
       │
       ▼
┌──────────────┐
│ Tool Check   │◀────── Registered Tools
│              │
└──────┬───────┘
       │
       ▼
┌──────────────┐
│  LLM Call    │◀────── Semantic Cache (Redis)
│              │
└──────┬───────┘
       │
       ▼
┌──────────────┐
│  Response    │
│  Streaming   │
└──────┬───────┘
       │
       ▼
User Response

Tool Execution Flow

LLM Response (tool_call)
    │
    ▼
┌──────────────┐
│ Tool Router  │
└──────┬───────┘
       │
       ▼
┌──────────────┐
│BASIC Runtime │
└──────┬───────┘
       │
       ├──▶ Database Operations
       ├──▶ HTTP Requests
       ├──▶ File Operations
       └──▶ Email/Notifications
       │
       ▼
┌──────────────┐
│ Tool Result  │
└──────┬───────┘
       │
       ▼
LLM (continue or respond)

Database Schema

Core Tables

-- User sessions
CREATE TABLE user_sessions (
    id UUID PRIMARY KEY,
    user_id VARCHAR(255),
    bot_id VARCHAR(100),
    status VARCHAR(20),
    metadata JSONB,
    created_at TIMESTAMPTZ,
    updated_at TIMESTAMPTZ
);

-- Message history
CREATE TABLE message_history (
    id UUID PRIMARY KEY,
    session_id UUID REFERENCES user_sessions(id),
    role VARCHAR(20),
    content TEXT,
    tokens_used INTEGER,
    created_at TIMESTAMPTZ
);

-- Bot configurations
CREATE TABLE bot_configs (
    id UUID PRIMARY KEY,
    bot_id VARCHAR(100) UNIQUE,
    config JSONB,
    created_at TIMESTAMPTZ,
    updated_at TIMESTAMPTZ
);

-- Knowledge base documents
CREATE TABLE kb_documents (
    id UUID PRIMARY KEY,
    collection VARCHAR(100),
    content TEXT,
    embedding_id VARCHAR(100),
    metadata JSONB,
    created_at TIMESTAMPTZ
);

Indexes

CREATE INDEX idx_sessions_user ON user_sessions(user_id);
CREATE INDEX idx_sessions_bot ON user_sessions(bot_id);
CREATE INDEX idx_messages_session ON message_history(session_id);
CREATE INDEX idx_kb_collection ON kb_documents(collection);

Caching Strategy

Cache Layers

┌─────────────────────────────────────────┐
│            Semantic Cache               │
│  (LLM responses by query similarity)    │
└───────────────────┬─────────────────────┘
                    │
┌───────────────────▼─────────────────────┐
│            Session Cache                │
│    (Active sessions, user context)      │
└───────────────────┬─────────────────────┘
                    │
┌───────────────────▼─────────────────────┐
│            Data Cache                   │
│   (KB results, config, templates)       │
└─────────────────────────────────────────┘

Cache Keys

Pattern TTL Description
session:{id} 30m Active session data
semantic:{hash} 24h LLM response cache
kb:{collection}:{hash} 1h KB search results
config:{bot_id} 5m Bot configuration
user:{id} 15m User preferences

Semantic Cache

// Query similarity check
let cache_key = compute_embedding_hash(query);
if let Some(cached) = redis.get(&cache_key).await? {
    if similarity(query, cached.query) > 0.95 {
        return cached.response;
    }
}

LLM Integration

Provider Abstraction

pub trait LlmProvider: Send + Sync {
    async fn complete(&self, request: CompletionRequest) 
        -> Result<CompletionResponse>;
    
    async fn complete_stream(&self, request: CompletionRequest) 
        -> Result<impl Stream<Item = StreamChunk>>;
    
    async fn embed(&self, text: &str) 
        -> Result<Vec<f32>>;
}

Supported Providers

Provider Models Features
OpenAI GPT-4, GPT-3.5 Streaming, Functions, Vision
Anthropic Claude 3 Streaming, Long context
Groq Llama, Mixtral Fast inference
Ollama Any local Self-hosted

Request Flow

// 1. Build messages with context
let messages = build_messages(history, kb_context, system_prompt);

// 2. Add tools if registered
let tools = get_registered_tools(session);

// 3. Check semantic cache
if let Some(cached) = semantic_cache.get(&messages).await? {
    return Ok(cached);
}

// 4. Call LLM
let response = llm.complete(CompletionRequest {
    messages,
    tools,
    temperature: config.temperature,
    max_tokens: config.max_tokens,
}).await?;

// 5. Cache response
semantic_cache.set(&messages, &response).await?;

BASIC Runtime

Compilation Pipeline

Source Code (.bas)
       │
       ▼
┌──────────────┐
│    Lexer     │──▶ Tokens
└──────────────┘
       │
       ▼
┌──────────────┐
│    Parser    │──▶ AST
└──────────────┘
       │
       ▼
┌──────────────┐
│   Analyzer   │──▶ Validated AST
└──────────────┘
       │
       ▼
┌──────────────┐
│   Runtime    │──▶ Execution
└──────────────┘

Execution Context

pub struct RuntimeContext {
    pub session: Session,
    pub variables: HashMap<String, Value>,
    pub tools: Vec<Tool>,
    pub kb: Vec<KnowledgeBase>,
    pub state: Arc<AppState>,
}

Security Architecture

Authentication Flow

Client Request
       │
       ▼
┌──────────────┐
│   Extract    │
│    Token     │
└──────┬───────┘
       │
       ▼
┌──────────────┐     ┌──────────────┐
│   Validate   │────▶│   Zitadel    │
│     JWT      │     │   (OIDC)     │
└──────┬───────┘     └──────────────┘
       │
       ▼
┌──────────────┐
│  Check       │
│  Permissions │
└──────┬───────┘
       │
       ▼
Handler Execution

Security Layers

  1. Transport: TLS 1.3 (rustls)
  2. Authentication: JWT/OAuth 2.0 (Zitadel)
  3. Authorization: Role-based access control
  4. Rate Limiting: Per-IP token bucket
  5. Input Validation: Type-safe parameters
  6. Output Sanitization: HTML escaping

Deployment Architecture

Single Instance

┌─────────────────────────────────────┐
│           Single Server             │
│  ┌─────────────────────────────┐   │
│  │        BotServer            │   │
│  └─────────────────────────────┘   │
│  ┌────────┐ ┌────────┐ ┌───────┐   │
│  │PostgreSQL│ │ Redis │ │ MinIO │   │
│  └────────┘ └────────┘ └───────┘   │
└─────────────────────────────────────┘

Clustered

┌─────────────────────────────────────────────────────┐
│                   Load Balancer                      │
└───────────────────────┬─────────────────────────────┘
                        │
        ┌───────────────┼───────────────┐
        │               │               │
   ┌────▼────┐    ┌─────▼────┐    ┌────▼────┐
   │BotServer│    │BotServer │    │BotServer│
   │   #1    │    │    #2    │    │   #3    │
   └────┬────┘    └─────┬────┘    └────┬────┘
        │               │               │
        └───────────────┼───────────────┘
                        │
   ┌────────────────────┼────────────────────┐
   │                    │                    │
┌──▼───┐          ┌─────▼────┐         ┌────▼────┐
│  PG  │          │  Redis   │         │   S3    │
│Cluster│         │  Cluster │         │ Cluster │
└──────┘          └──────────┘         └─────────┘

Performance Characteristics

Operation Latency Throughput
REST API < 10ms 10,000 req/s
WebSocket message < 5ms 50,000 msg/s
KB search < 50ms 1,000 req/s
LLM call (cached) < 20ms 5,000 req/s
LLM call (uncached) 500-3000ms 50 req/s

Monitoring Points

Metric Description
http_requests_total Total HTTP requests
http_request_duration Request latency
ws_connections Active WebSocket connections
llm_requests_total LLM API calls
llm_cache_hits Semantic cache hit rate
db_pool_size Active DB connections
memory_usage Process memory

Extension Points

Adding a New Module

  1. Create module in src/
  2. Define routes in mod.rs
  3. Register in lib.rs
  4. Add to router in main.rs

Adding a New LLM Provider

  1. Implement LlmProvider trait
  2. Add provider enum variant
  3. Register in provider factory

Adding a New BASIC Keyword

  1. Add keyword to lexer
  2. Implement AST node
  3. Add runtime handler
  4. Update documentation