diff --git a/src/13-devices/README.md b/src/13-devices/README.md
new file mode 100644
index 00000000..4153b4ac
--- /dev/null
+++ b/src/13-devices/README.md
@@ -0,0 +1,54 @@
+# Chapter 13: Device & Offline Deployment
+
+Deploy General Bots to any device - from smartphones to Raspberry Pi to industrial kiosks - with local LLM inference for fully offline AI capabilities.
+
+## Overview
+
+General Bots can run on any device, from mobile phones to minimal embedded hardware with displays as small as 16x2 character LCDs, enabling AI-powered interactions anywhere:
+
+- **Kiosks** - Self-service terminals in stores, airports, hospitals
+- **Industrial IoT** - Factory floor assistants, machine interfaces
+- **Smart Home** - Wall panels, kitchen displays, door intercoms
+- **Retail** - Point-of-sale systems, product information terminals
+- **Education** - Classroom assistants, lab equipment interfaces
+- **Healthcare** - Patient check-in, medication reminders
+
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                         Embedded GB Architecture                             │
+├─────────────────────────────────────────────────────────────────────────────┤
+│                                                                              │
+│    ┌──────────────┐     ┌──────────────┐     ┌──────────────┐              │
+│    │   Display    │     │  botserver   │     │  llama.cpp   │              │
+│    │  LCD/OLED    │────▶│   (Rust)     │────▶│   (Local)    │              │
+│    │   TFT/HDMI   │     │  Port 8088   │     │  Port 8080   │              │
+│    └──────────────┘     └──────────────┘     └──────────────┘              │
+│           │                    │                    │                       │
+│           │                    │                    │                       │
+│    ┌──────▼──────┐     ┌──────▼──────┐     ┌──────▼──────┐              │
+│    │  Keyboard   │     │   SQLite    │     │  TinyLlama  │              │
+│    │  Buttons    │     │   (Data)    │     │    GGUF     │              │
+│    │  Touch      │     │             │     │   (~700MB)  │              │
+│    └─────────────┘     └─────────────┘     └─────────────┘              │
+│                                                                              │
+└─────────────────────────────────────────────────────────────────────────────┘
+```
+
+## What's in This Chapter
+
+### Mobile Deployment
+- [Mobile (Android & HarmonyOS)](./mobile.md) - BotOS for smartphones and tablets
+
+### Embedded Deployment  
+- [Supported Hardware](./hardware.md) - SBCs, displays, and peripherals
+- [Quick Start](./quick-start.md) - Deploy in 5 minutes
+- [Local LLM](./local-llm.md) - Offline AI with llama.cpp
+
+### Deployment Options
+
+| Platform | Use Case | Requirements |
+|----------|----------|--------------|
+| **Android/HarmonyOS** | Smartphones, tablets, kiosks | Any Android 8+ device |
+| **Raspberry Pi** | IoT, displays, terminals | 1GB+ RAM |
+| **Orange Pi** | Full offline AI | 4GB+ RAM for LLM |
+| **Industrial** | Factory, retail, healthcare | Any ARM/x86 SBC |
diff --git a/src/13-devices/hardware.md b/src/13-devices/hardware.md
new file mode 100644
index 00000000..f8d6d416
--- /dev/null
+++ b/src/13-devices/hardware.md
@@ -0,0 +1,190 @@
+# Supported Hardware
+
+## Single Board Computers (SBCs)
+
+### Recommended Boards
+
+| Board | CPU | RAM | Best For | Price |
+|-------|-----|-----|----------|-------|
+| **Orange Pi 5** | RK3588S | 4-16GB | Full LLM, NPU accel | $89-149 |
+| **Raspberry Pi 5** | BCM2712 | 4-8GB | General purpose | $60-80 |
+| **Orange Pi Zero 3** | H618 | 1-4GB | Minimal deployments | $20-35 |
+| **Raspberry Pi 4** | BCM2711 | 2-8GB | Established ecosystem | $45-75 |
+| **Raspberry Pi Zero 2W** | RP3A0 | 512MB | Ultra-compact | $15 |
+| **Rock Pi 4** | RK3399 | 4GB | NPU available | $75 |
+| **NVIDIA Jetson Nano** | Tegra X1 | 4GB | GPU inference | $149 |
+| **BeagleBone Black** | AM3358 | 512MB | Industrial | $55 |
+| **LattePanda 3 Delta** | N100 | 8GB | x86 compatibility | $269 |
+| **ODROID-N2+** | S922X | 4GB | High performance | $79 |
+
+### Minimum Requirements
+
+**For UI only (connect to remote botserver):**
+- Any ARM/x86 Linux board
+- 256MB RAM
+- Network connection
+- Display output
+
+**For local botserver:**
+- ARM64 or x86_64
+- 1GB RAM minimum
+- 4GB storage
+
+**For local LLM (llama.cpp):**
+- ARM64 or x86_64
+- 2GB+ RAM (4GB recommended)
+- 2GB+ storage for model
+
+### Orange Pi 5 (Recommended for LLM)
+
+The Orange Pi 5 with RK3588S is ideal for embedded LLM:
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│  Orange Pi 5 - Best for Offline AI                          │
+├─────────────────────────────────────────────────────────────┤
+│  CPU: Rockchip RK3588S (4x A76 + 4x A55)                   │
+│  NPU: 6 TOPS (Neural Processing Unit)                       │
+│  GPU: Mali-G610 MP4                                         │
+│  RAM: 4GB / 8GB / 16GB LPDDR4X                             │
+│  Storage: M.2 NVMe + eMMC + microSD                        │
+│                                                             │
+│  LLM Performance:                                           │
+│  ├─ TinyLlama 1.1B Q4: ~8-12 tokens/sec                    │
+│  ├─ Phi-2 2.7B Q4: ~4-6 tokens/sec                         │
+│  └─ With NPU (rkllm): ~20-30 tokens/sec                    │
+└─────────────────────────────────────────────────────────────┘
+```
+
+## Displays
+
+### Character LCDs (Minimal)
+
+For text-only interfaces:
+
+| Display | Resolution | Interface | Use Case |
+|---------|------------|-----------|----------|
+| HD44780 16x2 | 16 chars × 2 lines | I2C/GPIO | Status, simple Q&A |
+| HD44780 20x4 | 20 chars × 4 lines | I2C/GPIO | More context |
+| LCD2004 | 20 chars × 4 lines | I2C | Industrial |
+
+**Example output on 16x2:**
+```
+┌────────────────┐
+│> How can I help│
+│< Processing... │
+└────────────────┘
+```
+
+### OLED Displays
+
+For graphical monochrome interfaces:
+
+| Display | Resolution | Interface | Size |
+|---------|------------|-----------|------|
+| SSD1306 | 128×64 | I2C/SPI | 0.96" |
+| SSD1309 | 128×64 | I2C/SPI | 2.42" |
+| SH1106 | 128×64 | I2C/SPI | 1.3" |
+| SSD1322 | 256×64 | SPI | 3.12" |
+
+### TFT/IPS Color Displays
+
+For full graphical interface:
+
+| Display | Resolution | Interface | Notes |
+|---------|------------|-----------|-------|
+| ILI9341 | 320×240 | SPI | Common, cheap |
+| ST7789 | 240×320 | SPI | Fast refresh |
+| ILI9488 | 480×320 | SPI | Larger |
+| Waveshare 5" | 800×480 | HDMI | Touch optional |
+| Waveshare 7" | 1024×600 | HDMI | Touch, IPS |
+| Official Pi 7" | 800×480 | DSI | Best for Pi |
+
+### E-Ink/E-Paper
+
+For low-power, readable in sunlight:
+
+| Display | Resolution | Colors | Refresh |
+|---------|------------|--------|---------|
+| Waveshare 2.13" | 250×122 | B/W | 2s |
+| Waveshare 4.2" | 400×300 | B/W | 4s |
+| Waveshare 7.5" | 800×480 | B/W | 5s |
+| Good Display 9.7" | 1200×825 | B/W | 6s |
+
+**Best for:** Menu displays, signs, low-update applications
+
+### Industrial Displays
+
+| Display | Resolution | Features |
+|---------|------------|----------|
+| Advantech | Various | Wide temp, sunlight |
+| Winstar | Various | Industrial grade |
+| Newhaven | Various | Long availability |
+
+## Input Devices
+
+### Keyboards
+
+- **USB Keyboard** - Standard, any USB keyboard works
+- **PS/2 Keyboard** - Via adapter, lower latency
+- **Matrix Keypad** - 4x4 or 3x4, GPIO connected
+- **I2C Keypad** - Fewer GPIO pins needed
+
+### Touch Input
+
+- **Capacitive Touch** - Better response, needs driver
+- **Resistive Touch** - Works with gloves, pressure-based
+- **IR Touch Frame** - Large displays, vandal-resistant
+
+### Buttons & GPIO
+
+```
+┌─────────────────────────────────────────────┐
+│  Simple 4-Button Interface                   │
+├─────────────────────────────────────────────┤
+│                                              │
+│   [◄ PREV]  [▲ UP]  [▼ DOWN]  [► SELECT]   │
+│                                              │
+│   GPIO 17   GPIO 27  GPIO 22   GPIO 23      │
+│                                              │
+└─────────────────────────────────────────────┘
+```
+
+## Enclosures
+
+### Commercial Options
+
+- **Hammond Manufacturing** - Industrial metal enclosures
+- **Polycase** - Plastic, IP65 rated
+- **Bud Industries** - Various sizes
+- **Pi-specific cases** - Argon, Flirc, etc.
+
+### DIY Options
+
+- **3D Printed** - Custom fit, PLA/PETG
+- **Laser Cut** - Acrylic, wood
+- **Metal Fabrication** - Professional look
+
+## Power
+
+### Power Requirements
+
+| Configuration | Power | Recommended PSU |
+|---------------|-------|-----------------|
+| Pi Zero + LCD | 1-2W | 5V 1A |
+| Pi 4 + Display | 5-10W | 5V 3A |
+| Orange Pi 5 | 8-15W | 5V 4A or 12V 2A |
+| With NVMe SSD | +2-3W | Add 1A headroom |
+
+### Power Options
+
+- **USB-C PD** - Modern, efficient
+- **PoE HAT** - Power over Ethernet
+- **12V Barrel** - Industrial standard
+- **Battery** - UPS, solar applications
+
+### UPS Solutions
+
+- **PiJuice** - Pi-specific UPS HAT
+- **UPS PIco** - Small form factor
+- **Powerboost** - Adafruit, lithium battery
diff --git a/src/13-devices/local-llm.md b/src/13-devices/local-llm.md
new file mode 100644
index 00000000..3feb4199
--- /dev/null
+++ b/src/13-devices/local-llm.md
@@ -0,0 +1,382 @@
+# Local LLM - Offline AI with llama.cpp
+
+Run AI inference completely offline on embedded devices. No internet, no API costs, full privacy.
+
+## Overview
+
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                        Local LLM Architecture                                │
+├─────────────────────────────────────────────────────────────────────────────┤
+│                                                                              │
+│   User Input ──▶ botserver ──▶ llama.cpp ──▶ Response                       │
+│                      │              │                                        │
+│                      │         ┌────┴────┐                                   │
+│                      │         │  Model  │                                   │
+│                      │         │  GGUF   │                                   │
+│                      │         │ (Q4_K)  │                                   │
+│                      │         └─────────┘                                   │
+│                      │                                                       │
+│                 SQLite DB                                                    │
+│                (sessions)                                                    │
+│                                                                              │
+└─────────────────────────────────────────────────────────────────────────────┘
+```
+
+## Recommended Models
+
+### By Device RAM
+
+| RAM | Model | Size | Speed | Quality |
+|-----|-------|------|-------|---------|
+| **2GB** | TinyLlama 1.1B Q4_K_M | 670MB | ~5 tok/s | Basic |
+| **4GB** | Phi-2 2.7B Q4_K_M | 1.6GB | ~3-4 tok/s | Good |
+| **4GB** | Gemma 2B Q4_K_M | 1.4GB | ~4 tok/s | Good |
+| **8GB** | Llama 3.2 3B Q4_K_M | 2GB | ~3 tok/s | Better |
+| **8GB** | Mistral 7B Q4_K_M | 4.1GB | ~2 tok/s | Great |
+| **16GB** | Llama 3.1 8B Q4_K_M | 4.7GB | ~2 tok/s | Excellent |
+
+### By Use Case
+
+**Simple Q&A, Commands:**
+```
+TinyLlama 1.1B - Fast, basic understanding
+```
+
+**Customer Service, FAQ:**
+```
+Phi-2 or Gemma 2B - Good comprehension, reasonable speed
+```
+
+**Complex Reasoning:**
+```
+Llama 3.2 3B or Mistral 7B - Better accuracy, slower
+```
+
+## Installation
+
+### Automatic (via deploy script)
+
+```bash
+./scripts/deploy-embedded.sh pi@device --with-llama
+```
+
+### Manual Installation
+
+```bash
+# SSH to device
+ssh pi@raspberrypi.local
+
+# Install dependencies
+sudo apt update
+sudo apt install -y build-essential cmake git wget
+
+# Clone llama.cpp
+cd /opt
+sudo git clone https://github.com/ggerganov/llama.cpp
+sudo chown -R $(whoami):$(whoami) llama.cpp
+cd llama.cpp
+
+# Build for ARM (auto-optimizes)
+mkdir build && cd build
+cmake .. -DLLAMA_NATIVE=ON -DCMAKE_BUILD_TYPE=Release
+make -j$(nproc)
+
+# Download model
+mkdir -p /opt/llama.cpp/models
+cd /opt/llama.cpp/models
+wget https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf
+```
+
+### Start Server
+
+```bash
+# Test run
+/opt/llama.cpp/build/bin/llama-server \
+    -m /opt/llama.cpp/models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf \
+    --host 0.0.0.0 \
+    --port 8080 \
+    -c 2048 \
+    --threads 4
+
+# Verify
+curl http://localhost:8080/v1/models
+```
+
+### Systemd Service
+
+Create `/etc/systemd/system/llama-server.service`:
+
+```ini
+[Unit]
+Description=llama.cpp Server - Local LLM
+After=network.target
+
+[Service]
+Type=simple
+User=root
+WorkingDirectory=/opt/llama.cpp
+ExecStart=/opt/llama.cpp/build/bin/llama-server \
+    -m /opt/llama.cpp/models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf \
+    --host 0.0.0.0 \
+    --port 8080 \
+    -c 2048 \
+    -ngl 0 \
+    --threads 4
+Restart=always
+RestartSec=5
+
+[Install]
+WantedBy=multi-user.target
+```
+
+Enable and start:
+```bash
+sudo systemctl daemon-reload
+sudo systemctl enable llama-server
+sudo systemctl start llama-server
+```
+
+## Configuration
+
+### botserver .env
+
+```env
+# Use local llama.cpp
+LLM_PROVIDER=llamacpp
+LLM_API_URL=http://127.0.0.1:8080
+LLM_MODEL=tinyllama
+
+# Memory limits
+MAX_CONTEXT_TOKENS=2048
+MAX_RESPONSE_TOKENS=512
+STREAMING_ENABLED=true
+```
+
+### llama.cpp Parameters
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `-c` | 2048 | Context size (tokens) |
+| `--threads` | 4 | CPU threads |
+| `-ngl` | 0 | GPU layers (0 for CPU only) |
+| `--host` | 127.0.0.1 | Bind address |
+| `--port` | 8080 | Server port |
+| `-b` | 512 | Batch size |
+| `--mlock` | off | Lock model in RAM |
+
+### Memory vs Context Size
+
+```
+Context 512:  ~400MB RAM, fast, limited conversation
+Context 1024: ~600MB RAM, moderate
+Context 2048: ~900MB RAM, good for most uses
+Context 4096: ~1.5GB RAM, long conversations
+```
+
+## Performance Optimization
+
+### CPU Optimization
+
+```bash
+# Check CPU features
+cat /proc/cpuinfo | grep -E "(model name|Features)"
+
+# Build with specific optimizations
+cmake .. -DLLAMA_NATIVE=ON \
+         -DCMAKE_BUILD_TYPE=Release \
+         -DLLAMA_ARM_FMA=ON \
+         -DLLAMA_ARM_DOTPROD=ON
+```
+
+### Memory Optimization
+
+```bash
+# For 2GB RAM devices
+# Use smaller context
+-c 1024
+
+# Use memory mapping (slower but less RAM)
+--mmap
+
+# Disable mlock (don't pin to RAM)
+# (default is disabled)
+```
+
+### Swap Configuration
+
+For devices with limited RAM:
+
+```bash
+# Create 2GB swap
+sudo fallocate -l 2G /swapfile
+sudo chmod 600 /swapfile
+sudo mkswap /swapfile
+sudo swapon /swapfile
+
+# Make permanent
+echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
+
+# Optimize swap usage
+echo 'vm.swappiness=10' | sudo tee -a /etc/sysctl.conf
+```
+
+## NPU Acceleration (Orange Pi 5)
+
+Orange Pi 5 has a 6 TOPS NPU that can accelerate inference:
+
+### Using rkllm (Rockchip NPU)
+
+```bash
+# Install rkllm runtime
+git clone https://github.com/airockchip/rknn-llm
+cd rknn-llm
+./install.sh
+
+# Convert model to RKNN format
+python3 convert_model.py \
+    --model tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf \
+    --output tinyllama.rkllm
+
+# Run with NPU
+rkllm-server \
+    --model tinyllama.rkllm \
+    --port 8080
+```
+
+Expected speedup: **3-5x faster** than CPU only.
+
+## Model Download URLs
+
+### TinyLlama 1.1B (Recommended for 2GB)
+```bash
+wget https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf
+```
+
+### Phi-2 2.7B (Recommended for 4GB)
+```bash
+wget https://huggingface.co/TheBloke/phi-2-GGUF/resolve/main/phi-2.Q4_K_M.gguf
+```
+
+### Gemma 2B
+```bash
+wget https://huggingface.co/bartowski/gemma-2-2b-it-GGUF/resolve/main/gemma-2-2b-it-Q4_K_M.gguf
+```
+
+### Llama 3.2 3B (Recommended for 8GB)
+```bash
+wget https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF/resolve/main/Llama-3.2-3B-Instruct-Q4_K_M.gguf
+```
+
+### Mistral 7B
+```bash
+wget https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/resolve/main/mistral-7b-instruct-v0.2.Q4_K_M.gguf
+```
+
+## API Usage
+
+llama.cpp exposes an OpenAI-compatible API:
+
+### Chat Completion
+
+```bash
+curl http://localhost:8080/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "tinyllama",
+    "messages": [
+      {"role": "user", "content": "What is 2+2?"}
+    ],
+    "max_tokens": 100
+  }'
+```
+
+### Streaming
+
+```bash
+curl http://localhost:8080/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "tinyllama",
+    "messages": [{"role": "user", "content": "Tell me a story"}],
+    "stream": true
+  }'
+```
+
+### Health Check
+
+```bash
+curl http://localhost:8080/health
+curl http://localhost:8080/v1/models
+```
+
+## Monitoring
+
+### Check Performance
+
+```bash
+# Watch resource usage
+htop
+
+# Check inference speed in logs
+sudo journalctl -u llama-server -f | grep "tokens/s"
+
+# Memory usage
+free -h
+```
+
+### Benchmarking
+
+```bash
+# Run llama.cpp benchmark
+/opt/llama.cpp/build/bin/llama-bench \
+    -m /opt/llama.cpp/models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf \
+    -p 512 -n 128 -t 4
+```
+
+## Troubleshooting
+
+### Model Loading Fails
+
+```bash
+# Check available RAM
+free -h
+
+# Try smaller context
+-c 512
+
+# Use memory mapping
+--mmap
+```
+
+### Slow Inference
+
+```bash
+# Increase threads (up to CPU cores)
+--threads $(nproc)
+
+# Use optimized build
+cmake .. -DLLAMA_NATIVE=ON
+
+# Consider smaller model
+```
+
+### Out of Memory Killer
+
+```bash
+# Check if OOM killed the process
+dmesg | grep -i "killed process"
+
+# Increase swap
+# Use smaller model
+# Reduce context size
+```
+
+## Best Practices
+
+1. **Start small** - Begin with TinyLlama, upgrade if needed
+2. **Monitor memory** - Use `htop` during initial tests
+3. **Set appropriate context** - 1024-2048 for most embedded use
+4. **Use quantized models** - Q4_K_M is a good balance
+5. **Enable streaming** - Better UX on slow inference
+6. **Test offline** - Verify it works without internet before deployment
diff --git a/src/13-devices/mobile.md b/src/13-devices/mobile.md
new file mode 100644
index 00000000..6cc3c1ca
--- /dev/null
+++ b/src/13-devices/mobile.md
@@ -0,0 +1,323 @@
+# Mobile Deployment - Android & HarmonyOS
+
+Deploy General Bots as the primary interface on Android and HarmonyOS devices, transforming them into dedicated AI assistants.
+
+## Overview
+
+BotOS transforms any Android or HarmonyOS device into a dedicated General Bots system, removing manufacturer bloatware and installing GB as the default launcher.
+
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                         BotOS Architecture                                   │
+├─────────────────────────────────────────────────────────────────────────────┤
+│                                                                              │
+│    ┌──────────────────────────────────────────────────────────────────┐    │
+│    │                      BotOS App (Tauri)                            │    │
+│    ├──────────────────────────────────────────────────────────────────┤    │
+│    │  botui/ui/suite   │  Tauri Android   │  src/lib.rs (Rust)        │    │
+│    │  (Web Interface)  │  (WebView + NDK) │  (Backend + Hardware)     │    │
+│    └──────────────────────────────────────────────────────────────────┘    │
+│                              │                                              │
+│    ┌─────────────────────────┴────────────────────────────┐                │
+│    │              Android/HarmonyOS System                 │                │
+│    │  ┌─────────┐  ┌──────────┐  ┌────────┐  ┌─────────┐ │                │
+│    │  │ Camera  │  │   GPS    │  │  WiFi  │  │ Storage │ │                │
+│    │  └─────────┘  └──────────┘  └────────┘  └─────────┘ │                │
+│    └───────────────────────────────────────────────────────┘                │
+│                                                                              │
+└─────────────────────────────────────────────────────────────────────────────┘
+```
+
+## Supported Platforms
+
+### Android
+- **AOSP** - Pure Android
+- **Samsung One UI** - Galaxy devices
+- **Xiaomi MIUI** - Mi, Redmi, Poco
+- **OPPO ColorOS** - OPPO, OnePlus, Realme
+- **Vivo Funtouch/OriginOS**
+- **Google Pixel**
+
+### HarmonyOS
+- **Huawei** - P series, Mate series, Nova
+- **Honor** - Magic series, X series
+
+## Installation Levels
+
+| Level | Requirements | What It Does |
+|-------|-------------|--------------|
+| **Level 1** | ADB only | Removes bloatware, installs BotOS as app |
+| **Level 2** | Root + Magisk | GB boot animation, BotOS as system app |
+| **Level 3** | Unlocked bootloader | Full Android replacement with BotOS |
+
+## Quick Installation
+
+### Level 1: Debloat + App (No Root)
+
+```bash
+# Clone botos repository
+git clone https://github.com/GeneralBots/botos.git
+cd botos/rom
+
+# Connect device via USB (enable USB debugging first)
+./install.sh
+```
+
+The interactive installer will:
+1. Detect your device and manufacturer
+2. Remove bloatware automatically
+3. Install BotOS APK
+4. Optionally set as default launcher
+
+### Level 2: Magisk Module (Root Required)
+
+```bash
+# Generate Magisk module
+cd botos/rom/scripts
+./build-magisk-module.sh
+
+# Copy to device
+adb push botos-magisk-v1.0.zip /sdcard/
+
+# Install via Magisk app
+# Magisk → Modules → + → Select ZIP → Reboot
+```
+
+This adds:
+- Custom boot animation
+- BotOS as system app (privileged permissions)
+- Debloat via overlay
+
+### Level 3: GSI (Full Replacement)
+
+For advanced users with unlocked bootloader. See `botos/rom/gsi/README.md`.
+
+## Bloatware Removed
+
+### Samsung One UI
+- Bixby, Samsung Pay, Samsung Pass
+- Duplicate apps (Email, Calendar, Browser)
+- AR Zone, Game Launcher
+- Samsung Free, Samsung Global Goals
+
+### Huawei EMUI/HarmonyOS
+- AppGallery, HiCloud, HiCar
+- Huawei Browser, Music, Video
+- Petal Maps, Petal Search
+- AI Life, HiSuite
+
+### Honor MagicOS
+- Honor Store, MagicRing
+- Honor Browser, Music
+
+### Xiaomi MIUI
+- MSA (analytics), Mi Apps
+- GetApps, Mi Cloud
+- Mi Browser, Mi Music
+
+### Universal (All Devices)
+- Pre-installed Facebook, Instagram
+- Pre-installed Netflix, Spotify
+- Games like Candy Crush
+- Carrier bloatware
+
+## Building from Source
+
+### Prerequisites
+
+```bash
+# Install Rust and Android targets
+rustup target add aarch64-linux-android armv7-linux-androideabi
+
+# Set up Android SDK/NDK
+export ANDROID_HOME=$HOME/Android/Sdk
+export NDK_HOME=$ANDROID_HOME/ndk/25.2.9519653
+
+# Install Tauri CLI
+cargo install tauri-cli
+
+# For icons/boot animation
+sudo apt install librsvg2-bin imagemagick
+```
+
+### Build APK
+
+```bash
+cd botos
+
+# Generate icons from SVG
+./scripts/generate-icons.sh
+
+# Initialize Android project
+cargo tauri android init
+
+# Build release APK
+cargo tauri android build --release
+```
+
+Output: `gen/android/app/build/outputs/apk/release/app-release.apk`
+
+### Development Mode
+
+```bash
+# Connect device and run
+cargo tauri android dev
+
+# Watch logs
+adb logcat -s BotOS:*
+```
+
+## Configuration
+
+### AndroidManifest.xml
+
+BotOS is configured as a launcher:
+
+```xml
+<intent-filter>
+    <action android:name="android.intent.action.MAIN" />
+    <category android:name="android.intent.category.HOME" />
+    <category android:name="android.intent.category.DEFAULT" />
+    <category android:name="android.intent.category.LAUNCHER" />
+</intent-filter>
+```
+
+### Permissions
+
+Default capabilities in `capabilities/default.json`:
+- Internet access
+- Camera (for QR codes, photos)
+- Location (GPS)
+- Storage (files)
+- Notifications
+
+### Connecting to Server
+
+Edit the embedded URL in `tauri.conf.json`:
+
+```json
+{
+  "build": {
+    "frontendDist": "../botui/ui/suite"
+  }
+}
+```
+
+Or configure botserver URL at runtime:
+```javascript
+window.BOTSERVER_URL = "https://your-server.com";
+```
+
+## Boot Animation
+
+Create custom boot animation with GB branding:
+
+```bash
+# Generate animation
+cd botos/scripts
+./create-bootanimation.sh
+
+# Install (requires root)
+adb root
+adb remount
+adb push bootanimation.zip /system/media/
+adb reboot
+```
+
+## Project Structure
+
+```
+botos/
+├── Cargo.toml              # Rust/Tauri dependencies
+├── tauri.conf.json         # Tauri config → botui/ui/suite
+├── build.rs                # Build script
+├── src/lib.rs              # Android entry point
+│
+├── icons/
+│   ├── gb-bot.svg          # Source icon
+│   ├── icon.png            # Main icon (512x512)
+│   └── */ic_launcher.png   # Icons by density
+│
+├── scripts/
+│   ├── generate-icons.sh   # Generate PNGs from SVG
+│   └── create-bootanimation.sh
+│
+├── capabilities/
+│   └── default.json        # Tauri permissions
+│
+├── gen/android/            # Generated Android project
+│   └── app/src/main/
+│       ├── AndroidManifest.xml
+│       └── res/values/themes.xml
+│
+└── rom/                    # Installation tools
+    ├── install.sh          # Interactive installer
+    ├── scripts/
+    │   ├── debloat.sh      # Remove bloatware
+    │   └── build-magisk-module.sh
+    └── gsi/
+        └── README.md       # GSI instructions
+```
+
+## Offline Mode
+
+BotOS can work offline with local LLM:
+
+1. Install botserver on the device (see [Local LLM](./local-llm.md))
+2. Configure to use localhost:
+   ```javascript
+   window.BOTSERVER_URL = "http://127.0.0.1:8088";
+   ```
+3. Run llama.cpp with small model (TinyLlama on 4GB+ devices)
+
+## Use Cases
+
+### Dedicated Kiosk
+- Retail product information
+- Hotel check-in
+- Restaurant ordering
+- Museum guides
+
+### Enterprise Device
+- Field service assistant
+- Warehouse scanner with AI
+- Delivery driver companion
+- Healthcare bedside terminal
+
+### Consumer Device
+- Elder-friendly phone
+- Child-safe device
+- Single-purpose assistant
+- Smart home controller
+
+## Troubleshooting
+
+### App Won't Install
+```bash
+# Enable installation from unknown sources
+# Settings → Security → Unknown Sources
+
+# Or use ADB
+adb install -r botos.apk
+```
+
+### Debloat Not Working
+```bash
+# Some packages require root
+# Use Level 2 (Magisk) for complete removal
+
+# Check which packages failed
+adb shell pm list packages | grep <manufacturer>
+```
+
+### Boot Loop After GSI
+```bash
+# Boot into recovery
+# Wipe data/factory reset
+# Reflash stock ROM
+```
+
+### WebView Crashes
+```bash
+# Update Android System WebView
+adb shell pm enable com.google.android.webview
diff --git a/src/13-devices/quick-start.md b/src/13-devices/quick-start.md
new file mode 100644
index 00000000..b7a15fb2
--- /dev/null
+++ b/src/13-devices/quick-start.md
@@ -0,0 +1,209 @@
+# Quick Start - Deploy in 5 Minutes
+
+Get General Bots running on your embedded device with local AI in just a few commands.
+
+## Prerequisites
+
+- An SBC (Raspberry Pi, Orange Pi, etc.) with Armbian/Raspbian
+- SSH access to the device
+- Internet connection (for initial setup only)
+
+## One-Line Deploy
+
+From your development machine:
+
+```bash
+# Clone and run the deployment script
+git clone https://github.com/GeneralBots/botserver.git
+cd botserver
+
+# Deploy to Orange Pi (replace with your device IP)
+./scripts/deploy-embedded.sh orangepi@192.168.1.100 --with-ui --with-llama
+```
+
+That's it! After ~10-15 minutes:
+- BotServer runs on port 8088
+- llama.cpp runs on port 8080 with TinyLlama
+- Embedded UI available at `http://your-device:8088/embedded/`
+
+## Step-by-Step Guide
+
+### Step 1: Prepare Your Device
+
+Flash your SBC with a compatible OS:
+
+**Raspberry Pi:**
+```bash
+# Download Raspberry Pi Imager
+# Select: Raspberry Pi OS Lite (64-bit)
+# Enable SSH in settings
+```
+
+**Orange Pi:**
+```bash
+# Download Armbian from armbian.com
+# Flash with balenaEtcher
+```
+
+### Step 2: First Boot Configuration
+
+```bash
+# SSH into your device
+ssh pi@raspberrypi.local  # or orangepi@orangepi.local
+
+# Update system
+sudo apt update && sudo apt upgrade -y
+
+# Set timezone
+sudo timedatectl set-timezone America/Sao_Paulo
+
+# Enable I2C/SPI if using GPIO displays
+sudo raspi-config  # or armbian-config
+```
+
+### Step 3: Run Deployment Script
+
+From your development PC:
+
+```bash
+# Basic deployment (botserver only)
+./scripts/deploy-embedded.sh pi@raspberrypi.local
+
+# With embedded UI
+./scripts/deploy-embedded.sh pi@raspberrypi.local --with-ui
+
+# With local LLM (requires 4GB+ RAM)
+./scripts/deploy-embedded.sh pi@raspberrypi.local --with-ui --with-llama
+
+# Specify a different model
+./scripts/deploy-embedded.sh pi@raspberrypi.local --with-llama --model phi-2-Q4_K_M.gguf
+```
+
+### Step 4: Verify Installation
+
+```bash
+# Check services
+ssh pi@raspberrypi.local 'sudo systemctl status botserver'
+ssh pi@raspberrypi.local 'sudo systemctl status llama-server'
+
+# Test botserver
+curl http://raspberrypi.local:8088/health
+
+# Test llama.cpp
+curl http://raspberrypi.local:8080/v1/models
+```
+
+### Step 5: Access the Interface
+
+Open in your browser:
+```
+http://raspberrypi.local:8088/embedded/
+```
+
+Or set up kiosk mode (auto-starts on boot):
+```bash
+# Already configured if you used --with-ui
+# Just reboot:
+ssh pi@raspberrypi.local 'sudo reboot'
+```
+
+## Local Installation (On the Device)
+
+If you prefer to install directly on the device:
+
+```bash
+# SSH into the device
+ssh pi@raspberrypi.local
+
+# Install Rust
+curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
+source ~/.cargo/env
+
+# Clone and build
+git clone https://github.com/GeneralBots/botserver.git
+cd botserver
+
+# Run local deployment
+./scripts/deploy-embedded.sh --local --with-ui --with-llama
+```
+
+⚠️ **Note:** Building on ARM devices is slow (1-2 hours). Cross-compilation is faster.
+
+## Configuration
+
+After deployment, edit the config file:
+
+```bash
+ssh pi@raspberrypi.local
+sudo nano /opt/botserver/.env
+```
+
+Key settings:
+```env
+# Server
+HOST=0.0.0.0
+PORT=8088
+
+# Local LLM
+LLM_PROVIDER=llamacpp
+LLM_API_URL=http://127.0.0.1:8080
+LLM_MODEL=tinyllama
+
+# Memory limits for small devices
+MAX_CONTEXT_TOKENS=2048
+MAX_RESPONSE_TOKENS=512
+```
+
+Restart after changes:
+```bash
+sudo systemctl restart botserver
+```
+
+## Troubleshooting
+
+### Out of Memory
+
+```bash
+# Check memory usage
+free -h
+
+# Reduce llama.cpp context
+sudo nano /etc/systemd/system/llama-server.service
+# Change -c 2048 to -c 1024
+
+# Or use a smaller model
+# TinyLlama uses ~700MB, Phi-2 uses ~1.6GB
+```
+
+### Service Won't Start
+
+```bash
+# Check logs
+sudo journalctl -u botserver -f
+sudo journalctl -u llama-server -f
+
+# Common issues:
+# - Port already in use
+# - Missing model file
+# - Database permissions
+```
+
+### Display Not Working
+
+```bash
+# Check if display is detected
+ls /dev/fb*       # HDMI/DSI
+ls /dev/i2c*      # I2C displays
+ls /dev/spidev*   # SPI displays
+
+# For HDMI, check config
+sudo nano /boot/config.txt  # Raspberry Pi
+sudo nano /boot/armbianEnv.txt  # Orange Pi
+```
+
+## Next Steps
+
+- [Embedded UI Guide](./embedded-ui.md) - Customize the interface
+- [Local LLM Configuration](./local-llm.md) - Optimize AI performance
+- [Kiosk Mode](./kiosk-mode.md) - Production deployment
+- [Offline Operation](./offline.md) - Disconnected environments
diff --git a/src/20-embedding/README.md b/src/20-embedding/README.md
index 69fce7df..21ffe7e0 100644
--- a/src/20-embedding/README.md
+++ b/src/20-embedding/README.md
@@ -1,47 +1 @@
-# Chapter 20: Embedded & Offline Deployment
-
-Deploy General Bots to any device - from Raspberry Pi to industrial kiosks - with local LLM inference for fully offline AI capabilities.
-
-## Overview
-
-General Bots can run on minimal hardware with displays as small as 16x2 character LCDs, enabling AI-powered interactions anywhere:
-
-- **Kiosks** - Self-service terminals in stores, airports, hospitals
-- **Industrial IoT** - Factory floor assistants, machine interfaces
-- **Smart Home** - Wall panels, kitchen displays, door intercoms
-- **Retail** - Point-of-sale systems, product information terminals
-- **Education** - Classroom assistants, lab equipment interfaces
-- **Healthcare** - Patient check-in, medication reminders
-
-```
-┌─────────────────────────────────────────────────────────────────────────────┐
-│                         Embedded GB Architecture                             │
-├─────────────────────────────────────────────────────────────────────────────┤
-│                                                                              │
-│    ┌──────────────┐     ┌──────────────┐     ┌──────────────┐              │
-│    │   Display    │     │  botserver   │     │  llama.cpp   │              │
-│    │  LCD/OLED    │────▶│   (Rust)     │────▶│   (Local)    │              │
-│    │   TFT/HDMI   │     │  Port 8088   │     │  Port 8080   │              │
-│    └──────────────┘     └──────────────┘     └──────────────┘              │
-│           │                    │                    │                       │
-│           │                    │                    │                       │
-│    ┌──────▼──────┐     ┌──────▼──────┐     ┌──────▼──────┐              │
-│    │  Keyboard   │     │   SQLite    │     │  TinyLlama  │              │
-│    │  Buttons    │     │   (Data)    │     │    GGUF     │              │
-│    │  Touch      │     │             │     │   (~700MB)  │              │
-│    └─────────────┘     └─────────────┘     └─────────────┘              │
-│                                                                              │
-└─────────────────────────────────────────────────────────────────────────────┘
-```
-
-## What's in This Chapter
-
-- [Supported Hardware](./hardware.md) - Boards, displays, and peripherals
-- [Quick Start](./quick-start.md) - Deploy in 5 minutes
-- [Embedded UI](./embedded-ui.md) - Interface for small displays
-- [Local LLM](./local-llm.md) - Offline AI with llama.cpp
-- [Display Modes](./display-modes.md) - LCD, OLED, TFT, E-ink configurations
-- [Kiosk Mode](./kiosk-mode.md) - Locked-down production deployments
-- [Performance Tuning](./performance.md) - Optimize for limited resources
-- [Offline Operation](./offline.md) - No internet required
-- [Use Cases](./use-cases.md) - Real-world deployment examples
+# Chapter 20: Embedded Deployment
diff --git a/src/20-embedding/hardware.md b/src/20-embedding/hardware.md
index f8d6d416..1bbe111c 100644
--- a/src/20-embedding/hardware.md
+++ b/src/20-embedding/hardware.md
@@ -1,190 +1 @@
 # Supported Hardware
-
-## Single Board Computers (SBCs)
-
-### Recommended Boards
-
-| Board | CPU | RAM | Best For | Price |
-|-------|-----|-----|----------|-------|
-| **Orange Pi 5** | RK3588S | 4-16GB | Full LLM, NPU accel | $89-149 |
-| **Raspberry Pi 5** | BCM2712 | 4-8GB | General purpose | $60-80 |
-| **Orange Pi Zero 3** | H618 | 1-4GB | Minimal deployments | $20-35 |
-| **Raspberry Pi 4** | BCM2711 | 2-8GB | Established ecosystem | $45-75 |
-| **Raspberry Pi Zero 2W** | RP3A0 | 512MB | Ultra-compact | $15 |
-| **Rock Pi 4** | RK3399 | 4GB | NPU available | $75 |
-| **NVIDIA Jetson Nano** | Tegra X1 | 4GB | GPU inference | $149 |
-| **BeagleBone Black** | AM3358 | 512MB | Industrial | $55 |
-| **LattePanda 3 Delta** | N100 | 8GB | x86 compatibility | $269 |
-| **ODROID-N2+** | S922X | 4GB | High performance | $79 |
-
-### Minimum Requirements
-
-**For UI only (connect to remote botserver):**
-- Any ARM/x86 Linux board
-- 256MB RAM
-- Network connection
-- Display output
-
-**For local botserver:**
-- ARM64 or x86_64
-- 1GB RAM minimum
-- 4GB storage
-
-**For local LLM (llama.cpp):**
-- ARM64 or x86_64
-- 2GB+ RAM (4GB recommended)
-- 2GB+ storage for model
-
-### Orange Pi 5 (Recommended for LLM)
-
-The Orange Pi 5 with RK3588S is ideal for embedded LLM:
-
-```
-┌─────────────────────────────────────────────────────────────┐
-│  Orange Pi 5 - Best for Offline AI                          │
-├─────────────────────────────────────────────────────────────┤
-│  CPU: Rockchip RK3588S (4x A76 + 4x A55)                   │
-│  NPU: 6 TOPS (Neural Processing Unit)                       │
-│  GPU: Mali-G610 MP4                                         │
-│  RAM: 4GB / 8GB / 16GB LPDDR4X                             │
-│  Storage: M.2 NVMe + eMMC + microSD                        │
-│                                                             │
-│  LLM Performance:                                           │
-│  ├─ TinyLlama 1.1B Q4: ~8-12 tokens/sec                    │
-│  ├─ Phi-2 2.7B Q4: ~4-6 tokens/sec                         │
-│  └─ With NPU (rkllm): ~20-30 tokens/sec                    │
-└─────────────────────────────────────────────────────────────┘
-```
-
-## Displays
-
-### Character LCDs (Minimal)
-
-For text-only interfaces:
-
-| Display | Resolution | Interface | Use Case |
-|---------|------------|-----------|----------|
-| HD44780 16x2 | 16 chars × 2 lines | I2C/GPIO | Status, simple Q&A |
-| HD44780 20x4 | 20 chars × 4 lines | I2C/GPIO | More context |
-| LCD2004 | 20 chars × 4 lines | I2C | Industrial |
-
-**Example output on 16x2:**
-```
-┌────────────────┐
-│> How can I help│
-│< Processing... │
-└────────────────┘
-```
-
-### OLED Displays
-
-For graphical monochrome interfaces:
-
-| Display | Resolution | Interface | Size |
-|---------|------------|-----------|------|
-| SSD1306 | 128×64 | I2C/SPI | 0.96" |
-| SSD1309 | 128×64 | I2C/SPI | 2.42" |
-| SH1106 | 128×64 | I2C/SPI | 1.3" |
-| SSD1322 | 256×64 | SPI | 3.12" |
-
-### TFT/IPS Color Displays
-
-For full graphical interface:
-
-| Display | Resolution | Interface | Notes |
-|---------|------------|-----------|-------|
-| ILI9341 | 320×240 | SPI | Common, cheap |
-| ST7789 | 240×320 | SPI | Fast refresh |
-| ILI9488 | 480×320 | SPI | Larger |
-| Waveshare 5" | 800×480 | HDMI | Touch optional |
-| Waveshare 7" | 1024×600 | HDMI | Touch, IPS |
-| Official Pi 7" | 800×480 | DSI | Best for Pi |
-
-### E-Ink/E-Paper
-
-For low-power, readable in sunlight:
-
-| Display | Resolution | Colors | Refresh |
-|---------|------------|--------|---------|
-| Waveshare 2.13" | 250×122 | B/W | 2s |
-| Waveshare 4.2" | 400×300 | B/W | 4s |
-| Waveshare 7.5" | 800×480 | B/W | 5s |
-| Good Display 9.7" | 1200×825 | B/W | 6s |
-
-**Best for:** Menu displays, signs, low-update applications
-
-### Industrial Displays
-
-| Display | Resolution | Features |
-|---------|------------|----------|
-| Advantech | Various | Wide temp, sunlight |
-| Winstar | Various | Industrial grade |
-| Newhaven | Various | Long availability |
-
-## Input Devices
-
-### Keyboards
-
-- **USB Keyboard** - Standard, any USB keyboard works
-- **PS/2 Keyboard** - Via adapter, lower latency
-- **Matrix Keypad** - 4x4 or 3x4, GPIO connected
-- **I2C Keypad** - Fewer GPIO pins needed
-
-### Touch Input
-
-- **Capacitive Touch** - Better response, needs driver
-- **Resistive Touch** - Works with gloves, pressure-based
-- **IR Touch Frame** - Large displays, vandal-resistant
-
-### Buttons & GPIO
-
-```
-┌─────────────────────────────────────────────┐
-│  Simple 4-Button Interface                   │
-├─────────────────────────────────────────────┤
-│                                              │
-│   [◄ PREV]  [▲ UP]  [▼ DOWN]  [► SELECT]   │
-│                                              │
-│   GPIO 17   GPIO 27  GPIO 22   GPIO 23      │
-│                                              │
-└─────────────────────────────────────────────┘
-```
-
-## Enclosures
-
-### Commercial Options
-
-- **Hammond Manufacturing** - Industrial metal enclosures
-- **Polycase** - Plastic, IP65 rated
-- **Bud Industries** - Various sizes
-- **Pi-specific cases** - Argon, Flirc, etc.
-
-### DIY Options
-
-- **3D Printed** - Custom fit, PLA/PETG
-- **Laser Cut** - Acrylic, wood
-- **Metal Fabrication** - Professional look
-
-## Power
-
-### Power Requirements
-
-| Configuration | Power | Recommended PSU |
-|---------------|-------|-----------------|
-| Pi Zero + LCD | 1-2W | 5V 1A |
-| Pi 4 + Display | 5-10W | 5V 3A |
-| Orange Pi 5 | 8-15W | 5V 4A or 12V 2A |
-| With NVMe SSD | +2-3W | Add 1A headroom |
-
-### Power Options
-
-- **USB-C PD** - Modern, efficient
-- **PoE HAT** - Power over Ethernet
-- **12V Barrel** - Industrial standard
-- **Battery** - UPS, solar applications
-
-### UPS Solutions
-
-- **PiJuice** - Pi-specific UPS HAT
-- **UPS PIco** - Small form factor
-- **Powerboost** - Adafruit, lithium battery
diff --git a/src/20-embedding/local-llm.md b/src/20-embedding/local-llm.md
index 3feb4199..86b1bc90 100644
--- a/src/20-embedding/local-llm.md
+++ b/src/20-embedding/local-llm.md
@@ -1,382 +1 @@
-# Local LLM - Offline AI with llama.cpp
-
-Run AI inference completely offline on embedded devices. No internet, no API costs, full privacy.
-
-## Overview
-
-```
-┌─────────────────────────────────────────────────────────────────────────────┐
-│                        Local LLM Architecture                                │
-├─────────────────────────────────────────────────────────────────────────────┤
-│                                                                              │
-│   User Input ──▶ botserver ──▶ llama.cpp ──▶ Response                       │
-│                      │              │                                        │
-│                      │         ┌────┴────┐                                   │
-│                      │         │  Model  │                                   │
-│                      │         │  GGUF   │                                   │
-│                      │         │ (Q4_K)  │                                   │
-│                      │         └─────────┘                                   │
-│                      │                                                       │
-│                 SQLite DB                                                    │
-│                (sessions)                                                    │
-│                                                                              │
-└─────────────────────────────────────────────────────────────────────────────┘
-```
-
-## Recommended Models
-
-### By Device RAM
-
-| RAM | Model | Size | Speed | Quality |
-|-----|-------|------|-------|---------|
-| **2GB** | TinyLlama 1.1B Q4_K_M | 670MB | ~5 tok/s | Basic |
-| **4GB** | Phi-2 2.7B Q4_K_M | 1.6GB | ~3-4 tok/s | Good |
-| **4GB** | Gemma 2B Q4_K_M | 1.4GB | ~4 tok/s | Good |
-| **8GB** | Llama 3.2 3B Q4_K_M | 2GB | ~3 tok/s | Better |
-| **8GB** | Mistral 7B Q4_K_M | 4.1GB | ~2 tok/s | Great |
-| **16GB** | Llama 3.1 8B Q4_K_M | 4.7GB | ~2 tok/s | Excellent |
-
-### By Use Case
-
-**Simple Q&A, Commands:**
-```
-TinyLlama 1.1B - Fast, basic understanding
-```
-
-**Customer Service, FAQ:**
-```
-Phi-2 or Gemma 2B - Good comprehension, reasonable speed
-```
-
-**Complex Reasoning:**
-```
-Llama 3.2 3B or Mistral 7B - Better accuracy, slower
-```
-
-## Installation
-
-### Automatic (via deploy script)
-
-```bash
-./scripts/deploy-embedded.sh pi@device --with-llama
-```
-
-### Manual Installation
-
-```bash
-# SSH to device
-ssh pi@raspberrypi.local
-
-# Install dependencies
-sudo apt update
-sudo apt install -y build-essential cmake git wget
-
-# Clone llama.cpp
-cd /opt
-sudo git clone https://github.com/ggerganov/llama.cpp
-sudo chown -R $(whoami):$(whoami) llama.cpp
-cd llama.cpp
-
-# Build for ARM (auto-optimizes)
-mkdir build && cd build
-cmake .. -DLLAMA_NATIVE=ON -DCMAKE_BUILD_TYPE=Release
-make -j$(nproc)
-
-# Download model
-mkdir -p /opt/llama.cpp/models
-cd /opt/llama.cpp/models
-wget https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf
-```
-
-### Start Server
-
-```bash
-# Test run
-/opt/llama.cpp/build/bin/llama-server \
-    -m /opt/llama.cpp/models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf \
-    --host 0.0.0.0 \
-    --port 8080 \
-    -c 2048 \
-    --threads 4
-
-# Verify
-curl http://localhost:8080/v1/models
-```
-
-### Systemd Service
-
-Create `/etc/systemd/system/llama-server.service`:
-
-```ini
-[Unit]
-Description=llama.cpp Server - Local LLM
-After=network.target
-
-[Service]
-Type=simple
-User=root
-WorkingDirectory=/opt/llama.cpp
-ExecStart=/opt/llama.cpp/build/bin/llama-server \
-    -m /opt/llama.cpp/models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf \
-    --host 0.0.0.0 \
-    --port 8080 \
-    -c 2048 \
-    -ngl 0 \
-    --threads 4
-Restart=always
-RestartSec=5
-
-[Install]
-WantedBy=multi-user.target
-```
-
-Enable and start:
-```bash
-sudo systemctl daemon-reload
-sudo systemctl enable llama-server
-sudo systemctl start llama-server
-```
-
-## Configuration
-
-### botserver .env
-
-```env
-# Use local llama.cpp
-LLM_PROVIDER=llamacpp
-LLM_API_URL=http://127.0.0.1:8080
-LLM_MODEL=tinyllama
-
-# Memory limits
-MAX_CONTEXT_TOKENS=2048
-MAX_RESPONSE_TOKENS=512
-STREAMING_ENABLED=true
-```
-
-### llama.cpp Parameters
-
-| Parameter | Default | Description |
-|-----------|---------|-------------|
-| `-c` | 2048 | Context size (tokens) |
-| `--threads` | 4 | CPU threads |
-| `-ngl` | 0 | GPU layers (0 for CPU only) |
-| `--host` | 127.0.0.1 | Bind address |
-| `--port` | 8080 | Server port |
-| `-b` | 512 | Batch size |
-| `--mlock` | off | Lock model in RAM |
-
-### Memory vs Context Size
-
-```
-Context 512:  ~400MB RAM, fast, limited conversation
-Context 1024: ~600MB RAM, moderate
-Context 2048: ~900MB RAM, good for most uses
-Context 4096: ~1.5GB RAM, long conversations
-```
-
-## Performance Optimization
-
-### CPU Optimization
-
-```bash
-# Check CPU features
-cat /proc/cpuinfo | grep -E "(model name|Features)"
-
-# Build with specific optimizations
-cmake .. -DLLAMA_NATIVE=ON \
-         -DCMAKE_BUILD_TYPE=Release \
-         -DLLAMA_ARM_FMA=ON \
-         -DLLAMA_ARM_DOTPROD=ON
-```
-
-### Memory Optimization
-
-```bash
-# For 2GB RAM devices
-# Use smaller context
--c 1024
-
-# Use memory mapping (slower but less RAM)
---mmap
-
-# Disable mlock (don't pin to RAM)
-# (default is disabled)
-```
-
-### Swap Configuration
-
-For devices with limited RAM:
-
-```bash
-# Create 2GB swap
-sudo fallocate -l 2G /swapfile
-sudo chmod 600 /swapfile
-sudo mkswap /swapfile
-sudo swapon /swapfile
-
-# Make permanent
-echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
-
-# Optimize swap usage
-echo 'vm.swappiness=10' | sudo tee -a /etc/sysctl.conf
-```
-
-## NPU Acceleration (Orange Pi 5)
-
-Orange Pi 5 has a 6 TOPS NPU that can accelerate inference:
-
-### Using rkllm (Rockchip NPU)
-
-```bash
-# Install rkllm runtime
-git clone https://github.com/airockchip/rknn-llm
-cd rknn-llm
-./install.sh
-
-# Convert model to RKNN format
-python3 convert_model.py \
-    --model tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf \
-    --output tinyllama.rkllm
-
-# Run with NPU
-rkllm-server \
-    --model tinyllama.rkllm \
-    --port 8080
-```
-
-Expected speedup: **3-5x faster** than CPU only.
-
-## Model Download URLs
-
-### TinyLlama 1.1B (Recommended for 2GB)
-```bash
-wget https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf
-```
-
-### Phi-2 2.7B (Recommended for 4GB)
-```bash
-wget https://huggingface.co/TheBloke/phi-2-GGUF/resolve/main/phi-2.Q4_K_M.gguf
-```
-
-### Gemma 2B
-```bash
-wget https://huggingface.co/bartowski/gemma-2-2b-it-GGUF/resolve/main/gemma-2-2b-it-Q4_K_M.gguf
-```
-
-### Llama 3.2 3B (Recommended for 8GB)
-```bash
-wget https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF/resolve/main/Llama-3.2-3B-Instruct-Q4_K_M.gguf
-```
-
-### Mistral 7B
-```bash
-wget https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/resolve/main/mistral-7b-instruct-v0.2.Q4_K_M.gguf
-```
-
-## API Usage
-
-llama.cpp exposes an OpenAI-compatible API:
-
-### Chat Completion
-
-```bash
-curl http://localhost:8080/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "tinyllama",
-    "messages": [
-      {"role": "user", "content": "What is 2+2?"}
-    ],
-    "max_tokens": 100
-  }'
-```
-
-### Streaming
-
-```bash
-curl http://localhost:8080/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "tinyllama",
-    "messages": [{"role": "user", "content": "Tell me a story"}],
-    "stream": true
-  }'
-```
-
-### Health Check
-
-```bash
-curl http://localhost:8080/health
-curl http://localhost:8080/v1/models
-```
-
-## Monitoring
-
-### Check Performance
-
-```bash
-# Watch resource usage
-htop
-
-# Check inference speed in logs
-sudo journalctl -u llama-server -f | grep "tokens/s"
-
-# Memory usage
-free -h
-```
-
-### Benchmarking
-
-```bash
-# Run llama.cpp benchmark
-/opt/llama.cpp/build/bin/llama-bench \
-    -m /opt/llama.cpp/models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf \
-    -p 512 -n 128 -t 4
-```
-
-## Troubleshooting
-
-### Model Loading Fails
-
-```bash
-# Check available RAM
-free -h
-
-# Try smaller context
--c 512
-
-# Use memory mapping
---mmap
-```
-
-### Slow Inference
-
-```bash
-# Increase threads (up to CPU cores)
---threads $(nproc)
-
-# Use optimized build
-cmake .. -DLLAMA_NATIVE=ON
-
-# Consider smaller model
-```
-
-### Out of Memory Killer
-
-```bash
-# Check if OOM killed the process
-dmesg | grep -i "killed process"
-
-# Increase swap
-# Use smaller model
-# Reduce context size
-```
-
-## Best Practices
-
-1. **Start small** - Begin with TinyLlama, upgrade if needed
-2. **Monitor memory** - Use `htop` during initial tests
-3. **Set appropriate context** - 1024-2048 for most embedded use
-4. **Use quantized models** - Q4_K_M is a good balance
-5. **Enable streaming** - Better UX on slow inference
-6. **Test offline** - Verify it works without internet before deployment
+# Local LLM with llama.cpp
diff --git a/src/20-embedding/quick-start.md b/src/20-embedding/quick-start.md
index b7a15fb2..05cf8c1f 100644
--- a/src/20-embedding/quick-start.md
+++ b/src/20-embedding/quick-start.md
@@ -1,209 +1 @@
-# Quick Start - Deploy in 5 Minutes
-
-Get General Bots running on your embedded device with local AI in just a few commands.
-
-## Prerequisites
-
-- An SBC (Raspberry Pi, Orange Pi, etc.) with Armbian/Raspbian
-- SSH access to the device
-- Internet connection (for initial setup only)
-
-## One-Line Deploy
-
-From your development machine:
-
-```bash
-# Clone and run the deployment script
-git clone https://github.com/GeneralBots/botserver.git
-cd botserver
-
-# Deploy to Orange Pi (replace with your device IP)
-./scripts/deploy-embedded.sh orangepi@192.168.1.100 --with-ui --with-llama
-```
-
-That's it! After ~10-15 minutes:
-- BotServer runs on port 8088
-- llama.cpp runs on port 8080 with TinyLlama
-- Embedded UI available at `http://your-device:8088/embedded/`
-
-## Step-by-Step Guide
-
-### Step 1: Prepare Your Device
-
-Flash your SBC with a compatible OS:
-
-**Raspberry Pi:**
-```bash
-# Download Raspberry Pi Imager
-# Select: Raspberry Pi OS Lite (64-bit)
-# Enable SSH in settings
-```
-
-**Orange Pi:**
-```bash
-# Download Armbian from armbian.com
-# Flash with balenaEtcher
-```
-
-### Step 2: First Boot Configuration
-
-```bash
-# SSH into your device
-ssh pi@raspberrypi.local  # or orangepi@orangepi.local
-
-# Update system
-sudo apt update && sudo apt upgrade -y
-
-# Set timezone
-sudo timedatectl set-timezone America/Sao_Paulo
-
-# Enable I2C/SPI if using GPIO displays
-sudo raspi-config  # or armbian-config
-```
-
-### Step 3: Run Deployment Script
-
-From your development PC:
-
-```bash
-# Basic deployment (botserver only)
-./scripts/deploy-embedded.sh pi@raspberrypi.local
-
-# With embedded UI
-./scripts/deploy-embedded.sh pi@raspberrypi.local --with-ui
-
-# With local LLM (requires 4GB+ RAM)
-./scripts/deploy-embedded.sh pi@raspberrypi.local --with-ui --with-llama
-
-# Specify a different model
-./scripts/deploy-embedded.sh pi@raspberrypi.local --with-llama --model phi-2-Q4_K_M.gguf
-```
-
-### Step 4: Verify Installation
-
-```bash
-# Check services
-ssh pi@raspberrypi.local 'sudo systemctl status botserver'
-ssh pi@raspberrypi.local 'sudo systemctl status llama-server'
-
-# Test botserver
-curl http://raspberrypi.local:8088/health
-
-# Test llama.cpp
-curl http://raspberrypi.local:8080/v1/models
-```
-
-### Step 5: Access the Interface
-
-Open in your browser:
-```
-http://raspberrypi.local:8088/embedded/
-```
-
-Or set up kiosk mode (auto-starts on boot):
-```bash
-# Already configured if you used --with-ui
-# Just reboot:
-ssh pi@raspberrypi.local 'sudo reboot'
-```
-
-## Local Installation (On the Device)
-
-If you prefer to install directly on the device:
-
-```bash
-# SSH into the device
-ssh pi@raspberrypi.local
-
-# Install Rust
-curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
-source ~/.cargo/env
-
-# Clone and build
-git clone https://github.com/GeneralBots/botserver.git
-cd botserver
-
-# Run local deployment
-./scripts/deploy-embedded.sh --local --with-ui --with-llama
-```
-
-⚠️ **Note:** Building on ARM devices is slow (1-2 hours). Cross-compilation is faster.
-
-## Configuration
-
-After deployment, edit the config file:
-
-```bash
-ssh pi@raspberrypi.local
-sudo nano /opt/botserver/.env
-```
-
-Key settings:
-```env
-# Server
-HOST=0.0.0.0
-PORT=8088
-
-# Local LLM
-LLM_PROVIDER=llamacpp
-LLM_API_URL=http://127.0.0.1:8080
-LLM_MODEL=tinyllama
-
-# Memory limits for small devices
-MAX_CONTEXT_TOKENS=2048
-MAX_RESPONSE_TOKENS=512
-```
-
-Restart after changes:
-```bash
-sudo systemctl restart botserver
-```
-
-## Troubleshooting
-
-### Out of Memory
-
-```bash
-# Check memory usage
-free -h
-
-# Reduce llama.cpp context
-sudo nano /etc/systemd/system/llama-server.service
-# Change -c 2048 to -c 1024
-
-# Or use a smaller model
-# TinyLlama uses ~700MB, Phi-2 uses ~1.6GB
-```
-
-### Service Won't Start
-
-```bash
-# Check logs
-sudo journalctl -u botserver -f
-sudo journalctl -u llama-server -f
-
-# Common issues:
-# - Port already in use
-# - Missing model file
-# - Database permissions
-```
-
-### Display Not Working
-
-```bash
-# Check if display is detected
-ls /dev/fb*       # HDMI/DSI
-ls /dev/i2c*      # I2C displays
-ls /dev/spidev*   # SPI displays
-
-# For HDMI, check config
-sudo nano /boot/config.txt  # Raspberry Pi
-sudo nano /boot/armbianEnv.txt  # Orange Pi
-```
-
-## Next Steps
-
-- [Embedded UI Guide](./embedded-ui.md) - Customize the interface
-- [Local LLM Configuration](./local-llm.md) - Optimize AI performance
-- [Kiosk Mode](./kiosk-mode.md) - Production deployment
-- [Offline Operation](./offline.md) - Disconnected environments
+# Quick Start
diff --git a/src/SUMMARY.md b/src/SUMMARY.md
index 5c52ab4e..02663eba 100644
--- a/src/SUMMARY.md
+++ b/src/SUMMARY.md
@@ -320,9 +320,17 @@
   - [Permissions Matrix](./12-auth/permissions-matrix.md)
   - [User Context vs System Context](./12-auth/user-system-context.md)
 
-# Part XII - Community
+# Part XII - Device & Offline Deployment
 
-- [Chapter 13: Contributing](./13-community/README.md)
+- [Chapter 13: Device Deployment](./13-devices/README.md)
+  - [Mobile (Android & HarmonyOS)](./13-devices/mobile.md)
+  - [Supported Hardware (SBCs)](./13-devices/hardware.md)
+  - [Quick Start](./13-devices/quick-start.md)
+  - [Local LLM with llama.cpp](./13-devices/local-llm.md)
+
+# Part XIII - Community
+
+- [Chapter 14: Contributing](./13-community/README.md)
   - [Development Setup](./13-community/setup.md)
   - [Testing Guide](./13-community/testing.md)
   - [Documentation](./13-community/documentation.md)
@@ -330,9 +338,9 @@
   - [Community Guidelines](./13-community/community.md)
   - [IDEs](./13-community/ide-extensions.md)
 
-# Part XIII - Migration
+# Part XIV - Migration
 
-- [Chapter 14: Migration Guide](./14-migration/README.md)
+- [Chapter 15: Migration Guide](./14-migration/README.md)
   - [Migration Overview](./14-migration/overview.md)
   - [Platform Comparison Matrix](./14-migration/comparison-matrix.md)
   - [Migration Resources](./14-migration/resources.md)
@@ -350,9 +358,9 @@
   - [Automation Migration](./14-migration/automation.md)
   - [Validation and Testing](./14-migration/validation.md)
 
-# Part XIV - Testing
+# Part XV - Testing
 
-- [Chapter 17: Testing](./17-testing/README.md)
+- [Chapter 16: Testing](./17-testing/README.md)
   - [End-to-End Testing](./17-testing/e2e-testing.md)
   - [Testing Architecture](./17-testing/architecture.md)
   - [Performance Testing](./17-testing/performance.md)
@@ -390,12 +398,5 @@
 - [Appendix D: Documentation Style](./16-appendix-docs-style/conversation-examples.md)
   - [SVG and Conversation Standards](./16-appendix-docs-style/svg.md)
 
-# Part XV - Embedded & Offline
-
-- [Chapter 20: Embedded Deployment](./20-embedding/README.md)
-  - [Supported Hardware](./20-embedding/hardware.md)
-  - [Quick Start](./20-embedding/quick-start.md)
-  - [Local LLM with llama.cpp](./20-embedding/local-llm.md)
-
 [Glossary](./glossary.md)
 [Contact](./contact/README.md)