Add Chapter 13: Device & Offline Deployment documentation
- Add mobile deployment guide for Android & HarmonyOS (BotOS) - Add hardware guide for SBCs (Raspberry Pi, Orange Pi, etc.) - Add quick start guide for 5-minute deployment - Add local LLM guide with llama.cpp for offline AI - Update SUMMARY.md to place chapter after Security (Part XII) - Include bloatware removal, Magisk module, GSI instructions - Cover NPU acceleration on Orange Pi 5 with rkllm
This commit is contained in:
parent
ff5d2ac12c
commit
d3bc28fac6
10 changed files with 1175 additions and 840 deletions
54
src/13-devices/README.md
Normal file
54
src/13-devices/README.md
Normal file
|
|
@ -0,0 +1,54 @@
|
||||||
|
# Chapter 13: Device & Offline Deployment
|
||||||
|
|
||||||
|
Deploy General Bots to any device - from smartphones to Raspberry Pi to industrial kiosks - with local LLM inference for fully offline AI capabilities.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
General Bots can run on any device, from mobile phones to minimal embedded hardware with displays as small as 16x2 character LCDs, enabling AI-powered interactions anywhere:
|
||||||
|
|
||||||
|
- **Kiosks** - Self-service terminals in stores, airports, hospitals
|
||||||
|
- **Industrial IoT** - Factory floor assistants, machine interfaces
|
||||||
|
- **Smart Home** - Wall panels, kitchen displays, door intercoms
|
||||||
|
- **Retail** - Point-of-sale systems, product information terminals
|
||||||
|
- **Education** - Classroom assistants, lab equipment interfaces
|
||||||
|
- **Healthcare** - Patient check-in, medication reminders
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||||
|
│ Embedded GB Architecture │
|
||||||
|
├─────────────────────────────────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||||
|
│ │ Display │ │ botserver │ │ llama.cpp │ │
|
||||||
|
│ │ LCD/OLED │────▶│ (Rust) │────▶│ (Local) │ │
|
||||||
|
│ │ TFT/HDMI │ │ Port 8088 │ │ Port 8080 │ │
|
||||||
|
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
||||||
|
│ │ │ │ │
|
||||||
|
│ │ │ │ │
|
||||||
|
│ ┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐ │
|
||||||
|
│ │ Keyboard │ │ SQLite │ │ TinyLlama │ │
|
||||||
|
│ │ Buttons │ │ (Data) │ │ GGUF │ │
|
||||||
|
│ │ Touch │ │ │ │ (~700MB) │ │
|
||||||
|
│ └─────────────┘ └─────────────┘ └─────────────┘ │
|
||||||
|
│ │
|
||||||
|
└─────────────────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## What's in This Chapter
|
||||||
|
|
||||||
|
### Mobile Deployment
|
||||||
|
- [Mobile (Android & HarmonyOS)](./mobile.md) - BotOS for smartphones and tablets
|
||||||
|
|
||||||
|
### Embedded Deployment
|
||||||
|
- [Supported Hardware](./hardware.md) - SBCs, displays, and peripherals
|
||||||
|
- [Quick Start](./quick-start.md) - Deploy in 5 minutes
|
||||||
|
- [Local LLM](./local-llm.md) - Offline AI with llama.cpp
|
||||||
|
|
||||||
|
### Deployment Options
|
||||||
|
|
||||||
|
| Platform | Use Case | Requirements |
|
||||||
|
|----------|----------|--------------|
|
||||||
|
| **Android/HarmonyOS** | Smartphones, tablets, kiosks | Any Android 8+ device |
|
||||||
|
| **Raspberry Pi** | IoT, displays, terminals | 1GB+ RAM |
|
||||||
|
| **Orange Pi** | Full offline AI | 4GB+ RAM for LLM |
|
||||||
|
| **Industrial** | Factory, retail, healthcare | Any ARM/x86 SBC |
|
||||||
190
src/13-devices/hardware.md
Normal file
190
src/13-devices/hardware.md
Normal file
|
|
@ -0,0 +1,190 @@
|
||||||
|
# Supported Hardware
|
||||||
|
|
||||||
|
## Single Board Computers (SBCs)
|
||||||
|
|
||||||
|
### Recommended Boards
|
||||||
|
|
||||||
|
| Board | CPU | RAM | Best For | Price |
|
||||||
|
|-------|-----|-----|----------|-------|
|
||||||
|
| **Orange Pi 5** | RK3588S | 4-16GB | Full LLM, NPU accel | $89-149 |
|
||||||
|
| **Raspberry Pi 5** | BCM2712 | 4-8GB | General purpose | $60-80 |
|
||||||
|
| **Orange Pi Zero 3** | H618 | 1-4GB | Minimal deployments | $20-35 |
|
||||||
|
| **Raspberry Pi 4** | BCM2711 | 2-8GB | Established ecosystem | $45-75 |
|
||||||
|
| **Raspberry Pi Zero 2W** | RP3A0 | 512MB | Ultra-compact | $15 |
|
||||||
|
| **Rock Pi 4** | RK3399 | 4GB | NPU available | $75 |
|
||||||
|
| **NVIDIA Jetson Nano** | Tegra X1 | 4GB | GPU inference | $149 |
|
||||||
|
| **BeagleBone Black** | AM3358 | 512MB | Industrial | $55 |
|
||||||
|
| **LattePanda 3 Delta** | N100 | 8GB | x86 compatibility | $269 |
|
||||||
|
| **ODROID-N2+** | S922X | 4GB | High performance | $79 |
|
||||||
|
|
||||||
|
### Minimum Requirements
|
||||||
|
|
||||||
|
**For UI only (connect to remote botserver):**
|
||||||
|
- Any ARM/x86 Linux board
|
||||||
|
- 256MB RAM
|
||||||
|
- Network connection
|
||||||
|
- Display output
|
||||||
|
|
||||||
|
**For local botserver:**
|
||||||
|
- ARM64 or x86_64
|
||||||
|
- 1GB RAM minimum
|
||||||
|
- 4GB storage
|
||||||
|
|
||||||
|
**For local LLM (llama.cpp):**
|
||||||
|
- ARM64 or x86_64
|
||||||
|
- 2GB+ RAM (4GB recommended)
|
||||||
|
- 2GB+ storage for model
|
||||||
|
|
||||||
|
### Orange Pi 5 (Recommended for LLM)
|
||||||
|
|
||||||
|
The Orange Pi 5 with RK3588S is ideal for embedded LLM:
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────┐
|
||||||
|
│ Orange Pi 5 - Best for Offline AI │
|
||||||
|
├─────────────────────────────────────────────────────────────┤
|
||||||
|
│ CPU: Rockchip RK3588S (4x A76 + 4x A55) │
|
||||||
|
│ NPU: 6 TOPS (Neural Processing Unit) │
|
||||||
|
│ GPU: Mali-G610 MP4 │
|
||||||
|
│ RAM: 4GB / 8GB / 16GB LPDDR4X │
|
||||||
|
│ Storage: M.2 NVMe + eMMC + microSD │
|
||||||
|
│ │
|
||||||
|
│ LLM Performance: │
|
||||||
|
│ ├─ TinyLlama 1.1B Q4: ~8-12 tokens/sec │
|
||||||
|
│ ├─ Phi-2 2.7B Q4: ~4-6 tokens/sec │
|
||||||
|
│ └─ With NPU (rkllm): ~20-30 tokens/sec │
|
||||||
|
└─────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Displays
|
||||||
|
|
||||||
|
### Character LCDs (Minimal)
|
||||||
|
|
||||||
|
For text-only interfaces:
|
||||||
|
|
||||||
|
| Display | Resolution | Interface | Use Case |
|
||||||
|
|---------|------------|-----------|----------|
|
||||||
|
| HD44780 16x2 | 16 chars × 2 lines | I2C/GPIO | Status, simple Q&A |
|
||||||
|
| HD44780 20x4 | 20 chars × 4 lines | I2C/GPIO | More context |
|
||||||
|
| LCD2004 | 20 chars × 4 lines | I2C | Industrial |
|
||||||
|
|
||||||
|
**Example output on 16x2:**
|
||||||
|
```
|
||||||
|
┌────────────────┐
|
||||||
|
│> How can I help│
|
||||||
|
│< Processing... │
|
||||||
|
└────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### OLED Displays
|
||||||
|
|
||||||
|
For graphical monochrome interfaces:
|
||||||
|
|
||||||
|
| Display | Resolution | Interface | Size |
|
||||||
|
|---------|------------|-----------|------|
|
||||||
|
| SSD1306 | 128×64 | I2C/SPI | 0.96" |
|
||||||
|
| SSD1309 | 128×64 | I2C/SPI | 2.42" |
|
||||||
|
| SH1106 | 128×64 | I2C/SPI | 1.3" |
|
||||||
|
| SSD1322 | 256×64 | SPI | 3.12" |
|
||||||
|
|
||||||
|
### TFT/IPS Color Displays
|
||||||
|
|
||||||
|
For full graphical interface:
|
||||||
|
|
||||||
|
| Display | Resolution | Interface | Notes |
|
||||||
|
|---------|------------|-----------|-------|
|
||||||
|
| ILI9341 | 320×240 | SPI | Common, cheap |
|
||||||
|
| ST7789 | 240×320 | SPI | Fast refresh |
|
||||||
|
| ILI9488 | 480×320 | SPI | Larger |
|
||||||
|
| Waveshare 5" | 800×480 | HDMI | Touch optional |
|
||||||
|
| Waveshare 7" | 1024×600 | HDMI | Touch, IPS |
|
||||||
|
| Official Pi 7" | 800×480 | DSI | Best for Pi |
|
||||||
|
|
||||||
|
### E-Ink/E-Paper
|
||||||
|
|
||||||
|
For low-power, readable in sunlight:
|
||||||
|
|
||||||
|
| Display | Resolution | Colors | Refresh |
|
||||||
|
|---------|------------|--------|---------|
|
||||||
|
| Waveshare 2.13" | 250×122 | B/W | 2s |
|
||||||
|
| Waveshare 4.2" | 400×300 | B/W | 4s |
|
||||||
|
| Waveshare 7.5" | 800×480 | B/W | 5s |
|
||||||
|
| Good Display 9.7" | 1200×825 | B/W | 6s |
|
||||||
|
|
||||||
|
**Best for:** Menu displays, signs, low-update applications
|
||||||
|
|
||||||
|
### Industrial Displays
|
||||||
|
|
||||||
|
| Display | Resolution | Features |
|
||||||
|
|---------|------------|----------|
|
||||||
|
| Advantech | Various | Wide temp, sunlight |
|
||||||
|
| Winstar | Various | Industrial grade |
|
||||||
|
| Newhaven | Various | Long availability |
|
||||||
|
|
||||||
|
## Input Devices
|
||||||
|
|
||||||
|
### Keyboards
|
||||||
|
|
||||||
|
- **USB Keyboard** - Standard, any USB keyboard works
|
||||||
|
- **PS/2 Keyboard** - Via adapter, lower latency
|
||||||
|
- **Matrix Keypad** - 4x4 or 3x4, GPIO connected
|
||||||
|
- **I2C Keypad** - Fewer GPIO pins needed
|
||||||
|
|
||||||
|
### Touch Input
|
||||||
|
|
||||||
|
- **Capacitive Touch** - Better response, needs driver
|
||||||
|
- **Resistive Touch** - Works with gloves, pressure-based
|
||||||
|
- **IR Touch Frame** - Large displays, vandal-resistant
|
||||||
|
|
||||||
|
### Buttons & GPIO
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────┐
|
||||||
|
│ Simple 4-Button Interface │
|
||||||
|
├─────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ [◄ PREV] [▲ UP] [▼ DOWN] [► SELECT] │
|
||||||
|
│ │
|
||||||
|
│ GPIO 17 GPIO 27 GPIO 22 GPIO 23 │
|
||||||
|
│ │
|
||||||
|
└─────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Enclosures
|
||||||
|
|
||||||
|
### Commercial Options
|
||||||
|
|
||||||
|
- **Hammond Manufacturing** - Industrial metal enclosures
|
||||||
|
- **Polycase** - Plastic, IP65 rated
|
||||||
|
- **Bud Industries** - Various sizes
|
||||||
|
- **Pi-specific cases** - Argon, Flirc, etc.
|
||||||
|
|
||||||
|
### DIY Options
|
||||||
|
|
||||||
|
- **3D Printed** - Custom fit, PLA/PETG
|
||||||
|
- **Laser Cut** - Acrylic, wood
|
||||||
|
- **Metal Fabrication** - Professional look
|
||||||
|
|
||||||
|
## Power
|
||||||
|
|
||||||
|
### Power Requirements
|
||||||
|
|
||||||
|
| Configuration | Power | Recommended PSU |
|
||||||
|
|---------------|-------|-----------------|
|
||||||
|
| Pi Zero + LCD | 1-2W | 5V 1A |
|
||||||
|
| Pi 4 + Display | 5-10W | 5V 3A |
|
||||||
|
| Orange Pi 5 | 8-15W | 5V 4A or 12V 2A |
|
||||||
|
| With NVMe SSD | +2-3W | Add 1A headroom |
|
||||||
|
|
||||||
|
### Power Options
|
||||||
|
|
||||||
|
- **USB-C PD** - Modern, efficient
|
||||||
|
- **PoE HAT** - Power over Ethernet
|
||||||
|
- **12V Barrel** - Industrial standard
|
||||||
|
- **Battery** - UPS, solar applications
|
||||||
|
|
||||||
|
### UPS Solutions
|
||||||
|
|
||||||
|
- **PiJuice** - Pi-specific UPS HAT
|
||||||
|
- **UPS PIco** - Small form factor
|
||||||
|
- **Powerboost** - Adafruit, lithium battery
|
||||||
382
src/13-devices/local-llm.md
Normal file
382
src/13-devices/local-llm.md
Normal file
|
|
@ -0,0 +1,382 @@
|
||||||
|
# Local LLM - Offline AI with llama.cpp
|
||||||
|
|
||||||
|
Run AI inference completely offline on embedded devices. No internet, no API costs, full privacy.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||||
|
│ Local LLM Architecture │
|
||||||
|
├─────────────────────────────────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ User Input ──▶ botserver ──▶ llama.cpp ──▶ Response │
|
||||||
|
│ │ │ │
|
||||||
|
│ │ ┌────┴────┐ │
|
||||||
|
│ │ │ Model │ │
|
||||||
|
│ │ │ GGUF │ │
|
||||||
|
│ │ │ (Q4_K) │ │
|
||||||
|
│ │ └─────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ SQLite DB │
|
||||||
|
│ (sessions) │
|
||||||
|
│ │
|
||||||
|
└─────────────────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Recommended Models
|
||||||
|
|
||||||
|
### By Device RAM
|
||||||
|
|
||||||
|
| RAM | Model | Size | Speed | Quality |
|
||||||
|
|-----|-------|------|-------|---------|
|
||||||
|
| **2GB** | TinyLlama 1.1B Q4_K_M | 670MB | ~5 tok/s | Basic |
|
||||||
|
| **4GB** | Phi-2 2.7B Q4_K_M | 1.6GB | ~3-4 tok/s | Good |
|
||||||
|
| **4GB** | Gemma 2B Q4_K_M | 1.4GB | ~4 tok/s | Good |
|
||||||
|
| **8GB** | Llama 3.2 3B Q4_K_M | 2GB | ~3 tok/s | Better |
|
||||||
|
| **8GB** | Mistral 7B Q4_K_M | 4.1GB | ~2 tok/s | Great |
|
||||||
|
| **16GB** | Llama 3.1 8B Q4_K_M | 4.7GB | ~2 tok/s | Excellent |
|
||||||
|
|
||||||
|
### By Use Case
|
||||||
|
|
||||||
|
**Simple Q&A, Commands:**
|
||||||
|
```
|
||||||
|
TinyLlama 1.1B - Fast, basic understanding
|
||||||
|
```
|
||||||
|
|
||||||
|
**Customer Service, FAQ:**
|
||||||
|
```
|
||||||
|
Phi-2 or Gemma 2B - Good comprehension, reasonable speed
|
||||||
|
```
|
||||||
|
|
||||||
|
**Complex Reasoning:**
|
||||||
|
```
|
||||||
|
Llama 3.2 3B or Mistral 7B - Better accuracy, slower
|
||||||
|
```
|
||||||
|
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
### Automatic (via deploy script)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
./scripts/deploy-embedded.sh pi@device --with-llama
|
||||||
|
```
|
||||||
|
|
||||||
|
### Manual Installation
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# SSH to device
|
||||||
|
ssh pi@raspberrypi.local
|
||||||
|
|
||||||
|
# Install dependencies
|
||||||
|
sudo apt update
|
||||||
|
sudo apt install -y build-essential cmake git wget
|
||||||
|
|
||||||
|
# Clone llama.cpp
|
||||||
|
cd /opt
|
||||||
|
sudo git clone https://github.com/ggerganov/llama.cpp
|
||||||
|
sudo chown -R $(whoami):$(whoami) llama.cpp
|
||||||
|
cd llama.cpp
|
||||||
|
|
||||||
|
# Build for ARM (auto-optimizes)
|
||||||
|
mkdir build && cd build
|
||||||
|
cmake .. -DLLAMA_NATIVE=ON -DCMAKE_BUILD_TYPE=Release
|
||||||
|
make -j$(nproc)
|
||||||
|
|
||||||
|
# Download model
|
||||||
|
mkdir -p /opt/llama.cpp/models
|
||||||
|
cd /opt/llama.cpp/models
|
||||||
|
wget https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf
|
||||||
|
```
|
||||||
|
|
||||||
|
### Start Server
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Test run
|
||||||
|
/opt/llama.cpp/build/bin/llama-server \
|
||||||
|
-m /opt/llama.cpp/models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf \
|
||||||
|
--host 0.0.0.0 \
|
||||||
|
--port 8080 \
|
||||||
|
-c 2048 \
|
||||||
|
--threads 4
|
||||||
|
|
||||||
|
# Verify
|
||||||
|
curl http://localhost:8080/v1/models
|
||||||
|
```
|
||||||
|
|
||||||
|
### Systemd Service
|
||||||
|
|
||||||
|
Create `/etc/systemd/system/llama-server.service`:
|
||||||
|
|
||||||
|
```ini
|
||||||
|
[Unit]
|
||||||
|
Description=llama.cpp Server - Local LLM
|
||||||
|
After=network.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=simple
|
||||||
|
User=root
|
||||||
|
WorkingDirectory=/opt/llama.cpp
|
||||||
|
ExecStart=/opt/llama.cpp/build/bin/llama-server \
|
||||||
|
-m /opt/llama.cpp/models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf \
|
||||||
|
--host 0.0.0.0 \
|
||||||
|
--port 8080 \
|
||||||
|
-c 2048 \
|
||||||
|
-ngl 0 \
|
||||||
|
--threads 4
|
||||||
|
Restart=always
|
||||||
|
RestartSec=5
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
|
```
|
||||||
|
|
||||||
|
Enable and start:
|
||||||
|
```bash
|
||||||
|
sudo systemctl daemon-reload
|
||||||
|
sudo systemctl enable llama-server
|
||||||
|
sudo systemctl start llama-server
|
||||||
|
```
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
### botserver .env
|
||||||
|
|
||||||
|
```env
|
||||||
|
# Use local llama.cpp
|
||||||
|
LLM_PROVIDER=llamacpp
|
||||||
|
LLM_API_URL=http://127.0.0.1:8080
|
||||||
|
LLM_MODEL=tinyllama
|
||||||
|
|
||||||
|
# Memory limits
|
||||||
|
MAX_CONTEXT_TOKENS=2048
|
||||||
|
MAX_RESPONSE_TOKENS=512
|
||||||
|
STREAMING_ENABLED=true
|
||||||
|
```
|
||||||
|
|
||||||
|
### llama.cpp Parameters
|
||||||
|
|
||||||
|
| Parameter | Default | Description |
|
||||||
|
|-----------|---------|-------------|
|
||||||
|
| `-c` | 2048 | Context size (tokens) |
|
||||||
|
| `--threads` | 4 | CPU threads |
|
||||||
|
| `-ngl` | 0 | GPU layers (0 for CPU only) |
|
||||||
|
| `--host` | 127.0.0.1 | Bind address |
|
||||||
|
| `--port` | 8080 | Server port |
|
||||||
|
| `-b` | 512 | Batch size |
|
||||||
|
| `--mlock` | off | Lock model in RAM |
|
||||||
|
|
||||||
|
### Memory vs Context Size
|
||||||
|
|
||||||
|
```
|
||||||
|
Context 512: ~400MB RAM, fast, limited conversation
|
||||||
|
Context 1024: ~600MB RAM, moderate
|
||||||
|
Context 2048: ~900MB RAM, good for most uses
|
||||||
|
Context 4096: ~1.5GB RAM, long conversations
|
||||||
|
```
|
||||||
|
|
||||||
|
## Performance Optimization
|
||||||
|
|
||||||
|
### CPU Optimization
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check CPU features
|
||||||
|
cat /proc/cpuinfo | grep -E "(model name|Features)"
|
||||||
|
|
||||||
|
# Build with specific optimizations
|
||||||
|
cmake .. -DLLAMA_NATIVE=ON \
|
||||||
|
-DCMAKE_BUILD_TYPE=Release \
|
||||||
|
-DLLAMA_ARM_FMA=ON \
|
||||||
|
-DLLAMA_ARM_DOTPROD=ON
|
||||||
|
```
|
||||||
|
|
||||||
|
### Memory Optimization
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# For 2GB RAM devices
|
||||||
|
# Use smaller context
|
||||||
|
-c 1024
|
||||||
|
|
||||||
|
# Use memory mapping (slower but less RAM)
|
||||||
|
--mmap
|
||||||
|
|
||||||
|
# Disable mlock (don't pin to RAM)
|
||||||
|
# (default is disabled)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Swap Configuration
|
||||||
|
|
||||||
|
For devices with limited RAM:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Create 2GB swap
|
||||||
|
sudo fallocate -l 2G /swapfile
|
||||||
|
sudo chmod 600 /swapfile
|
||||||
|
sudo mkswap /swapfile
|
||||||
|
sudo swapon /swapfile
|
||||||
|
|
||||||
|
# Make permanent
|
||||||
|
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
|
||||||
|
|
||||||
|
# Optimize swap usage
|
||||||
|
echo 'vm.swappiness=10' | sudo tee -a /etc/sysctl.conf
|
||||||
|
```
|
||||||
|
|
||||||
|
## NPU Acceleration (Orange Pi 5)
|
||||||
|
|
||||||
|
Orange Pi 5 has a 6 TOPS NPU that can accelerate inference:
|
||||||
|
|
||||||
|
### Using rkllm (Rockchip NPU)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Install rkllm runtime
|
||||||
|
git clone https://github.com/airockchip/rknn-llm
|
||||||
|
cd rknn-llm
|
||||||
|
./install.sh
|
||||||
|
|
||||||
|
# Convert model to RKNN format
|
||||||
|
python3 convert_model.py \
|
||||||
|
--model tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf \
|
||||||
|
--output tinyllama.rkllm
|
||||||
|
|
||||||
|
# Run with NPU
|
||||||
|
rkllm-server \
|
||||||
|
--model tinyllama.rkllm \
|
||||||
|
--port 8080
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected speedup: **3-5x faster** than CPU only.
|
||||||
|
|
||||||
|
## Model Download URLs
|
||||||
|
|
||||||
|
### TinyLlama 1.1B (Recommended for 2GB)
|
||||||
|
```bash
|
||||||
|
wget https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf
|
||||||
|
```
|
||||||
|
|
||||||
|
### Phi-2 2.7B (Recommended for 4GB)
|
||||||
|
```bash
|
||||||
|
wget https://huggingface.co/TheBloke/phi-2-GGUF/resolve/main/phi-2.Q4_K_M.gguf
|
||||||
|
```
|
||||||
|
|
||||||
|
### Gemma 2B
|
||||||
|
```bash
|
||||||
|
wget https://huggingface.co/bartowski/gemma-2-2b-it-GGUF/resolve/main/gemma-2-2b-it-Q4_K_M.gguf
|
||||||
|
```
|
||||||
|
|
||||||
|
### Llama 3.2 3B (Recommended for 8GB)
|
||||||
|
```bash
|
||||||
|
wget https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF/resolve/main/Llama-3.2-3B-Instruct-Q4_K_M.gguf
|
||||||
|
```
|
||||||
|
|
||||||
|
### Mistral 7B
|
||||||
|
```bash
|
||||||
|
wget https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/resolve/main/mistral-7b-instruct-v0.2.Q4_K_M.gguf
|
||||||
|
```
|
||||||
|
|
||||||
|
## API Usage
|
||||||
|
|
||||||
|
llama.cpp exposes an OpenAI-compatible API:
|
||||||
|
|
||||||
|
### Chat Completion
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl http://localhost:8080/v1/chat/completions \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"model": "tinyllama",
|
||||||
|
"messages": [
|
||||||
|
{"role": "user", "content": "What is 2+2?"}
|
||||||
|
],
|
||||||
|
"max_tokens": 100
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
### Streaming
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl http://localhost:8080/v1/chat/completions \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"model": "tinyllama",
|
||||||
|
"messages": [{"role": "user", "content": "Tell me a story"}],
|
||||||
|
"stream": true
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
### Health Check
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl http://localhost:8080/health
|
||||||
|
curl http://localhost:8080/v1/models
|
||||||
|
```
|
||||||
|
|
||||||
|
## Monitoring
|
||||||
|
|
||||||
|
### Check Performance
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Watch resource usage
|
||||||
|
htop
|
||||||
|
|
||||||
|
# Check inference speed in logs
|
||||||
|
sudo journalctl -u llama-server -f | grep "tokens/s"
|
||||||
|
|
||||||
|
# Memory usage
|
||||||
|
free -h
|
||||||
|
```
|
||||||
|
|
||||||
|
### Benchmarking
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Run llama.cpp benchmark
|
||||||
|
/opt/llama.cpp/build/bin/llama-bench \
|
||||||
|
-m /opt/llama.cpp/models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf \
|
||||||
|
-p 512 -n 128 -t 4
|
||||||
|
```
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### Model Loading Fails
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check available RAM
|
||||||
|
free -h
|
||||||
|
|
||||||
|
# Try smaller context
|
||||||
|
-c 512
|
||||||
|
|
||||||
|
# Use memory mapping
|
||||||
|
--mmap
|
||||||
|
```
|
||||||
|
|
||||||
|
### Slow Inference
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Increase threads (up to CPU cores)
|
||||||
|
--threads $(nproc)
|
||||||
|
|
||||||
|
# Use optimized build
|
||||||
|
cmake .. -DLLAMA_NATIVE=ON
|
||||||
|
|
||||||
|
# Consider smaller model
|
||||||
|
```
|
||||||
|
|
||||||
|
### Out of Memory Killer
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check if OOM killed the process
|
||||||
|
dmesg | grep -i "killed process"
|
||||||
|
|
||||||
|
# Increase swap
|
||||||
|
# Use smaller model
|
||||||
|
# Reduce context size
|
||||||
|
```
|
||||||
|
|
||||||
|
## Best Practices
|
||||||
|
|
||||||
|
1. **Start small** - Begin with TinyLlama, upgrade if needed
|
||||||
|
2. **Monitor memory** - Use `htop` during initial tests
|
||||||
|
3. **Set appropriate context** - 1024-2048 for most embedded use
|
||||||
|
4. **Use quantized models** - Q4_K_M is a good balance
|
||||||
|
5. **Enable streaming** - Better UX on slow inference
|
||||||
|
6. **Test offline** - Verify it works without internet before deployment
|
||||||
323
src/13-devices/mobile.md
Normal file
323
src/13-devices/mobile.md
Normal file
|
|
@ -0,0 +1,323 @@
|
||||||
|
# Mobile Deployment - Android & HarmonyOS
|
||||||
|
|
||||||
|
Deploy General Bots as the primary interface on Android and HarmonyOS devices, transforming them into dedicated AI assistants.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
BotOS transforms any Android or HarmonyOS device into a dedicated General Bots system, removing manufacturer bloatware and installing GB as the default launcher.
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||||
|
│ BotOS Architecture │
|
||||||
|
├─────────────────────────────────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ ┌──────────────────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ BotOS App (Tauri) │ │
|
||||||
|
│ ├──────────────────────────────────────────────────────────────────┤ │
|
||||||
|
│ │ botui/ui/suite │ Tauri Android │ src/lib.rs (Rust) │ │
|
||||||
|
│ │ (Web Interface) │ (WebView + NDK) │ (Backend + Hardware) │ │
|
||||||
|
│ └──────────────────────────────────────────────────────────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ┌─────────────────────────┴────────────────────────────┐ │
|
||||||
|
│ │ Android/HarmonyOS System │ │
|
||||||
|
│ │ ┌─────────┐ ┌──────────┐ ┌────────┐ ┌─────────┐ │ │
|
||||||
|
│ │ │ Camera │ │ GPS │ │ WiFi │ │ Storage │ │ │
|
||||||
|
│ │ └─────────┘ └──────────┘ └────────┘ └─────────┘ │ │
|
||||||
|
│ └───────────────────────────────────────────────────────┘ │
|
||||||
|
│ │
|
||||||
|
└─────────────────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Supported Platforms
|
||||||
|
|
||||||
|
### Android
|
||||||
|
- **AOSP** - Pure Android
|
||||||
|
- **Samsung One UI** - Galaxy devices
|
||||||
|
- **Xiaomi MIUI** - Mi, Redmi, Poco
|
||||||
|
- **OPPO ColorOS** - OPPO, OnePlus, Realme
|
||||||
|
- **Vivo Funtouch/OriginOS**
|
||||||
|
- **Google Pixel**
|
||||||
|
|
||||||
|
### HarmonyOS
|
||||||
|
- **Huawei** - P series, Mate series, Nova
|
||||||
|
- **Honor** - Magic series, X series
|
||||||
|
|
||||||
|
## Installation Levels
|
||||||
|
|
||||||
|
| Level | Requirements | What It Does |
|
||||||
|
|-------|-------------|--------------|
|
||||||
|
| **Level 1** | ADB only | Removes bloatware, installs BotOS as app |
|
||||||
|
| **Level 2** | Root + Magisk | GB boot animation, BotOS as system app |
|
||||||
|
| **Level 3** | Unlocked bootloader | Full Android replacement with BotOS |
|
||||||
|
|
||||||
|
## Quick Installation
|
||||||
|
|
||||||
|
### Level 1: Debloat + App (No Root)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Clone botos repository
|
||||||
|
git clone https://github.com/GeneralBots/botos.git
|
||||||
|
cd botos/rom
|
||||||
|
|
||||||
|
# Connect device via USB (enable USB debugging first)
|
||||||
|
./install.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
The interactive installer will:
|
||||||
|
1. Detect your device and manufacturer
|
||||||
|
2. Remove bloatware automatically
|
||||||
|
3. Install BotOS APK
|
||||||
|
4. Optionally set as default launcher
|
||||||
|
|
||||||
|
### Level 2: Magisk Module (Root Required)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Generate Magisk module
|
||||||
|
cd botos/rom/scripts
|
||||||
|
./build-magisk-module.sh
|
||||||
|
|
||||||
|
# Copy to device
|
||||||
|
adb push botos-magisk-v1.0.zip /sdcard/
|
||||||
|
|
||||||
|
# Install via Magisk app
|
||||||
|
# Magisk → Modules → + → Select ZIP → Reboot
|
||||||
|
```
|
||||||
|
|
||||||
|
This adds:
|
||||||
|
- Custom boot animation
|
||||||
|
- BotOS as system app (privileged permissions)
|
||||||
|
- Debloat via overlay
|
||||||
|
|
||||||
|
### Level 3: GSI (Full Replacement)
|
||||||
|
|
||||||
|
For advanced users with unlocked bootloader. See `botos/rom/gsi/README.md`.
|
||||||
|
|
||||||
|
## Bloatware Removed
|
||||||
|
|
||||||
|
### Samsung One UI
|
||||||
|
- Bixby, Samsung Pay, Samsung Pass
|
||||||
|
- Duplicate apps (Email, Calendar, Browser)
|
||||||
|
- AR Zone, Game Launcher
|
||||||
|
- Samsung Free, Samsung Global Goals
|
||||||
|
|
||||||
|
### Huawei EMUI/HarmonyOS
|
||||||
|
- AppGallery, HiCloud, HiCar
|
||||||
|
- Huawei Browser, Music, Video
|
||||||
|
- Petal Maps, Petal Search
|
||||||
|
- AI Life, HiSuite
|
||||||
|
|
||||||
|
### Honor MagicOS
|
||||||
|
- Honor Store, MagicRing
|
||||||
|
- Honor Browser, Music
|
||||||
|
|
||||||
|
### Xiaomi MIUI
|
||||||
|
- MSA (analytics), Mi Apps
|
||||||
|
- GetApps, Mi Cloud
|
||||||
|
- Mi Browser, Mi Music
|
||||||
|
|
||||||
|
### Universal (All Devices)
|
||||||
|
- Pre-installed Facebook, Instagram
|
||||||
|
- Pre-installed Netflix, Spotify
|
||||||
|
- Games like Candy Crush
|
||||||
|
- Carrier bloatware
|
||||||
|
|
||||||
|
## Building from Source
|
||||||
|
|
||||||
|
### Prerequisites
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Install Rust and Android targets
|
||||||
|
rustup target add aarch64-linux-android armv7-linux-androideabi
|
||||||
|
|
||||||
|
# Set up Android SDK/NDK
|
||||||
|
export ANDROID_HOME=$HOME/Android/Sdk
|
||||||
|
export NDK_HOME=$ANDROID_HOME/ndk/25.2.9519653
|
||||||
|
|
||||||
|
# Install Tauri CLI
|
||||||
|
cargo install tauri-cli
|
||||||
|
|
||||||
|
# For icons/boot animation
|
||||||
|
sudo apt install librsvg2-bin imagemagick
|
||||||
|
```
|
||||||
|
|
||||||
|
### Build APK
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd botos
|
||||||
|
|
||||||
|
# Generate icons from SVG
|
||||||
|
./scripts/generate-icons.sh
|
||||||
|
|
||||||
|
# Initialize Android project
|
||||||
|
cargo tauri android init
|
||||||
|
|
||||||
|
# Build release APK
|
||||||
|
cargo tauri android build --release
|
||||||
|
```
|
||||||
|
|
||||||
|
Output: `gen/android/app/build/outputs/apk/release/app-release.apk`
|
||||||
|
|
||||||
|
### Development Mode
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Connect device and run
|
||||||
|
cargo tauri android dev
|
||||||
|
|
||||||
|
# Watch logs
|
||||||
|
adb logcat -s BotOS:*
|
||||||
|
```
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
### AndroidManifest.xml
|
||||||
|
|
||||||
|
BotOS is configured as a launcher:
|
||||||
|
|
||||||
|
```xml
|
||||||
|
<intent-filter>
|
||||||
|
<action android:name="android.intent.action.MAIN" />
|
||||||
|
<category android:name="android.intent.category.HOME" />
|
||||||
|
<category android:name="android.intent.category.DEFAULT" />
|
||||||
|
<category android:name="android.intent.category.LAUNCHER" />
|
||||||
|
</intent-filter>
|
||||||
|
```
|
||||||
|
|
||||||
|
### Permissions
|
||||||
|
|
||||||
|
Default capabilities in `capabilities/default.json`:
|
||||||
|
- Internet access
|
||||||
|
- Camera (for QR codes, photos)
|
||||||
|
- Location (GPS)
|
||||||
|
- Storage (files)
|
||||||
|
- Notifications
|
||||||
|
|
||||||
|
### Connecting to Server
|
||||||
|
|
||||||
|
Edit the embedded URL in `tauri.conf.json`:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"build": {
|
||||||
|
"frontendDist": "../botui/ui/suite"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Or configure botserver URL at runtime:
|
||||||
|
```javascript
|
||||||
|
window.BOTSERVER_URL = "https://your-server.com";
|
||||||
|
```
|
||||||
|
|
||||||
|
## Boot Animation
|
||||||
|
|
||||||
|
Create custom boot animation with GB branding:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Generate animation
|
||||||
|
cd botos/scripts
|
||||||
|
./create-bootanimation.sh
|
||||||
|
|
||||||
|
# Install (requires root)
|
||||||
|
adb root
|
||||||
|
adb remount
|
||||||
|
adb push bootanimation.zip /system/media/
|
||||||
|
adb reboot
|
||||||
|
```
|
||||||
|
|
||||||
|
## Project Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
botos/
|
||||||
|
├── Cargo.toml # Rust/Tauri dependencies
|
||||||
|
├── tauri.conf.json # Tauri config → botui/ui/suite
|
||||||
|
├── build.rs # Build script
|
||||||
|
├── src/lib.rs # Android entry point
|
||||||
|
│
|
||||||
|
├── icons/
|
||||||
|
│ ├── gb-bot.svg # Source icon
|
||||||
|
│ ├── icon.png # Main icon (512x512)
|
||||||
|
│ └── */ic_launcher.png # Icons by density
|
||||||
|
│
|
||||||
|
├── scripts/
|
||||||
|
│ ├── generate-icons.sh # Generate PNGs from SVG
|
||||||
|
│ └── create-bootanimation.sh
|
||||||
|
│
|
||||||
|
├── capabilities/
|
||||||
|
│ └── default.json # Tauri permissions
|
||||||
|
│
|
||||||
|
├── gen/android/ # Generated Android project
|
||||||
|
│ └── app/src/main/
|
||||||
|
│ ├── AndroidManifest.xml
|
||||||
|
│ └── res/values/themes.xml
|
||||||
|
│
|
||||||
|
└── rom/ # Installation tools
|
||||||
|
├── install.sh # Interactive installer
|
||||||
|
├── scripts/
|
||||||
|
│ ├── debloat.sh # Remove bloatware
|
||||||
|
│ └── build-magisk-module.sh
|
||||||
|
└── gsi/
|
||||||
|
└── README.md # GSI instructions
|
||||||
|
```
|
||||||
|
|
||||||
|
## Offline Mode
|
||||||
|
|
||||||
|
BotOS can work offline with local LLM:
|
||||||
|
|
||||||
|
1. Install botserver on the device (see [Local LLM](./local-llm.md))
|
||||||
|
2. Configure to use localhost:
|
||||||
|
```javascript
|
||||||
|
window.BOTSERVER_URL = "http://127.0.0.1:8088";
|
||||||
|
```
|
||||||
|
3. Run llama.cpp with small model (TinyLlama on 4GB+ devices)
|
||||||
|
|
||||||
|
## Use Cases
|
||||||
|
|
||||||
|
### Dedicated Kiosk
|
||||||
|
- Retail product information
|
||||||
|
- Hotel check-in
|
||||||
|
- Restaurant ordering
|
||||||
|
- Museum guides
|
||||||
|
|
||||||
|
### Enterprise Device
|
||||||
|
- Field service assistant
|
||||||
|
- Warehouse scanner with AI
|
||||||
|
- Delivery driver companion
|
||||||
|
- Healthcare bedside terminal
|
||||||
|
|
||||||
|
### Consumer Device
|
||||||
|
- Elder-friendly phone
|
||||||
|
- Child-safe device
|
||||||
|
- Single-purpose assistant
|
||||||
|
- Smart home controller
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### App Won't Install
|
||||||
|
```bash
|
||||||
|
# Enable installation from unknown sources
|
||||||
|
# Settings → Security → Unknown Sources
|
||||||
|
|
||||||
|
# Or use ADB
|
||||||
|
adb install -r botos.apk
|
||||||
|
```
|
||||||
|
|
||||||
|
### Debloat Not Working
|
||||||
|
```bash
|
||||||
|
# Some packages require root
|
||||||
|
# Use Level 2 (Magisk) for complete removal
|
||||||
|
|
||||||
|
# Check which packages failed
|
||||||
|
adb shell pm list packages | grep <manufacturer>
|
||||||
|
```
|
||||||
|
|
||||||
|
### Boot Loop After GSI
|
||||||
|
```bash
|
||||||
|
# Boot into recovery
|
||||||
|
# Wipe data/factory reset
|
||||||
|
# Reflash stock ROM
|
||||||
|
```
|
||||||
|
|
||||||
|
### WebView Crashes
|
||||||
|
```bash
|
||||||
|
# Update Android System WebView
|
||||||
|
adb shell pm enable com.google.android.webview
|
||||||
209
src/13-devices/quick-start.md
Normal file
209
src/13-devices/quick-start.md
Normal file
|
|
@ -0,0 +1,209 @@
|
||||||
|
# Quick Start - Deploy in 5 Minutes
|
||||||
|
|
||||||
|
Get General Bots running on your embedded device with local AI in just a few commands.
|
||||||
|
|
||||||
|
## Prerequisites
|
||||||
|
|
||||||
|
- An SBC (Raspberry Pi, Orange Pi, etc.) with Armbian/Raspbian
|
||||||
|
- SSH access to the device
|
||||||
|
- Internet connection (for initial setup only)
|
||||||
|
|
||||||
|
## One-Line Deploy
|
||||||
|
|
||||||
|
From your development machine:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Clone and run the deployment script
|
||||||
|
git clone https://github.com/GeneralBots/botserver.git
|
||||||
|
cd botserver
|
||||||
|
|
||||||
|
# Deploy to Orange Pi (replace with your device IP)
|
||||||
|
./scripts/deploy-embedded.sh orangepi@192.168.1.100 --with-ui --with-llama
|
||||||
|
```
|
||||||
|
|
||||||
|
That's it! After ~10-15 minutes:
|
||||||
|
- BotServer runs on port 8088
|
||||||
|
- llama.cpp runs on port 8080 with TinyLlama
|
||||||
|
- Embedded UI available at `http://your-device:8088/embedded/`
|
||||||
|
|
||||||
|
## Step-by-Step Guide
|
||||||
|
|
||||||
|
### Step 1: Prepare Your Device
|
||||||
|
|
||||||
|
Flash your SBC with a compatible OS:
|
||||||
|
|
||||||
|
**Raspberry Pi:**
|
||||||
|
```bash
|
||||||
|
# Download Raspberry Pi Imager
|
||||||
|
# Select: Raspberry Pi OS Lite (64-bit)
|
||||||
|
# Enable SSH in settings
|
||||||
|
```
|
||||||
|
|
||||||
|
**Orange Pi:**
|
||||||
|
```bash
|
||||||
|
# Download Armbian from armbian.com
|
||||||
|
# Flash with balenaEtcher
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 2: First Boot Configuration
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# SSH into your device
|
||||||
|
ssh pi@raspberrypi.local # or orangepi@orangepi.local
|
||||||
|
|
||||||
|
# Update system
|
||||||
|
sudo apt update && sudo apt upgrade -y
|
||||||
|
|
||||||
|
# Set timezone
|
||||||
|
sudo timedatectl set-timezone America/Sao_Paulo
|
||||||
|
|
||||||
|
# Enable I2C/SPI if using GPIO displays
|
||||||
|
sudo raspi-config # or armbian-config
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 3: Run Deployment Script
|
||||||
|
|
||||||
|
From your development PC:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Basic deployment (botserver only)
|
||||||
|
./scripts/deploy-embedded.sh pi@raspberrypi.local
|
||||||
|
|
||||||
|
# With embedded UI
|
||||||
|
./scripts/deploy-embedded.sh pi@raspberrypi.local --with-ui
|
||||||
|
|
||||||
|
# With local LLM (requires 4GB+ RAM)
|
||||||
|
./scripts/deploy-embedded.sh pi@raspberrypi.local --with-ui --with-llama
|
||||||
|
|
||||||
|
# Specify a different model
|
||||||
|
./scripts/deploy-embedded.sh pi@raspberrypi.local --with-llama --model phi-2-Q4_K_M.gguf
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 4: Verify Installation
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check services
|
||||||
|
ssh pi@raspberrypi.local 'sudo systemctl status botserver'
|
||||||
|
ssh pi@raspberrypi.local 'sudo systemctl status llama-server'
|
||||||
|
|
||||||
|
# Test botserver
|
||||||
|
curl http://raspberrypi.local:8088/health
|
||||||
|
|
||||||
|
# Test llama.cpp
|
||||||
|
curl http://raspberrypi.local:8080/v1/models
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 5: Access the Interface
|
||||||
|
|
||||||
|
Open in your browser:
|
||||||
|
```
|
||||||
|
http://raspberrypi.local:8088/embedded/
|
||||||
|
```
|
||||||
|
|
||||||
|
Or set up kiosk mode (auto-starts on boot):
|
||||||
|
```bash
|
||||||
|
# Already configured if you used --with-ui
|
||||||
|
# Just reboot:
|
||||||
|
ssh pi@raspberrypi.local 'sudo reboot'
|
||||||
|
```
|
||||||
|
|
||||||
|
## Local Installation (On the Device)
|
||||||
|
|
||||||
|
If you prefer to install directly on the device:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# SSH into the device
|
||||||
|
ssh pi@raspberrypi.local
|
||||||
|
|
||||||
|
# Install Rust
|
||||||
|
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
|
||||||
|
source ~/.cargo/env
|
||||||
|
|
||||||
|
# Clone and build
|
||||||
|
git clone https://github.com/GeneralBots/botserver.git
|
||||||
|
cd botserver
|
||||||
|
|
||||||
|
# Run local deployment
|
||||||
|
./scripts/deploy-embedded.sh --local --with-ui --with-llama
|
||||||
|
```
|
||||||
|
|
||||||
|
⚠️ **Note:** Building on ARM devices is slow (1-2 hours). Cross-compilation is faster.
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
After deployment, edit the config file:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
ssh pi@raspberrypi.local
|
||||||
|
sudo nano /opt/botserver/.env
|
||||||
|
```
|
||||||
|
|
||||||
|
Key settings:
|
||||||
|
```env
|
||||||
|
# Server
|
||||||
|
HOST=0.0.0.0
|
||||||
|
PORT=8088
|
||||||
|
|
||||||
|
# Local LLM
|
||||||
|
LLM_PROVIDER=llamacpp
|
||||||
|
LLM_API_URL=http://127.0.0.1:8080
|
||||||
|
LLM_MODEL=tinyllama
|
||||||
|
|
||||||
|
# Memory limits for small devices
|
||||||
|
MAX_CONTEXT_TOKENS=2048
|
||||||
|
MAX_RESPONSE_TOKENS=512
|
||||||
|
```
|
||||||
|
|
||||||
|
Restart after changes:
|
||||||
|
```bash
|
||||||
|
sudo systemctl restart botserver
|
||||||
|
```
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### Out of Memory
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check memory usage
|
||||||
|
free -h
|
||||||
|
|
||||||
|
# Reduce llama.cpp context
|
||||||
|
sudo nano /etc/systemd/system/llama-server.service
|
||||||
|
# Change -c 2048 to -c 1024
|
||||||
|
|
||||||
|
# Or use a smaller model
|
||||||
|
# TinyLlama uses ~700MB, Phi-2 uses ~1.6GB
|
||||||
|
```
|
||||||
|
|
||||||
|
### Service Won't Start
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check logs
|
||||||
|
sudo journalctl -u botserver -f
|
||||||
|
sudo journalctl -u llama-server -f
|
||||||
|
|
||||||
|
# Common issues:
|
||||||
|
# - Port already in use
|
||||||
|
# - Missing model file
|
||||||
|
# - Database permissions
|
||||||
|
```
|
||||||
|
|
||||||
|
### Display Not Working
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check if display is detected
|
||||||
|
ls /dev/fb* # HDMI/DSI
|
||||||
|
ls /dev/i2c* # I2C displays
|
||||||
|
ls /dev/spidev* # SPI displays
|
||||||
|
|
||||||
|
# For HDMI, check config
|
||||||
|
sudo nano /boot/config.txt # Raspberry Pi
|
||||||
|
sudo nano /boot/armbianEnv.txt # Orange Pi
|
||||||
|
```
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
- [Embedded UI Guide](./embedded-ui.md) - Customize the interface
|
||||||
|
- [Local LLM Configuration](./local-llm.md) - Optimize AI performance
|
||||||
|
- [Kiosk Mode](./kiosk-mode.md) - Production deployment
|
||||||
|
- [Offline Operation](./offline.md) - Disconnected environments
|
||||||
|
|
@ -1,47 +1 @@
|
||||||
# Chapter 20: Embedded & Offline Deployment
|
# Chapter 20: Embedded Deployment
|
||||||
|
|
||||||
Deploy General Bots to any device - from Raspberry Pi to industrial kiosks - with local LLM inference for fully offline AI capabilities.
|
|
||||||
|
|
||||||
## Overview
|
|
||||||
|
|
||||||
General Bots can run on minimal hardware with displays as small as 16x2 character LCDs, enabling AI-powered interactions anywhere:
|
|
||||||
|
|
||||||
- **Kiosks** - Self-service terminals in stores, airports, hospitals
|
|
||||||
- **Industrial IoT** - Factory floor assistants, machine interfaces
|
|
||||||
- **Smart Home** - Wall panels, kitchen displays, door intercoms
|
|
||||||
- **Retail** - Point-of-sale systems, product information terminals
|
|
||||||
- **Education** - Classroom assistants, lab equipment interfaces
|
|
||||||
- **Healthcare** - Patient check-in, medication reminders
|
|
||||||
|
|
||||||
```
|
|
||||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
||||||
│ Embedded GB Architecture │
|
|
||||||
├─────────────────────────────────────────────────────────────────────────────┤
|
|
||||||
│ │
|
|
||||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
|
||||||
│ │ Display │ │ botserver │ │ llama.cpp │ │
|
|
||||||
│ │ LCD/OLED │────▶│ (Rust) │────▶│ (Local) │ │
|
|
||||||
│ │ TFT/HDMI │ │ Port 8088 │ │ Port 8080 │ │
|
|
||||||
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
|
||||||
│ │ │ │ │
|
|
||||||
│ │ │ │ │
|
|
||||||
│ ┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐ │
|
|
||||||
│ │ Keyboard │ │ SQLite │ │ TinyLlama │ │
|
|
||||||
│ │ Buttons │ │ (Data) │ │ GGUF │ │
|
|
||||||
│ │ Touch │ │ │ │ (~700MB) │ │
|
|
||||||
│ └─────────────┘ └─────────────┘ └─────────────┘ │
|
|
||||||
│ │
|
|
||||||
└─────────────────────────────────────────────────────────────────────────────┘
|
|
||||||
```
|
|
||||||
|
|
||||||
## What's in This Chapter
|
|
||||||
|
|
||||||
- [Supported Hardware](./hardware.md) - Boards, displays, and peripherals
|
|
||||||
- [Quick Start](./quick-start.md) - Deploy in 5 minutes
|
|
||||||
- [Embedded UI](./embedded-ui.md) - Interface for small displays
|
|
||||||
- [Local LLM](./local-llm.md) - Offline AI with llama.cpp
|
|
||||||
- [Display Modes](./display-modes.md) - LCD, OLED, TFT, E-ink configurations
|
|
||||||
- [Kiosk Mode](./kiosk-mode.md) - Locked-down production deployments
|
|
||||||
- [Performance Tuning](./performance.md) - Optimize for limited resources
|
|
||||||
- [Offline Operation](./offline.md) - No internet required
|
|
||||||
- [Use Cases](./use-cases.md) - Real-world deployment examples
|
|
||||||
|
|
|
||||||
|
|
@ -1,190 +1 @@
|
||||||
# Supported Hardware
|
# Supported Hardware
|
||||||
|
|
||||||
## Single Board Computers (SBCs)
|
|
||||||
|
|
||||||
### Recommended Boards
|
|
||||||
|
|
||||||
| Board | CPU | RAM | Best For | Price |
|
|
||||||
|-------|-----|-----|----------|-------|
|
|
||||||
| **Orange Pi 5** | RK3588S | 4-16GB | Full LLM, NPU accel | $89-149 |
|
|
||||||
| **Raspberry Pi 5** | BCM2712 | 4-8GB | General purpose | $60-80 |
|
|
||||||
| **Orange Pi Zero 3** | H618 | 1-4GB | Minimal deployments | $20-35 |
|
|
||||||
| **Raspberry Pi 4** | BCM2711 | 2-8GB | Established ecosystem | $45-75 |
|
|
||||||
| **Raspberry Pi Zero 2W** | RP3A0 | 512MB | Ultra-compact | $15 |
|
|
||||||
| **Rock Pi 4** | RK3399 | 4GB | NPU available | $75 |
|
|
||||||
| **NVIDIA Jetson Nano** | Tegra X1 | 4GB | GPU inference | $149 |
|
|
||||||
| **BeagleBone Black** | AM3358 | 512MB | Industrial | $55 |
|
|
||||||
| **LattePanda 3 Delta** | N100 | 8GB | x86 compatibility | $269 |
|
|
||||||
| **ODROID-N2+** | S922X | 4GB | High performance | $79 |
|
|
||||||
|
|
||||||
### Minimum Requirements
|
|
||||||
|
|
||||||
**For UI only (connect to remote botserver):**
|
|
||||||
- Any ARM/x86 Linux board
|
|
||||||
- 256MB RAM
|
|
||||||
- Network connection
|
|
||||||
- Display output
|
|
||||||
|
|
||||||
**For local botserver:**
|
|
||||||
- ARM64 or x86_64
|
|
||||||
- 1GB RAM minimum
|
|
||||||
- 4GB storage
|
|
||||||
|
|
||||||
**For local LLM (llama.cpp):**
|
|
||||||
- ARM64 or x86_64
|
|
||||||
- 2GB+ RAM (4GB recommended)
|
|
||||||
- 2GB+ storage for model
|
|
||||||
|
|
||||||
### Orange Pi 5 (Recommended for LLM)
|
|
||||||
|
|
||||||
The Orange Pi 5 with RK3588S is ideal for embedded LLM:
|
|
||||||
|
|
||||||
```
|
|
||||||
┌─────────────────────────────────────────────────────────────┐
|
|
||||||
│ Orange Pi 5 - Best for Offline AI │
|
|
||||||
├─────────────────────────────────────────────────────────────┤
|
|
||||||
│ CPU: Rockchip RK3588S (4x A76 + 4x A55) │
|
|
||||||
│ NPU: 6 TOPS (Neural Processing Unit) │
|
|
||||||
│ GPU: Mali-G610 MP4 │
|
|
||||||
│ RAM: 4GB / 8GB / 16GB LPDDR4X │
|
|
||||||
│ Storage: M.2 NVMe + eMMC + microSD │
|
|
||||||
│ │
|
|
||||||
│ LLM Performance: │
|
|
||||||
│ ├─ TinyLlama 1.1B Q4: ~8-12 tokens/sec │
|
|
||||||
│ ├─ Phi-2 2.7B Q4: ~4-6 tokens/sec │
|
|
||||||
│ └─ With NPU (rkllm): ~20-30 tokens/sec │
|
|
||||||
└─────────────────────────────────────────────────────────────┘
|
|
||||||
```
|
|
||||||
|
|
||||||
## Displays
|
|
||||||
|
|
||||||
### Character LCDs (Minimal)
|
|
||||||
|
|
||||||
For text-only interfaces:
|
|
||||||
|
|
||||||
| Display | Resolution | Interface | Use Case |
|
|
||||||
|---------|------------|-----------|----------|
|
|
||||||
| HD44780 16x2 | 16 chars × 2 lines | I2C/GPIO | Status, simple Q&A |
|
|
||||||
| HD44780 20x4 | 20 chars × 4 lines | I2C/GPIO | More context |
|
|
||||||
| LCD2004 | 20 chars × 4 lines | I2C | Industrial |
|
|
||||||
|
|
||||||
**Example output on 16x2:**
|
|
||||||
```
|
|
||||||
┌────────────────┐
|
|
||||||
│> How can I help│
|
|
||||||
│< Processing... │
|
|
||||||
└────────────────┘
|
|
||||||
```
|
|
||||||
|
|
||||||
### OLED Displays
|
|
||||||
|
|
||||||
For graphical monochrome interfaces:
|
|
||||||
|
|
||||||
| Display | Resolution | Interface | Size |
|
|
||||||
|---------|------------|-----------|------|
|
|
||||||
| SSD1306 | 128×64 | I2C/SPI | 0.96" |
|
|
||||||
| SSD1309 | 128×64 | I2C/SPI | 2.42" |
|
|
||||||
| SH1106 | 128×64 | I2C/SPI | 1.3" |
|
|
||||||
| SSD1322 | 256×64 | SPI | 3.12" |
|
|
||||||
|
|
||||||
### TFT/IPS Color Displays
|
|
||||||
|
|
||||||
For full graphical interface:
|
|
||||||
|
|
||||||
| Display | Resolution | Interface | Notes |
|
|
||||||
|---------|------------|-----------|-------|
|
|
||||||
| ILI9341 | 320×240 | SPI | Common, cheap |
|
|
||||||
| ST7789 | 240×320 | SPI | Fast refresh |
|
|
||||||
| ILI9488 | 480×320 | SPI | Larger |
|
|
||||||
| Waveshare 5" | 800×480 | HDMI | Touch optional |
|
|
||||||
| Waveshare 7" | 1024×600 | HDMI | Touch, IPS |
|
|
||||||
| Official Pi 7" | 800×480 | DSI | Best for Pi |
|
|
||||||
|
|
||||||
### E-Ink/E-Paper
|
|
||||||
|
|
||||||
For low-power, readable in sunlight:
|
|
||||||
|
|
||||||
| Display | Resolution | Colors | Refresh |
|
|
||||||
|---------|------------|--------|---------|
|
|
||||||
| Waveshare 2.13" | 250×122 | B/W | 2s |
|
|
||||||
| Waveshare 4.2" | 400×300 | B/W | 4s |
|
|
||||||
| Waveshare 7.5" | 800×480 | B/W | 5s |
|
|
||||||
| Good Display 9.7" | 1200×825 | B/W | 6s |
|
|
||||||
|
|
||||||
**Best for:** Menu displays, signs, low-update applications
|
|
||||||
|
|
||||||
### Industrial Displays
|
|
||||||
|
|
||||||
| Display | Resolution | Features |
|
|
||||||
|---------|------------|----------|
|
|
||||||
| Advantech | Various | Wide temp, sunlight |
|
|
||||||
| Winstar | Various | Industrial grade |
|
|
||||||
| Newhaven | Various | Long availability |
|
|
||||||
|
|
||||||
## Input Devices
|
|
||||||
|
|
||||||
### Keyboards
|
|
||||||
|
|
||||||
- **USB Keyboard** - Standard, any USB keyboard works
|
|
||||||
- **PS/2 Keyboard** - Via adapter, lower latency
|
|
||||||
- **Matrix Keypad** - 4x4 or 3x4, GPIO connected
|
|
||||||
- **I2C Keypad** - Fewer GPIO pins needed
|
|
||||||
|
|
||||||
### Touch Input
|
|
||||||
|
|
||||||
- **Capacitive Touch** - Better response, needs driver
|
|
||||||
- **Resistive Touch** - Works with gloves, pressure-based
|
|
||||||
- **IR Touch Frame** - Large displays, vandal-resistant
|
|
||||||
|
|
||||||
### Buttons & GPIO
|
|
||||||
|
|
||||||
```
|
|
||||||
┌─────────────────────────────────────────────┐
|
|
||||||
│ Simple 4-Button Interface │
|
|
||||||
├─────────────────────────────────────────────┤
|
|
||||||
│ │
|
|
||||||
│ [◄ PREV] [▲ UP] [▼ DOWN] [► SELECT] │
|
|
||||||
│ │
|
|
||||||
│ GPIO 17 GPIO 27 GPIO 22 GPIO 23 │
|
|
||||||
│ │
|
|
||||||
└─────────────────────────────────────────────┘
|
|
||||||
```
|
|
||||||
|
|
||||||
## Enclosures
|
|
||||||
|
|
||||||
### Commercial Options
|
|
||||||
|
|
||||||
- **Hammond Manufacturing** - Industrial metal enclosures
|
|
||||||
- **Polycase** - Plastic, IP65 rated
|
|
||||||
- **Bud Industries** - Various sizes
|
|
||||||
- **Pi-specific cases** - Argon, Flirc, etc.
|
|
||||||
|
|
||||||
### DIY Options
|
|
||||||
|
|
||||||
- **3D Printed** - Custom fit, PLA/PETG
|
|
||||||
- **Laser Cut** - Acrylic, wood
|
|
||||||
- **Metal Fabrication** - Professional look
|
|
||||||
|
|
||||||
## Power
|
|
||||||
|
|
||||||
### Power Requirements
|
|
||||||
|
|
||||||
| Configuration | Power | Recommended PSU |
|
|
||||||
|---------------|-------|-----------------|
|
|
||||||
| Pi Zero + LCD | 1-2W | 5V 1A |
|
|
||||||
| Pi 4 + Display | 5-10W | 5V 3A |
|
|
||||||
| Orange Pi 5 | 8-15W | 5V 4A or 12V 2A |
|
|
||||||
| With NVMe SSD | +2-3W | Add 1A headroom |
|
|
||||||
|
|
||||||
### Power Options
|
|
||||||
|
|
||||||
- **USB-C PD** - Modern, efficient
|
|
||||||
- **PoE HAT** - Power over Ethernet
|
|
||||||
- **12V Barrel** - Industrial standard
|
|
||||||
- **Battery** - UPS, solar applications
|
|
||||||
|
|
||||||
### UPS Solutions
|
|
||||||
|
|
||||||
- **PiJuice** - Pi-specific UPS HAT
|
|
||||||
- **UPS PIco** - Small form factor
|
|
||||||
- **Powerboost** - Adafruit, lithium battery
|
|
||||||
|
|
|
||||||
|
|
@ -1,382 +1 @@
|
||||||
# Local LLM - Offline AI with llama.cpp
|
# Local LLM with llama.cpp
|
||||||
|
|
||||||
Run AI inference completely offline on embedded devices. No internet, no API costs, full privacy.
|
|
||||||
|
|
||||||
## Overview
|
|
||||||
|
|
||||||
```
|
|
||||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
||||||
│ Local LLM Architecture │
|
|
||||||
├─────────────────────────────────────────────────────────────────────────────┤
|
|
||||||
│ │
|
|
||||||
│ User Input ──▶ botserver ──▶ llama.cpp ──▶ Response │
|
|
||||||
│ │ │ │
|
|
||||||
│ │ ┌────┴────┐ │
|
|
||||||
│ │ │ Model │ │
|
|
||||||
│ │ │ GGUF │ │
|
|
||||||
│ │ │ (Q4_K) │ │
|
|
||||||
│ │ └─────────┘ │
|
|
||||||
│ │ │
|
|
||||||
│ SQLite DB │
|
|
||||||
│ (sessions) │
|
|
||||||
│ │
|
|
||||||
└─────────────────────────────────────────────────────────────────────────────┘
|
|
||||||
```
|
|
||||||
|
|
||||||
## Recommended Models
|
|
||||||
|
|
||||||
### By Device RAM
|
|
||||||
|
|
||||||
| RAM | Model | Size | Speed | Quality |
|
|
||||||
|-----|-------|------|-------|---------|
|
|
||||||
| **2GB** | TinyLlama 1.1B Q4_K_M | 670MB | ~5 tok/s | Basic |
|
|
||||||
| **4GB** | Phi-2 2.7B Q4_K_M | 1.6GB | ~3-4 tok/s | Good |
|
|
||||||
| **4GB** | Gemma 2B Q4_K_M | 1.4GB | ~4 tok/s | Good |
|
|
||||||
| **8GB** | Llama 3.2 3B Q4_K_M | 2GB | ~3 tok/s | Better |
|
|
||||||
| **8GB** | Mistral 7B Q4_K_M | 4.1GB | ~2 tok/s | Great |
|
|
||||||
| **16GB** | Llama 3.1 8B Q4_K_M | 4.7GB | ~2 tok/s | Excellent |
|
|
||||||
|
|
||||||
### By Use Case
|
|
||||||
|
|
||||||
**Simple Q&A, Commands:**
|
|
||||||
```
|
|
||||||
TinyLlama 1.1B - Fast, basic understanding
|
|
||||||
```
|
|
||||||
|
|
||||||
**Customer Service, FAQ:**
|
|
||||||
```
|
|
||||||
Phi-2 or Gemma 2B - Good comprehension, reasonable speed
|
|
||||||
```
|
|
||||||
|
|
||||||
**Complex Reasoning:**
|
|
||||||
```
|
|
||||||
Llama 3.2 3B or Mistral 7B - Better accuracy, slower
|
|
||||||
```
|
|
||||||
|
|
||||||
## Installation
|
|
||||||
|
|
||||||
### Automatic (via deploy script)
|
|
||||||
|
|
||||||
```bash
|
|
||||||
./scripts/deploy-embedded.sh pi@device --with-llama
|
|
||||||
```
|
|
||||||
|
|
||||||
### Manual Installation
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# SSH to device
|
|
||||||
ssh pi@raspberrypi.local
|
|
||||||
|
|
||||||
# Install dependencies
|
|
||||||
sudo apt update
|
|
||||||
sudo apt install -y build-essential cmake git wget
|
|
||||||
|
|
||||||
# Clone llama.cpp
|
|
||||||
cd /opt
|
|
||||||
sudo git clone https://github.com/ggerganov/llama.cpp
|
|
||||||
sudo chown -R $(whoami):$(whoami) llama.cpp
|
|
||||||
cd llama.cpp
|
|
||||||
|
|
||||||
# Build for ARM (auto-optimizes)
|
|
||||||
mkdir build && cd build
|
|
||||||
cmake .. -DLLAMA_NATIVE=ON -DCMAKE_BUILD_TYPE=Release
|
|
||||||
make -j$(nproc)
|
|
||||||
|
|
||||||
# Download model
|
|
||||||
mkdir -p /opt/llama.cpp/models
|
|
||||||
cd /opt/llama.cpp/models
|
|
||||||
wget https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf
|
|
||||||
```
|
|
||||||
|
|
||||||
### Start Server
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Test run
|
|
||||||
/opt/llama.cpp/build/bin/llama-server \
|
|
||||||
-m /opt/llama.cpp/models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf \
|
|
||||||
--host 0.0.0.0 \
|
|
||||||
--port 8080 \
|
|
||||||
-c 2048 \
|
|
||||||
--threads 4
|
|
||||||
|
|
||||||
# Verify
|
|
||||||
curl http://localhost:8080/v1/models
|
|
||||||
```
|
|
||||||
|
|
||||||
### Systemd Service
|
|
||||||
|
|
||||||
Create `/etc/systemd/system/llama-server.service`:
|
|
||||||
|
|
||||||
```ini
|
|
||||||
[Unit]
|
|
||||||
Description=llama.cpp Server - Local LLM
|
|
||||||
After=network.target
|
|
||||||
|
|
||||||
[Service]
|
|
||||||
Type=simple
|
|
||||||
User=root
|
|
||||||
WorkingDirectory=/opt/llama.cpp
|
|
||||||
ExecStart=/opt/llama.cpp/build/bin/llama-server \
|
|
||||||
-m /opt/llama.cpp/models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf \
|
|
||||||
--host 0.0.0.0 \
|
|
||||||
--port 8080 \
|
|
||||||
-c 2048 \
|
|
||||||
-ngl 0 \
|
|
||||||
--threads 4
|
|
||||||
Restart=always
|
|
||||||
RestartSec=5
|
|
||||||
|
|
||||||
[Install]
|
|
||||||
WantedBy=multi-user.target
|
|
||||||
```
|
|
||||||
|
|
||||||
Enable and start:
|
|
||||||
```bash
|
|
||||||
sudo systemctl daemon-reload
|
|
||||||
sudo systemctl enable llama-server
|
|
||||||
sudo systemctl start llama-server
|
|
||||||
```
|
|
||||||
|
|
||||||
## Configuration
|
|
||||||
|
|
||||||
### botserver .env
|
|
||||||
|
|
||||||
```env
|
|
||||||
# Use local llama.cpp
|
|
||||||
LLM_PROVIDER=llamacpp
|
|
||||||
LLM_API_URL=http://127.0.0.1:8080
|
|
||||||
LLM_MODEL=tinyllama
|
|
||||||
|
|
||||||
# Memory limits
|
|
||||||
MAX_CONTEXT_TOKENS=2048
|
|
||||||
MAX_RESPONSE_TOKENS=512
|
|
||||||
STREAMING_ENABLED=true
|
|
||||||
```
|
|
||||||
|
|
||||||
### llama.cpp Parameters
|
|
||||||
|
|
||||||
| Parameter | Default | Description |
|
|
||||||
|-----------|---------|-------------|
|
|
||||||
| `-c` | 2048 | Context size (tokens) |
|
|
||||||
| `--threads` | 4 | CPU threads |
|
|
||||||
| `-ngl` | 0 | GPU layers (0 for CPU only) |
|
|
||||||
| `--host` | 127.0.0.1 | Bind address |
|
|
||||||
| `--port` | 8080 | Server port |
|
|
||||||
| `-b` | 512 | Batch size |
|
|
||||||
| `--mlock` | off | Lock model in RAM |
|
|
||||||
|
|
||||||
### Memory vs Context Size
|
|
||||||
|
|
||||||
```
|
|
||||||
Context 512: ~400MB RAM, fast, limited conversation
|
|
||||||
Context 1024: ~600MB RAM, moderate
|
|
||||||
Context 2048: ~900MB RAM, good for most uses
|
|
||||||
Context 4096: ~1.5GB RAM, long conversations
|
|
||||||
```
|
|
||||||
|
|
||||||
## Performance Optimization
|
|
||||||
|
|
||||||
### CPU Optimization
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Check CPU features
|
|
||||||
cat /proc/cpuinfo | grep -E "(model name|Features)"
|
|
||||||
|
|
||||||
# Build with specific optimizations
|
|
||||||
cmake .. -DLLAMA_NATIVE=ON \
|
|
||||||
-DCMAKE_BUILD_TYPE=Release \
|
|
||||||
-DLLAMA_ARM_FMA=ON \
|
|
||||||
-DLLAMA_ARM_DOTPROD=ON
|
|
||||||
```
|
|
||||||
|
|
||||||
### Memory Optimization
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# For 2GB RAM devices
|
|
||||||
# Use smaller context
|
|
||||||
-c 1024
|
|
||||||
|
|
||||||
# Use memory mapping (slower but less RAM)
|
|
||||||
--mmap
|
|
||||||
|
|
||||||
# Disable mlock (don't pin to RAM)
|
|
||||||
# (default is disabled)
|
|
||||||
```
|
|
||||||
|
|
||||||
### Swap Configuration
|
|
||||||
|
|
||||||
For devices with limited RAM:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Create 2GB swap
|
|
||||||
sudo fallocate -l 2G /swapfile
|
|
||||||
sudo chmod 600 /swapfile
|
|
||||||
sudo mkswap /swapfile
|
|
||||||
sudo swapon /swapfile
|
|
||||||
|
|
||||||
# Make permanent
|
|
||||||
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
|
|
||||||
|
|
||||||
# Optimize swap usage
|
|
||||||
echo 'vm.swappiness=10' | sudo tee -a /etc/sysctl.conf
|
|
||||||
```
|
|
||||||
|
|
||||||
## NPU Acceleration (Orange Pi 5)
|
|
||||||
|
|
||||||
Orange Pi 5 has a 6 TOPS NPU that can accelerate inference:
|
|
||||||
|
|
||||||
### Using rkllm (Rockchip NPU)
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Install rkllm runtime
|
|
||||||
git clone https://github.com/airockchip/rknn-llm
|
|
||||||
cd rknn-llm
|
|
||||||
./install.sh
|
|
||||||
|
|
||||||
# Convert model to RKNN format
|
|
||||||
python3 convert_model.py \
|
|
||||||
--model tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf \
|
|
||||||
--output tinyllama.rkllm
|
|
||||||
|
|
||||||
# Run with NPU
|
|
||||||
rkllm-server \
|
|
||||||
--model tinyllama.rkllm \
|
|
||||||
--port 8080
|
|
||||||
```
|
|
||||||
|
|
||||||
Expected speedup: **3-5x faster** than CPU only.
|
|
||||||
|
|
||||||
## Model Download URLs
|
|
||||||
|
|
||||||
### TinyLlama 1.1B (Recommended for 2GB)
|
|
||||||
```bash
|
|
||||||
wget https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf
|
|
||||||
```
|
|
||||||
|
|
||||||
### Phi-2 2.7B (Recommended for 4GB)
|
|
||||||
```bash
|
|
||||||
wget https://huggingface.co/TheBloke/phi-2-GGUF/resolve/main/phi-2.Q4_K_M.gguf
|
|
||||||
```
|
|
||||||
|
|
||||||
### Gemma 2B
|
|
||||||
```bash
|
|
||||||
wget https://huggingface.co/bartowski/gemma-2-2b-it-GGUF/resolve/main/gemma-2-2b-it-Q4_K_M.gguf
|
|
||||||
```
|
|
||||||
|
|
||||||
### Llama 3.2 3B (Recommended for 8GB)
|
|
||||||
```bash
|
|
||||||
wget https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF/resolve/main/Llama-3.2-3B-Instruct-Q4_K_M.gguf
|
|
||||||
```
|
|
||||||
|
|
||||||
### Mistral 7B
|
|
||||||
```bash
|
|
||||||
wget https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/resolve/main/mistral-7b-instruct-v0.2.Q4_K_M.gguf
|
|
||||||
```
|
|
||||||
|
|
||||||
## API Usage
|
|
||||||
|
|
||||||
llama.cpp exposes an OpenAI-compatible API:
|
|
||||||
|
|
||||||
### Chat Completion
|
|
||||||
|
|
||||||
```bash
|
|
||||||
curl http://localhost:8080/v1/chat/completions \
|
|
||||||
-H "Content-Type: application/json" \
|
|
||||||
-d '{
|
|
||||||
"model": "tinyllama",
|
|
||||||
"messages": [
|
|
||||||
{"role": "user", "content": "What is 2+2?"}
|
|
||||||
],
|
|
||||||
"max_tokens": 100
|
|
||||||
}'
|
|
||||||
```
|
|
||||||
|
|
||||||
### Streaming
|
|
||||||
|
|
||||||
```bash
|
|
||||||
curl http://localhost:8080/v1/chat/completions \
|
|
||||||
-H "Content-Type: application/json" \
|
|
||||||
-d '{
|
|
||||||
"model": "tinyllama",
|
|
||||||
"messages": [{"role": "user", "content": "Tell me a story"}],
|
|
||||||
"stream": true
|
|
||||||
}'
|
|
||||||
```
|
|
||||||
|
|
||||||
### Health Check
|
|
||||||
|
|
||||||
```bash
|
|
||||||
curl http://localhost:8080/health
|
|
||||||
curl http://localhost:8080/v1/models
|
|
||||||
```
|
|
||||||
|
|
||||||
## Monitoring
|
|
||||||
|
|
||||||
### Check Performance
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Watch resource usage
|
|
||||||
htop
|
|
||||||
|
|
||||||
# Check inference speed in logs
|
|
||||||
sudo journalctl -u llama-server -f | grep "tokens/s"
|
|
||||||
|
|
||||||
# Memory usage
|
|
||||||
free -h
|
|
||||||
```
|
|
||||||
|
|
||||||
### Benchmarking
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Run llama.cpp benchmark
|
|
||||||
/opt/llama.cpp/build/bin/llama-bench \
|
|
||||||
-m /opt/llama.cpp/models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf \
|
|
||||||
-p 512 -n 128 -t 4
|
|
||||||
```
|
|
||||||
|
|
||||||
## Troubleshooting
|
|
||||||
|
|
||||||
### Model Loading Fails
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Check available RAM
|
|
||||||
free -h
|
|
||||||
|
|
||||||
# Try smaller context
|
|
||||||
-c 512
|
|
||||||
|
|
||||||
# Use memory mapping
|
|
||||||
--mmap
|
|
||||||
```
|
|
||||||
|
|
||||||
### Slow Inference
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Increase threads (up to CPU cores)
|
|
||||||
--threads $(nproc)
|
|
||||||
|
|
||||||
# Use optimized build
|
|
||||||
cmake .. -DLLAMA_NATIVE=ON
|
|
||||||
|
|
||||||
# Consider smaller model
|
|
||||||
```
|
|
||||||
|
|
||||||
### Out of Memory Killer
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Check if OOM killed the process
|
|
||||||
dmesg | grep -i "killed process"
|
|
||||||
|
|
||||||
# Increase swap
|
|
||||||
# Use smaller model
|
|
||||||
# Reduce context size
|
|
||||||
```
|
|
||||||
|
|
||||||
## Best Practices
|
|
||||||
|
|
||||||
1. **Start small** - Begin with TinyLlama, upgrade if needed
|
|
||||||
2. **Monitor memory** - Use `htop` during initial tests
|
|
||||||
3. **Set appropriate context** - 1024-2048 for most embedded use
|
|
||||||
4. **Use quantized models** - Q4_K_M is a good balance
|
|
||||||
5. **Enable streaming** - Better UX on slow inference
|
|
||||||
6. **Test offline** - Verify it works without internet before deployment
|
|
||||||
|
|
|
||||||
|
|
@ -1,209 +1 @@
|
||||||
# Quick Start - Deploy in 5 Minutes
|
# Quick Start
|
||||||
|
|
||||||
Get General Bots running on your embedded device with local AI in just a few commands.
|
|
||||||
|
|
||||||
## Prerequisites
|
|
||||||
|
|
||||||
- An SBC (Raspberry Pi, Orange Pi, etc.) with Armbian/Raspbian
|
|
||||||
- SSH access to the device
|
|
||||||
- Internet connection (for initial setup only)
|
|
||||||
|
|
||||||
## One-Line Deploy
|
|
||||||
|
|
||||||
From your development machine:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Clone and run the deployment script
|
|
||||||
git clone https://github.com/GeneralBots/botserver.git
|
|
||||||
cd botserver
|
|
||||||
|
|
||||||
# Deploy to Orange Pi (replace with your device IP)
|
|
||||||
./scripts/deploy-embedded.sh orangepi@192.168.1.100 --with-ui --with-llama
|
|
||||||
```
|
|
||||||
|
|
||||||
That's it! After ~10-15 minutes:
|
|
||||||
- BotServer runs on port 8088
|
|
||||||
- llama.cpp runs on port 8080 with TinyLlama
|
|
||||||
- Embedded UI available at `http://your-device:8088/embedded/`
|
|
||||||
|
|
||||||
## Step-by-Step Guide
|
|
||||||
|
|
||||||
### Step 1: Prepare Your Device
|
|
||||||
|
|
||||||
Flash your SBC with a compatible OS:
|
|
||||||
|
|
||||||
**Raspberry Pi:**
|
|
||||||
```bash
|
|
||||||
# Download Raspberry Pi Imager
|
|
||||||
# Select: Raspberry Pi OS Lite (64-bit)
|
|
||||||
# Enable SSH in settings
|
|
||||||
```
|
|
||||||
|
|
||||||
**Orange Pi:**
|
|
||||||
```bash
|
|
||||||
# Download Armbian from armbian.com
|
|
||||||
# Flash with balenaEtcher
|
|
||||||
```
|
|
||||||
|
|
||||||
### Step 2: First Boot Configuration
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# SSH into your device
|
|
||||||
ssh pi@raspberrypi.local # or orangepi@orangepi.local
|
|
||||||
|
|
||||||
# Update system
|
|
||||||
sudo apt update && sudo apt upgrade -y
|
|
||||||
|
|
||||||
# Set timezone
|
|
||||||
sudo timedatectl set-timezone America/Sao_Paulo
|
|
||||||
|
|
||||||
# Enable I2C/SPI if using GPIO displays
|
|
||||||
sudo raspi-config # or armbian-config
|
|
||||||
```
|
|
||||||
|
|
||||||
### Step 3: Run Deployment Script
|
|
||||||
|
|
||||||
From your development PC:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Basic deployment (botserver only)
|
|
||||||
./scripts/deploy-embedded.sh pi@raspberrypi.local
|
|
||||||
|
|
||||||
# With embedded UI
|
|
||||||
./scripts/deploy-embedded.sh pi@raspberrypi.local --with-ui
|
|
||||||
|
|
||||||
# With local LLM (requires 4GB+ RAM)
|
|
||||||
./scripts/deploy-embedded.sh pi@raspberrypi.local --with-ui --with-llama
|
|
||||||
|
|
||||||
# Specify a different model
|
|
||||||
./scripts/deploy-embedded.sh pi@raspberrypi.local --with-llama --model phi-2-Q4_K_M.gguf
|
|
||||||
```
|
|
||||||
|
|
||||||
### Step 4: Verify Installation
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Check services
|
|
||||||
ssh pi@raspberrypi.local 'sudo systemctl status botserver'
|
|
||||||
ssh pi@raspberrypi.local 'sudo systemctl status llama-server'
|
|
||||||
|
|
||||||
# Test botserver
|
|
||||||
curl http://raspberrypi.local:8088/health
|
|
||||||
|
|
||||||
# Test llama.cpp
|
|
||||||
curl http://raspberrypi.local:8080/v1/models
|
|
||||||
```
|
|
||||||
|
|
||||||
### Step 5: Access the Interface
|
|
||||||
|
|
||||||
Open in your browser:
|
|
||||||
```
|
|
||||||
http://raspberrypi.local:8088/embedded/
|
|
||||||
```
|
|
||||||
|
|
||||||
Or set up kiosk mode (auto-starts on boot):
|
|
||||||
```bash
|
|
||||||
# Already configured if you used --with-ui
|
|
||||||
# Just reboot:
|
|
||||||
ssh pi@raspberrypi.local 'sudo reboot'
|
|
||||||
```
|
|
||||||
|
|
||||||
## Local Installation (On the Device)
|
|
||||||
|
|
||||||
If you prefer to install directly on the device:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# SSH into the device
|
|
||||||
ssh pi@raspberrypi.local
|
|
||||||
|
|
||||||
# Install Rust
|
|
||||||
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
|
|
||||||
source ~/.cargo/env
|
|
||||||
|
|
||||||
# Clone and build
|
|
||||||
git clone https://github.com/GeneralBots/botserver.git
|
|
||||||
cd botserver
|
|
||||||
|
|
||||||
# Run local deployment
|
|
||||||
./scripts/deploy-embedded.sh --local --with-ui --with-llama
|
|
||||||
```
|
|
||||||
|
|
||||||
⚠️ **Note:** Building on ARM devices is slow (1-2 hours). Cross-compilation is faster.
|
|
||||||
|
|
||||||
## Configuration
|
|
||||||
|
|
||||||
After deployment, edit the config file:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
ssh pi@raspberrypi.local
|
|
||||||
sudo nano /opt/botserver/.env
|
|
||||||
```
|
|
||||||
|
|
||||||
Key settings:
|
|
||||||
```env
|
|
||||||
# Server
|
|
||||||
HOST=0.0.0.0
|
|
||||||
PORT=8088
|
|
||||||
|
|
||||||
# Local LLM
|
|
||||||
LLM_PROVIDER=llamacpp
|
|
||||||
LLM_API_URL=http://127.0.0.1:8080
|
|
||||||
LLM_MODEL=tinyllama
|
|
||||||
|
|
||||||
# Memory limits for small devices
|
|
||||||
MAX_CONTEXT_TOKENS=2048
|
|
||||||
MAX_RESPONSE_TOKENS=512
|
|
||||||
```
|
|
||||||
|
|
||||||
Restart after changes:
|
|
||||||
```bash
|
|
||||||
sudo systemctl restart botserver
|
|
||||||
```
|
|
||||||
|
|
||||||
## Troubleshooting
|
|
||||||
|
|
||||||
### Out of Memory
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Check memory usage
|
|
||||||
free -h
|
|
||||||
|
|
||||||
# Reduce llama.cpp context
|
|
||||||
sudo nano /etc/systemd/system/llama-server.service
|
|
||||||
# Change -c 2048 to -c 1024
|
|
||||||
|
|
||||||
# Or use a smaller model
|
|
||||||
# TinyLlama uses ~700MB, Phi-2 uses ~1.6GB
|
|
||||||
```
|
|
||||||
|
|
||||||
### Service Won't Start
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Check logs
|
|
||||||
sudo journalctl -u botserver -f
|
|
||||||
sudo journalctl -u llama-server -f
|
|
||||||
|
|
||||||
# Common issues:
|
|
||||||
# - Port already in use
|
|
||||||
# - Missing model file
|
|
||||||
# - Database permissions
|
|
||||||
```
|
|
||||||
|
|
||||||
### Display Not Working
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Check if display is detected
|
|
||||||
ls /dev/fb* # HDMI/DSI
|
|
||||||
ls /dev/i2c* # I2C displays
|
|
||||||
ls /dev/spidev* # SPI displays
|
|
||||||
|
|
||||||
# For HDMI, check config
|
|
||||||
sudo nano /boot/config.txt # Raspberry Pi
|
|
||||||
sudo nano /boot/armbianEnv.txt # Orange Pi
|
|
||||||
```
|
|
||||||
|
|
||||||
## Next Steps
|
|
||||||
|
|
||||||
- [Embedded UI Guide](./embedded-ui.md) - Customize the interface
|
|
||||||
- [Local LLM Configuration](./local-llm.md) - Optimize AI performance
|
|
||||||
- [Kiosk Mode](./kiosk-mode.md) - Production deployment
|
|
||||||
- [Offline Operation](./offline.md) - Disconnected environments
|
|
||||||
|
|
|
||||||
|
|
@ -320,9 +320,17 @@
|
||||||
- [Permissions Matrix](./12-auth/permissions-matrix.md)
|
- [Permissions Matrix](./12-auth/permissions-matrix.md)
|
||||||
- [User Context vs System Context](./12-auth/user-system-context.md)
|
- [User Context vs System Context](./12-auth/user-system-context.md)
|
||||||
|
|
||||||
# Part XII - Community
|
# Part XII - Device & Offline Deployment
|
||||||
|
|
||||||
- [Chapter 13: Contributing](./13-community/README.md)
|
- [Chapter 13: Device Deployment](./13-devices/README.md)
|
||||||
|
- [Mobile (Android & HarmonyOS)](./13-devices/mobile.md)
|
||||||
|
- [Supported Hardware (SBCs)](./13-devices/hardware.md)
|
||||||
|
- [Quick Start](./13-devices/quick-start.md)
|
||||||
|
- [Local LLM with llama.cpp](./13-devices/local-llm.md)
|
||||||
|
|
||||||
|
# Part XIII - Community
|
||||||
|
|
||||||
|
- [Chapter 14: Contributing](./13-community/README.md)
|
||||||
- [Development Setup](./13-community/setup.md)
|
- [Development Setup](./13-community/setup.md)
|
||||||
- [Testing Guide](./13-community/testing.md)
|
- [Testing Guide](./13-community/testing.md)
|
||||||
- [Documentation](./13-community/documentation.md)
|
- [Documentation](./13-community/documentation.md)
|
||||||
|
|
@ -330,9 +338,9 @@
|
||||||
- [Community Guidelines](./13-community/community.md)
|
- [Community Guidelines](./13-community/community.md)
|
||||||
- [IDEs](./13-community/ide-extensions.md)
|
- [IDEs](./13-community/ide-extensions.md)
|
||||||
|
|
||||||
# Part XIII - Migration
|
# Part XIV - Migration
|
||||||
|
|
||||||
- [Chapter 14: Migration Guide](./14-migration/README.md)
|
- [Chapter 15: Migration Guide](./14-migration/README.md)
|
||||||
- [Migration Overview](./14-migration/overview.md)
|
- [Migration Overview](./14-migration/overview.md)
|
||||||
- [Platform Comparison Matrix](./14-migration/comparison-matrix.md)
|
- [Platform Comparison Matrix](./14-migration/comparison-matrix.md)
|
||||||
- [Migration Resources](./14-migration/resources.md)
|
- [Migration Resources](./14-migration/resources.md)
|
||||||
|
|
@ -350,9 +358,9 @@
|
||||||
- [Automation Migration](./14-migration/automation.md)
|
- [Automation Migration](./14-migration/automation.md)
|
||||||
- [Validation and Testing](./14-migration/validation.md)
|
- [Validation and Testing](./14-migration/validation.md)
|
||||||
|
|
||||||
# Part XIV - Testing
|
# Part XV - Testing
|
||||||
|
|
||||||
- [Chapter 17: Testing](./17-testing/README.md)
|
- [Chapter 16: Testing](./17-testing/README.md)
|
||||||
- [End-to-End Testing](./17-testing/e2e-testing.md)
|
- [End-to-End Testing](./17-testing/e2e-testing.md)
|
||||||
- [Testing Architecture](./17-testing/architecture.md)
|
- [Testing Architecture](./17-testing/architecture.md)
|
||||||
- [Performance Testing](./17-testing/performance.md)
|
- [Performance Testing](./17-testing/performance.md)
|
||||||
|
|
@ -390,12 +398,5 @@
|
||||||
- [Appendix D: Documentation Style](./16-appendix-docs-style/conversation-examples.md)
|
- [Appendix D: Documentation Style](./16-appendix-docs-style/conversation-examples.md)
|
||||||
- [SVG and Conversation Standards](./16-appendix-docs-style/svg.md)
|
- [SVG and Conversation Standards](./16-appendix-docs-style/svg.md)
|
||||||
|
|
||||||
# Part XV - Embedded & Offline
|
|
||||||
|
|
||||||
- [Chapter 20: Embedded Deployment](./20-embedding/README.md)
|
|
||||||
- [Supported Hardware](./20-embedding/hardware.md)
|
|
||||||
- [Quick Start](./20-embedding/quick-start.md)
|
|
||||||
- [Local LLM with llama.cpp](./20-embedding/local-llm.md)
|
|
||||||
|
|
||||||
[Glossary](./glossary.md)
|
[Glossary](./glossary.md)
|
||||||
[Contact](./contact/README.md)
|
[Contact](./contact/README.md)
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue