3 KiB
3 KiB
General Bots Models (Python) - Project Guidelines
Version: 1.0.0 Role: AI Inference Service for BotServer Primary Directive: Provide access to the latest open-source AI models (Python ecosystem) that are impractical to implement in Rust.
🐍 PHILOSOPHY & SCOPE
Why Python?
While botserver (Rust) handles the heavy lifting, networking, and business logic, botmodels exists solely to leverage the extensive Python AI/ML ecosystem.
- Rust vs. Python Rule:
- If logic is deterministic, systems-level, or performance-critical logic: Do it in Rust (botserver).
- If logic requires cutting-edge ML models, rapid experimentation with HuggingFace, or specific Python-only libraries: Do it here.
Architecture
- Inference Only: This service should NOT hold business state. It accepts inputs, runs inference, and returns predictions.
- Stateless: Treated as a sidecar to
botserver. - API First: Exposes strict HTTP/REST endpoints (or gRPC) consumed by
botserver.
🛠 TECHNOLOGY STACK
- Runtime: Python 3.10+
- Web Framework: FastAPI (preferred over Flask for async/performance) or Flask (legacy support).
- ML Frameworks: PyTorch, HuggingFace Transformers, raw ONNX (if speed needed).
- Quality:
ruff(linting),black(formatting),mypy(typing).
⚡️ IMPERATIVES
1. Modern Model Usage
- Deprecate Legacy: Move away from outdated libs (e.g., old
allennlpif superseded) in favor of HuggingFace Transformers and Diffusers. - Quantization: Always consider quantized models (bitsandbytes, GGUF) to reduce VRAM usage given the "consumer/prosumer" target of General Bots.
2. Performance & Loading
- Lazy Loading: Do NOT load 10GB models at module import time. Load on startup lifecycle or first request with locking.
- GPU Handling: robustly detect CUDA/MPS (Mac) and fallback to CPU gracefully.
3. Code Quality
- Type Hints: All functions MUST have type hints.
- Error Handling: No bare check
except:. Catch precise exceptions and return structured JSON errors tobotserver.
📝 DEVELOPMENT WORKFLOW
- Environment: Always use a
venv.python3 -m venv venv source venv/bin/activate pip install -r requirements.txt - Running:
python app.py # OR if migrated to FastAPI uvicorn src.main:app --port 8089 --reload
🔗 INTEGRATION WITH BOTSERVER
- Port: Defaults to
8089(internal). - Security: Must implement the shared secret handshake (HMAC/API Key) validated against
botserver. - Keep-Alive:
botservermanages the lifecycle of this process.
✅ CONTINUATION PROMPT
When working in botmodels:
- Prioritize Ecosystem: If a new SOTA model drops (e.g., Llama 3, Mistral), enable it here immediately.
- Optimize: Ensure dependencies are minimized. Don't install
tensorflowiftorchsuffices. - Strict Typing: Ensure all input/outputs match the
botserverexpectations perfectly.