- Add token-aware text truncation utility in core/shared/utils.rs
- Fix embedding generators to use 600 token limit (safe under 768)
- Fix LLM context limit detection for local models (768 vs 4096)
- Prevent 'exceed context size' errors for both embeddings and chat