Add multimodal module for botmodels integration
Introduces IMAGE, VIDEO, AUDIO, and SEE keywords for BASIC scripts that connect to the botmodels service for AI-powered media generation and vision/captioning capabilities. - Add BotModelsClient for HTTP communication with botmodels service - Implement BASIC keywords: IMAGE, VIDEO, AUDIO (generation), SEE (captioning) - Support configuration via config.csv for models
This commit is contained in:
parent
d761a076aa
commit
a21292daa3
7 changed files with 1200 additions and 0 deletions
191
docs/multimodal-config.md
Normal file
191
docs/multimodal-config.md
Normal file
|
|
@ -0,0 +1,191 @@
|
|||
# Multimodal Configuration Guide
|
||||
|
||||
This document describes how to configure botserver to use the botmodels service for image, video, audio generation, and vision/captioning capabilities.
|
||||
|
||||
## Overview
|
||||
|
||||
The multimodal feature connects botserver to botmodels - a Python-based service similar to llama.cpp but for multimodal AI tasks. This enables BASIC scripts to generate images, videos, audio, and analyze visual content.
|
||||
|
||||
## Configuration Keys
|
||||
|
||||
Add the following configuration to your bot's `config.csv` file:
|
||||
|
||||
### Image Generator Settings
|
||||
|
||||
| Key | Default | Description |
|
||||
|-----|---------|-------------|
|
||||
| `image-generator-model` | - | Path to the image generation model (e.g., `../../../../data/diffusion/sd_turbo_f16.gguf`) |
|
||||
| `image-generator-steps` | `4` | Number of inference steps for image generation |
|
||||
| `image-generator-width` | `512` | Output image width in pixels |
|
||||
| `image-generator-height` | `512` | Output image height in pixels |
|
||||
| `image-generator-gpu-layers` | `20` | Number of layers to offload to GPU |
|
||||
| `image-generator-batch-size` | `1` | Batch size for generation |
|
||||
|
||||
### Video Generator Settings
|
||||
|
||||
| Key | Default | Description |
|
||||
|-----|---------|-------------|
|
||||
| `video-generator-model` | - | Path to the video generation model (e.g., `../../../../data/diffusion/zeroscope_v2_576w`) |
|
||||
| `video-generator-frames` | `24` | Number of frames to generate |
|
||||
| `video-generator-fps` | `8` | Frames per second for output video |
|
||||
| `video-generator-width` | `320` | Output video width in pixels |
|
||||
| `video-generator-height` | `576` | Output video height in pixels |
|
||||
| `video-generator-gpu-layers` | `15` | Number of layers to offload to GPU |
|
||||
| `video-generator-batch-size` | `1` | Batch size for generation |
|
||||
|
||||
### BotModels Service Settings
|
||||
|
||||
| Key | Default | Description |
|
||||
|-----|---------|-------------|
|
||||
| `botmodels-enabled` | `false` | Enable/disable botmodels integration |
|
||||
| `botmodels-host` | `0.0.0.0` | Host address for botmodels service |
|
||||
| `botmodels-port` | `8085` | Port for botmodels service |
|
||||
| `botmodels-api-key` | - | API key for authentication with botmodels |
|
||||
| `botmodels-https` | `false` | Use HTTPS for connection to botmodels |
|
||||
|
||||
## Example config.csv
|
||||
|
||||
```csv
|
||||
key,value
|
||||
image-generator-model,../../../../data/diffusion/sd_turbo_f16.gguf
|
||||
image-generator-steps,4
|
||||
image-generator-width,512
|
||||
image-generator-height,512
|
||||
image-generator-gpu-layers,20
|
||||
image-generator-batch-size,1
|
||||
video-generator-model,../../../../data/diffusion/zeroscope_v2_576w
|
||||
video-generator-frames,24
|
||||
video-generator-fps,8
|
||||
video-generator-width,320
|
||||
video-generator-height,576
|
||||
video-generator-gpu-layers,15
|
||||
video-generator-batch-size,1
|
||||
botmodels-enabled,true
|
||||
botmodels-host,0.0.0.0
|
||||
botmodels-port,8085
|
||||
botmodels-api-key,your-secret-key
|
||||
botmodels-https,false
|
||||
```
|
||||
|
||||
## BASIC Keywords
|
||||
|
||||
Once configured, the following keywords become available in BASIC scripts:
|
||||
|
||||
### IMAGE
|
||||
|
||||
Generate an image from a text prompt.
|
||||
|
||||
```basic
|
||||
file = IMAGE "a cute cat playing with yarn"
|
||||
SEND FILE TO user, file
|
||||
```
|
||||
|
||||
### VIDEO
|
||||
|
||||
Generate a video from a text prompt.
|
||||
|
||||
```basic
|
||||
file = VIDEO "a rocket launching into space"
|
||||
SEND FILE TO user, file
|
||||
```
|
||||
|
||||
### AUDIO
|
||||
|
||||
Generate speech audio from text.
|
||||
|
||||
```basic
|
||||
file = AUDIO "Hello, welcome to our service!"
|
||||
SEND FILE TO user, file
|
||||
```
|
||||
|
||||
### SEE
|
||||
|
||||
Get a caption/description of an image or video file.
|
||||
|
||||
```basic
|
||||
caption = SEE "/path/to/image.jpg"
|
||||
TALK caption
|
||||
|
||||
// Also works with video files
|
||||
description = SEE "/path/to/video.mp4"
|
||||
TALK description
|
||||
```
|
||||
|
||||
## Starting BotModels Service
|
||||
|
||||
Before using multimodal features, start the botmodels service:
|
||||
|
||||
```bash
|
||||
cd botmodels
|
||||
python -m uvicorn src.main:app --host 0.0.0.0 --port 8085
|
||||
```
|
||||
|
||||
Or with HTTPS:
|
||||
|
||||
```bash
|
||||
python -m uvicorn src.main:app --host 0.0.0.0 --port 8085 --ssl-keyfile key.pem --ssl-certfile cert.pem
|
||||
```
|
||||
|
||||
## API Endpoints (BotModels)
|
||||
|
||||
The botmodels service exposes these REST endpoints:
|
||||
|
||||
| Endpoint | Method | Description |
|
||||
|----------|--------|-------------|
|
||||
| `/api/image/generate` | POST | Generate image from prompt |
|
||||
| `/api/video/generate` | POST | Generate video from prompt |
|
||||
| `/api/speech/generate` | POST | Generate speech from text |
|
||||
| `/api/speech/totext` | POST | Convert audio to text |
|
||||
| `/api/vision/describe` | POST | Get description of an image |
|
||||
| `/api/vision/describe_video` | POST | Get description of a video |
|
||||
| `/api/vision/vqa` | POST | Visual question answering |
|
||||
| `/api/health` | GET | Health check |
|
||||
|
||||
All endpoints require the `X-API-Key` header for authentication.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────┐ HTTPS ┌─────────────┐
|
||||
│ botserver │ ────────────▶ │ botmodels │
|
||||
│ (Rust) │ │ (Python) │
|
||||
└─────────────┘ └─────────────┘
|
||||
│ │
|
||||
│ BASIC Keywords │ AI Models
|
||||
│ - IMAGE │ - Stable Diffusion
|
||||
│ - VIDEO │ - Zeroscope
|
||||
│ - AUDIO │ - TTS/Whisper
|
||||
│ - SEE │ - BLIP2
|
||||
▼ ▼
|
||||
┌─────────────┐ ┌─────────────┐
|
||||
│ config │ │ outputs │
|
||||
│ .csv │ │ (files) │
|
||||
└─────────────┘ └─────────────┘
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "BotModels is not enabled"
|
||||
|
||||
Set `botmodels-enabled=true` in your config.csv.
|
||||
|
||||
### Connection refused
|
||||
|
||||
1. Ensure botmodels service is running
|
||||
2. Check host/port configuration
|
||||
3. Verify firewall settings
|
||||
|
||||
### Authentication failed
|
||||
|
||||
Ensure `botmodels-api-key` in config.csv matches `API_KEY` environment variable in botmodels.
|
||||
|
||||
### Model not found
|
||||
|
||||
Verify model paths are correct and models are downloaded to the expected locations.
|
||||
|
||||
## Security Notes
|
||||
|
||||
1. Always use HTTPS in production (`botmodels-https=true`)
|
||||
2. Use strong, unique API keys
|
||||
3. Restrict network access to botmodels service
|
||||
4. Consider running botmodels on a separate GPU server
|
||||
|
|
@ -15,6 +15,7 @@ pub mod get;
|
|||
pub mod hear_talk;
|
||||
pub mod last;
|
||||
pub mod llm_keyword;
|
||||
pub mod multimodal;
|
||||
pub mod on;
|
||||
pub mod print;
|
||||
pub mod remember;
|
||||
|
|
|
|||
323
src/basic/keywords/multimodal.rs
Normal file
323
src/basic/keywords/multimodal.rs
Normal file
|
|
@ -0,0 +1,323 @@
|
|||
//! Multimodal keywords for image, video, audio generation and vision/captioning
|
||||
//!
|
||||
//! Provides BASIC keywords:
|
||||
//! - IMAGE "prompt" -> generates image, returns file URL
|
||||
//! - VIDEO "prompt" -> generates video, returns file URL
|
||||
//! - AUDIO "text" -> generates speech audio, returns file URL
|
||||
//! - SEE file -> gets caption/description of image or video
|
||||
|
||||
use crate::multimodal::BotModelsClient;
|
||||
use crate::shared::models::UserSession;
|
||||
use crate::shared::state::AppState;
|
||||
use log::{error, trace};
|
||||
use rhai::{Dynamic, Engine};
|
||||
use std::sync::Arc;
|
||||
use std::time::Duration;
|
||||
|
||||
/// Register all multimodal keywords
|
||||
pub fn register_multimodal_keywords(state: Arc<AppState>, user: UserSession, engine: &mut Engine) {
|
||||
image_keyword(state.clone(), user.clone(), engine);
|
||||
video_keyword(state.clone(), user.clone(), engine);
|
||||
audio_keyword(state.clone(), user.clone(), engine);
|
||||
see_keyword(state.clone(), user.clone(), engine);
|
||||
}
|
||||
|
||||
/// IMAGE "prompt" - Generate an image from text prompt
|
||||
/// Returns the URL/path to the generated image file
|
||||
pub fn image_keyword(state: Arc<AppState>, user: UserSession, engine: &mut Engine) {
|
||||
let state_clone = Arc::clone(&state);
|
||||
let user_clone = user.clone();
|
||||
|
||||
engine
|
||||
.register_custom_syntax(&["IMAGE", "$expr$"], false, move |context, inputs| {
|
||||
let prompt = context.eval_expression_tree(&inputs[0])?.to_string();
|
||||
|
||||
trace!("IMAGE keyword: generating image for prompt: {}", prompt);
|
||||
|
||||
let state_for_thread = Arc::clone(&state_clone);
|
||||
let bot_id = user_clone.bot_id;
|
||||
|
||||
let (tx, rx) = std::sync::mpsc::channel();
|
||||
|
||||
std::thread::spawn(move || {
|
||||
let rt = tokio::runtime::Builder::new_multi_thread()
|
||||
.worker_threads(2)
|
||||
.enable_all()
|
||||
.build();
|
||||
|
||||
let send_err = if let Ok(rt) = rt {
|
||||
let result = rt.block_on(async move {
|
||||
execute_image_generation(state_for_thread, bot_id, prompt).await
|
||||
});
|
||||
tx.send(result).err()
|
||||
} else {
|
||||
tx.send(Err("Failed to build tokio runtime".into())).err()
|
||||
};
|
||||
|
||||
if send_err.is_some() {
|
||||
error!("Failed to send IMAGE result");
|
||||
}
|
||||
});
|
||||
|
||||
match rx.recv_timeout(Duration::from_secs(300)) {
|
||||
Ok(Ok(result)) => Ok(Dynamic::from(result)),
|
||||
Ok(Err(e)) => Err(Box::new(rhai::EvalAltResult::ErrorRuntime(
|
||||
e.to_string().into(),
|
||||
rhai::Position::NONE,
|
||||
))),
|
||||
Err(std::sync::mpsc::RecvTimeoutError::Timeout) => {
|
||||
Err(Box::new(rhai::EvalAltResult::ErrorRuntime(
|
||||
"Image generation timed out".into(),
|
||||
rhai::Position::NONE,
|
||||
)))
|
||||
}
|
||||
Err(e) => Err(Box::new(rhai::EvalAltResult::ErrorRuntime(
|
||||
format!("IMAGE thread failed: {}", e).into(),
|
||||
rhai::Position::NONE,
|
||||
))),
|
||||
}
|
||||
})
|
||||
.unwrap();
|
||||
}
|
||||
|
||||
async fn execute_image_generation(
|
||||
state: Arc<AppState>,
|
||||
bot_id: uuid::Uuid,
|
||||
prompt: String,
|
||||
) -> Result<String, Box<dyn std::error::Error + Send + Sync>> {
|
||||
let client = BotModelsClient::from_state(&state, &bot_id);
|
||||
|
||||
if !client.is_enabled() {
|
||||
return Err("BotModels is not enabled. Set botmodels-enabled=true in config.csv".into());
|
||||
}
|
||||
|
||||
client.generate_image(&prompt).await
|
||||
}
|
||||
|
||||
/// VIDEO "prompt" - Generate a video from text prompt
|
||||
/// Returns the URL/path to the generated video file
|
||||
pub fn video_keyword(state: Arc<AppState>, user: UserSession, engine: &mut Engine) {
|
||||
let state_clone = Arc::clone(&state);
|
||||
let user_clone = user.clone();
|
||||
|
||||
engine
|
||||
.register_custom_syntax(&["VIDEO", "$expr$"], false, move |context, inputs| {
|
||||
let prompt = context.eval_expression_tree(&inputs[0])?.to_string();
|
||||
|
||||
trace!("VIDEO keyword: generating video for prompt: {}", prompt);
|
||||
|
||||
let state_for_thread = Arc::clone(&state_clone);
|
||||
let bot_id = user_clone.bot_id;
|
||||
|
||||
let (tx, rx) = std::sync::mpsc::channel();
|
||||
|
||||
std::thread::spawn(move || {
|
||||
let rt = tokio::runtime::Builder::new_multi_thread()
|
||||
.worker_threads(2)
|
||||
.enable_all()
|
||||
.build();
|
||||
|
||||
let send_err = if let Ok(rt) = rt {
|
||||
let result = rt.block_on(async move {
|
||||
execute_video_generation(state_for_thread, bot_id, prompt).await
|
||||
});
|
||||
tx.send(result).err()
|
||||
} else {
|
||||
tx.send(Err("Failed to build tokio runtime".into())).err()
|
||||
};
|
||||
|
||||
if send_err.is_some() {
|
||||
error!("Failed to send VIDEO result");
|
||||
}
|
||||
});
|
||||
|
||||
// Video generation can take longer
|
||||
match rx.recv_timeout(Duration::from_secs(600)) {
|
||||
Ok(Ok(result)) => Ok(Dynamic::from(result)),
|
||||
Ok(Err(e)) => Err(Box::new(rhai::EvalAltResult::ErrorRuntime(
|
||||
e.to_string().into(),
|
||||
rhai::Position::NONE,
|
||||
))),
|
||||
Err(std::sync::mpsc::RecvTimeoutError::Timeout) => {
|
||||
Err(Box::new(rhai::EvalAltResult::ErrorRuntime(
|
||||
"Video generation timed out".into(),
|
||||
rhai::Position::NONE,
|
||||
)))
|
||||
}
|
||||
Err(e) => Err(Box::new(rhai::EvalAltResult::ErrorRuntime(
|
||||
format!("VIDEO thread failed: {}", e).into(),
|
||||
rhai::Position::NONE,
|
||||
))),
|
||||
}
|
||||
})
|
||||
.unwrap();
|
||||
}
|
||||
|
||||
async fn execute_video_generation(
|
||||
state: Arc<AppState>,
|
||||
bot_id: uuid::Uuid,
|
||||
prompt: String,
|
||||
) -> Result<String, Box<dyn std::error::Error + Send + Sync>> {
|
||||
let client = BotModelsClient::from_state(&state, &bot_id);
|
||||
|
||||
if !client.is_enabled() {
|
||||
return Err("BotModels is not enabled. Set botmodels-enabled=true in config.csv".into());
|
||||
}
|
||||
|
||||
client.generate_video(&prompt).await
|
||||
}
|
||||
|
||||
/// AUDIO "text" - Generate speech audio from text
|
||||
/// Returns the URL/path to the generated audio file
|
||||
pub fn audio_keyword(state: Arc<AppState>, user: UserSession, engine: &mut Engine) {
|
||||
let state_clone = Arc::clone(&state);
|
||||
let user_clone = user.clone();
|
||||
|
||||
engine
|
||||
.register_custom_syntax(&["AUDIO", "$expr$"], false, move |context, inputs| {
|
||||
let text = context.eval_expression_tree(&inputs[0])?.to_string();
|
||||
|
||||
trace!("AUDIO keyword: generating speech for text: {}", text);
|
||||
|
||||
let state_for_thread = Arc::clone(&state_clone);
|
||||
let bot_id = user_clone.bot_id;
|
||||
|
||||
let (tx, rx) = std::sync::mpsc::channel();
|
||||
|
||||
std::thread::spawn(move || {
|
||||
let rt = tokio::runtime::Builder::new_multi_thread()
|
||||
.worker_threads(2)
|
||||
.enable_all()
|
||||
.build();
|
||||
|
||||
let send_err = if let Ok(rt) = rt {
|
||||
let result = rt.block_on(async move {
|
||||
execute_audio_generation(state_for_thread, bot_id, text).await
|
||||
});
|
||||
tx.send(result).err()
|
||||
} else {
|
||||
tx.send(Err("Failed to build tokio runtime".into())).err()
|
||||
};
|
||||
|
||||
if send_err.is_some() {
|
||||
error!("Failed to send AUDIO result");
|
||||
}
|
||||
});
|
||||
|
||||
match rx.recv_timeout(Duration::from_secs(120)) {
|
||||
Ok(Ok(result)) => Ok(Dynamic::from(result)),
|
||||
Ok(Err(e)) => Err(Box::new(rhai::EvalAltResult::ErrorRuntime(
|
||||
e.to_string().into(),
|
||||
rhai::Position::NONE,
|
||||
))),
|
||||
Err(std::sync::mpsc::RecvTimeoutError::Timeout) => {
|
||||
Err(Box::new(rhai::EvalAltResult::ErrorRuntime(
|
||||
"Audio generation timed out".into(),
|
||||
rhai::Position::NONE,
|
||||
)))
|
||||
}
|
||||
Err(e) => Err(Box::new(rhai::EvalAltResult::ErrorRuntime(
|
||||
format!("AUDIO thread failed: {}", e).into(),
|
||||
rhai::Position::NONE,
|
||||
))),
|
||||
}
|
||||
})
|
||||
.unwrap();
|
||||
}
|
||||
|
||||
async fn execute_audio_generation(
|
||||
state: Arc<AppState>,
|
||||
bot_id: uuid::Uuid,
|
||||
text: String,
|
||||
) -> Result<String, Box<dyn std::error::Error + Send + Sync>> {
|
||||
let client = BotModelsClient::from_state(&state, &bot_id);
|
||||
|
||||
if !client.is_enabled() {
|
||||
return Err("BotModels is not enabled. Set botmodels-enabled=true in config.csv".into());
|
||||
}
|
||||
|
||||
client.generate_audio(&text, None, None).await
|
||||
}
|
||||
|
||||
/// SEE file - Get caption/description of an image or video file
|
||||
/// Returns the text description of the visual content
|
||||
pub fn see_keyword(state: Arc<AppState>, user: UserSession, engine: &mut Engine) {
|
||||
let state_clone = Arc::clone(&state);
|
||||
let user_clone = user.clone();
|
||||
|
||||
engine
|
||||
.register_custom_syntax(&["SEE", "$expr$"], false, move |context, inputs| {
|
||||
let file_path = context.eval_expression_tree(&inputs[0])?.to_string();
|
||||
|
||||
trace!("SEE keyword: getting caption for file: {}", file_path);
|
||||
|
||||
let state_for_thread = Arc::clone(&state_clone);
|
||||
let bot_id = user_clone.bot_id;
|
||||
|
||||
let (tx, rx) = std::sync::mpsc::channel();
|
||||
|
||||
std::thread::spawn(move || {
|
||||
let rt = tokio::runtime::Builder::new_multi_thread()
|
||||
.worker_threads(2)
|
||||
.enable_all()
|
||||
.build();
|
||||
|
||||
let send_err = if let Ok(rt) = rt {
|
||||
let result = rt.block_on(async move {
|
||||
execute_see_caption(state_for_thread, bot_id, file_path).await
|
||||
});
|
||||
tx.send(result).err()
|
||||
} else {
|
||||
tx.send(Err("Failed to build tokio runtime".into())).err()
|
||||
};
|
||||
|
||||
if send_err.is_some() {
|
||||
error!("Failed to send SEE result");
|
||||
}
|
||||
});
|
||||
|
||||
match rx.recv_timeout(Duration::from_secs(60)) {
|
||||
Ok(Ok(result)) => Ok(Dynamic::from(result)),
|
||||
Ok(Err(e)) => Err(Box::new(rhai::EvalAltResult::ErrorRuntime(
|
||||
e.to_string().into(),
|
||||
rhai::Position::NONE,
|
||||
))),
|
||||
Err(std::sync::mpsc::RecvTimeoutError::Timeout) => {
|
||||
Err(Box::new(rhai::EvalAltResult::ErrorRuntime(
|
||||
"Vision/caption timed out".into(),
|
||||
rhai::Position::NONE,
|
||||
)))
|
||||
}
|
||||
Err(e) => Err(Box::new(rhai::EvalAltResult::ErrorRuntime(
|
||||
format!("SEE thread failed: {}", e).into(),
|
||||
rhai::Position::NONE,
|
||||
))),
|
||||
}
|
||||
})
|
||||
.unwrap();
|
||||
}
|
||||
|
||||
async fn execute_see_caption(
|
||||
state: Arc<AppState>,
|
||||
bot_id: uuid::Uuid,
|
||||
file_path: String,
|
||||
) -> Result<String, Box<dyn std::error::Error + Send + Sync>> {
|
||||
let client = BotModelsClient::from_state(&state, &bot_id);
|
||||
|
||||
if !client.is_enabled() {
|
||||
return Err("BotModels is not enabled. Set botmodels-enabled=true in config.csv".into());
|
||||
}
|
||||
|
||||
// Determine if it's a video or image based on extension
|
||||
let lower_path = file_path.to_lowercase();
|
||||
if lower_path.ends_with(".mp4")
|
||||
|| lower_path.ends_with(".avi")
|
||||
|| lower_path.ends_with(".mov")
|
||||
|| lower_path.ends_with(".webm")
|
||||
|| lower_path.ends_with(".mkv")
|
||||
{
|
||||
client.describe_video(&file_path).await
|
||||
} else {
|
||||
client.describe_image(&file_path).await
|
||||
}
|
||||
}
|
||||
|
|
@ -23,6 +23,7 @@ use self::keywords::format::format_keyword;
|
|||
use self::keywords::get::get_keyword;
|
||||
use self::keywords::hear_talk::{hear_keyword, talk_keyword};
|
||||
use self::keywords::last::last_keyword;
|
||||
use self::keywords::multimodal::register_multimodal_keywords;
|
||||
use self::keywords::remember::remember_keyword;
|
||||
use self::keywords::save_from_unstructured::save_from_unstructured_keyword;
|
||||
use self::keywords::send_mail::send_mail_keyword;
|
||||
|
|
@ -92,6 +93,10 @@ impl ScriptService {
|
|||
&mut engine,
|
||||
);
|
||||
|
||||
// Register multimodal keywords (IMAGE, VIDEO, AUDIO, SEE)
|
||||
// These connect to botmodels for image/video/audio generation and vision/captioning
|
||||
register_multimodal_keywords(state.clone(), user.clone(), &mut engine);
|
||||
|
||||
ScriptService { engine }
|
||||
}
|
||||
fn preprocess_basic_script(&self, script: &str) -> String {
|
||||
|
|
@ -158,6 +163,11 @@ impl ScriptService {
|
|||
"SET USER",
|
||||
"GET BOT MEMORY",
|
||||
"SET BOT MEMORY",
|
||||
"IMAGE",
|
||||
"VIDEO",
|
||||
"AUDIO",
|
||||
"SEE",
|
||||
"SEND FILE",
|
||||
];
|
||||
let is_basic_command = basic_commands.iter().any(|&cmd| trimmed.starts_with(cmd));
|
||||
let is_control_flow = trimmed.starts_with("IF")
|
||||
|
|
|
|||
|
|
@ -1,6 +1,7 @@
|
|||
// Core modules (always included)
|
||||
pub mod basic;
|
||||
pub mod core;
|
||||
pub mod multimodal;
|
||||
pub mod security;
|
||||
pub mod web;
|
||||
|
||||
|
|
|
|||
652
src/multimodal/mod.rs
Normal file
652
src/multimodal/mod.rs
Normal file
|
|
@ -0,0 +1,652 @@
|
|||
//! Multimodal module for botmodels integration
|
||||
//! Provides client for image, video, audio generation and vision/captioning
|
||||
|
||||
use crate::config::ConfigManager;
|
||||
use crate::shared::state::AppState;
|
||||
use log::{error, info, trace};
|
||||
use reqwest::Client;
|
||||
use serde::{Deserialize, Serialize};
|
||||
use std::sync::Arc;
|
||||
use uuid::Uuid;
|
||||
|
||||
/// Configuration for botmodels connection
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct BotModelsConfig {
|
||||
pub enabled: bool,
|
||||
pub host: String,
|
||||
pub port: u16,
|
||||
pub api_key: String,
|
||||
pub use_https: bool,
|
||||
}
|
||||
|
||||
impl BotModelsConfig {
|
||||
pub fn from_database(config_manager: &ConfigManager, bot_id: &Uuid) -> Self {
|
||||
let enabled = config_manager
|
||||
.get_config(bot_id, "botmodels-enabled", Some("false"))
|
||||
.unwrap_or_default()
|
||||
.to_lowercase()
|
||||
== "true";
|
||||
|
||||
let host = config_manager
|
||||
.get_config(bot_id, "botmodels-host", Some("0.0.0.0"))
|
||||
.unwrap_or_else(|_| "0.0.0.0".to_string());
|
||||
|
||||
let port = config_manager
|
||||
.get_config(bot_id, "botmodels-port", Some("8085"))
|
||||
.unwrap_or_else(|_| "8085".to_string())
|
||||
.parse()
|
||||
.unwrap_or(8085);
|
||||
|
||||
let api_key = config_manager
|
||||
.get_config(bot_id, "botmodels-api-key", Some(""))
|
||||
.unwrap_or_default();
|
||||
|
||||
let use_https = config_manager
|
||||
.get_config(bot_id, "botmodels-https", Some("false"))
|
||||
.unwrap_or_default()
|
||||
.to_lowercase()
|
||||
== "true";
|
||||
|
||||
Self {
|
||||
enabled,
|
||||
host,
|
||||
port,
|
||||
api_key,
|
||||
use_https,
|
||||
}
|
||||
}
|
||||
|
||||
pub fn base_url(&self) -> String {
|
||||
let protocol = if self.use_https { "https" } else { "http" };
|
||||
format!("{}://{}:{}", protocol, self.host, self.port)
|
||||
}
|
||||
}
|
||||
|
||||
/// Image generation configuration
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct ImageGeneratorConfig {
|
||||
pub model: String,
|
||||
pub steps: u32,
|
||||
pub width: u32,
|
||||
pub height: u32,
|
||||
pub gpu_layers: u32,
|
||||
pub batch_size: u32,
|
||||
}
|
||||
|
||||
impl ImageGeneratorConfig {
|
||||
pub fn from_database(config_manager: &ConfigManager, bot_id: &Uuid) -> Self {
|
||||
Self {
|
||||
model: config_manager
|
||||
.get_config(bot_id, "image-generator-model", None)
|
||||
.unwrap_or_default(),
|
||||
steps: config_manager
|
||||
.get_config(bot_id, "image-generator-steps", Some("4"))
|
||||
.unwrap_or_else(|_| "4".to_string())
|
||||
.parse()
|
||||
.unwrap_or(4),
|
||||
width: config_manager
|
||||
.get_config(bot_id, "image-generator-width", Some("512"))
|
||||
.unwrap_or_else(|_| "512".to_string())
|
||||
.parse()
|
||||
.unwrap_or(512),
|
||||
height: config_manager
|
||||
.get_config(bot_id, "image-generator-height", Some("512"))
|
||||
.unwrap_or_else(|_| "512".to_string())
|
||||
.parse()
|
||||
.unwrap_or(512),
|
||||
gpu_layers: config_manager
|
||||
.get_config(bot_id, "image-generator-gpu-layers", Some("20"))
|
||||
.unwrap_or_else(|_| "20".to_string())
|
||||
.parse()
|
||||
.unwrap_or(20),
|
||||
batch_size: config_manager
|
||||
.get_config(bot_id, "image-generator-batch-size", Some("1"))
|
||||
.unwrap_or_else(|_| "1".to_string())
|
||||
.parse()
|
||||
.unwrap_or(1),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Video generation configuration
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct VideoGeneratorConfig {
|
||||
pub model: String,
|
||||
pub frames: u32,
|
||||
pub fps: u32,
|
||||
pub width: u32,
|
||||
pub height: u32,
|
||||
pub gpu_layers: u32,
|
||||
pub batch_size: u32,
|
||||
}
|
||||
|
||||
impl VideoGeneratorConfig {
|
||||
pub fn from_database(config_manager: &ConfigManager, bot_id: &Uuid) -> Self {
|
||||
Self {
|
||||
model: config_manager
|
||||
.get_config(bot_id, "video-generator-model", None)
|
||||
.unwrap_or_default(),
|
||||
frames: config_manager
|
||||
.get_config(bot_id, "video-generator-frames", Some("24"))
|
||||
.unwrap_or_else(|_| "24".to_string())
|
||||
.parse()
|
||||
.unwrap_or(24),
|
||||
fps: config_manager
|
||||
.get_config(bot_id, "video-generator-fps", Some("8"))
|
||||
.unwrap_or_else(|_| "8".to_string())
|
||||
.parse()
|
||||
.unwrap_or(8),
|
||||
width: config_manager
|
||||
.get_config(bot_id, "video-generator-width", Some("320"))
|
||||
.unwrap_or_else(|_| "320".to_string())
|
||||
.parse()
|
||||
.unwrap_or(320),
|
||||
height: config_manager
|
||||
.get_config(bot_id, "video-generator-height", Some("576"))
|
||||
.unwrap_or_else(|_| "576".to_string())
|
||||
.parse()
|
||||
.unwrap_or(576),
|
||||
gpu_layers: config_manager
|
||||
.get_config(bot_id, "video-generator-gpu-layers", Some("15"))
|
||||
.unwrap_or_else(|_| "15".to_string())
|
||||
.parse()
|
||||
.unwrap_or(15),
|
||||
batch_size: config_manager
|
||||
.get_config(bot_id, "video-generator-batch-size", Some("1"))
|
||||
.unwrap_or_else(|_| "1".to_string())
|
||||
.parse()
|
||||
.unwrap_or(1),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// API Request/Response types
|
||||
|
||||
#[derive(Debug, Serialize)]
|
||||
pub struct ImageGenerateRequest {
|
||||
pub prompt: String,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub steps: Option<u32>,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub width: Option<u32>,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub height: Option<u32>,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub guidance_scale: Option<f32>,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub seed: Option<i64>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Serialize)]
|
||||
pub struct VideoGenerateRequest {
|
||||
pub prompt: String,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub num_frames: Option<u32>,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub fps: Option<u32>,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub steps: Option<u32>,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub seed: Option<i64>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Serialize)]
|
||||
pub struct SpeechGenerateRequest {
|
||||
pub prompt: String,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub voice: Option<String>,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub language: Option<String>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Deserialize)]
|
||||
pub struct GenerationResponse {
|
||||
pub status: String,
|
||||
pub file_path: Option<String>,
|
||||
pub generation_time: Option<f64>,
|
||||
pub error: Option<String>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Deserialize)]
|
||||
pub struct DescribeResponse {
|
||||
pub description: String,
|
||||
pub confidence: Option<f64>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Deserialize)]
|
||||
pub struct VideoDescribeResponse {
|
||||
pub description: String,
|
||||
pub frame_count: Option<u32>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Deserialize)]
|
||||
pub struct SpeechToTextResponse {
|
||||
pub text: String,
|
||||
pub language: Option<String>,
|
||||
pub confidence: Option<f64>,
|
||||
}
|
||||
|
||||
/// BotModels client for multimodal operations
|
||||
pub struct BotModelsClient {
|
||||
client: Client,
|
||||
config: BotModelsConfig,
|
||||
image_config: ImageGeneratorConfig,
|
||||
video_config: VideoGeneratorConfig,
|
||||
}
|
||||
|
||||
impl BotModelsClient {
|
||||
pub fn new(
|
||||
config: BotModelsConfig,
|
||||
image_config: ImageGeneratorConfig,
|
||||
video_config: VideoGeneratorConfig,
|
||||
) -> Self {
|
||||
let client = Client::builder()
|
||||
.danger_accept_invalid_certs(true) // For self-signed certs in dev
|
||||
.timeout(std::time::Duration::from_secs(300)) // 5 min timeout for generation
|
||||
.build()
|
||||
.unwrap_or_else(|_| Client::new());
|
||||
|
||||
Self {
|
||||
client,
|
||||
config,
|
||||
image_config,
|
||||
video_config,
|
||||
}
|
||||
}
|
||||
|
||||
pub fn from_state(state: &AppState, bot_id: &Uuid) -> Self {
|
||||
let config_manager = ConfigManager::new(state.conn.clone());
|
||||
let config = BotModelsConfig::from_database(&config_manager, bot_id);
|
||||
let image_config = ImageGeneratorConfig::from_database(&config_manager, bot_id);
|
||||
let video_config = VideoGeneratorConfig::from_database(&config_manager, bot_id);
|
||||
Self::new(config, image_config, video_config)
|
||||
}
|
||||
|
||||
pub fn is_enabled(&self) -> bool {
|
||||
self.config.enabled
|
||||
}
|
||||
|
||||
/// Generate an image from a text prompt
|
||||
pub async fn generate_image(
|
||||
&self,
|
||||
prompt: &str,
|
||||
) -> Result<String, Box<dyn std::error::Error + Send + Sync>> {
|
||||
if !self.config.enabled {
|
||||
return Err("BotModels is not enabled".into());
|
||||
}
|
||||
|
||||
let url = format!("{}/api/image/generate", self.config.base_url());
|
||||
trace!("Generating image at {}: {}", url, prompt);
|
||||
|
||||
let request = ImageGenerateRequest {
|
||||
prompt: prompt.to_string(),
|
||||
steps: Some(self.image_config.steps),
|
||||
width: Some(self.image_config.width),
|
||||
height: Some(self.image_config.height),
|
||||
guidance_scale: Some(7.5),
|
||||
seed: None,
|
||||
};
|
||||
|
||||
let response = self
|
||||
.client
|
||||
.post(&url)
|
||||
.header("X-API-Key", &self.config.api_key)
|
||||
.json(&request)
|
||||
.send()
|
||||
.await?;
|
||||
|
||||
if !response.status().is_success() {
|
||||
let error_text = response.text().await.unwrap_or_default();
|
||||
error!("Image generation failed: {}", error_text);
|
||||
return Err(format!("Image generation failed: {}", error_text).into());
|
||||
}
|
||||
|
||||
let result: GenerationResponse = response.json().await?;
|
||||
|
||||
if result.status == "completed" {
|
||||
if let Some(file_path) = result.file_path {
|
||||
let full_url = format!("{}{}", self.config.base_url(), file_path);
|
||||
info!("Image generated: {}", full_url);
|
||||
return Ok(full_url);
|
||||
}
|
||||
}
|
||||
|
||||
Err(result
|
||||
.error
|
||||
.unwrap_or_else(|| "Unknown error".to_string())
|
||||
.into())
|
||||
}
|
||||
|
||||
/// Generate a video from a text prompt
|
||||
pub async fn generate_video(
|
||||
&self,
|
||||
prompt: &str,
|
||||
) -> Result<String, Box<dyn std::error::Error + Send + Sync>> {
|
||||
if !self.config.enabled {
|
||||
return Err("BotModels is not enabled".into());
|
||||
}
|
||||
|
||||
let url = format!("{}/api/video/generate", self.config.base_url());
|
||||
trace!("Generating video at {}: {}", url, prompt);
|
||||
|
||||
let request = VideoGenerateRequest {
|
||||
prompt: prompt.to_string(),
|
||||
num_frames: Some(self.video_config.frames),
|
||||
fps: Some(self.video_config.fps),
|
||||
steps: Some(50),
|
||||
seed: None,
|
||||
};
|
||||
|
||||
let response = self
|
||||
.client
|
||||
.post(&url)
|
||||
.header("X-API-Key", &self.config.api_key)
|
||||
.json(&request)
|
||||
.send()
|
||||
.await?;
|
||||
|
||||
if !response.status().is_success() {
|
||||
let error_text = response.text().await.unwrap_or_default();
|
||||
error!("Video generation failed: {}", error_text);
|
||||
return Err(format!("Video generation failed: {}", error_text).into());
|
||||
}
|
||||
|
||||
let result: GenerationResponse = response.json().await?;
|
||||
|
||||
if result.status == "completed" {
|
||||
if let Some(file_path) = result.file_path {
|
||||
let full_url = format!("{}{}", self.config.base_url(), file_path);
|
||||
info!("Video generated: {}", full_url);
|
||||
return Ok(full_url);
|
||||
}
|
||||
}
|
||||
|
||||
Err(result
|
||||
.error
|
||||
.unwrap_or_else(|| "Unknown error".to_string())
|
||||
.into())
|
||||
}
|
||||
|
||||
/// Generate audio/speech from text
|
||||
pub async fn generate_audio(
|
||||
&self,
|
||||
text: &str,
|
||||
voice: Option<&str>,
|
||||
language: Option<&str>,
|
||||
) -> Result<String, Box<dyn std::error::Error + Send + Sync>> {
|
||||
if !self.config.enabled {
|
||||
return Err("BotModels is not enabled".into());
|
||||
}
|
||||
|
||||
let url = format!("{}/api/speech/generate", self.config.base_url());
|
||||
trace!("Generating audio at {}: {}", url, text);
|
||||
|
||||
let request = SpeechGenerateRequest {
|
||||
prompt: text.to_string(),
|
||||
voice: voice.map(String::from),
|
||||
language: language.map(String::from),
|
||||
};
|
||||
|
||||
let response = self
|
||||
.client
|
||||
.post(&url)
|
||||
.header("X-API-Key", &self.config.api_key)
|
||||
.json(&request)
|
||||
.send()
|
||||
.await?;
|
||||
|
||||
if !response.status().is_success() {
|
||||
let error_text = response.text().await.unwrap_or_default();
|
||||
error!("Audio generation failed: {}", error_text);
|
||||
return Err(format!("Audio generation failed: {}", error_text).into());
|
||||
}
|
||||
|
||||
let result: GenerationResponse = response.json().await?;
|
||||
|
||||
if result.status == "completed" {
|
||||
if let Some(file_path) = result.file_path {
|
||||
let full_url = format!("{}{}", self.config.base_url(), file_path);
|
||||
info!("Audio generated: {}", full_url);
|
||||
return Ok(full_url);
|
||||
}
|
||||
}
|
||||
|
||||
Err(result
|
||||
.error
|
||||
.unwrap_or_else(|| "Unknown error".to_string())
|
||||
.into())
|
||||
}
|
||||
|
||||
/// Get caption/description for an image
|
||||
pub async fn describe_image(
|
||||
&self,
|
||||
image_url_or_path: &str,
|
||||
) -> Result<String, Box<dyn std::error::Error + Send + Sync>> {
|
||||
if !self.config.enabled {
|
||||
return Err("BotModels is not enabled".into());
|
||||
}
|
||||
|
||||
let url = format!("{}/api/vision/describe", self.config.base_url());
|
||||
trace!("Describing image at {}: {}", url, image_url_or_path);
|
||||
|
||||
// If it's a URL, download the image first
|
||||
let image_data = if image_url_or_path.starts_with("http") {
|
||||
let response = self.client.get(image_url_or_path).send().await?;
|
||||
response.bytes().await?.to_vec()
|
||||
} else {
|
||||
tokio::fs::read(image_url_or_path).await?
|
||||
};
|
||||
|
||||
let form = reqwest::multipart::Form::new().part(
|
||||
"file",
|
||||
reqwest::multipart::Part::bytes(image_data)
|
||||
.file_name("image.png")
|
||||
.mime_str("image/png")?,
|
||||
);
|
||||
|
||||
let response = self
|
||||
.client
|
||||
.post(&url)
|
||||
.header("X-API-Key", &self.config.api_key)
|
||||
.multipart(form)
|
||||
.send()
|
||||
.await?;
|
||||
|
||||
if !response.status().is_success() {
|
||||
let error_text = response.text().await.unwrap_or_default();
|
||||
error!("Image description failed: {}", error_text);
|
||||
return Err(format!("Image description failed: {}", error_text).into());
|
||||
}
|
||||
|
||||
let result: DescribeResponse = response.json().await?;
|
||||
info!("Image described: {}", result.description);
|
||||
Ok(result.description)
|
||||
}
|
||||
|
||||
/// Get caption/description for a video
|
||||
pub async fn describe_video(
|
||||
&self,
|
||||
video_url_or_path: &str,
|
||||
) -> Result<String, Box<dyn std::error::Error + Send + Sync>> {
|
||||
if !self.config.enabled {
|
||||
return Err("BotModels is not enabled".into());
|
||||
}
|
||||
|
||||
let url = format!("{}/api/vision/describe_video", self.config.base_url());
|
||||
trace!("Describing video at {}: {}", url, video_url_or_path);
|
||||
|
||||
let video_data = if video_url_or_path.starts_with("http") {
|
||||
let response = self.client.get(video_url_or_path).send().await?;
|
||||
response.bytes().await?.to_vec()
|
||||
} else {
|
||||
tokio::fs::read(video_url_or_path).await?
|
||||
};
|
||||
|
||||
let form = reqwest::multipart::Form::new().part(
|
||||
"file",
|
||||
reqwest::multipart::Part::bytes(video_data)
|
||||
.file_name("video.mp4")
|
||||
.mime_str("video/mp4")?,
|
||||
);
|
||||
|
||||
let response = self
|
||||
.client
|
||||
.post(&url)
|
||||
.header("X-API-Key", &self.config.api_key)
|
||||
.multipart(form)
|
||||
.send()
|
||||
.await?;
|
||||
|
||||
if !response.status().is_success() {
|
||||
let error_text = response.text().await.unwrap_or_default();
|
||||
error!("Video description failed: {}", error_text);
|
||||
return Err(format!("Video description failed: {}", error_text).into());
|
||||
}
|
||||
|
||||
let result: VideoDescribeResponse = response.json().await?;
|
||||
info!("Video described: {}", result.description);
|
||||
Ok(result.description)
|
||||
}
|
||||
|
||||
/// Convert speech to text
|
||||
pub async fn speech_to_text(
|
||||
&self,
|
||||
audio_url_or_path: &str,
|
||||
) -> Result<String, Box<dyn std::error::Error + Send + Sync>> {
|
||||
if !self.config.enabled {
|
||||
return Err("BotModels is not enabled".into());
|
||||
}
|
||||
|
||||
let url = format!("{}/api/speech/totext", self.config.base_url());
|
||||
trace!(
|
||||
"Converting speech to text at {}: {}",
|
||||
url,
|
||||
audio_url_or_path
|
||||
);
|
||||
|
||||
let audio_data = if audio_url_or_path.starts_with("http") {
|
||||
let response = self.client.get(audio_url_or_path).send().await?;
|
||||
response.bytes().await?.to_vec()
|
||||
} else {
|
||||
tokio::fs::read(audio_url_or_path).await?
|
||||
};
|
||||
|
||||
let form = reqwest::multipart::Form::new().part(
|
||||
"file",
|
||||
reqwest::multipart::Part::bytes(audio_data)
|
||||
.file_name("audio.wav")
|
||||
.mime_str("audio/wav")?,
|
||||
);
|
||||
|
||||
let response = self
|
||||
.client
|
||||
.post(&url)
|
||||
.header("X-API-Key", &self.config.api_key)
|
||||
.multipart(form)
|
||||
.send()
|
||||
.await?;
|
||||
|
||||
if !response.status().is_success() {
|
||||
let error_text = response.text().await.unwrap_or_default();
|
||||
error!("Speech to text failed: {}", error_text);
|
||||
return Err(format!("Speech to text failed: {}", error_text).into());
|
||||
}
|
||||
|
||||
let result: SpeechToTextResponse = response.json().await?;
|
||||
info!("Speech converted: {}", result.text);
|
||||
Ok(result.text)
|
||||
}
|
||||
|
||||
/// Check if botmodels server is healthy
|
||||
pub async fn health_check(&self) -> bool {
|
||||
if !self.config.enabled {
|
||||
return false;
|
||||
}
|
||||
|
||||
let url = format!("{}/api/health", self.config.base_url());
|
||||
match self.client.get(&url).send().await {
|
||||
Ok(response) => response.status().is_success(),
|
||||
Err(_) => false,
|
||||
}
|
||||
}
|
||||
|
||||
/// Download generated file to local path
|
||||
pub async fn download_file(
|
||||
&self,
|
||||
url: &str,
|
||||
local_path: &str,
|
||||
) -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
|
||||
let response = self.client.get(url).send().await?;
|
||||
let bytes = response.bytes().await?;
|
||||
tokio::fs::write(local_path, bytes).await?;
|
||||
Ok(())
|
||||
}
|
||||
}
|
||||
|
||||
/// Ensure botmodels server is running (similar to llama.cpp startup)
|
||||
pub async fn ensure_botmodels_running(
|
||||
app_state: Arc<AppState>,
|
||||
) -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
|
||||
use crate::shared::models::schema::bots::dsl::*;
|
||||
use diesel::prelude::*;
|
||||
|
||||
let config_values = {
|
||||
let conn_arc = app_state.conn.clone();
|
||||
let default_bot_id = tokio::task::spawn_blocking(move || {
|
||||
let mut conn = conn_arc.get().unwrap();
|
||||
bots.filter(name.eq("default"))
|
||||
.select(id)
|
||||
.first::<uuid::Uuid>(&mut *conn)
|
||||
.unwrap_or_else(|_| uuid::Uuid::nil())
|
||||
})
|
||||
.await?;
|
||||
|
||||
let config_manager = ConfigManager::new(app_state.conn.clone());
|
||||
let config = BotModelsConfig::from_database(&config_manager, &default_bot_id);
|
||||
config
|
||||
};
|
||||
|
||||
if !config_values.enabled {
|
||||
info!("BotModels is disabled, skipping startup");
|
||||
return Ok(());
|
||||
}
|
||||
|
||||
info!("Checking BotModels server status...");
|
||||
info!(" Host: {}", config_values.host);
|
||||
info!(" Port: {}", config_values.port);
|
||||
|
||||
let client = BotModelsClient::new(
|
||||
config_values.clone(),
|
||||
ImageGeneratorConfig {
|
||||
model: String::new(),
|
||||
steps: 4,
|
||||
width: 512,
|
||||
height: 512,
|
||||
gpu_layers: 20,
|
||||
batch_size: 1,
|
||||
},
|
||||
VideoGeneratorConfig {
|
||||
model: String::new(),
|
||||
frames: 24,
|
||||
fps: 8,
|
||||
width: 320,
|
||||
height: 576,
|
||||
gpu_layers: 15,
|
||||
batch_size: 1,
|
||||
},
|
||||
);
|
||||
|
||||
// Check if already running
|
||||
if client.health_check().await {
|
||||
info!("BotModels server is already running and healthy");
|
||||
return Ok(());
|
||||
}
|
||||
|
||||
info!("BotModels server not responding, it should be started externally");
|
||||
info!(
|
||||
"Start botmodels with: cd botmodels && python -m uvicorn src.main:app --host {} --port {}",
|
||||
config_values.host, config_values.port
|
||||
);
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
|
@ -49,3 +49,25 @@ custom-password,
|
|||
website-expires,1d
|
||||
website-max-depth,3
|
||||
website-max-pages,100
|
||||
|
||||
|
||||
image-generator-model,../../../../data/diffusion/sd_turbo_f16.gguf
|
||||
image-generator-steps,4
|
||||
image-generator-width,512
|
||||
image-generator-height,512
|
||||
image-generator-gpu-layers,20
|
||||
image-generator-batch-size,1
|
||||
|
||||
video-generator-model,../../../../data/diffusion/zeroscope_v2_576w
|
||||
video-generator-frames,24
|
||||
video-generator-fps,8
|
||||
video-generator-width,320
|
||||
video-generator-height,576
|
||||
video-generator-gpu-layers,15
|
||||
video-generator-batch-size,1
|
||||
|
||||
botmodels-enabled,true
|
||||
botmodels-host,0.0.0.0
|
||||
botmodels-port,8085
|
||||
|
||||
default-generator,all
|
||||
|
|
|
|||
|
Loading…
Add table
Reference in a new issue