Fix Bedrock config for OpenAI GPT-OSS models
All checks were successful
GBCI / build (push) Successful in 11s
All checks were successful
GBCI / build (push) Successful in 11s
This commit is contained in:
parent
60fe68a693
commit
c312a30461
1 changed files with 118 additions and 11 deletions
|
|
@ -68,32 +68,43 @@ llm-model,claude-sonnet-4.5
|
||||||
- Premium pricing
|
- Premium pricing
|
||||||
- Newer provider, smaller ecosystem
|
- Newer provider, smaller ecosystem
|
||||||
|
|
||||||
### Google (Gemini Series)
|
### Google (Gemini & Vertex AI)
|
||||||
|
|
||||||
Google's multimodal AI models with strong reasoning capabilities.
|
Google's multimodal AI models with strong reasoning capabilities. General Bots natively supports both the public AI Studio API and Enterprise Vertex AI.
|
||||||
|
|
||||||
| Model | Context | Best For | Speed |
|
| Model | Context | Best For | Speed |
|
||||||
|-------|---------|----------|-------|
|
|-------|---------|----------|-------|
|
||||||
| Gemini Pro | 2M | Complex reasoning, benchmarks | Medium |
|
| Gemini 1.5 Pro | 2M | Complex reasoning, benchmarks | Medium |
|
||||||
| Gemini Flash | 1M | Fast multimodal tasks | Fast |
|
| Gemini 1.5 Flash | 1M | Fast multimodal tasks | Fast |
|
||||||
|
|
||||||
**Configuration (config.csv):**
|
**Configuration for AI Studio (Public API):**
|
||||||
|
|
||||||
```csv
|
```csv
|
||||||
name,value
|
name,value
|
||||||
llm-provider,google
|
llm-provider,google
|
||||||
llm-model,gemini-pro
|
llm-model,gemini-1.5-pro
|
||||||
|
llm-url,https://generativelanguage.googleapis.com
|
||||||
|
llm-key,AIza...
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**Configuration for Vertex AI (Enterprise):**
|
||||||
|
|
||||||
|
```csv
|
||||||
|
name,value
|
||||||
|
llm-provider,vertex
|
||||||
|
llm-model,gemini-1.5-pro
|
||||||
|
llm-url,https://us-central1-aiplatform.googleapis.com
|
||||||
|
llm-key,~/.vertex.json
|
||||||
|
```
|
||||||
|
*Note: The bots will handle the Google OAuth2 JWT authentication internally if you provide the path or the raw JSON to a Service Account.*
|
||||||
|
|
||||||
**Strengths:**
|
**Strengths:**
|
||||||
- Largest context window (2M tokens)
|
- Largest context window (2M tokens)
|
||||||
- Native multimodal (text, image, video, audio)
|
- Native multimodal (text, image, video, audio)
|
||||||
- Strong at structured data
|
- Vertex AI support enables enterprise VPC/IAM integration
|
||||||
- Good coding abilities
|
|
||||||
|
|
||||||
**Considerations:**
|
**Considerations:**
|
||||||
- Some features region-limited
|
- Different endpoints for public vs enterprise deployments
|
||||||
- API changes more frequently
|
|
||||||
|
|
||||||
### xAI (Grok Series)
|
### xAI (Grok Series)
|
||||||
|
|
||||||
|
|
@ -203,6 +214,98 @@ llm-server-url,https://api.deepseek.com
|
||||||
- Data processed in China
|
- Data processed in China
|
||||||
- Newer provider
|
- Newer provider
|
||||||
|
|
||||||
|
### Amazon Bedrock
|
||||||
|
|
||||||
|
AWS managed service for foundation models, supporting Claude, Llama, Titan, and others.
|
||||||
|
|
||||||
|
| Model | Context | Best For | Speed |
|
||||||
|
|-------|---------|----------|-------|
|
||||||
|
| Claude 3.5 Sonnet | 200K | High capability tasks | Fast |
|
||||||
|
| Llama 3.1 70B | 128K | Open-weight performance | Fast |
|
||||||
|
|
||||||
|
**Configuration (config.csv):**
|
||||||
|
|
||||||
|
```csv
|
||||||
|
name,value
|
||||||
|
llm-provider,bedrock
|
||||||
|
llm-model,anthropic.claude-3-5-sonnet-20240620-v1:0
|
||||||
|
llm-url,https://bedrock-runtime.us-east-1.amazonaws.com/model/anthropic.claude-3-5-sonnet-20240620-v1:0/invoke
|
||||||
|
llm-key,YOUR_BEDROCK_API_KEY
|
||||||
|
```
|
||||||
|
|
||||||
|
**Strengths:**
|
||||||
|
- Native AWS integration
|
||||||
|
- Enterprise-grade security
|
||||||
|
- Multiple model families in one API
|
||||||
|
|
||||||
|
### Azure OpenAI
|
||||||
|
|
||||||
|
Enterprise-grade deployment of OpenAI models hosted on Microsoft Azure.
|
||||||
|
|
||||||
|
| Model | Context | Best For | Speed |
|
||||||
|
|-------|---------|----------|-------|
|
||||||
|
| GPT-4o | 128K | Advanced multimodal | Fast |
|
||||||
|
|
||||||
|
**Configuration (config.csv):**
|
||||||
|
|
||||||
|
```csv
|
||||||
|
name,value
|
||||||
|
llm-provider,azureclaude
|
||||||
|
llm-model,gpt-4o
|
||||||
|
llm-url,https://YOUR_RESOURCE.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT/chat/completions?api-version=2024-02-15-preview
|
||||||
|
llm-key,YOUR_AZURE_API_KEY
|
||||||
|
```
|
||||||
|
|
||||||
|
**Strengths:**
|
||||||
|
- High enterprise compliance (HIPAA, SOC2)
|
||||||
|
- Azure VNet integration
|
||||||
|
- Guaranteed provisioned throughput available
|
||||||
|
|
||||||
|
### Cerebras
|
||||||
|
|
||||||
|
Ultra-fast inference powered by Wafer-Scale Engine hardware, specifically tuned for open-source models like Llama.
|
||||||
|
|
||||||
|
| Model | Context | Best For | Speed |
|
||||||
|
|-------|---------|----------|-------|
|
||||||
|
| Llama 3.1 70B | 8K | High-speed general tasks | Extremely Fast |
|
||||||
|
|
||||||
|
**Configuration (config.csv):**
|
||||||
|
|
||||||
|
```csv
|
||||||
|
name,value
|
||||||
|
llm-provider,cerebras
|
||||||
|
llm-model,llama3.1-8b
|
||||||
|
llm-url,https://api.cerebras.ai/v1/chat/completions
|
||||||
|
llm-key,YOUR_CEREBRAS_API_KEY
|
||||||
|
```
|
||||||
|
|
||||||
|
**Strengths:**
|
||||||
|
- Highest tokens-per-second available
|
||||||
|
- Excellent for real-time agent loops
|
||||||
|
|
||||||
|
### Zhipu AI (GLM)
|
||||||
|
|
||||||
|
High-capability bilingual models (English/Chinese) directly competing with state-of-the-art global models.
|
||||||
|
|
||||||
|
| Model | Context | Best For | Speed |
|
||||||
|
|-------|---------|----------|-------|
|
||||||
|
| GLM-4 | 128K | General purpose | Medium |
|
||||||
|
| GLM-4-Long | 1M | Long document analysis | Medium |
|
||||||
|
|
||||||
|
**Configuration (config.csv):**
|
||||||
|
|
||||||
|
```csv
|
||||||
|
name,value
|
||||||
|
llm-provider,glm
|
||||||
|
llm-model,glm-4
|
||||||
|
llm-url,https://open.bigmodel.cn/api/paas/v4/chat/completions
|
||||||
|
llm-key,YOUR_ZHIPU_API_KEY
|
||||||
|
```
|
||||||
|
|
||||||
|
**Strengths:**
|
||||||
|
- Excellent bilingual performance
|
||||||
|
- Large context windows (up to 1M)
|
||||||
|
|
||||||
## Local Models
|
## Local Models
|
||||||
|
|
||||||
Run models on your own hardware for privacy, cost control, and offline operation.
|
Run models on your own hardware for privacy, cost control, and offline operation.
|
||||||
|
|
@ -456,8 +559,12 @@ llm-log-timing,true
|
||||||
|-------|---------|------|-----------|
|
|-------|---------|------|-----------|
|
||||||
| GPT-5 | OpenAI | Proprietary | Most advanced all-in-one |
|
| GPT-5 | OpenAI | Proprietary | Most advanced all-in-one |
|
||||||
| Claude Opus/Sonnet 4.5 | Anthropic | Proprietary | Extended thinking, complex reasoning |
|
| Claude Opus/Sonnet 4.5 | Anthropic | Proprietary | Extended thinking, complex reasoning |
|
||||||
| Gemini 3 Pro | Google | Proprietary | Benchmarks, reasoning |
|
| Gemini 1.5/3 Pro | Google | Proprietary | Benchmarks, reasoning, 2M context |
|
||||||
| Grok 4 | xAI | Proprietary | Real-time X data |
|
| Grok 4 | xAI | Proprietary | Real-time X data |
|
||||||
|
| Claude / Llama | Amazon Bedrock | Managed API | Enterprise AWS integration |
|
||||||
|
| GPT-4o / GPT-5 | Azure OpenAI | Managed API | Enterprise compliance, Azure VNet |
|
||||||
|
| Llama / Open Models | Cerebras | Hardware Cloud | Extreme inference speed |
|
||||||
|
| GLM-4 | Zhipu AI | Proprietary | English/Chinese bilingual, up to 1M context |
|
||||||
| DeepSeek-V3.1/R1 | DeepSeek | Open (MIT/Apache) | Cost-optimized, reasoning |
|
| DeepSeek-V3.1/R1 | DeepSeek | Open (MIT/Apache) | Cost-optimized, reasoning |
|
||||||
| Llama 4 | Meta | Open-weight | 10M context, multimodal |
|
| Llama 4 | Meta | Open-weight | 10M context, multimodal |
|
||||||
| Qwen3 | Alibaba | Open (Apache) | Efficient MoE |
|
| Qwen3 | Alibaba | Open (Apache) | Efficient MoE |
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue