Models

Model Agnosticism

Agentium is model-agnostic. You can use OpenAI, Anthropic, Google Gemini, Vertex AI, AWS Bedrock, Azure OpenAI, Azure AI Foundry, DeepSeek, Mistral, xAI (Grok), Perplexity, Cohere, Meta (Llama), Vercel v0, Ollama, or any custom provider through a unified ModelProvider interface. Switch models with a single line change—no refactoring required.

Unified Interface

All providers implement the same generate() and stream() methods. Your agent code stays identical.

Easy Switching

Swap openai("gpt-4o") for anthropic("claude-sonnet-4-20250514") without touching the rest of your app.

ModelProvider Interface

Every model in Agentium implements the ModelProvider interface:

interface ModelProvider {
  readonly providerId: string;
  readonly modelId: string;
  generate(messages: ChatMessage[], options?: ModelConfig & { tools?: ToolDefinition[] }): Promise<ModelResponse>;
  stream(messages: ChatMessage[], options?: ModelConfig & { tools?: ToolDefinition[] }): AsyncGenerator<StreamChunk>;
}

providerId

string

Unique identifier for the provider (e.g., "openai", "anthropic").

modelId

string

The specific model identifier (e.g., "gpt-4o", "claude-sonnet-4-20250514").

generate

(messages, options?) => Promise<ModelResponse>

Non-streaming completion. Returns the full response with message, usage, and finish reason.

stream

(messages, options?) => AsyncGenerator<StreamChunk>

Streaming completion. Yields text deltas, tool call chunks, and finish events.

Factory Functions

Use factory functions to create model instances. Each returns a ModelProvider ready for agents, teams, and workflows.

Factory	Provider	Returns	Config
`openai(modelId, config?)`	OpenAI	`ModelProvider`	`{ apiKey?, baseURL? }`
`anthropic(modelId, config?)`	Anthropic	`ModelProvider`	`{ apiKey? }`
`google(modelId, config?)`	Google Gemini	`ModelProvider`	`{ apiKey? }`
`vertex(modelId, config?)`	Google Vertex AI	`ModelProvider`	`{ project?, location?, credentials? }`
`ollama(modelId, config?)`	Ollama (local)	`ModelProvider`	`{ host? }` (default: `http://localhost:11434`)
`awsBedrock(modelId, config?)`	AWS Bedrock	`ModelProvider`	`{ accessKeyId?, secretAccessKey?, region? }`
`awsClaude(modelId, config?)`	AWS Claude (Bedrock)	`ModelProvider`	`{ awsAccessKey?, awsSecretKey?, awsRegion? }`
`azureOpenai(modelId, config?)`	Azure OpenAI	`ModelProvider`	`{ apiKey?, endpoint?, deployment?, apiVersion? }`
`azureFoundry(modelId, config?)`	Azure AI Foundry	`ModelProvider`	`{ apiKey?, endpoint? }`
`deepseek(modelId, config?)`	DeepSeek	`ModelProvider`	`{ apiKey?, baseURL? }`
`mistral(modelId, config?)`	Mistral	`ModelProvider`	`{ apiKey?, baseURL? }`
`xai(modelId, config?)`	xAI (Grok)	`ModelProvider`	`{ apiKey?, baseURL? }`
`perplexity(modelId, config?)`	Perplexity	`ModelProvider`	`{ apiKey?, baseURL?, search? }`
`cohere(modelId, config?)`	Cohere	`ModelProvider`	`{ apiKey? }`
`meta(modelId, config?)`	Meta (Llama)	`ModelProvider`	`{ apiKey?, baseURL? }`
`vercel(modelId, config?)`	Vercel v0	`ModelProvider`	`{ apiKey?, baseURL? }`
`openaiRealtime(modelId?, config?)`	OpenAI Realtime	`RealtimeProvider`	`{ apiKey?, baseURL? }`
`googleLive(modelId?, config?)`	Gemini Live	`RealtimeProvider`	`{ apiKey? }`

Switching Models in One Line

import {
  Agent, openai, anthropic, google, vertex, ollama,
  awsBedrock, awsClaude, azureOpenai, azureFoundry,
  deepseek, mistral, xai, perplexity, cohere, meta, vercel,
} from "@agentium/core";

// OpenAI
const model = openai("gpt-4o");

// Anthropic
const model = anthropic("claude-sonnet-4-20250514");

// Google Gemini (API key)
const model = google("gemini-2.5-flash");

// Google Vertex AI (GCP auth)
const model = vertex("gemini-2.5-flash", { project: "my-project" });

// Local Ollama (no API key)
const model = ollama("llama3.1");

// AWS Bedrock (Mistral, Nova, Llama, etc.)
const model = awsBedrock("mistral.mistral-large-2402-v1:0");

// AWS Claude (Claude via Bedrock)
const model = awsClaude("us.anthropic.claude-sonnet-4-20250514-v1:0");

// Azure OpenAI (GPT on Azure)
const model = azureOpenai("gpt-4o");

// Azure AI Foundry (Phi, Llama, Mistral on Azure)
const model = azureFoundry("Phi-4");

// DeepSeek (reasoning + chat)
const model = deepseek("deepseek-chat");

// Mistral (code, vision, reasoning)
const model = mistral("mistral-large-latest");

// xAI Grok
const model = xai("grok-3");

// Perplexity (built-in web search)
const model = perplexity("sonar-pro");

// Cohere (RAG-optimized)
const model = cohere("command-r-plus");

// Meta Llama (via Llama API)
const model = meta("Llama-4-Scout-17B-16E-Instruct");

// Vercel v0 (web dev code gen)
const model = vercel("v0-1.0-md");

const agent = new Agent({
  name: "Assistant",
  model,
  instructions: "You are a helpful assistant.",
});

Realtime Providers (Voice)

For voice agents, use the realtime helpers that return a RealtimeProvider:

import { VoiceAgent, openaiRealtime, googleLive } from "@agentium/core";

// OpenAI Realtime
const agent = new VoiceAgent({
  name: "assistant",
  provider: openaiRealtime("gpt-4o-realtime-preview"),
});

// Google Gemini Live
const agent = new VoiceAgent({
  name: "assistant",
  provider: googleLive("gemini-2.5-flash-native-audio-preview-12-2025"),
});

ModelConfig Options

Pass these options to generate() or stream() (or set them on agents):

temperature

number

Sampling temperature (0–2). Lower = more deterministic. Typical: 0.7.

maxTokens

number

Maximum tokens in the completion. Provider-specific limits apply.

topP

number

Nucleus sampling. Alternative to temperature for some providers.

stop

string[]

Stop sequences. Generation stops when any of these strings are produced.

responseFormat

'text' | 'json' | object

Output format: "text" (default), "json" (JSON object), or { type: "json_schema", schema, name? } for structured output.

apiKey

string

Per-request API key override. Use when you need to override the key set at construction (e.g., multi-tenant).

Example: ModelConfig Usage

const response = await model.generate(messages, {
  temperature: 0.3,
  maxTokens: 1024,
  responseFormat: "json",
});

SDK Implementation

Each provider uses the best available SDK for its API. Some providers support a dual-mode pattern: they try to load the native SDK first for full feature access, and fall back to the openai SDK (via the provider’s OpenAI-compatible endpoint) if the native SDK is not installed.

Provider	Native SDK	Fallback	Notes
OpenAI	`openai`	—	Direct
Anthropic	`@anthropic-ai/sdk`	—	Direct
Google / Vertex	`@google/genai`	—	Direct
Ollama	`ollama`	—	Direct
AWS Bedrock	`@aws-sdk/client-bedrock-runtime`	—	Converse API
AWS Claude	`@anthropic-ai/bedrock-sdk`	—	Anthropic SDK + AWS auth
Mistral	`@mistralai/mistralai`	`openai`	Dual-mode
Cohere	`cohere-ai`	`openai`	Dual-mode
Perplexity	`@perplexity-ai/perplexity_ai`	`openai`	Dual-mode; native unlocks search options & citations
xAI, DeepSeek, Meta, Vercel	`openai`	—	OpenAI-compatible APIs
Azure OpenAI, Azure Foundry	`openai`	—	OpenAI-compatible with Azure auth

For dual-mode providers, install the native SDK to access provider-specific features (e.g., Perplexity search filtering, Cohere RAG connectors). If you only need basic chat completions, the openai fallback works fine.

TokenUsage

Every model response includes a TokenUsage object with normalized counts and the raw provider metrics:

interface TokenUsage {
  promptTokens: number;
  completionTokens: number;
  totalTokens: number;
  reasoningTokens?: number;
  cachedTokens?: number;
  audioInputTokens?: number;
  audioOutputTokens?: number;
  providerMetrics?: Record<string, unknown>;
}

The providerMetrics field contains the raw usage object from the underlying API (e.g., usageMetadata from Gemini, usage from OpenAI, usage from Anthropic). This is useful for debugging, auditing, or accessing provider-specific fields that aren’t captured in the normalized interface (e.g., thoughtsTokenCount, prompt_tokens_details, cache_read_input_tokens).

const result = await agent.run("Hello!");

console.log(result.usage.promptTokens);       // 16 (normalized)
console.log(result.usage.providerMetrics);     // Raw API response
// OpenAI: { prompt_tokens: 16, completion_tokens: 10, ... }
// Vertex: { promptTokenCount: 16, candidatesTokenCount: 10, thoughtsTokenCount: 0, ... }

Next Steps

OpenAI

GPT-4o, GPT-4o-mini, GPT-4-turbo, o1-preview.

Anthropic

Claude Sonnet, Claude Haiku.

Google Gemini

Gemini 2.5 Flash, Gemini 2.5 Pro. Multi-modal support.

Vertex AI

Enterprise Gemini via Google Cloud. IAM auth, VPC, compliance.

Ollama

Local models: Llama, CodeLlama, Mistral.

AWS Bedrock

Mistral, Nova, Llama, Cohere via AWS.

AWS Claude

Claude on Bedrock with AWS auth.

Azure OpenAI

GPT-4o, o-series on Azure with enterprise compliance.

Azure AI Foundry

Phi, Llama, Mistral, Cohere on Azure.

DeepSeek

Reasoning and chat models with chain-of-thought.

Mistral

Code generation, vision (Pixtral), reasoning.

xAI (Grok)

Grok models with live web search.

Perplexity

Search-grounded answers with citations.

Cohere

RAG-optimized Command models, fine-tuning.

Meta (Llama)

Llama 3.3, Llama 4 via the Llama API.

Vercel v0

Web development code generation.

OpenAI-Compatible

Together, Groq, Fireworks, OpenRouter, NVIDIA, and more.

Custom Provider

Implement your own ModelProvider from scratch.

​Models

​Model Agnosticism

Unified Interface

Easy Switching

​ModelProvider Interface

​Factory Functions

​Switching Models in One Line

​Realtime Providers (Voice)

​ModelConfig Options

​Example: ModelConfig Usage

​SDK Implementation

​TokenUsage

​Next Steps

OpenAI

Anthropic

Google Gemini

Vertex AI

Ollama

AWS Bedrock

AWS Claude

Azure OpenAI

Azure AI Foundry

DeepSeek

Mistral

xAI (Grok)

Perplexity

Cohere

Meta (Llama)

Vercel v0

OpenAI-Compatible

Custom Provider

Models

Model Agnosticism

ModelProvider Interface

Factory Functions

Switching Models in One Line

Realtime Providers (Voice)

ModelConfig Options

Example: ModelConfig Usage

SDK Implementation

TokenUsage

Next Steps