Skip to main content
Embedding providers convert text into high-dimensional vectors for semantic search. Agentium provides a unified EmbeddingProvider interface with OpenAI and Google implementations.

EmbeddingProvider Interface

interface EmbeddingProvider {
  readonly dimensions: number;
  embed(text: string): Promise<number[]>;
  embedBatch(texts: string[]): Promise<number[][]>;
  /** Optional - true for models like gemini-embedding-2 */
  readonly supportsMultimodal?: boolean;
  /** Optional - returns ONE vector for the full multimodal input */
  embedMultimodal?(input: string | ContentPart | ContentPart[]): Promise<number[]>;
}

OpenAI Embeddings

npm install openai
import { OpenAIEmbedding } from "@agentium/core";

const embedder = new OpenAIEmbedding({
  apiKey: process.env.OPENAI_API_KEY,  // optional, uses env var by default
  model: "text-embedding-3-small",      // optional, this is the default
});

const vector = await embedder.embed("Hello world");
console.log(vector.length); // 1536

const vectors = await embedder.embedBatch(["Hello", "World"]);
console.log(vectors.length); // 2

Available Models

ModelDimensionsBest For
text-embedding-3-small1536General use, cost-effective
text-embedding-3-large3072Higher accuracy
text-embedding-ada-0021536Legacy
apiKey
string
OpenAI API key. Falls back to OPENAI_API_KEY env var.
model
string
default:"text-embedding-3-small"
Embedding model name.

Google Embeddings

npm install @google/genai
import { GoogleEmbedding } from "@agentium/core";

const embedder = new GoogleEmbedding({
  apiKey: process.env.GOOGLE_API_KEY,  // optional, uses env var by default
  model: "text-embedding-004",         // optional, this is the default
});

const vector = await embedder.embed("Hello world");
const vectors = await embedder.embedBatch(["Hello", "World"]);

Available Models

ModelDimensionsModalitiesBest For
text-embedding-004768TextGeneral text, cost-effective
embedding-001768TextLegacy
gemini-embedding-0013072TextHigher accuracy text-only
gemini-embedding-23072 (Matryoshka 128 - 3072)Text + image + audio + video + PDFMultimodal RAG and unified search
apiKey
string
Google API key. Falls back to GOOGLE_API_KEY env var.
model
string
default:"text-embedding-004"
Embedding model name. Switch to gemini-embedding-2 for multimodal input.
dimensions
number
Override the output dimension. gemini-embedding-2 supports any value from 128 to 3072 (recommended: 768, 1536, or 3072) thanks to Matryoshka Representation Learning.

Multimodal Embeddings (Gemini Embedding 2)

gemini-embedding-2 maps text, images, audio, video, and PDFs into a single unified vector space. One call returns one aggregated 3072-dim vector for the whole input - useful when you want a single semantic key for a heterogeneous document.

Index an image with a caption

import {
  GoogleEmbedding,
  InMemoryVectorStore,
  partsFromFile,
} from "@agentium/core";

const embedder = new GoogleEmbedding({ model: "gemini-embedding-2" });
const store = new InMemoryVectorStore(embedder);

const imagePart = await partsFromFile("./photo.jpg");

await store.upsert("photos", {
  id: "photo-1",
  content: "Sunset over the Sahara",
  parts: [
    { type: "text", text: "Sunset over the Sahara" },
    imagePart,
  ],
  metadata: { tags: ["nature", "africa"] },
});

Search by image

const queryImage = await partsFromFile("./query.jpg");
const hits = await store.search("photos", [queryImage], { topK: 5 });
console.log(hits);

Supported modalities

  • Text -> { type: "text", text }
  • Image -> { type: "image", data, mimeType } (base64 or HTTPS URL; supported MIME image/png|jpeg|gif|webp)
  • Audio -> { type: "audio", data, mimeType }
  • Video -> { type: "file", data, mimeType: "video/mp4" }
  • PDF -> { type: "file", data, mimeType: "application/pdf" }
Helpers:
  • partsFromFile(path, mimeType?) reads a local file, infers MIME from extension, and returns the right ContentPart.
  • fetchAsBase64(url) downloads an HTTPS URL and returns { data, mimeType }.

Output dimensions (Matryoshka)

gemini-embedding-2 is trained with Matryoshka Representation Learning, so you can store smaller vectors without retraining:
const compact = new GoogleEmbedding({
  model: "gemini-embedding-2",
  dimensions: 768,  // also supported: 128-3072; recommended 768 / 1536 / 3072
});

Important: v1 and v2 vectors are NOT interchangeable

Embeddings produced by text-embedding-004 / gemini-embedding-001 live in a different semantic space than those produced by gemini-embedding-2. If you upgrade, re-index your collection - cosine similarity across the two spaces is meaningless.

Limitations

  • One aggregated vector per call. If you need separate vectors per item, call embedMultimodal once per item (the Gemini Batch API for per-item vectors is on the roadmap).
  • Vector backends store text only. The content field and the embedding are persisted; the original parts are not. Put any structured payload you need to keep (image URL, file path, page number) in metadata.
  • Per-call media limits (enforced by Google): 6 images, 120s video, 180s audio, 6 PDF pages.

Using with KnowledgeBase

Embedding providers are passed to KnowledgeBase via the vector store. Most vector stores accept an EmbeddingProvider in their configuration or the KnowledgeBase handles embedding internally.
import { KnowledgeBase, InMemoryVectorStore, OpenAIEmbedding } from "@agentium/core";

const embedder = new OpenAIEmbedding();
const vectorStore = new InMemoryVectorStore(1536);

const kb = new KnowledgeBase({
  name: "docs",
  vectorStore,
});

RAG Example

See a complete end-to-end RAG implementation using embeddings.