Embeddings - Agentium

Embedding providers convert text into high-dimensional vectors for semantic search. Agentium provides a unified EmbeddingProvider interface with OpenAI and Google implementations.

EmbeddingProvider Interface

interface EmbeddingProvider {
  readonly dimensions: number;
  embed(text: string): Promise<number[]>;
  embedBatch(texts: string[]): Promise<number[][]>;
  /** Optional - true for models like gemini-embedding-2 */
  readonly supportsMultimodal?: boolean;
  /** Optional - returns ONE vector for the full multimodal input */
  embedMultimodal?(input: string | ContentPart | ContentPart[]): Promise<number[]>;
}

OpenAI Embeddings

npm install openai

import { OpenAIEmbedding } from "@agentium/core";

const embedder = new OpenAIEmbedding({
  apiKey: process.env.OPENAI_API_KEY,  // optional, uses env var by default
  model: "text-embedding-3-small",      // optional, this is the default
});

const vector = await embedder.embed("Hello world");
console.log(vector.length); // 1536

const vectors = await embedder.embedBatch(["Hello", "World"]);
console.log(vectors.length); // 2

Available Models

Model	Dimensions	Best For
`text-embedding-3-small`	1536	General use, cost-effective
`text-embedding-3-large`	3072	Higher accuracy
`text-embedding-ada-002`	1536	Legacy

apiKey

string

OpenAI API key. Falls back to OPENAI_API_KEY env var.

model

string

default:"text-embedding-3-small"

Embedding model name.

Google Embeddings

npm install @google/genai

import { GoogleEmbedding } from "@agentium/core";

const embedder = new GoogleEmbedding({
  apiKey: process.env.GOOGLE_API_KEY,  // optional, uses env var by default
  model: "text-embedding-004",         // optional, this is the default
});

const vector = await embedder.embed("Hello world");
const vectors = await embedder.embedBatch(["Hello", "World"]);

Available Models

Model	Dimensions	Modalities	Best For
`text-embedding-004`	768	Text	General text, cost-effective
`embedding-001`	768	Text	Legacy
`gemini-embedding-001`	3072	Text	Higher accuracy text-only
`gemini-embedding-2`	3072 (Matryoshka 128 - 3072)	Text + image + audio + video + PDF	Multimodal RAG and unified search

apiKey

string

Google API key. Falls back to GOOGLE_API_KEY env var.

model

string

default:"text-embedding-004"

Embedding model name. Switch to gemini-embedding-2 for multimodal input.

dimensions

number

Override the output dimension. gemini-embedding-2 supports any value from 128 to 3072 (recommended: 768, 1536, or 3072) thanks to Matryoshka Representation Learning.

Multimodal Embeddings (Gemini Embedding 2)

gemini-embedding-2 maps text, images, audio, video, and PDFs into a single unified vector space. One call returns one aggregated 3072-dim vector for the whole input - useful when you want a single semantic key for a heterogeneous document.

Index an image with a caption

import {
  GoogleEmbedding,
  InMemoryVectorStore,
  partsFromFile,
} from "@agentium/core";

const embedder = new GoogleEmbedding({ model: "gemini-embedding-2" });
const store = new InMemoryVectorStore(embedder);

const imagePart = await partsFromFile("./photo.jpg");

await store.upsert("photos", {
  id: "photo-1",
  content: "Sunset over the Sahara",
  parts: [
    { type: "text", text: "Sunset over the Sahara" },
    imagePart,
  ],
  metadata: { tags: ["nature", "africa"] },
});

Search by image

const queryImage = await partsFromFile("./query.jpg");
const hits = await store.search("photos", [queryImage], { topK: 5 });
console.log(hits);

Supported modalities

Text -> { type: "text", text }
Image -> { type: "image", data, mimeType } (base64 or HTTPS URL; supported MIME image/png|jpeg|gif|webp)
Audio -> { type: "audio", data, mimeType }
Video -> { type: "file", data, mimeType: "video/mp4" }
PDF -> { type: "file", data, mimeType: "application/pdf" }

Helpers:

partsFromFile(path, mimeType?) reads a local file, infers MIME from extension, and returns the right ContentPart.
fetchAsBase64(url) downloads an HTTPS URL and returns { data, mimeType }.

Output dimensions (Matryoshka)

gemini-embedding-2 is trained with Matryoshka Representation Learning, so you can store smaller vectors without retraining:

const compact = new GoogleEmbedding({
  model: "gemini-embedding-2",
  dimensions: 768,  // also supported: 128-3072; recommended 768 / 1536 / 3072
});

Important: v1 and v2 vectors are NOT interchangeable

Embeddings produced by text-embedding-004 / gemini-embedding-001 live in a different semantic space than those produced by gemini-embedding-2. If you upgrade, re-index your collection - cosine similarity across the two spaces is meaningless.

Limitations

One aggregated vector per call. If you need separate vectors per item, call embedMultimodal once per item (the Gemini Batch API for per-item vectors is on the roadmap).
Vector backends store text only. The content field and the embedding are persisted; the original parts are not. Put any structured payload you need to keep (image URL, file path, page number) in metadata.
Per-call media limits (enforced by Google): 6 images, 120s video, 180s audio, 6 PDF pages.

Using with KnowledgeBase

Embedding providers are passed to KnowledgeBase via the vector store. Most vector stores accept an EmbeddingProvider in their configuration or the KnowledgeBase handles embedding internally.

import { KnowledgeBase, InMemoryVectorStore, OpenAIEmbedding } from "@agentium/core";

const embedder = new OpenAIEmbedding();
const vectorStore = new InMemoryVectorStore(1536);

const kb = new KnowledgeBase({
  name: "docs",
  vectorStore,
});

RAG Example

See a complete end-to-end RAG implementation using embeddings.

Vector Stores Hybrid Search

​EmbeddingProvider Interface

​OpenAI Embeddings

​Available Models

​Google Embeddings

​Available Models

​Multimodal Embeddings (Gemini Embedding 2)

​Index an image with a caption

​Search by image

​Supported modalities

​Output dimensions (Matryoshka)

​Important: v1 and v2 vectors are NOT interchangeable

​Limitations

​Using with KnowledgeBase

RAG Example

EmbeddingProvider Interface

OpenAI Embeddings

Available Models

Google Embeddings

Available Models

Multimodal Embeddings (Gemini Embedding 2)

Index an image with a caption

Search by image

Supported modalities

Output dimensions (Matryoshka)

Important: v1 and v2 vectors are NOT interchangeable

Limitations

Using with KnowledgeBase