EmbeddingProvider interface with OpenAI and Google implementations.
EmbeddingProvider Interface
OpenAI Embeddings
Available Models
| Model | Dimensions | Best For |
|---|---|---|
text-embedding-3-small | 1536 | General use, cost-effective |
text-embedding-3-large | 3072 | Higher accuracy |
text-embedding-ada-002 | 1536 | Legacy |
OpenAI API key. Falls back to
OPENAI_API_KEY env var.Embedding model name.
Google Embeddings
Available Models
| Model | Dimensions | Modalities | Best For |
|---|---|---|---|
text-embedding-004 | 768 | Text | General text, cost-effective |
embedding-001 | 768 | Text | Legacy |
gemini-embedding-001 | 3072 | Text | Higher accuracy text-only |
gemini-embedding-2 | 3072 (Matryoshka 128 - 3072) | Text + image + audio + video + PDF | Multimodal RAG and unified search |
Google API key. Falls back to
GOOGLE_API_KEY env var.Embedding model name. Switch to
gemini-embedding-2 for multimodal input.Override the output dimension.
gemini-embedding-2 supports any value from 128 to 3072 (recommended: 768, 1536, or 3072) thanks to Matryoshka Representation Learning.Multimodal Embeddings (Gemini Embedding 2)
gemini-embedding-2 maps text, images, audio, video, and PDFs into a single unified vector space. One call returns one aggregated 3072-dim vector for the whole input - useful when you want a single semantic key for a heterogeneous document.
Index an image with a caption
Search by image
Supported modalities
- Text ->
{ type: "text", text } - Image ->
{ type: "image", data, mimeType }(base64 or HTTPS URL; supported MIMEimage/png|jpeg|gif|webp) - Audio ->
{ type: "audio", data, mimeType } - Video ->
{ type: "file", data, mimeType: "video/mp4" } - PDF ->
{ type: "file", data, mimeType: "application/pdf" }
partsFromFile(path, mimeType?)reads a local file, infers MIME from extension, and returns the rightContentPart.fetchAsBase64(url)downloads an HTTPS URL and returns{ data, mimeType }.
Output dimensions (Matryoshka)
gemini-embedding-2 is trained with Matryoshka Representation Learning, so you can store smaller vectors without retraining:
Important: v1 and v2 vectors are NOT interchangeable
Embeddings produced bytext-embedding-004 / gemini-embedding-001 live in a different semantic space than those produced by gemini-embedding-2. If you upgrade, re-index your collection - cosine similarity across the two spaces is meaningless.
Limitations
- One aggregated vector per call. If you need separate vectors per item, call
embedMultimodalonce per item (the Gemini Batch API for per-item vectors is on the roadmap). - Vector backends store text only. The
contentfield and the embedding are persisted; the originalpartsare not. Put any structured payload you need to keep (image URL, file path, page number) inmetadata. - Per-call media limits (enforced by Google): 6 images, 120s video, 180s audio, 6 PDF pages.
Using with KnowledgeBase
Embedding providers are passed toKnowledgeBase via the vector store. Most vector stores accept an EmbeddingProvider in their configuration or the KnowledgeBase handles embedding internally.
RAG Example
See a complete end-to-end RAG implementation using embeddings.