Milvus vector store configuration options

Last update:

May 18, 2026

This document describes settings when Milvus is selected as the vector store provider (where embedding vectors and metadata are stored and searched). Milvus is not the model that produces embeddings: you configure the embedding model separately (for example under AI Services > Embedding models, or via Source / Provider on the embedding section of your feature).

Cross-check field names and defaults against your product build.

Global behavior notes

Dimension must match embeddings: The dimension you set here must exactly match the output size of the embedding model used for this workflow. If they differ, inserts or searches will fail or return incorrect results.
Secrets: API key, username, and password grant access to your Milvus or Zilliz cluster. Store configurations securely and restrict who can view or export them.
Cloud vs local: Milvus Cloud and Zilliz Cloud typically require an API key. Local Milvus without authentication may leave key and basic-auth fields empty.
Timeouts: Values in Connection settings and Retry settings are in milliseconds unless your UI states otherwise.

Basic configuration

Field	Description
Provider	Milvus. The selected vector store provider for this configuration.
URL	Required. Endpoint for your Milvus instance (for example `http://localhost:19530` for a local default). For Milvus Cloud or Zilliz Cloud, use the connection URI from your cluster or project details page.
Dimension	Required. Length of each embedding vector stored in this collection. Must match your embedding model’s output dimension. Common examples include 1536 (for example OpenAI `text-embedding-3-small` at default size) and 384 (for example All-MiniLM-style models)—use the value your embedding pipeline actually produces.
Collection name	Optional with a product default (often `default`). Logical collection that holds vectors and schema. If the collection does not exist, the product may create it using this dimension and index settings—confirm auto-create behavior in your release notes.
Database name	Optional with a product default (often `default`). Milvus database name for namespace isolation. If omitted, the integration typically uses the default database.
Metric type	Distance metric for similarity search (for example Cosine, L2 (Euclidean), IP (inner product)). Choose a metric that matches how your embeddings were trained or normalized; cosine is common for normalized embedding vectors.
Index type	Vector index algorithm. Typical options include HNSW (strong query performance, higher memory), IVF_FLAT (balance of speed and memory), IVF_SQ8 (more memory-efficient), FLAT (exact search; slower at large scale), and AUTOINDEX (automatic choice where supported). Pick based on dataset size, latency budget, and memory.
API key	Optional for local Milvus without auth. For Milvus Cloud or Zilliz Cloud, usually required. Leave blank only when your deployment does not use token authentication.
Username	Optional. User name for basic authentication when the Milvus endpoint requires user/password auth. Not used when you authenticate only with an API key.
Password	Optional. Password paired with username for basic authentication. Treat stored passwords as secrets.
Top K	Optional. Maximum number of hits to return from a similarity search. Must be a positive integer when set. If empty, the product uses its default top K.
Min score	Optional. Minimum similarity score for a hit to be returned; results below the threshold are dropped. For many metrics, meaningful ranges are between 0 and 1; leave empty to return all top-K results without a score cutoff.

---

Connection settings

These tune gRPC/HTTP client behavior to the Milvus endpoint.

Field	Description
Call timeout (ms)	Maximum time to wait for a single RPC or request to finish. Default is often 60000 ms (60 seconds).
Connection timeout (ms)	Maximum time to wait while establishing a connection. Default is often 20000 ms (20 seconds).
Keep alive	When enabled, keeps connections warm (for example HTTP/2 keep-alive), which can reduce latency on steady workloads. Default is often on.
Keep alive time (ms)	Interval between keep-alive pings when the connection is idle. Default is often 30000 ms (30 seconds).
Keep alive timeout (ms)	How long to wait for a keep-alive ping acknowledgment. Default is often 5000 ms (5 seconds).
Idle timeout (ms)	How long a connection may stay idle before the client closes it. Default is often 600000 ms (10 minutes).
Deadline (ms)	Per-call gRPC deadline. When set above 0, some clients enable waitForReady so calls queue on broken channels instead of failing immediately. 0 usually means disabled. Default is often 0.

---

Retry settings

Field	Description
Max retries	How many times to retry after a failed request. Default is often 3.
Retry on rate limit	When enabled, retries when the server signals rate limiting. Default is often on.
Initial backoff (ms)	Delay before the first retry. Default is often 200 ms.
Max backoff (ms)	Upper cap on delay between retries. Default is often 1000 ms (1 second).
Backoff multiplier	Factor for exponential backoff between attempts. Default is often 2.

---

Embedding model selection (related UI)

When the product shows an Embedding model section with Source and Provider:

Field	Description
Source	None — skip embedding configuration for this flow (only if valid for your use case). Pre-configured — use an embedding profile managed under AI Services > Embedding models. Custom — define or override embedding settings inline in this screen.
Provider	After you choose a source, pick the embedding provider (OpenAI, Mistral, Ollama, and so on). That choice drives which embedding API runs; Milvus settings above define where those vectors are stored and queried.

Manage long-lived embedding definitions in AI Services > Embedding models when you want reuse and central updates.

---

Milvus vector store configuration options

Basic configuration

Connection settings

Retry settings

Embedding model selection (related UI)

Related documentation

On this page