Skip to main content

Managed + PostgreSQL

In managed mode with PostgreSQL, Embedd reads data from your PostgreSQL database, generates embeddings via your chosen provider, and stores vectors in Embedd's built-in Qdrant instance. This guide walks through setup end to end.

You'll need:

  • A PostgreSQL database accessible from the internet (or peered network)
  • An embedding provider API key (OpenAI, Gemini, or Voyage)
  • An Embedd API key

Step 1: Create a Connection

Register your PostgreSQL database so Embedd can read source rows.

curl -X POST https://api.embedd.to/v1/providers/postgresql/connections \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"name": "product-db",
"mode": "managed",
"credentials": {
"host": "your-db-host.com",
"port": 5432,
"database": "myapp",
"user": "embedd_reader",
"password": "your_password"
}
}'

Response:

{
"id": "conn_abc123",
"name": "product-db",
"provider": "postgresql",
"mode": "managed",
"status": "created",
"created_at": "2026-03-13T10:00:00Z"
}

You can also pass "ssl_mode": "require" inside credentials if your database requires SSL connections (the default is prefer).

Minimum permissions

The database user only needs read access. Grant the minimum required:

GRANT SELECT ON TABLE public.products TO embedd_reader;

Step 2: Test the Connection

Verify that Embedd can reach your database before proceeding.

curl -X POST https://api.embedd.to/v1/connections/conn_abc123/test \
-H "Authorization: Bearer sk_your_api_key"

Response:

{
"status": "ok",
"latency_ms": 42
}

If the test fails, check:

  • Firewall rules — ensure Embedd's IPs can reach your database host and port.
  • Credentials — confirm the username, password, and database name are correct.
  • SSL — if your database requires SSL, set ssl_mode to require in the connection credentials.

Step 3: Configure an Embedding Provider

Tell Embedd which provider and model to use for generating embeddings.

curl -X POST https://api.embedd.to/v1/embedding-providers \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"name": "openai-prod",
"provider": "openai",
"api_key": "sk-proj-your-openai-key",
"default_model": "text-embedding-3-small"
}'

Response:

{
"id": "emb_xyz789",
"name": "openai-prod",
"provider": "openai",
"default_model": "text-embedding-3-small",
"created_at": "2026-03-13T10:01:00Z"
}

Available providers and models

ProviderModelDimensions
openaitext-embedding-3-small1536
openaitext-embedding-3-large3072
openaitext-embedding-ada-0021536
geminigemini-embedding-0013072
geminitext-embedding-005768
geminitext-embedding-004768
voyagevoyage-3.51024
voyagevoyage-3.5-lite512
voyagevoyage-code-31024
voyagevoyage-3-large1024
voyagevoyage-3-lite512

Step 4: Create a Vector Table

A vector table maps source columns to embedding and metadata roles. Embedd uses this configuration to read your data, generate embeddings, and store them.

curl -X POST https://api.embedd.to/v1/vector-tables \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"name": "products-search",
"connection_id": "conn_abc123",
"embedding_provider_id": "emb_xyz789",
"source_table": "public.products",
"primary_key_column": "id",
"embedding_model": "text-embedding-3-small",
"embedding_dimensions": 1536,
"columns": [
{"name": "name", "role": "embedding", "ordinal": 1, "name_prefix": "Product: "},
{"name": "description", "role": "embedding", "ordinal": 2, "name_prefix": "Description: "},
{"name": "category", "role": "metadata", "filter_type": "keyword"},
{"name": "price", "role": "metadata", "filter_type": "float"},
{"name": "in_stock", "role": "metadata", "filter_type": "boolean"}
]
}'

Response:

{
"id": "vt_abc123",
"name": "products-search",
"connection_id": "conn_abc123",
"embedding_provider_id": "emb_xyz789",
"source_table": "public.products",
"mode": "managed",
"sync_status": "pending",
"embedding_model": "text-embedding-3-small",
"embedding_dimensions": 1536,
"columns": [
{"name": "name", "role": "embedding", "ordinal": 1, "name_prefix": "Product: "},
{"name": "description", "role": "embedding", "ordinal": 2, "name_prefix": "Description: "},
{"name": "category", "role": "metadata", "filter_type": "keyword"},
{"name": "price", "role": "metadata", "filter_type": "float"},
{"name": "in_stock", "role": "metadata", "filter_type": "boolean"}
],
"created_at": "2026-03-13T10:02:00Z"
}

The sync_status starts as "pending" — no data has been indexed yet.

Columns with role: "embedding" are concatenated (in ordinal order, prefixed by name_prefix) into a single text string that gets embedded. Columns with role: "metadata" are stored alongside each vector and can be used for filtering at query time.

Tier limits

In managed mode, vector table creation is subject to your organization's max_tables tier limit. See Subscription Tiers for details.


Step 5: Trigger Backfill

Kick off the initial backfill to index all existing rows from your source table.

curl -X POST https://api.embedd.to/v1/vector-tables/vt_abc123/backfill \
-H "Authorization: Bearer sk_your_api_key"

Response:

{
"task_id": "task_def456",
"task_type": "backfill",
"target_id": "vt_abc123",
"status": "pending",
"created_at": "2026-03-13T10:03:00Z"
}

Check sync status to track progress:

curl https://api.embedd.to/v1/vector-tables/vt_abc123/sync/status \
-H "Authorization: Bearer sk_your_api_key"

Response (once complete):

{
"sync_status": "synced",
"synced_rows": 12450,
"total_rows": 12450,
"last_synced_at": "2026-03-13T10:08:00Z"
}

When synced_rows matches total_rows, all your data has been embedded and is ready to query.


Step 6: Query

Run a semantic search with optional metadata filters.

curl -X POST https://api.embedd.to/v1/vector-tables/vt_abc123/query \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"query": "something warm for hiking",
"limit": 5,
"filters": {
"in_stock": {"eq": true},
"price": {"lte": 150}
}
}'

Response:

{
"results": [
{
"id": "4821",
"score": 0.892,
"metadata": {
"name": "Alpine Fleece Jacket",
"description": "Lightweight fleece jacket with wind-resistant outer layer",
"category": "outerwear",
"price": 129.99,
"in_stock": true
}
},
{
"id": "7733",
"score": 0.871,
"metadata": {
"name": "Merino Wool Base Layer",
"description": "Moisture-wicking merino wool top for cold-weather hiking",
"category": "base-layers",
"price": 89.00,
"in_stock": true
}
}
]
}

Filter operators use plain names like eq, lte, gte, ne — no $ prefix. See Filters for the full list of supported operators and types.


Step 7: Monitor Sync

After the initial backfill, Embedd automatically keeps vectors in sync with your source table. Inserts, updates, and deletes in PostgreSQL are detected and reflected in the vector store.

Check sync status:

curl https://api.embedd.to/v1/vector-tables/vt_abc123/sync/status \
-H "Authorization: Bearer sk_your_api_key"

Pause sync:

curl -X POST https://api.embedd.to/v1/vector-tables/vt_abc123/sync/pause \
-H "Authorization: Bearer sk_your_api_key"

Resume sync:

curl -X POST https://api.embedd.to/v1/vector-tables/vt_abc123/sync/resume \
-H "Authorization: Bearer sk_your_api_key"

See Sync & Backfill for details on how sync works, polling intervals, and re-backfill behavior.