Platform + PostgreSQL
In platform mode with PostgreSQL, Embedd reads data from your PostgreSQL database, generates embeddings via your chosen provider, and writes vectors back to your PostgreSQL database using pgvector. Vectors live in your infrastructure — Embedd never stores them.
You'll need:
- PostgreSQL 14+ with the pgvector extension installed
- The
vectorextension created in the target schema:CREATE EXTENSION IF NOT EXISTS vector; - An embedding provider API key (OpenAI, Gemini, or Voyage) — unlike Snowflake platform mode, PostgreSQL has no native embedding engine
- An Embedd API key
Step 1: Create a Connection
Register your PostgreSQL database in platform mode. This tells Embedd where to read source rows and where to write vectors.
curl -X POST https://api.embedd.to/v1/providers/postgresql/connections \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"name": "product-db",
"mode": "platform",
"credentials": {
"host": "your-db-host.com",
"port": 5432,
"database": "myapp",
"user": "embedd_user",
"password": "your_password"
}
}'
Response:
{
"id": "conn_abc123",
"name": "product-db",
"provider": "postgresql",
"mode": "platform",
"status": "created",
"created_at": "2026-03-13T10:00:00Z"
}
Platform mode needs more than read access. The user must be able to create tables and indexes for vector storage:
GRANT SELECT ON TABLE public.products TO embedd_user;
GRANT CREATE ON SCHEMA public TO embedd_user;
Step 2: Test the Connection
Verify that Embedd can reach your database before proceeding.
curl -X POST https://api.embedd.to/v1/connections/conn_abc123/test \
-H "Authorization: Bearer sk_your_api_key"
Response:
{
"status": "ok",
"latency_ms": 42
}
If the test fails, check:
- Firewall rules — ensure Embedd's IPs can reach your database host and port.
- Credentials — confirm the username, password, and database name are correct.
- SSL — if your database requires SSL, set
ssl_modetorequirein the connection credentials. - pgvector — confirm the extension is installed:
SELECT * FROM pg_extension WHERE extname = 'vector';
Step 3: Configure an Embedding Provider
PostgreSQL has no native embedding engine, so an embedding provider is required for platform mode. Tell Embedd which provider and model to use.
curl -X POST https://api.embedd.to/v1/embedding-providers \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"name": "openai-prod",
"provider": "openai",
"api_key": "sk-proj-your-openai-key",
"default_model": "text-embedding-3-small"
}'
Response:
{
"id": "emb_xyz789",
"name": "openai-prod",
"provider": "openai",
"default_model": "text-embedding-3-small",
"created_at": "2026-03-13T10:01:00Z"
}
Available providers and models
| Provider | Model | Dimensions |
|---|---|---|
| openai | text-embedding-3-small | 1536 |
| openai | text-embedding-3-large | 3072 |
| openai | text-embedding-ada-002 | 1536 |
| gemini | gemini-embedding-001 | 3072 |
| gemini | text-embedding-005 | 768 |
| gemini | text-embedding-004 | 768 |
| voyage | voyage-3.5 | 1024 |
| voyage | voyage-3.5-lite | 512 |
| voyage | voyage-code-3 | 1024 |
| voyage | voyage-3-large | 1024 |
| voyage | voyage-3-lite | 512 |
Step 4: Create a Vector Table
A vector table maps source columns to embedding and metadata roles. In platform mode, Embedd creates a physical table in your PostgreSQL database to store the vectors.
curl -X POST https://api.embedd.to/v1/vector-tables \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"name": "products-search",
"connection_id": "conn_abc123",
"embedding_provider_id": "emb_xyz789",
"source_table": "public.products",
"primary_key_column": "id",
"embedding_model": "text-embedding-3-small",
"embedding_dimensions": 1536,
"mode": "platform",
"columns": [
{"name": "name", "role": "embedding", "ordinal": 1, "name_prefix": "Product: "},
{"name": "description", "role": "embedding", "ordinal": 2, "name_prefix": "Description: "},
{"name": "category", "role": "metadata", "filter_type": "keyword"},
{"name": "price", "role": "metadata", "filter_type": "float"},
{"name": "in_stock", "role": "metadata", "filter_type": "boolean"}
]
}'
Response:
{
"id": "vt_abc123",
"name": "products-search",
"connection_id": "conn_abc123",
"embedding_provider_id": "emb_xyz789",
"source_table": "public.products",
"mode": "platform",
"sync_status": "pending",
"embedding_model": "text-embedding-3-small",
"embedding_dimensions": 1536,
"platform_vector_ref": "embedd_vt_a1b2c3d4_products_search",
"columns": [
{"name": "name", "role": "embedding", "ordinal": 1, "name_prefix": "Product: "},
{"name": "description", "role": "embedding", "ordinal": 2, "name_prefix": "Description: "},
{"name": "category", "role": "metadata", "filter_type": "keyword"},
{"name": "price", "role": "metadata", "filter_type": "float"},
{"name": "in_stock", "role": "metadata", "filter_type": "boolean"}
],
"created_at": "2026-03-13T10:02:00Z"
}
Embedd creates a table in your database with this schema:
CREATE TABLE embedd_vt_a1b2c3d4_products_search (
pk_value TEXT PRIMARY KEY,
embedding vector(1536),
embedded_text TEXT,
metadata JSONB DEFAULT '{}',
row_hash TEXT
);
CREATE INDEX ON embedd_vt_a1b2c3d4_products_search USING hnsw (embedding vector_cosine_ops);
The embedding_provider_id is required for PostgreSQL platform mode. Without it, the request will fail.
Platform mode is not subject to max_tables or max_vectors tier limits. Vectors live in your database, so usage is bounded only by your own infrastructure.
Step 5: Trigger Backfill
Kick off the initial backfill to read all source rows, generate embeddings, and write vectors to your PostgreSQL database.
curl -X POST https://api.embedd.to/v1/vector-tables/vt_abc123/backfill \
-H "Authorization: Bearer sk_your_api_key"
Response:
{
"task_id": "task_def456",
"task_type": "backfill",
"target_id": "vt_abc123",
"status": "pending",
"created_at": "2026-03-13T10:03:00Z"
}
Check sync status to track progress:
curl https://api.embedd.to/v1/vector-tables/vt_abc123/sync/status \
-H "Authorization: Bearer sk_your_api_key"
Response (once complete):
{
"sync_status": "synced",
"synced_rows": 12450,
"total_rows": 12450,
"last_synced_at": "2026-03-13T10:08:00Z"
}
When synced_rows matches total_rows, all your data has been embedded and is ready to query.
Step 6: Query
Run a semantic search with optional metadata filters.
curl -X POST https://api.embedd.to/v1/vector-tables/vt_abc123/query \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"query": "something warm for hiking",
"limit": 5,
"filters": {
"in_stock": {"eq": true},
"price": {"lte": 150}
}
}'
Response:
{
"results": [
{
"id": "4821",
"score": 0.892,
"metadata": {
"name": "Alpine Fleece Jacket",
"description": "Lightweight fleece jacket with wind-resistant outer layer",
"category": "outerwear",
"price": 129.99,
"in_stock": true
}
},
{
"id": "7733",
"score": 0.871,
"metadata": {
"name": "Merino Wool Base Layer",
"description": "Moisture-wicking merino wool top for cold-weather hiking",
"category": "base-layers",
"price": 89.00,
"in_stock": true
}
}
]
}
Filter operators use plain names like eq, lte, gte, ne — no $ prefix. See Filters for the full list of supported operators and types.
Embedd automatically sets hnsw.iterative_scan = on for PostgreSQL platform queries. This ensures consistent results when combining vector similarity with metadata filters, preventing cases where the HNSW index would otherwise return too few candidates before filtering.
Step 7: Monitor Sync
After the initial backfill, Embedd automatically keeps vectors in sync with your source table. Inserts, updates, and deletes in your source PostgreSQL table are detected and reflected in the vector table.
Check sync status:
curl https://api.embedd.to/v1/vector-tables/vt_abc123/sync/status \
-H "Authorization: Bearer sk_your_api_key"
Pause sync:
curl -X POST https://api.embedd.to/v1/vector-tables/vt_abc123/sync/pause \
-H "Authorization: Bearer sk_your_api_key"
Resume sync:
curl -X POST https://api.embedd.to/v1/vector-tables/vt_abc123/sync/resume \
-H "Authorization: Bearer sk_your_api_key"
Re-backfill and atomic swap
If you change the embedding model, dimensions, or column configuration, the vector table enters pending_rebackfill status. The next backfill performs an atomic table swap to replace vectors with zero downtime:
- Embedd creates a new swap table (e.g.,
embedd_vt_a1b2c3d4_products_search_swap) - All rows are re-embedded and written to the swap table
- The live table is renamed to
_old, and the swap table is renamed to the live name - The
_oldtable is dropped
Your queries continue to hit the live table throughout this process — there is no window where data is unavailable.
See Sync & Backfill for details on how sync works, polling intervals, and re-backfill behavior.
Key Differences from Managed Mode
| Aspect | Managed | Platform (PostgreSQL) |
|---|---|---|
| Vector storage | Qdrant (hosted by Embedd) | Your PostgreSQL database |
| Embedding provider | Required | Required |
| Tier limits | Enforced (max_tables, max_vectors) | Not enforced |
| SQL JOINs with vectors | No | Yes — vectors are a regular table in your database |
| pgvector required | No | Yes |
| Re-backfill strategy | New Qdrant collection | Atomic table swap |
Because vectors live in your PostgreSQL database, you can join the vector table directly with your application tables for hybrid queries — something that is not possible in managed mode.
Related
- Managed + PostgreSQL — managed mode guide for comparison
- Filters — full filter operator reference
- Sync & Backfill — how automatic sync and re-backfill work
- Vector Tables API — complete vector table endpoints
- Query API — query parameters and response format