Platform + Snowflake
In platform mode with Snowflake, Embedd reads data from your Snowflake database and writes vectors back to your Snowflake account using the VECTOR(FLOAT, N) type. You can use Snowflake Cortex for embedding generation — no external embedding provider needed.
You'll need:
- A Snowflake account with a warehouse, database, and schema accessible to Embedd
- An Embedd API key
- (Optional) An embedding provider API key if you prefer OpenAI, Gemini, or Voyage over Cortex
Step 1: Create a Connection
Register your Snowflake account so Embedd can read source rows and write vector tables.
Password authentication:
curl -X POST https://api.embedd.to/v1/providers/snowflake/connections \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"name": "analytics-snowflake",
"mode": "platform",
"credentials": {
"account": "xy12345.us-east-1",
"user": "EMBEDD_USER",
"password": "your_password",
"warehouse": "COMPUTE_WH",
"database": "ANALYTICS",
"schema": "PUBLIC"
}
}'
Key pair authentication:
curl -X POST https://api.embedd.to/v1/providers/snowflake/connections \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"name": "analytics-snowflake",
"mode": "platform",
"credentials": {
"account": "xy12345.us-east-1",
"user": "EMBEDD_USER",
"private_key": "-----BEGIN PRIVATE KEY-----\nMIIE...\n-----END PRIVATE KEY-----",
"warehouse": "COMPUTE_WH",
"database": "ANALYTICS",
"schema": "PUBLIC"
}
}'
Response:
{
"id": "conn_abc123",
"name": "analytics-snowflake",
"provider": "snowflake",
"mode": "platform",
"status": "created",
"created_at": "2026-03-13T10:00:00Z"
}
Because Embedd creates and manages vector tables in your Snowflake account, the role needs more than just read access:
GRANT USAGE ON DATABASE ANALYTICS TO ROLE EMBEDD_ROLE;
GRANT USAGE ON SCHEMA ANALYTICS.PUBLIC TO ROLE EMBEDD_ROLE;
GRANT SELECT ON TABLE ANALYTICS.PUBLIC.PRODUCTS TO ROLE EMBEDD_ROLE;
GRANT CREATE TABLE ON SCHEMA ANALYTICS.PUBLIC TO ROLE EMBEDD_ROLE;
CREATE TABLE is required because Embedd writes vector tables (and performs atomic table swaps during re-backfill) directly in your schema.
Step 2: Test the Connection
Verify that Embedd can reach your Snowflake account before proceeding.
curl -X POST https://api.embedd.to/v1/connections/conn_abc123/test \
-H "Authorization: Bearer sk_your_api_key"
Response:
{
"status": "ok",
"latency_ms": 185
}
If the test fails, check:
- Account identifier — ensure the format is correct (e.g.,
xy12345.us-east-1). - Credentials — confirm the username and password (or private key) are correct.
- Warehouse — verify the warehouse exists and is not suspended.
- Network policy — if your Snowflake account uses network policies, allow Embedd's IPs.
Step 3: Embedding Provider (Optional)
This is the key difference in Snowflake platform mode: you can use Snowflake Cortex for embedding generation, which means no external provider is needed.
Option A: Use Snowflake Cortex
Skip this step entirely. When you create a vector table in Step 4, omit the embedding_provider_id field and use a Cortex-compatible model name. Cortex handles embedding generation natively inside your Snowflake account.
Available Cortex embedding models:
| Model | Dimensions |
|---|---|
| snowflake-arctic-embed-m-v1.5 | 768 |
| snowflake-arctic-embed-l-v2.0 | 1024 |
Option B: Use an External Provider
If you prefer OpenAI, Gemini, or Voyage, create an embedding provider the same way as in managed mode:
curl -X POST https://api.embedd.to/v1/embedding-providers \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"name": "openai-prod",
"provider": "openai",
"api_key": "sk-proj-your-openai-key",
"default_model": "text-embedding-3-small"
}'
Response:
{
"id": "emb_xyz789",
"name": "openai-prod",
"provider": "openai",
"default_model": "text-embedding-3-small",
"created_at": "2026-03-13T10:01:00Z"
}
Use the returned id as the embedding_provider_id in Step 4.
Step 4: Create a Vector Table
A vector table maps source columns to embedding and metadata roles. In platform mode, Embedd creates the vector table directly in your Snowflake account.
Example A: Using Cortex (no external provider)
curl -X POST https://api.embedd.to/v1/vector-tables \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"name": "products-search",
"connection_id": "conn_abc123",
"source_table": "ANALYTICS.PUBLIC.PRODUCTS",
"primary_key_column": "ID",
"embedding_model": "snowflake-arctic-embed-m-v1.5",
"embedding_dimensions": 768,
"mode": "platform",
"columns": [
{"name": "NAME", "role": "embedding", "ordinal": 1},
{"name": "DESCRIPTION", "role": "embedding", "ordinal": 2},
{"name": "CATEGORY", "role": "metadata", "filter_type": "keyword"},
{"name": "PRICE", "role": "metadata", "filter_type": "float"},
{"name": "IN_STOCK", "role": "metadata", "filter_type": "boolean"}
]
}'
No embedding_provider_id — Cortex handles embeddings natively.
Example B: Using an External Provider
curl -X POST https://api.embedd.to/v1/vector-tables \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"name": "products-search",
"connection_id": "conn_abc123",
"embedding_provider_id": "emb_xyz789",
"source_table": "ANALYTICS.PUBLIC.PRODUCTS",
"primary_key_column": "ID",
"embedding_model": "text-embedding-3-small",
"embedding_dimensions": 1536,
"mode": "platform",
"columns": [
{"name": "NAME", "role": "embedding", "ordinal": 1},
{"name": "DESCRIPTION", "role": "embedding", "ordinal": 2},
{"name": "CATEGORY", "role": "metadata", "filter_type": "keyword"},
{"name": "PRICE", "role": "metadata", "filter_type": "float"},
{"name": "IN_STOCK", "role": "metadata", "filter_type": "boolean"}
]
}'
Response (both examples):
{
"id": "vt_abc123",
"name": "products-search",
"connection_id": "conn_abc123",
"source_table": "ANALYTICS.PUBLIC.PRODUCTS",
"mode": "platform",
"sync_status": "pending",
"embedding_model": "snowflake-arctic-embed-m-v1.5",
"embedding_dimensions": 768,
"columns": [
{"name": "NAME", "role": "embedding", "ordinal": 1},
{"name": "DESCRIPTION", "role": "embedding", "ordinal": 2},
{"name": "CATEGORY", "role": "metadata", "filter_type": "keyword"},
{"name": "PRICE", "role": "metadata", "filter_type": "float"},
{"name": "IN_STOCK", "role": "metadata", "filter_type": "boolean"}
],
"created_at": "2026-03-13T10:02:00Z"
}
Embedd creates a table in your Snowflake schema with this structure:
CREATE TABLE EMBEDD_VT_XXXXXXXX_NAME (
PK_VALUE VARCHAR,
EMBEDDING VECTOR(FLOAT, 768),
EMBEDDED_TEXT VARCHAR,
METADATA VARIANT,
ROW_HASH VARCHAR,
PRIMARY KEY (PK_VALUE)
);
Metadata is stored as Snowflake VARIANT, so you can JOIN this table with other Snowflake tables and query metadata using standard Snowflake SQL.
Platform mode is not subject to tier limits. Since vectors are stored in your own Snowflake account, there are no max_tables or max_vectors restrictions.
Step 5: Trigger Backfill
Kick off the initial backfill to embed all existing rows from your source table.
curl -X POST https://api.embedd.to/v1/vector-tables/vt_abc123/backfill \
-H "Authorization: Bearer sk_your_api_key"
Response:
{
"task_id": "task_def456",
"task_type": "backfill",
"target_id": "vt_abc123",
"status": "pending",
"created_at": "2026-03-13T10:03:00Z"
}
Check sync status to track progress:
curl https://api.embedd.to/v1/vector-tables/vt_abc123/sync/status \
-H "Authorization: Bearer sk_your_api_key"
Response (once complete):
{
"sync_status": "synced",
"synced_rows": 12450,
"total_rows": 12450,
"last_synced_at": "2026-03-13T10:08:00Z"
}
Step 6: Query
Run a semantic search with optional metadata filters.
curl -X POST https://api.embedd.to/v1/vector-tables/vt_abc123/query \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"query": "something warm for hiking",
"limit": 5,
"filters": {
"in_stock": {"eq": true},
"price": {"lte": 150}
}
}'
Response:
{
"results": [
{
"id": "4821",
"score": 0.892,
"metadata": {
"name": "Alpine Fleece Jacket",
"description": "Lightweight fleece jacket with wind-resistant outer layer",
"category": "outerwear",
"price": 129.99,
"in_stock": true
}
},
{
"id": "7733",
"score": 0.871,
"metadata": {
"name": "Merino Wool Base Layer",
"description": "Moisture-wicking merino wool top for cold-weather hiking",
"category": "base-layers",
"price": 89.00,
"in_stock": true
}
}
]
}
Filter operators use plain names like eq, lte, gte, ne — no $ prefix. See Filters for the full list of supported operators and types.
Step 7: Monitor Sync
After the initial backfill, Embedd automatically keeps vectors in sync with your source table. Inserts, updates, and deletes in Snowflake are detected and reflected in the vector table.
Check sync status:
curl https://api.embedd.to/v1/vector-tables/vt_abc123/sync/status \
-H "Authorization: Bearer sk_your_api_key"
Pause sync:
curl -X POST https://api.embedd.to/v1/vector-tables/vt_abc123/sync/pause \
-H "Authorization: Bearer sk_your_api_key"
Resume sync:
curl -X POST https://api.embedd.to/v1/vector-tables/vt_abc123/sync/resume \
-H "Authorization: Bearer sk_your_api_key"
Re-backfill and atomic swap
When a re-backfill is triggered (for example, after changing the embedding model), Embedd uses an atomic table swap to avoid downtime:
- A new swap table is created in your Snowflake schema.
- All rows are embedded and written to the swap table.
- The live table is renamed to
_old, the swap table is renamed to the live name, and_oldis dropped — all in a single transaction.
Queries continue hitting the live table throughout the process.
See Sync & Backfill for details on polling intervals and re-backfill behavior.
Key Differences from Managed Mode
| Aspect | Managed | Platform (Snowflake) |
|---|---|---|
| Vector storage | Qdrant (hosted by Embedd) | Your Snowflake account |
| Embedding provider | Required | Optional (Cortex available) |
| Tier limits | Enforced | Not enforced |
| SQL JOINs with vectors | No | Yes |
| Metadata type | JSON | VARIANT |
| Re-backfill strategy | New Qdrant collection | Atomic table swap |
Related
- Filters — full filter operator reference
- Sync & Backfill — how automatic sync and re-backfill work
- Vector Tables API — complete vector table endpoints
- Query API — query parameters and response format