Snowflake Provider Guide
Embedd.to supports Snowflake as both a source database and a platform mode target for vector storage.
Authentication Methods
Password Authentication
{
"auth_method": "password",
"account": "myorg-account",
"user": "EMBEDD_USER",
"password": "secure_password",
"warehouse": "COMPUTE_WH",
"database": "ANALYTICS",
"schema": "PUBLIC",
"role": "EMBEDD_ROLE"
}
Key Pair Authentication
{
"auth_method": "key_pair",
"account": "myorg-account",
"user": "EMBEDD_USER",
"private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----",
"warehouse": "COMPUTE_WH",
"database": "ANALYTICS",
"schema": "PUBLIC",
"role": "EMBEDD_ROLE"
}
Required Permissions
The Snowflake user needs the following minimum permissions:
-- Read access to source tables
GRANT USAGE ON DATABASE ANALYTICS TO ROLE EMBEDD_ROLE;
GRANT USAGE ON SCHEMA ANALYTICS.PUBLIC TO ROLE EMBEDD_ROLE;
GRANT SELECT ON TABLE ANALYTICS.PUBLIC.PRODUCTS TO ROLE EMBEDD_ROLE;
-- For platform mode: write access to create vector tables
GRANT CREATE TABLE ON SCHEMA ANALYTICS.PUBLIC TO ROLE EMBEDD_ROLE;
Managed Mode
In managed mode, Embedd.to reads data from Snowflake and stores vectors in its built-in Qdrant instance. An embedding provider (OpenAI, Gemini) is required.
Platform Mode
In platform mode, vectors are stored directly in your Snowflake database using VECTOR(FLOAT, N) type columns.
Native Embeddings with Cortex
Snowflake platform mode can use Snowflake Cortex for embedding generation, eliminating the need for an external embedding provider. Supported Cortex models:
snowflake-arctic-embed-m-v1.5(768 dimensions)snowflake-arctic-embed-l-v2.0(1024 dimensions)
When using Cortex, omit embedding_provider_id from the vector table creation request.
Vector Table Structure
Platform mode creates a table in your Snowflake database with the following schema:
CREATE TABLE EMBEDD_VT_xxxxxxxx_name (
PK_VALUE VARCHAR,
EMBEDDING VECTOR(FLOAT, 1536),
EMBEDDED_TEXT VARCHAR,
METADATA VARIANT,
ROW_HASH VARCHAR,
PRIMARY KEY (PK_VALUE)
);
Sync Modes
Batch Sync
Full-table comparison using row hashes. Detects inserts, updates, and deletes by comparing hash values between the source table and the vector table.
CDC Sync
Polling-based change data capture. Periodically queries the source table for changes using row hash comparison. Lower latency than batch sync with reduced compute usage for large tables.
Filter Support
Snowflake platform mode translates the unified filter syntax to Snowflake SQL:
| Filter | Snowflake SQL |
|---|---|
$eq | metadata:field = 'value' |
$ne | metadata:field != 'value' |
$gt | metadata:field > value |
$gte | metadata:field >= value |
$lt | metadata:field < value |
$lte | metadata:field <= value |
$in | metadata:field IN (...) |
$nin | metadata:field NOT IN (...) |
$exists | metadata:field IS [NOT] NULL |