Skip to main content

Getting Started with Embedd.to

Embedd.to is a provider-agnostic vectorized table management API. It connects to your existing databases, automatically generates and maintains vector embeddings, and provides a unified query interface for semantic search.

What Embedd.to Does

  • Connects to your data — Snowflake, PostgreSQL, with more providers coming
  • Generates embeddings — Automatically embeds your text data using OpenAI, Google Gemini, Voyage AI, or Snowflake Cortex
  • Keeps vectors in sync — CDC and batch sync modes detect changes and re-embed automatically
  • Unified query API — One API for semantic search across any provider, any embedding model

Key Concepts

Organizations

Organizations are the top-level container for all resources. Every API key, environment, connection, and vector table belongs to an organization. Organizations have a subscription tier that determines usage limits.

Environments

Environments isolate resources within your organization. A prod environment is created automatically when you create an organization. Use additional environments to separate dev, staging, and production.

Connections

A connection stores credentials to a source database. Connections are scoped to an environment and support Snowflake and PostgreSQL.

Embedding Providers

An embedding provider stores API credentials for an embedding service (OpenAI, Google Gemini, Voyage AI). Required for managed mode and PostgreSQL platform mode. Not required for Snowflake platform mode (uses Cortex).

Vector Tables (Search Tables)

A vector table links a source table to its vector representation. It defines which columns to embed for semantic search, which columns to keep as filterable metadata, and where to store the vectors. See Search Tables for a deep dive.

Modes

Embedd.to supports two modes for vector storage. Your choice of mode is set per-connection.

Managed Mode

Embedd.to stores vectors in its built-in Qdrant vector database. You provide a source database connection and an embedding provider — Embedd.to handles the rest.

Source DB → Embedd.to → Embedding Provider → Qdrant (managed)

Best for: Getting started quickly, multi-provider search, no infrastructure changes needed.

Platform Mode

Vectors are stored directly in your own database alongside your source data.

Source DB → Embedd.to → Embedding Provider → Your DB (vector table)

Best for: Data residency requirements, joining vectors with existing data, leveraging your existing infrastructure.

Which Mode Is Right for You?

QuestionManagedPlatform
Do you need vectors in your own infrastructure?NoYes
Do you want to JOIN vectors with source data via SQL?NoYes
Do you want the fastest setup with no DB changes?YesNo
Do you use Snowflake Cortex for embeddings?N/AYes (Snowflake only)
Do you want Embedd.to to handle vector storage?YesNo

Supported Providers

ProviderManaged ModePlatform ModeNative Embeddings
PostgreSQLYesYes (requires pgvector)No
SnowflakeYesYesYes (Cortex)

Sync Modes

  • Batch — Periodic full-table comparison using row hashes to detect changes
  • CDC — Polling-based change data capture for lower-latency sync

See Sync & Backfill for details.

Next Steps

  1. Read Search Tables to understand how vector tables work
  2. Set up your account — create an org, get an API key
  3. Follow a guide for your setup: