Switching Providers with Vectra

One of Vectra’s key benefits is the ability to switch between providers without changing your application code. This guide shows you how.

Why Switch Providers?

Cost optimization: Move from managed (Pinecone) to self-hosted (Qdrant/pgvector)
Performance: Switch to a provider better suited for your workload
Vendor lock-in: Reduce dependency on a single provider
Testing: Use in-memory provider for fast tests

The Unified API

Vectra’s unified API means your code stays the same:

# Works with ANY provider
client.upsert(index: 'docs', vectors: [...])
client.query(index: 'docs', vector: emb, top_k: 10)
client.fetch(index: 'docs', ids: ['1', '2'])
client.delete(index: 'docs', ids: ['1', '2'])

Step 1: Update Client Initialization

From Pinecone to Qdrant

Before:

client = Vectra::Client.new(
  provider: :pinecone,
  api_key: ENV['PINECONE_API_KEY'],
  environment: 'us-west-4'
)

After:

client = Vectra::Client.new(
  provider: :qdrant,
  host: 'http://localhost:6333',
  api_key: ENV['QDRANT_API_KEY']  # Optional for local
)

From Qdrant to pgvector

Before:

client = Vectra::Client.new(
  provider: :qdrant,
  host: 'http://localhost:6333'
)

After:

client = Vectra::Client.new(
  provider: :pgvector,
  connection_url: ENV['DATABASE_URL']
)

From Any Provider to Memory (Testing)

client = Vectra::Client.new(provider: :memory)
# No configuration needed - perfect for tests

Step 2: Migrate Your Data

Option 1: Export/Import Script

# Export from source provider
source_client = Vectra::Client.new(provider: :pinecone, ...)
target_client = Vectra::Client.new(provider: :qdrant, ...)

# Fetch all vectors from source
indexes = source_client.list_indexes
indexes.each do |index_info|
  index_name = index_info[:name]
  
  # Get all IDs (you may need to query or use stats)
  stats = source_client.stats(index: index_name)
  # Note: You'll need to implement ID enumeration based on provider
  
  # Fetch and re-insert
  vectors = source_client.fetch(index: index_name, ids: all_ids)
  target_client.upsert(index: index_name, vectors: vectors.values)
end

Option 2: Dual-Write Pattern

Write to both providers during migration:

def upsert_to_both(vectors)
  source_client.upsert(index: 'docs', vectors: vectors)
  target_client.upsert(index: 'docs', vectors: vectors)
rescue StandardError => e
  # Log error, but don't fail
  Rails.logger.error("Dual-write failed: #{e.message}")
end

Option 3: Use the Migration Tool

For copying vectors between providers, use Vectra::Migration:

source = Vectra::Client.new(provider: :pinecone, ...)
target = Vectra::Client.new(provider: :qdrant, ...)
migration = Vectra::Migration.new(source, target)

result = migration.migrate(
  source_index: index_name,
  target_index: index_name,
  on_progress: ->(stats) { puts "#{stats[:percentage]}%" }
)
verification = migration.verify(source_index: index_name, target_index: index_name)

See the Migration Tool guide for full documentation.

Option 4: Background Job Migration (manual)

If you need custom logic, you can migrate manually:

class MigrateVectorsJob < ApplicationJob
  def perform(index_name, batch_size: 1000)
    source = Vectra::Client.new(provider: :pinecone, ...)
    target = Vectra::Client.new(provider: :qdrant, ...)
    
    # Get all IDs (you'll need to implement this based on your data structure)
    # Option A: If you have IDs stored elsewhere (e.g., database)
    all_ids = YourModel.pluck(:vector_id)
    
    # Option B: Query with a dummy vector to get some IDs (limited)
    # Note: This won't get ALL IDs, just a sample
    sample_results = source.query(index: index_name, vector: Array.new(1536, 0), top_k: 1000)
    all_ids = sample_results.ids
    
    # Process in batches
    all_ids.each_slice(batch_size) do |id_batch|
      vectors = source.fetch(index: index_name, ids: id_batch)
      target.upsert(index: index_name, vectors: vectors.values)
    end
  end
end

Step 3: Feature Compatibility

Not all providers support all features. Check compatibility:

Feature	Pinecone	Qdrant	Weaviate	pgvector
Vector search	✅	✅	✅	✅
Hybrid search	⚠️	✅	✅	✅
Text search	❌	✅	✅	✅
Metadata filtering	✅	✅	✅	✅
Namespaces	✅	✅	✅	❌

Handling Missing Features

# Check if provider supports feature
if client.provider.respond_to?(:text_search)
  results = client.text_search(index: 'docs', text: 'query')
else
  # Fallback to vector search
  embedding = generate_embedding('query')
  results = client.query(index: 'docs', vector: embedding, top_k: 10)
end

# Or use validate!
client.validate!(features: [:text_search])

Step 4: Update Configuration

Environment Variables

# config/initializers/vectra.rb
provider = ENV.fetch('VECTRA_PROVIDER', 'qdrant').to_sym

client = Vectra::Client.new(
  provider: provider,
  **case provider
  when :pinecone
    { api_key: ENV['PINECONE_API_KEY'], environment: ENV['PINECONE_ENV'] }
  when :qdrant
    { host: ENV['QDRANT_HOST'], api_key: ENV['QDRANT_API_KEY'] }
  when :pgvector
    { connection_url: ENV['DATABASE_URL'] }
  end
)

Rails config/vectra.yml

# Development: Qdrant local
development:
  provider: qdrant
  host: http://localhost:6333
  index: documents
  dimension: 1536

# Production: Pinecone
production:
  provider: pinecone
  api_key: <%= Rails.application.credentials.pinecone_api_key %>
  environment: us-west-4
  index: documents
  dimension: 1536

# Test: Memory
test:
  provider: memory
  index: documents
  dimension: 1536

Step 5: Testing the Switch

1. Validate Configuration

client.validate!
client.validate!(features: [:hybrid_search]) if needed

2. Test Basic Operations

# Test upsert
result = client.upsert(index: 'test', vectors: [test_vector])
expect(result[:upserted_count]).to eq(1)

# Test query
results = client.query(index: 'test', vector: test_vector, top_k: 1)
expect(results.size).to eq(1)

# Test fetch
vectors = client.fetch(index: 'test', ids: ['test-id'])
expect(vectors['test-id']).to be_present

3. Test Feature-Specific Code

# If using hybrid search
if client.provider.respond_to?(:hybrid_search)
  results = client.hybrid_search(
    index: 'test',
    vector: emb,
    text: 'query',
    alpha: 0.7
  )
  expect(results.size).to be > 0
end

Step 6: Zero-Downtime Migration

Phase 1: Dual-Write

# Write to both providers
def upsert_safe(vectors)
  old_client.upsert(index: 'docs', vectors: vectors)
  new_client.upsert(index: 'docs', vectors: vectors)
end

Phase 2: Verify

# Compare results from both providers
old_results = old_client.query(index: 'docs', vector: emb, top_k: 10)
new_results = new_client.query(index: 'docs', vector: emb, top_k: 10)

# Log differences
if old_results.ids != new_results.ids
  Rails.logger.warn("Result mismatch detected")
end

Phase 3: Switch Reads

# Feature flag
if Feature.enabled?(:new_provider)
  client = new_client
else
  client = old_client
end

Phase 4: Complete Migration

# Once verified, switch fully
client = new_client
# Stop writing to old provider

Common Pitfalls

1. ID Format Differences

Some providers use integers, others strings. Vectra normalizes to strings:

# Always use string IDs
client.upsert(vectors: [{ id: '1', values: [...] }])

2. Namespace Support

pgvector doesn’t support namespaces. Use separate indexes instead:

# Instead of namespaces
client.upsert(index: 'docs-tenant-1', vectors: [...])

# Or use for_tenant helper
client.for_tenant('tenant-1') do |c|
  c.upsert(index: 'docs', vectors: [...])
end

3. Filter Syntax

Filters are automatically converted, but complex filters may need adjustment:

# Simple filters work everywhere
filter: { category: 'docs' }

# Complex filters may need provider-specific syntax
# Check provider documentation