|
@@ -0,0 +1,378 @@
|
|
|
|
|
+# Qdrant Vector Database Integration
|
|
|
|
|
+
|
|
|
|
|
+## Overview
|
|
|
|
|
+
|
|
|
|
|
+This document describes the Qdrant vector database integration for the ShopCall.ai store synchronization system. The integration enables semantic search and AI-powered features across all e-commerce platforms.
|
|
|
|
|
+
|
|
|
|
|
+## Architecture
|
|
|
|
|
+
|
|
|
|
|
+### Vector Database Configuration
|
|
|
|
|
+
|
|
|
|
|
+- **Provider**: Qdrant
|
|
|
|
|
+- **Endpoint**: http://142.93.100.6:6333
|
|
|
|
|
+- **API Key**: pyXAyyEPbLzba2RvdBwm
|
|
|
|
|
+- **Vector Dimensions**: 3072 (OpenAI text-embedding-3-large compatible)
|
|
|
|
|
+- **Distance Metric**: Cosine (optimal for normalized text embeddings)
|
|
|
|
|
+
|
|
|
|
|
+### Collection Naming Convention
|
|
|
|
|
+
|
|
|
|
|
+Each store has separate collections for different entity types:
|
|
|
|
|
+- `{shopname}-products` - Product catalog
|
|
|
|
|
+- `{shopname}-orders` - Order history (if permitted)
|
|
|
|
|
+- `{shopname}-customers` - Customer data (if permitted)
|
|
|
|
|
+
|
|
|
|
|
+The `{shopname}` is sanitized: lowercase, alphanumeric with hyphens.
|
|
|
|
|
+
|
|
|
|
|
+## Data Privacy & Permissions
|
|
|
|
|
+
|
|
|
|
|
+### Store-Level Control
|
|
|
|
|
+
|
|
|
|
|
+The `stores.data_access_permissions` JSONB field controls what data can be synced:
|
|
|
|
|
+
|
|
|
|
|
+```json
|
|
|
|
|
+{
|
|
|
|
|
+ "allow_product_access": true,
|
|
|
|
|
+ "allow_order_access": true,
|
|
|
|
|
+ "allow_customer_access": true
|
|
|
|
|
+}
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+### Privacy Compliance
|
|
|
|
|
+
|
|
|
|
|
+- Products are always synced (core catalog data)
|
|
|
|
|
+- Orders and customers respect store owner preferences
|
|
|
|
|
+- SQL cache and Qdrant sync both check permissions
|
|
|
|
|
+- Helper functions: `can_sync_products()`, `can_sync_orders()`, `can_sync_customers()`
|
|
|
|
|
+
|
|
|
|
|
+## Collection Schemas
|
|
|
|
|
+
|
|
|
|
|
+### Products Collection
|
|
|
|
|
+
|
|
|
|
|
+**Payload Structure:**
|
|
|
|
|
+```typescript
|
|
|
|
|
+{
|
|
|
|
|
+ store_id: string,
|
|
|
|
|
+ product_id: string,
|
|
|
|
|
+ platform: "shopify" | "woocommerce" | "shoprenter",
|
|
|
|
|
+ title/name: string,
|
|
|
|
|
+ sku: string,
|
|
|
|
|
+ price: number,
|
|
|
|
|
+ status: string,
|
|
|
|
|
+ description: string,
|
|
|
|
|
+ tags: string[],
|
|
|
|
|
+ synced_at: string (ISO 8601)
|
|
|
|
|
+}
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+**Payload Indexes:**
|
|
|
|
|
+- `store_id` (keyword)
|
|
|
|
|
+- `product_id` (keyword)
|
|
|
|
|
+- `platform` (keyword)
|
|
|
|
|
+- `status` (keyword)
|
|
|
|
|
+- `price` (float)
|
|
|
|
|
+- `sku` (keyword)
|
|
|
|
|
+
|
|
|
|
|
+### Orders Collection
|
|
|
|
|
+
|
|
|
|
|
+**Payload Structure:**
|
|
|
|
|
+```typescript
|
|
|
|
|
+{
|
|
|
|
|
+ store_id: string,
|
|
|
|
|
+ order_id: string,
|
|
|
|
|
+ platform: string,
|
|
|
|
|
+ order_number: string,
|
|
|
|
|
+ status/financial_status: string,
|
|
|
|
|
+ total/total_price: number,
|
|
|
|
|
+ currency: string,
|
|
|
|
|
+ customer_name: string,
|
|
|
|
|
+ customer_email: string,
|
|
|
|
|
+ synced_at: string
|
|
|
|
|
+}
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+**Payload Indexes:**
|
|
|
|
|
+- `store_id` (keyword)
|
|
|
|
|
+- `order_id` (keyword)
|
|
|
|
|
+- `platform` (keyword)
|
|
|
|
|
+- `status` (keyword)
|
|
|
|
|
+- `total_price` (float)
|
|
|
|
|
+- `customer_email` (keyword)
|
|
|
|
|
+
|
|
|
|
|
+### Customers Collection
|
|
|
|
|
+
|
|
|
|
|
+**Payload Structure:**
|
|
|
|
|
+```typescript
|
|
|
|
|
+{
|
|
|
|
|
+ store_id: string,
|
|
|
|
|
+ customer_id: string,
|
|
|
|
|
+ platform: string,
|
|
|
|
|
+ email: string,
|
|
|
|
|
+ first_name: string,
|
|
|
|
|
+ last_name: string,
|
|
|
|
|
+ phone: string,
|
|
|
|
|
+ orders_count: number,
|
|
|
|
|
+ total_spent: number,
|
|
|
|
|
+ synced_at: string
|
|
|
|
|
+}
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+**Payload Indexes:**
|
|
|
|
|
+- `store_id` (keyword)
|
|
|
|
|
+- `customer_id` (keyword)
|
|
|
|
|
+- `platform` (keyword)
|
|
|
|
|
+- `email` (keyword)
|
|
|
|
|
+
|
|
|
|
|
+## Implementation Status
|
|
|
|
|
+
|
|
|
|
|
+### ✅ Completed
|
|
|
|
|
+
|
|
|
|
|
+1. **Qdrant Client Library** (`supabase/functions/_shared/qdrant-client.ts`)
|
|
|
|
|
+ - Collection management (create, delete, exists)
|
|
|
|
|
+ - Point operations (upsert, delete, scroll, search)
|
|
|
|
|
+ - Change detection support
|
|
|
|
|
+ - Text embedding helpers
|
|
|
|
|
+
|
|
|
|
|
+2. **Database Schema** (Migration: `20251111_qdrant_integration.sql`)
|
|
|
|
|
+ - Extended `data_access_permissions`
|
|
|
|
|
+ - Added Qdrant-specific columns to `stores`
|
|
|
|
|
+ - Created `qdrant_sync_logs` table
|
|
|
|
|
+ - Helper functions for permission checks
|
|
|
|
|
+ - Automatic trigger for `qdrant_last_sync_at`
|
|
|
|
|
+
|
|
|
|
|
+3. **Shopify Sync** (`shopify-sync/index.ts`)
|
|
|
|
|
+ - Full Qdrant integration
|
|
|
|
|
+ - Change detection for deleted products
|
|
|
|
|
+ - Privacy-compliant
|
|
|
|
|
+ - Comprehensive logging
|
|
|
|
|
+
|
|
|
|
|
+4. **WooCommerce Sync** (`woocommerce-sync/index.ts`)
|
|
|
|
|
+ - Full Qdrant integration
|
|
|
|
|
+ - Pagination-aware collection
|
|
|
|
|
+ - Privacy-compliant
|
|
|
|
|
+ - Comprehensive logging
|
|
|
|
|
+
|
|
|
|
|
+### 🔄 Pending
|
|
|
|
|
+
|
|
|
|
|
+1. **ShopRenter Sync** (`shoprenter-sync/index.ts`)
|
|
|
|
|
+ - Needs same updates as Shopify/WooCommerce
|
|
|
|
|
+ - Follow established pattern
|
|
|
|
|
+ - Import Qdrant client functions
|
|
|
|
|
+ - Add Qdrant sync helper functions
|
|
|
|
|
+ - Update main sync functions
|
|
|
|
|
+
|
|
|
|
|
+2. **Production Embeddings**
|
|
|
|
|
+ - Replace `generateSimpleEmbedding()` with OpenAI API
|
|
|
|
|
+ - Use `text-embedding-3-large` model
|
|
|
|
|
+ - Implement batching for efficiency
|
|
|
|
|
+ - Add embedding cost tracking
|
|
|
|
|
+
|
|
|
|
|
+## Sync Flow
|
|
|
|
|
+
|
|
|
|
|
+### Initial Sync
|
|
|
|
|
+
|
|
|
|
|
+1. Check store permissions
|
|
|
|
|
+2. Initialize Qdrant collections (if not exist)
|
|
|
|
|
+3. Fetch data from e-commerce platform
|
|
|
|
|
+4. Sync to SQL cache (existing behavior)
|
|
|
|
|
+5. If Qdrant enabled:
|
|
|
|
|
+ - Generate embeddings for text content
|
|
|
|
|
+ - Upsert points to Qdrant
|
|
|
|
|
+ - Log operation to `qdrant_sync_logs`
|
|
|
|
|
+
|
|
|
|
|
+### Change Detection
|
|
|
|
|
+
|
|
|
|
|
+1. Scroll through existing Qdrant points for store
|
|
|
|
|
+2. Compare with current product IDs from platform
|
|
|
|
|
+3. Identify deleted products (in Qdrant but not in platform)
|
|
|
|
|
+4. Delete stale points from Qdrant
|
|
|
|
|
+5. Log deletion operation
|
|
|
|
|
+
|
|
|
|
|
+### Scheduled Sync
|
|
|
|
|
+
|
|
|
|
|
+The existing scheduled sync mechanisms (pg_cron) will automatically include Qdrant:
|
|
|
|
|
+- `shopify-scheduled-sync` → syncs Shopify stores
|
|
|
|
|
+- `woocommerce-scheduled-sync` → syncs WooCommerce stores
|
|
|
|
|
+- `shoprenter-scheduled-sync` → syncs ShopRenter stores
|
|
|
|
|
+
|
|
|
|
|
+All respect `qdrant_sync_enabled` flag.
|
|
|
|
|
+
|
|
|
|
|
+## Monitoring & Debugging
|
|
|
|
|
+
|
|
|
|
|
+### Qdrant Sync Logs
|
|
|
|
|
+
|
|
|
|
|
+Query recent sync operations:
|
|
|
|
|
+
|
|
|
|
|
+```sql
|
|
|
|
|
+SELECT
|
|
|
|
|
+ s.store_name,
|
|
|
|
|
+ qsl.sync_type,
|
|
|
|
|
+ qsl.collection_name,
|
|
|
|
|
+ qsl.operation,
|
|
|
|
|
+ qsl.items_processed,
|
|
|
|
|
+ qsl.items_succeeded,
|
|
|
|
|
+ qsl.items_failed,
|
|
|
|
|
+ qsl.error_message,
|
|
|
|
|
+ qsl.duration_ms,
|
|
|
|
|
+ qsl.created_at
|
|
|
|
|
+FROM qdrant_sync_logs qsl
|
|
|
|
|
+JOIN stores s ON s.id = qsl.store_id
|
|
|
|
|
+ORDER BY qsl.created_at DESC
|
|
|
|
|
+LIMIT 50;
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+### Check Store Qdrant Status
|
|
|
|
|
+
|
|
|
|
|
+```sql
|
|
|
|
|
+SELECT
|
|
|
|
|
+ store_name,
|
|
|
|
|
+ platform_name,
|
|
|
|
|
+ qdrant_sync_enabled,
|
|
|
|
|
+ qdrant_last_sync_at,
|
|
|
|
|
+ data_access_permissions
|
|
|
|
|
+FROM stores
|
|
|
|
|
+WHERE is_active = true;
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+### Collection Info
|
|
|
|
|
+
|
|
|
|
|
+Use the Qdrant API:
|
|
|
|
|
+
|
|
|
|
|
+```bash
|
|
|
|
|
+curl -X GET "http://142.93.100.6:6333/collections/{collection-name}" \
|
|
|
|
|
+ -H "api-key: pyXAyyEPbLzba2RvdBwm"
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+## Performance Considerations
|
|
|
|
|
+
|
|
|
|
|
+### Batching
|
|
|
|
|
+
|
|
|
|
|
+- Points are upserted in chunks of 100
|
|
|
|
|
+- Prevents payload size limits
|
|
|
|
|
+- Improves network efficiency
|
|
|
|
|
+
|
|
|
|
|
+### Indexing
|
|
|
|
|
+
|
|
|
|
|
+- Payload indexes created on frequently filtered fields
|
|
|
|
|
+- Indexing threshold: 10,000 points
|
|
|
|
|
+- On-disk storage enabled for large collections
|
|
|
|
|
+
|
|
|
|
|
+### Rate Limiting
|
|
|
|
|
+
|
|
|
|
|
+- Respects e-commerce platform rate limits
|
|
|
|
|
+- No additional rate limiting for Qdrant
|
|
|
|
|
+- Async operations don't block SQL sync
|
|
|
|
|
+
|
|
|
|
|
+## Future Enhancements
|
|
|
|
|
+
|
|
|
|
|
+### 1. Real-time Embeddings
|
|
|
|
|
+
|
|
|
|
|
+Replace placeholder embeddings with OpenAI API:
|
|
|
|
|
+
|
|
|
|
|
+```typescript
|
|
|
|
|
+import { OpenAI } from 'openai'
|
|
|
|
|
+
|
|
|
|
|
+async function generateEmbedding(text: string): Promise<number[]> {
|
|
|
|
|
+ const openai = new OpenAI({ apiKey: Deno.env.get('OPENAI_API_KEY') })
|
|
|
|
|
+
|
|
|
|
|
+ const response = await openai.embeddings.create({
|
|
|
|
|
+ model: 'text-embedding-3-large',
|
|
|
|
|
+ input: text,
|
|
|
|
|
+ dimensions: 3072
|
|
|
|
|
+ })
|
|
|
|
|
+
|
|
|
|
|
+ return response.data[0].embedding
|
|
|
|
|
+}
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+### 2. Semantic Search API
|
|
|
|
|
+
|
|
|
|
|
+Add endpoint for vector similarity search:
|
|
|
|
|
+
|
|
|
|
|
+```typescript
|
|
|
|
|
+// Search products across all stores
|
|
|
|
|
+POST /api/search
|
|
|
|
|
+{
|
|
|
|
|
+ "query": "blue running shoes",
|
|
|
|
|
+ "limit": 10,
|
|
|
|
|
+ "filter": {
|
|
|
|
|
+ "platform": "shopify",
|
|
|
|
|
+ "price": { "gte": 50, "lte": 150 }
|
|
|
|
|
+ }
|
|
|
|
|
+}
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+### 3. AI Features
|
|
|
|
|
+
|
|
|
|
|
+- Product recommendations
|
|
|
|
|
+- Customer segmentation
|
|
|
|
|
+- Semantic product search
|
|
|
|
|
+- Duplicate product detection
|
|
|
|
|
+- Auto-categorization
|
|
|
|
|
+
|
|
|
|
|
+### 4. Analytics
|
|
|
|
|
+
|
|
|
|
|
+- Vector space visualization
|
|
|
|
|
+- Cluster analysis
|
|
|
|
|
+- Trend detection
|
|
|
|
|
+- Anomaly detection
|
|
|
|
|
+
|
|
|
|
|
+## Security
|
|
|
|
|
+
|
|
|
|
|
+### API Key Management
|
|
|
|
|
+
|
|
|
|
|
+- Qdrant API key stored in Edge Function environment
|
|
|
|
|
+- Not exposed to frontend
|
|
|
|
|
+- Rotated periodically
|
|
|
|
|
+
|
|
|
|
|
+### Data Access
|
|
|
|
|
+
|
|
|
|
|
+- Row-level security on `qdrant_sync_logs`
|
|
|
|
|
+- Users only see their own store sync logs
|
|
|
|
|
+- Service role required for cross-store operations
|
|
|
|
|
+
|
|
|
|
|
+### Privacy
|
|
|
|
|
+
|
|
|
|
|
+- Store owners control what data is synced
|
|
|
|
|
+- Audit trail in `store_permission_audit`
|
|
|
|
|
+- GDPR-compliant data deletion
|
|
|
|
|
+
|
|
|
|
|
+## Troubleshooting
|
|
|
|
|
+
|
|
|
|
|
+### Common Issues
|
|
|
|
|
+
|
|
|
|
|
+**Issue**: Collection not found
|
|
|
|
|
+- **Cause**: First sync didn't create collection
|
|
|
|
|
+- **Fix**: Call `initializeStoreCollections()` manually
|
|
|
|
|
+
|
|
|
|
|
+**Issue**: Points not updating
|
|
|
|
|
+- **Cause**: Qdrant sync disabled or permissions denied
|
|
|
|
|
+- **Fix**: Check `qdrant_sync_enabled` and `data_access_permissions`
|
|
|
|
|
+
|
|
|
|
|
+**Issue**: High sync latency
|
|
|
|
|
+- **Cause**: Large dataset + embedding generation
|
|
|
|
|
+- **Fix**: Implement batching and async processing
|
|
|
|
|
+
|
|
|
|
|
+### Debug Mode
|
|
|
|
|
+
|
|
|
|
|
+Enable verbose logging:
|
|
|
|
|
+
|
|
|
|
|
+```typescript
|
|
|
|
|
+// In Qdrant client
|
|
|
|
|
+console.log('[Qdrant] Detailed operation log:', operation)
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+Check Edge Function logs:
|
|
|
|
|
+
|
|
|
|
|
+```bash
|
|
|
|
|
+# Via Supabase CLI
|
|
|
|
|
+supabase functions logs shopify-sync
|
|
|
|
|
+
|
|
|
|
|
+# Via MCP tool
|
|
|
|
|
+mcp__supabase__get_logs(service: "edge-function")
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+## References
|
|
|
|
|
+
|
|
|
|
|
+- [Qdrant Documentation](https://qdrant.tech/documentation/)
|
|
|
|
|
+- [OpenAI Embeddings](https://platform.openai.com/docs/guides/embeddings)
|
|
|
|
|
+- [Vector Databases Guide](https://www.pinecone.io/learn/vector-database/)
|