BREAKING CHANGE: Docker text2vec-transformers service removed Changes: - Removed text2vec-transformers service from docker-compose.yml - Removed ENABLE_MODULES and DEFAULT_VECTORIZER_MODULE from Weaviate config - Updated architecture comments to reflect Python GPU embedder only - Simplified docker-compose to single Weaviate service Architecture: Before: Weaviate + text2vec-transformers (2 services) After: Weaviate only (1 service) Vectorization: - Ingestion: Python GPU embedder (manual vectorization) - Queries: Python GPU embedder (manual vectorization) - No auto-vectorization modules needed Benefits: - RAM: -10 GB freed (no text2vec-transformers container) - CPU: -3 cores freed - Architecture: Simplified (one service instead of two) - Maintenance: Easier (no Docker service dependencies) Validation: ✅ Weaviate starts correctly without text2vec-transformers ✅ Existing data accessible (5355 chunks preserved) ✅ API endpoints respond correctly ✅ No errors in startup logs Migration: GPU embedder already tested and validated See: TESTS_COMPLETS_GPU_EMBEDDER.md Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
63 lines
2.2 KiB
YAML
63 lines
2.2 KiB
YAML
# Library RAG - Weaviate + Python GPU Embedder
|
|
# ==============================================
|
|
#
|
|
# This docker-compose runs Weaviate with manual vectorization via Python GPU embedder.
|
|
#
|
|
# BGE-M3 GPU Embedder (Python):
|
|
# - 1024 dimensions - Rich semantic representation
|
|
# - 8192 token context - Long document support
|
|
# - Superior multilingual support (Greek, Latin, French, English)
|
|
# - GPU acceleration (NVIDIA RTX 4070) - 30-70x faster than Docker text2vec
|
|
# - PyTorch CUDA + FP16 precision
|
|
#
|
|
# Architecture (Jan 2026):
|
|
# - Ingestion: Python GPU embedder (manual vectorization)
|
|
# - Queries: Python GPU embedder (manual vectorization)
|
|
# - Weaviate: Vector storage only (no auto-vectorization)
|
|
#
|
|
# Migration Notes:
|
|
# - Dec 2024: Migrated from MiniLM-L6 (384-dim) to BGE-M3 (1024-dim)
|
|
# - Jan 2026: Migrated from Docker text2vec-transformers to Python GPU embedder
|
|
# - See MIGRATION_GPU_EMBEDDER_SUCCESS.md for details
|
|
|
|
services:
|
|
weaviate:
|
|
image: cr.weaviate.io/semitechnologies/weaviate:1.34.4
|
|
restart: on-failure:0
|
|
ports:
|
|
- "8080:8080"
|
|
- "50051:50051"
|
|
environment:
|
|
QUERY_DEFAULTS_LIMIT: "25"
|
|
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "true" # ok pour dev/local
|
|
PERSISTENCE_DATA_PATH: "/var/lib/weaviate"
|
|
CLUSTER_HOSTNAME: "node1"
|
|
CLUSTER_GOSSIP_BIND_PORT: "7946"
|
|
CLUSTER_DATA_BIND_PORT: "7947"
|
|
# Fix for "No private IP address found" error
|
|
CLUSTER_JOIN: ""
|
|
# NOTE: Manual vectorization via Python GPU embedder - no modules needed
|
|
# DEFAULT_VECTORIZER_MODULE and ENABLE_MODULES removed (Jan 2026)
|
|
# Limits to prevent OOM crashes
|
|
GOMEMLIMIT: "6GiB"
|
|
GOGC: "100"
|
|
volumes:
|
|
- weaviate_data:/var/lib/weaviate
|
|
mem_limit: 8g
|
|
memswap_limit: 10g
|
|
cpus: 4
|
|
healthcheck:
|
|
test: ["CMD", "curl", "-f", "http://localhost:8080/v1/.well-known/ready"]
|
|
interval: 30s
|
|
timeout: 10s
|
|
retries: 3
|
|
start_period: 60s
|
|
|
|
# NOTE: text2vec-transformers service REMOVED (Jan 2026)
|
|
# Vectorization now handled by Python GPU embedder (memory/core/embedding_service.py)
|
|
# Benefits: 30-70x faster ingestion, -10 GB RAM, unified architecture
|
|
# See MIGRATION_GPU_EMBEDDER_SUCCESS.md for details
|
|
|
|
volumes:
|
|
weaviate_data:
|