Files
linear-coding-agent/generations/library_rag/docs_techniques/SEARCH_QUALITY_RESULTS.md
David Blanc Brioir d2f7165120 Add Library RAG project and cleanup root directory
- Add complete Library RAG application (Flask + MCP server)
  - PDF processing pipeline with OCR and LLM extraction
  - Weaviate vector database integration (BGE-M3 embeddings)
  - Flask web interface with search and document management
  - MCP server for Claude Desktop integration
  - Comprehensive test suite (134 tests)

- Clean up root directory
  - Remove obsolete documentation files
  - Remove backup and temporary files
  - Update autonomous agent configuration

- Update prompts
  - Enhance initializer bis prompt with better instructions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-30 11:57:12 +01:00

2.8 KiB

BGE-M3 Search Quality Validation Results

Generated: (Run python test_bge_m3_quality.py --output SEARCH_QUALITY_RESULTS.md to populate)

Weaviate Version: TBD

Database Statistics

  • Total Documents: TBD
  • Total Chunks: TBD
  • Vector Dimensions: TBD (expected: 1024)

Vector Dimension Verification

Run the validation script to confirm BGE-M3 (1024-dim) vectors are properly configured.

Expected output: BGE-M3 (1024-dim) vectors confirmed.

Test Categories

1. Multilingual Queries

Tests the model's ability to understand philosophical terms in multiple languages:

Language Test Terms
French justice, vertu, liberte, verite, connaissance
English virtue, knowledge, ethics, wisdom, justice
Greek arete, telos, psyche, logos, eudaimonia
Latin virtus, sapientia, forma, anima, ratio

2. Semantic Understanding

Tests concept mapping for philosophical questions:

Query Expected Topics
"What is the nature of reality?" ontology, metaphysics, being
"How should we live?" ethics, virtue, good life
"What can we know?" epistemology, knowledge, truth
"What is the meaning of life?" purpose, existence, value
"What is beauty?" aesthetics, art, form

3. Long Query Handling

Tests the extended 8192 token context (vs MiniLM-L6's 512 tokens):

  • Uses a 100+ word query about Plato's Meno
  • Verifies no truncation occurs
  • Measures semantic accuracy of results

4. Performance Metrics

Performance targets:

  • Query Latency: < 500ms average
  • Throughput: Measured across 10 iterations per query

Running the Tests

# Run all tests with verbose output
python test_bge_m3_quality.py --verbose

# Generate markdown report
python test_bge_m3_quality.py --output SEARCH_QUALITY_RESULTS.md

# Output as JSON
python test_bge_m3_quality.py --json

Prerequisites

  1. Weaviate must be running:

    docker-compose up -d
    
  2. Documents must be ingested with BGE-M3 vectorizer

  3. Schema must be created with 1024-dim vectors

Expected Improvements over MiniLM-L6

Feature MiniLM-L6 BGE-M3
Vector Dimensions 384 1024 (2.7x richer)
Context Window 512 tokens 8192 tokens (16x larger)
Multilingual Limited Excellent (Greek, Latin, French, English)
Academic Texts Good Superior (trained on research papers)

Troubleshooting

"Connection error: Failed to connect to Weaviate"

Ensure Weaviate is running:

docker-compose up -d
docker-compose ps  # Check status

"No vectors found in Chunk collection"

Ensure documents have been ingested:

python reingest_from_cache.py

Vector dimensions show 384 instead of 1024

The BGE-M3 migration is incomplete. Re-run:

python migrate_to_bge_m3.py