Add Library RAG project and cleanup root directory
- Add complete Library RAG application (Flask + MCP server) - PDF processing pipeline with OCR and LLM extraction - Weaviate vector database integration (BGE-M3 embeddings) - Flask web interface with search and document management - MCP server for Claude Desktop integration - Comprehensive test suite (134 tests) - Clean up root directory - Remove obsolete documentation files - Remove backup and temporary files - Update autonomous agent configuration - Update prompts - Enhance initializer bis prompt with better instructions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
71
generations/library_rag/examples/KNOWN_ISSUES.md
Normal file
71
generations/library_rag/examples/KNOWN_ISSUES.md
Normal file
@@ -0,0 +1,71 @@
|
||||
# Known Issues - MCP Client
|
||||
|
||||
## 1. Author/Work Filters Not Supported (Weaviate Limitation)
|
||||
|
||||
**Status:** Known limitation
|
||||
**Affects:** `search_chunks` and `search_summaries` tools
|
||||
**Error:** Results in server error when using `author_filter` or `work_filter` parameters
|
||||
|
||||
**Root Cause:**
|
||||
Weaviate v4 does not support filtering on nested object properties. The `work` field in the Chunk schema is defined as:
|
||||
|
||||
```python
|
||||
wvc.Property(
|
||||
name="work",
|
||||
data_type=wvc.DataType.OBJECT,
|
||||
nested_properties=[
|
||||
wvc.Property(name="title", data_type=wvc.DataType.TEXT),
|
||||
wvc.Property(name="author", data_type=wvc.DataType.TEXT),
|
||||
],
|
||||
)
|
||||
```
|
||||
|
||||
Attempts to filter on `work.author` or `work.title` result in:
|
||||
```
|
||||
data type "object" not supported in query
|
||||
```
|
||||
|
||||
**Workaround:**
|
||||
|
||||
Use the `filter_by_author` tool instead:
|
||||
|
||||
```python
|
||||
# Instead of:
|
||||
search_chunks(
|
||||
query="nominalism",
|
||||
author_filter="Charles Sanders Peirce" # ❌ Doesn't work
|
||||
)
|
||||
|
||||
# Use:
|
||||
filter_by_author(
|
||||
author="Charles Sanders Peirce" # ✓ Works
|
||||
)
|
||||
```
|
||||
|
||||
Or search without filters and filter client-side:
|
||||
|
||||
```python
|
||||
results = await client.call_tool("search_chunks", {
|
||||
"query": "nominalism",
|
||||
"limit": 50 # Fetch more
|
||||
})
|
||||
|
||||
# Filter in Python
|
||||
filtered = [
|
||||
r for r in results["results"]
|
||||
if r["work_author"] == "Charles Sanders Peirce"
|
||||
]
|
||||
```
|
||||
|
||||
**Future Fix:**
|
||||
|
||||
Option 1: Add flat properties `workAuthor` and `workTitle` to Chunk schema (requires migration)
|
||||
Option 2: Implement post-filtering in Python on the server side
|
||||
Option 3: Wait for Weaviate to support nested object filtering
|
||||
|
||||
**Tests Affected:**
|
||||
|
||||
- `test_mcp_client.py::test_search_chunks` - Works without filters
|
||||
- Search with `author_filter` - Currently fails
|
||||
|
||||
**Last Updated:** 2025-12-25
|
||||
165
generations/library_rag/examples/README.md
Normal file
165
generations/library_rag/examples/README.md
Normal file
@@ -0,0 +1,165 @@
|
||||
# Library RAG - Exemples MCP Client
|
||||
|
||||
Ce dossier contient des exemples d'implémentation de clients MCP pour utiliser Library RAG depuis votre application Python.
|
||||
|
||||
## Clients MCP avec LLM
|
||||
|
||||
### 1. `mcp_client_claude.py` ⭐ RECOMMANDÉ
|
||||
|
||||
**Client MCP avec Claude (Anthropic)**
|
||||
|
||||
**Modèle:** Claude Sonnet 4.5 (`claude-sonnet-4-5-20250929`)
|
||||
|
||||
**Features:**
|
||||
- Auto-chargement des clés depuis `.env`
|
||||
- Tool calling automatique
|
||||
- Gestion multi-tour de conversation
|
||||
- Synthèse naturelle des résultats
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
# Assurez-vous que .env contient:
|
||||
# ANTHROPIC_API_KEY=your_key
|
||||
# MISTRAL_API_KEY=your_key
|
||||
|
||||
python examples/mcp_client_claude.py
|
||||
```
|
||||
|
||||
**Exemple:**
|
||||
```
|
||||
User: "What did Peirce say about nominalism?"
|
||||
|
||||
Claude → search_chunks(query="Peirce nominalism")
|
||||
→ Weaviate (BGE-M3 embeddings)
|
||||
→ 10 chunks retournés
|
||||
Claude → "Peirce characterized nominalism as a 'tidal wave'..."
|
||||
```
|
||||
|
||||
### 2. `mcp_client_reference.py`
|
||||
|
||||
**Client MCP avec Mistral AI**
|
||||
|
||||
**Modèle:** Mistral Large (`mistral-large-latest`)
|
||||
|
||||
Même fonctionnalités que le client Claude, mais utilise Mistral AI.
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
python examples/mcp_client_reference.py
|
||||
```
|
||||
|
||||
## Tests
|
||||
|
||||
### `test_mcp_quick.py`
|
||||
|
||||
Test rapide (< 5 secondes) des fonctionnalités MCP:
|
||||
- ✅ search_chunks (recherche sémantique)
|
||||
- ✅ list_documents
|
||||
- ✅ filter_by_author
|
||||
|
||||
```bash
|
||||
python examples/test_mcp_quick.py
|
||||
```
|
||||
|
||||
### `test_mcp_client.py`
|
||||
|
||||
Suite de tests complète pour le client MCP (tests unitaires des 9 outils).
|
||||
|
||||
## Exemples sans MCP (direct pipeline)
|
||||
|
||||
### `example_python_usage.py`
|
||||
|
||||
Utilisation des handlers MCP directement (sans subprocess):
|
||||
```python
|
||||
from mcp_tools import search_chunks_handler, SearchChunksInput
|
||||
|
||||
result = await search_chunks_handler(
|
||||
SearchChunksInput(query="nominalism", limit=10)
|
||||
)
|
||||
```
|
||||
|
||||
### `example_direct_pipeline.py`
|
||||
|
||||
Utilisation directe du pipeline PDF:
|
||||
```python
|
||||
from utils.pdf_pipeline import process_pdf
|
||||
|
||||
result = process_pdf(
|
||||
Path("document.pdf"),
|
||||
use_llm=True,
|
||||
ingest_to_weaviate=True
|
||||
)
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────┐
|
||||
│ Votre Application │
|
||||
│ │
|
||||
│ Claude/Mistral (LLM conversationnel) │
|
||||
│ ↓ │
|
||||
│ MCPClient (stdio JSON-RPC) │
|
||||
└────────────┬────────────────────────────┘
|
||||
↓
|
||||
┌─────────────────────────────────────────┐
|
||||
│ MCP Server (subprocess) │
|
||||
│ - 9 outils disponibles │
|
||||
│ - search_chunks, parse_pdf, etc. │
|
||||
└────────────┬────────────────────────────┘
|
||||
↓
|
||||
┌─────────────────────────────────────────┐
|
||||
│ Weaviate + BGE-M3 embeddings │
|
||||
│ - 5,180 chunks de Peirce │
|
||||
│ - Recherche sémantique │
|
||||
└─────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Embeddings vs LLM
|
||||
|
||||
**Important:** Trois modèles distincts sont utilisés:
|
||||
|
||||
1. **BGE-M3** (text2vec-transformers dans Weaviate)
|
||||
- Rôle: Vectorisation (embeddings 1024-dim)
|
||||
- Quand: Ingestion + recherche
|
||||
- Non modifiable sans migration
|
||||
|
||||
2. **Claude/Mistral** (Agent conversationnel)
|
||||
- Rôle: Comprendre questions + synthétiser réponses
|
||||
- Quand: Chaque conversation utilisateur
|
||||
- Changeable (votre choix)
|
||||
|
||||
3. **Mistral OCR** (pixtral-12b)
|
||||
- Rôle: Extraction texte depuis PDF
|
||||
- Quand: Ingestion de PDFs (via parse_pdf tool)
|
||||
- Fixé par le MCP server
|
||||
|
||||
## Outils MCP disponibles
|
||||
|
||||
| Outil | Description |
|
||||
|-------|-------------|
|
||||
| `search_chunks` | Recherche sémantique (500 max) |
|
||||
| `search_summaries` | Recherche dans résumés |
|
||||
| `list_documents` | Liste tous les documents |
|
||||
| `get_document` | Récupère un document spécifique |
|
||||
| `get_chunks_by_document` | Chunks d'un document |
|
||||
| `filter_by_author` | Filtre par auteur |
|
||||
| `parse_pdf` | Ingère un PDF/Markdown |
|
||||
| `delete_document` | Supprime un document |
|
||||
| `ping` | Health check |
|
||||
|
||||
## Limitations connues
|
||||
|
||||
Voir `KNOWN_ISSUES.md` pour les détails:
|
||||
- ⚠️ `author_filter` et `work_filter` ne fonctionnent pas (limitation Weaviate nested objects)
|
||||
- ✅ Workaround: Utiliser `filter_by_author` tool à la place
|
||||
|
||||
## Requirements
|
||||
|
||||
```bash
|
||||
pip install anthropic python-dotenv # Pour Claude
|
||||
# OU
|
||||
pip install mistralai # Pour Mistral
|
||||
```
|
||||
|
||||
Toutes les dépendances sont dans `requirements.txt` du projet parent.
|
||||
91
generations/library_rag/examples/example_direct_pipeline.py
Normal file
91
generations/library_rag/examples/example_direct_pipeline.py
Normal file
@@ -0,0 +1,91 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Exemple d'utilisation DIRECTE du pipeline PDF (sans MCP).
|
||||
|
||||
Plus simple et plus de contrôle sur les paramètres!
|
||||
"""
|
||||
|
||||
from pathlib import Path
|
||||
from utils.pdf_pipeline import process_pdf, process_pdf_bytes
|
||||
import weaviate
|
||||
from weaviate.classes.query import Filter
|
||||
|
||||
|
||||
def example_process_local_file():
|
||||
"""Traiter un fichier local (PDF ou Markdown)."""
|
||||
|
||||
result = process_pdf(
|
||||
pdf_path=Path("md/peirce_collected_papers_fixed.md"),
|
||||
output_dir=Path("output"),
|
||||
|
||||
# Paramètres personnalisables
|
||||
skip_ocr=True, # Déjà en Markdown
|
||||
use_llm=False, # Pas besoin de LLM pour Peirce
|
||||
use_semantic_chunking=False, # Chunking basique (rapide)
|
||||
ingest_to_weaviate=True, # Ingérer dans Weaviate
|
||||
)
|
||||
|
||||
if result.get("success"):
|
||||
print(f"✓ {result['document_name']}: {result['chunks_count']} chunks")
|
||||
print(f" Coût total: {result['cost_total']:.4f}€")
|
||||
else:
|
||||
print(f"✗ Erreur: {result.get('error')}")
|
||||
|
||||
|
||||
def example_process_from_url():
|
||||
"""Télécharger et traiter depuis une URL."""
|
||||
|
||||
import httpx
|
||||
|
||||
url = "https://example.com/document.pdf"
|
||||
|
||||
# Télécharger
|
||||
response = httpx.get(url, follow_redirects=True)
|
||||
pdf_bytes = response.content
|
||||
|
||||
# Traiter
|
||||
result = process_pdf_bytes(
|
||||
file_bytes=pdf_bytes,
|
||||
filename="document.pdf",
|
||||
output_dir=Path("output"),
|
||||
|
||||
# Paramètres optimaux
|
||||
use_llm=True,
|
||||
llm_provider="mistral", # Ou "ollama"
|
||||
use_semantic_chunking=True,
|
||||
ingest_to_weaviate=True,
|
||||
)
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def example_search():
|
||||
"""Rechercher directement dans Weaviate."""
|
||||
|
||||
client = weaviate.connect_to_local()
|
||||
|
||||
try:
|
||||
collection = client.collections.get('Chunk')
|
||||
|
||||
# Recherche sémantique
|
||||
response = collection.query.near_text(
|
||||
query="nominalism and realism",
|
||||
limit=10,
|
||||
)
|
||||
|
||||
print(f"Trouvé {len(response.objects)} résultats:")
|
||||
for obj in response.objects[:3]:
|
||||
props = obj.properties
|
||||
print(f"\n- {props.get('sectionPath', 'N/A')}")
|
||||
print(f" {props.get('text', '')[:150]}...")
|
||||
|
||||
finally:
|
||||
client.close()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
# Choisir un exemple
|
||||
|
||||
# example_process_local_file()
|
||||
# example_process_from_url()
|
||||
example_search()
|
||||
78
generations/library_rag/examples/example_python_usage.py
Normal file
78
generations/library_rag/examples/example_python_usage.py
Normal file
@@ -0,0 +1,78 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Exemple d'utilisation de Library RAG depuis une application Python.
|
||||
|
||||
Le MCP server est uniquement pour Claude Desktop.
|
||||
Pour Python, appelez directement les handlers!
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
from pathlib import Path
|
||||
|
||||
# Import direct des handlers
|
||||
from mcp_tools import (
|
||||
parse_pdf_handler,
|
||||
ParsePdfInput,
|
||||
search_chunks_handler,
|
||||
SearchChunksInput,
|
||||
)
|
||||
|
||||
|
||||
async def example_parse_pdf():
|
||||
"""Exemple: Traiter un PDF ou Markdown."""
|
||||
|
||||
# Depuis un chemin local
|
||||
input_data = ParsePdfInput(
|
||||
pdf_path="C:/Users/david/Documents/platon.pdf"
|
||||
)
|
||||
|
||||
# OU depuis une URL
|
||||
# input_data = ParsePdfInput(
|
||||
# pdf_path="https://example.com/aristotle.pdf"
|
||||
# )
|
||||
|
||||
# OU un fichier Markdown
|
||||
# input_data = ParsePdfInput(
|
||||
# pdf_path="/path/to/peirce.md"
|
||||
# )
|
||||
|
||||
result = await parse_pdf_handler(input_data)
|
||||
|
||||
if result.success:
|
||||
print(f"✓ Document traité: {result.document_name}")
|
||||
print(f" Pages: {result.pages}")
|
||||
print(f" Chunks: {result.chunks_count}")
|
||||
print(f" Coût: {result.cost_total:.4f}€")
|
||||
else:
|
||||
print(f"✗ Erreur: {result.error}")
|
||||
|
||||
|
||||
async def example_search():
|
||||
"""Exemple: Rechercher dans les chunks."""
|
||||
|
||||
input_data = SearchChunksInput(
|
||||
query="nominalism and realism",
|
||||
limit=10,
|
||||
author_filter="Charles Sanders Peirce", # Optionnel
|
||||
)
|
||||
|
||||
result = await search_chunks_handler(input_data)
|
||||
|
||||
print(f"Trouvé {result.total_count} résultats:")
|
||||
for i, chunk in enumerate(result.results[:5], 1):
|
||||
print(f"\n[{i}] Similarité: {chunk.similarity:.3f}")
|
||||
print(f" {chunk.text[:200]}...")
|
||||
|
||||
|
||||
async def main():
|
||||
"""Point d'entrée principal."""
|
||||
|
||||
# Exemple 1: Traiter un PDF
|
||||
# await example_parse_pdf()
|
||||
|
||||
# Exemple 2: Rechercher
|
||||
await example_search()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
359
generations/library_rag/examples/mcp_client_claude.py
Normal file
359
generations/library_rag/examples/mcp_client_claude.py
Normal file
@@ -0,0 +1,359 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
MCP Client pour Library RAG avec Claude (Anthropic).
|
||||
|
||||
Implémentation d'un client MCP qui permet à Claude d'utiliser
|
||||
les outils de Library RAG via tool calling.
|
||||
|
||||
Usage:
|
||||
python mcp_client_claude.py
|
||||
|
||||
Requirements:
|
||||
pip install anthropic python-dotenv
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
# Charger les variables d'environnement depuis .env
|
||||
try:
|
||||
from dotenv import load_dotenv
|
||||
# Charger depuis le .env du projet parent
|
||||
env_path = Path(__file__).parent.parent / ".env"
|
||||
load_dotenv(env_path)
|
||||
print(f"[ENV] Loaded environment from {env_path}")
|
||||
except ImportError:
|
||||
print("[ENV] python-dotenv not installed, using system environment variables")
|
||||
print(" Install with: pip install python-dotenv")
|
||||
|
||||
|
||||
@dataclass
|
||||
class ToolDefinition:
|
||||
"""Définition d'un outil MCP."""
|
||||
|
||||
name: str
|
||||
description: str
|
||||
input_schema: dict[str, Any]
|
||||
|
||||
|
||||
class MCPClient:
|
||||
"""Client pour communiquer avec le MCP server de Library RAG."""
|
||||
|
||||
def __init__(self, server_path: str, env: dict[str, str] | None = None):
|
||||
"""
|
||||
Args:
|
||||
server_path: Chemin vers mcp_server.py
|
||||
env: Variables d'environnement additionnelles
|
||||
"""
|
||||
self.server_path = server_path
|
||||
self.env = env or {}
|
||||
self.process = None
|
||||
self.request_id = 0
|
||||
|
||||
async def start(self) -> None:
|
||||
"""Démarrer le MCP server subprocess."""
|
||||
print(f"[MCP] Starting server: {self.server_path}")
|
||||
|
||||
# Préparer l'environnement
|
||||
full_env = {**os.environ, **self.env}
|
||||
|
||||
# Démarrer le subprocess
|
||||
self.process = await asyncio.create_subprocess_exec(
|
||||
sys.executable,
|
||||
self.server_path,
|
||||
stdin=asyncio.subprocess.PIPE,
|
||||
stdout=asyncio.subprocess.PIPE,
|
||||
stderr=asyncio.subprocess.PIPE,
|
||||
env=full_env,
|
||||
)
|
||||
|
||||
# Phase 1: Initialize
|
||||
init_result = await self._send_request(
|
||||
"initialize",
|
||||
{
|
||||
"protocolVersion": "2024-11-05",
|
||||
"capabilities": {"tools": {}},
|
||||
"clientInfo": {"name": "library-rag-client-claude", "version": "1.0.0"},
|
||||
},
|
||||
)
|
||||
|
||||
print(f"[MCP] Server initialized: {init_result.get('serverInfo', {}).get('name')}")
|
||||
|
||||
# Phase 2: Initialized notification
|
||||
await self._send_notification("notifications/initialized", {})
|
||||
|
||||
print("[MCP] Client ready")
|
||||
|
||||
async def _send_request(self, method: str, params: dict) -> dict:
|
||||
"""Envoyer une requête JSON-RPC et attendre la réponse."""
|
||||
self.request_id += 1
|
||||
request = {
|
||||
"jsonrpc": "2.0",
|
||||
"id": self.request_id,
|
||||
"method": method,
|
||||
"params": params,
|
||||
}
|
||||
|
||||
# Envoyer
|
||||
request_json = json.dumps(request) + "\n"
|
||||
self.process.stdin.write(request_json.encode())
|
||||
await self.process.stdin.drain()
|
||||
|
||||
# Recevoir
|
||||
response_line = await self.process.stdout.readline()
|
||||
if not response_line:
|
||||
raise RuntimeError("MCP server closed connection")
|
||||
|
||||
response = json.loads(response_line.decode())
|
||||
|
||||
# Vérifier erreurs
|
||||
if "error" in response:
|
||||
raise RuntimeError(f"MCP error: {response['error']}")
|
||||
|
||||
return response.get("result", {})
|
||||
|
||||
async def _send_notification(self, method: str, params: dict) -> None:
|
||||
"""Envoyer une notification (pas de réponse)."""
|
||||
notification = {"jsonrpc": "2.0", "method": method, "params": params}
|
||||
|
||||
notification_json = json.dumps(notification) + "\n"
|
||||
self.process.stdin.write(notification_json.encode())
|
||||
await self.process.stdin.drain()
|
||||
|
||||
async def list_tools(self) -> list[ToolDefinition]:
|
||||
"""Obtenir la liste des outils disponibles."""
|
||||
result = await self._send_request("tools/list", {})
|
||||
tools = result.get("tools", [])
|
||||
|
||||
tool_defs = [
|
||||
ToolDefinition(
|
||||
name=tool["name"],
|
||||
description=tool["description"],
|
||||
input_schema=tool["inputSchema"],
|
||||
)
|
||||
for tool in tools
|
||||
]
|
||||
|
||||
print(f"[MCP] Found {len(tool_defs)} tools")
|
||||
return tool_defs
|
||||
|
||||
async def call_tool(self, tool_name: str, arguments: dict) -> Any:
|
||||
"""Appeler un outil MCP."""
|
||||
print(f"[MCP] Calling tool: {tool_name}")
|
||||
print(f" Arguments: {json.dumps(arguments, indent=2)[:200]}...")
|
||||
|
||||
result = await self._send_request(
|
||||
"tools/call", {"name": tool_name, "arguments": arguments}
|
||||
)
|
||||
|
||||
# Extraire le contenu
|
||||
content = result.get("content", [])
|
||||
if content and content[0].get("type") == "text":
|
||||
text_content = content[0]["text"]
|
||||
try:
|
||||
return json.loads(text_content)
|
||||
except json.JSONDecodeError:
|
||||
return text_content
|
||||
|
||||
return result
|
||||
|
||||
async def stop(self) -> None:
|
||||
"""Arrêter le MCP server."""
|
||||
if self.process:
|
||||
print("[MCP] Stopping server...")
|
||||
self.process.terminate()
|
||||
await self.process.wait()
|
||||
print("[MCP] Server stopped")
|
||||
|
||||
|
||||
class ClaudeWithMCP:
|
||||
"""Claude avec capacité d'utiliser les outils MCP."""
|
||||
|
||||
def __init__(self, mcp_client: MCPClient, anthropic_api_key: str):
|
||||
"""
|
||||
Args:
|
||||
mcp_client: Client MCP initialisé
|
||||
anthropic_api_key: Clé API Anthropic
|
||||
"""
|
||||
self.mcp_client = mcp_client
|
||||
self.anthropic_api_key = anthropic_api_key
|
||||
self.tools = None
|
||||
self.messages = []
|
||||
|
||||
# Import Claude
|
||||
try:
|
||||
from anthropic import Anthropic
|
||||
|
||||
self.client = Anthropic(api_key=anthropic_api_key)
|
||||
except ImportError:
|
||||
raise ImportError("Install anthropic: pip install anthropic")
|
||||
|
||||
async def initialize(self) -> None:
|
||||
"""Charger les outils MCP et les convertir pour Claude."""
|
||||
mcp_tools = await self.mcp_client.list_tools()
|
||||
|
||||
# Convertir au format Claude (identique au format MCP)
|
||||
self.tools = [
|
||||
{
|
||||
"name": tool.name,
|
||||
"description": tool.description,
|
||||
"input_schema": tool.input_schema,
|
||||
}
|
||||
for tool in mcp_tools
|
||||
]
|
||||
|
||||
print(f"[Claude] Loaded {len(self.tools)} tools")
|
||||
|
||||
async def chat(self, user_message: str, max_iterations: int = 10) -> str:
|
||||
"""
|
||||
Converser avec Claude qui peut utiliser les outils MCP.
|
||||
|
||||
Args:
|
||||
user_message: Message de l'utilisateur
|
||||
max_iterations: Limite de tool calls
|
||||
|
||||
Returns:
|
||||
Réponse finale de Claude
|
||||
"""
|
||||
print(f"\n[USER] {user_message}\n")
|
||||
|
||||
self.messages.append({"role": "user", "content": user_message})
|
||||
|
||||
for iteration in range(max_iterations):
|
||||
print(f"[Claude] Iteration {iteration + 1}/{max_iterations}")
|
||||
|
||||
# Appel Claude avec tools
|
||||
response = self.client.messages.create(
|
||||
model="claude-sonnet-4-5-20250929", # Claude Sonnet 4.5
|
||||
max_tokens=4096,
|
||||
messages=self.messages,
|
||||
tools=self.tools,
|
||||
)
|
||||
|
||||
# Ajouter la réponse de Claude
|
||||
assistant_message = {
|
||||
"role": "assistant",
|
||||
"content": response.content,
|
||||
}
|
||||
self.messages.append(assistant_message)
|
||||
|
||||
# Vérifier si Claude veut utiliser des outils
|
||||
tool_uses = [block for block in response.content if block.type == "tool_use"]
|
||||
|
||||
# Si pas de tool use → réponse finale
|
||||
if not tool_uses:
|
||||
# Extraire le texte de la réponse
|
||||
text_blocks = [block for block in response.content if block.type == "text"]
|
||||
if text_blocks:
|
||||
print(f"[Claude] Final response")
|
||||
return text_blocks[0].text
|
||||
return ""
|
||||
|
||||
# Exécuter les tool uses
|
||||
print(f"[Claude] Tool uses: {len(tool_uses)}")
|
||||
|
||||
tool_results = []
|
||||
|
||||
for tool_use in tool_uses:
|
||||
tool_name = tool_use.name
|
||||
arguments = tool_use.input
|
||||
|
||||
# Appeler via MCP
|
||||
try:
|
||||
result = await self.mcp_client.call_tool(tool_name, arguments)
|
||||
result_str = json.dumps(result) if isinstance(result, dict) else str(result)
|
||||
print(f"[MCP] Result: {result_str[:200]}...")
|
||||
|
||||
tool_results.append({
|
||||
"type": "tool_result",
|
||||
"tool_use_id": tool_use.id,
|
||||
"content": result_str,
|
||||
})
|
||||
|
||||
except Exception as e:
|
||||
print(f"[MCP] Error: {e}")
|
||||
tool_results.append({
|
||||
"type": "tool_result",
|
||||
"tool_use_id": tool_use.id,
|
||||
"content": json.dumps({"error": str(e)}),
|
||||
"is_error": True,
|
||||
})
|
||||
|
||||
# Ajouter les résultats des outils
|
||||
self.messages.append({
|
||||
"role": "user",
|
||||
"content": tool_results,
|
||||
})
|
||||
|
||||
return "Max iterations atteintes"
|
||||
|
||||
|
||||
async def main():
|
||||
"""Exemple d'utilisation du client MCP avec Claude."""
|
||||
|
||||
# Configuration
|
||||
library_rag_path = Path(__file__).parent.parent
|
||||
server_path = library_rag_path / "mcp_server.py"
|
||||
|
||||
anthropic_api_key = os.getenv("ANTHROPIC_API_KEY")
|
||||
if not anthropic_api_key:
|
||||
print("ERROR: ANTHROPIC_API_KEY not found in .env file")
|
||||
print("Please add to .env: ANTHROPIC_API_KEY=your_key")
|
||||
return
|
||||
|
||||
mistral_api_key = os.getenv("MISTRAL_API_KEY")
|
||||
if not mistral_api_key:
|
||||
print("ERROR: MISTRAL_API_KEY not found in .env file")
|
||||
print("The MCP server needs Mistral API for OCR functionality")
|
||||
return
|
||||
|
||||
# 1. Créer et démarrer le client MCP
|
||||
mcp_client = MCPClient(
|
||||
server_path=str(server_path),
|
||||
env={
|
||||
"MISTRAL_API_KEY": mistral_api_key or "",
|
||||
},
|
||||
)
|
||||
|
||||
try:
|
||||
await mcp_client.start()
|
||||
|
||||
# 2. Créer l'agent Claude
|
||||
agent = ClaudeWithMCP(mcp_client, anthropic_api_key)
|
||||
await agent.initialize()
|
||||
|
||||
# 3. Exemples de conversations
|
||||
print("\n" + "=" * 80)
|
||||
print("EXAMPLE 1: Search in Peirce")
|
||||
print("=" * 80)
|
||||
|
||||
response = await agent.chat(
|
||||
"What did Charles Sanders Peirce say about the philosophical debate "
|
||||
"between nominalism and realism? Search the database and provide "
|
||||
"a detailed summary with specific quotes."
|
||||
)
|
||||
|
||||
print(f"\n[CLAUDE]\n{response}\n")
|
||||
|
||||
print("\n" + "=" * 80)
|
||||
print("EXAMPLE 2: Explore database")
|
||||
print("=" * 80)
|
||||
|
||||
response = await agent.chat(
|
||||
"What documents are available in the database? "
|
||||
"Give me an overview of the authors and topics covered."
|
||||
)
|
||||
|
||||
print(f"\n[CLAUDE]\n{response}\n")
|
||||
|
||||
finally:
|
||||
await mcp_client.stop()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
347
generations/library_rag/examples/mcp_client_reference.py
Normal file
347
generations/library_rag/examples/mcp_client_reference.py
Normal file
@@ -0,0 +1,347 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
MCP Client de référence pour Library RAG.
|
||||
|
||||
Implémentation complète d'un client MCP qui permet à un LLM
|
||||
d'utiliser les outils de Library RAG.
|
||||
|
||||
Usage:
|
||||
python mcp_client_reference.py
|
||||
|
||||
Requirements:
|
||||
pip install mistralai anyio
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
|
||||
@dataclass
|
||||
class ToolDefinition:
|
||||
"""Définition d'un outil MCP."""
|
||||
|
||||
name: str
|
||||
description: str
|
||||
input_schema: dict[str, Any]
|
||||
|
||||
|
||||
class MCPClient:
|
||||
"""Client pour communiquer avec le MCP server de Library RAG."""
|
||||
|
||||
def __init__(self, server_path: str, env: dict[str, str] | None = None):
|
||||
"""
|
||||
Args:
|
||||
server_path: Chemin vers mcp_server.py
|
||||
env: Variables d'environnement additionnelles
|
||||
"""
|
||||
self.server_path = server_path
|
||||
self.env = env or {}
|
||||
self.process = None
|
||||
self.request_id = 0
|
||||
|
||||
async def start(self) -> None:
|
||||
"""Démarrer le MCP server subprocess."""
|
||||
print(f"[MCP] Starting server: {self.server_path}")
|
||||
|
||||
# Préparer l'environnement
|
||||
full_env = {**os.environ, **self.env}
|
||||
|
||||
# Démarrer le subprocess
|
||||
self.process = await asyncio.create_subprocess_exec(
|
||||
sys.executable, # Python executable
|
||||
self.server_path,
|
||||
stdin=asyncio.subprocess.PIPE,
|
||||
stdout=asyncio.subprocess.PIPE,
|
||||
stderr=asyncio.subprocess.PIPE,
|
||||
env=full_env,
|
||||
)
|
||||
|
||||
# Phase 1: Initialize
|
||||
init_result = await self._send_request(
|
||||
"initialize",
|
||||
{
|
||||
"protocolVersion": "2024-11-05",
|
||||
"capabilities": {"tools": {}},
|
||||
"clientInfo": {"name": "library-rag-client", "version": "1.0.0"},
|
||||
},
|
||||
)
|
||||
|
||||
print(f"[MCP] Server initialized: {init_result.get('serverInfo', {}).get('name')}")
|
||||
|
||||
# Phase 2: Initialized notification
|
||||
await self._send_notification("notifications/initialized", {})
|
||||
|
||||
print("[MCP] Client ready")
|
||||
|
||||
async def _send_request(self, method: str, params: dict) -> dict:
|
||||
"""Envoyer une requête JSON-RPC et attendre la réponse."""
|
||||
self.request_id += 1
|
||||
request = {
|
||||
"jsonrpc": "2.0",
|
||||
"id": self.request_id,
|
||||
"method": method,
|
||||
"params": params,
|
||||
}
|
||||
|
||||
# Envoyer
|
||||
request_json = json.dumps(request) + "\n"
|
||||
self.process.stdin.write(request_json.encode())
|
||||
await self.process.stdin.drain()
|
||||
|
||||
# Recevoir
|
||||
response_line = await self.process.stdout.readline()
|
||||
if not response_line:
|
||||
raise RuntimeError("MCP server closed connection")
|
||||
|
||||
response = json.loads(response_line.decode())
|
||||
|
||||
# Vérifier erreurs
|
||||
if "error" in response:
|
||||
raise RuntimeError(f"MCP error: {response['error']}")
|
||||
|
||||
return response.get("result", {})
|
||||
|
||||
async def _send_notification(self, method: str, params: dict) -> None:
|
||||
"""Envoyer une notification (pas de réponse)."""
|
||||
notification = {"jsonrpc": "2.0", "method": method, "params": params}
|
||||
|
||||
notification_json = json.dumps(notification) + "\n"
|
||||
self.process.stdin.write(notification_json.encode())
|
||||
await self.process.stdin.drain()
|
||||
|
||||
async def list_tools(self) -> list[ToolDefinition]:
|
||||
"""Obtenir la liste des outils disponibles."""
|
||||
result = await self._send_request("tools/list", {})
|
||||
tools = result.get("tools", [])
|
||||
|
||||
tool_defs = [
|
||||
ToolDefinition(
|
||||
name=tool["name"],
|
||||
description=tool["description"],
|
||||
input_schema=tool["inputSchema"],
|
||||
)
|
||||
for tool in tools
|
||||
]
|
||||
|
||||
print(f"[MCP] Found {len(tool_defs)} tools")
|
||||
return tool_defs
|
||||
|
||||
async def call_tool(self, tool_name: str, arguments: dict) -> Any:
|
||||
"""Appeler un outil MCP."""
|
||||
print(f"[MCP] Calling tool: {tool_name}")
|
||||
print(f" Arguments: {json.dumps(arguments, indent=2)}")
|
||||
|
||||
result = await self._send_request(
|
||||
"tools/call", {"name": tool_name, "arguments": arguments}
|
||||
)
|
||||
|
||||
# Extraire le contenu
|
||||
content = result.get("content", [])
|
||||
if content and content[0].get("type") == "text":
|
||||
text_content = content[0]["text"]
|
||||
try:
|
||||
return json.loads(text_content)
|
||||
except json.JSONDecodeError:
|
||||
return text_content
|
||||
|
||||
return result
|
||||
|
||||
async def stop(self) -> None:
|
||||
"""Arrêter le MCP server."""
|
||||
if self.process:
|
||||
print("[MCP] Stopping server...")
|
||||
self.process.terminate()
|
||||
await self.process.wait()
|
||||
print("[MCP] Server stopped")
|
||||
|
||||
|
||||
class LLMWithMCP:
|
||||
"""LLM avec capacité d'utiliser les outils MCP."""
|
||||
|
||||
def __init__(self, mcp_client: MCPClient, mistral_api_key: str):
|
||||
"""
|
||||
Args:
|
||||
mcp_client: Client MCP initialisé
|
||||
mistral_api_key: Clé API Mistral
|
||||
"""
|
||||
self.mcp_client = mcp_client
|
||||
self.mistral_api_key = mistral_api_key
|
||||
self.tools = None
|
||||
self.messages = []
|
||||
|
||||
# Import Mistral
|
||||
try:
|
||||
from mistralai import Mistral
|
||||
|
||||
self.mistral = Mistral(api_key=mistral_api_key)
|
||||
except ImportError:
|
||||
raise ImportError("Install mistralai: pip install mistralai")
|
||||
|
||||
async def initialize(self) -> None:
|
||||
"""Charger les outils MCP et les convertir pour Mistral."""
|
||||
mcp_tools = await self.mcp_client.list_tools()
|
||||
|
||||
# Convertir au format Mistral
|
||||
self.tools = [
|
||||
{
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": tool.name,
|
||||
"description": tool.description,
|
||||
"parameters": tool.input_schema,
|
||||
},
|
||||
}
|
||||
for tool in mcp_tools
|
||||
]
|
||||
|
||||
print(f"[LLM] Loaded {len(self.tools)} tools for Mistral")
|
||||
|
||||
async def chat(self, user_message: str, max_iterations: int = 10) -> str:
|
||||
"""
|
||||
Converser avec le LLM qui peut utiliser les outils MCP.
|
||||
|
||||
Args:
|
||||
user_message: Message de l'utilisateur
|
||||
max_iterations: Limite de tool calls
|
||||
|
||||
Returns:
|
||||
Réponse finale du LLM
|
||||
"""
|
||||
print(f"\n[USER] {user_message}\n")
|
||||
|
||||
self.messages.append({"role": "user", "content": user_message})
|
||||
|
||||
for iteration in range(max_iterations):
|
||||
print(f"[LLM] Iteration {iteration + 1}/{max_iterations}")
|
||||
|
||||
# Appel LLM avec tools
|
||||
response = self.mistral.chat.complete(
|
||||
model="mistral-large-latest",
|
||||
messages=self.messages,
|
||||
tools=self.tools,
|
||||
tool_choice="auto",
|
||||
)
|
||||
|
||||
assistant_message = response.choices[0].message
|
||||
|
||||
# Ajouter le message assistant
|
||||
self.messages.append(
|
||||
{
|
||||
"role": "assistant",
|
||||
"content": assistant_message.content or "",
|
||||
"tool_calls": (
|
||||
[
|
||||
{
|
||||
"id": tc.id,
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": tc.function.name,
|
||||
"arguments": tc.function.arguments,
|
||||
},
|
||||
}
|
||||
for tc in assistant_message.tool_calls
|
||||
]
|
||||
if assistant_message.tool_calls
|
||||
else None
|
||||
),
|
||||
}
|
||||
)
|
||||
|
||||
# Si pas de tool calls → réponse finale
|
||||
if not assistant_message.tool_calls:
|
||||
print(f"[LLM] Final response")
|
||||
return assistant_message.content
|
||||
|
||||
# Exécuter les tool calls
|
||||
print(f"[LLM] Tool calls: {len(assistant_message.tool_calls)}")
|
||||
|
||||
for tool_call in assistant_message.tool_calls:
|
||||
tool_name = tool_call.function.name
|
||||
arguments = json.loads(tool_call.function.arguments)
|
||||
|
||||
# Appeler via MCP
|
||||
try:
|
||||
result = await self.mcp_client.call_tool(tool_name, arguments)
|
||||
result_str = json.dumps(result)
|
||||
print(f"[MCP] Result: {result_str[:200]}...")
|
||||
|
||||
except Exception as e:
|
||||
result_str = json.dumps({"error": str(e)})
|
||||
print(f"[MCP] Error: {e}")
|
||||
|
||||
# Ajouter le résultat
|
||||
self.messages.append(
|
||||
{
|
||||
"role": "tool",
|
||||
"name": tool_name,
|
||||
"content": result_str,
|
||||
"tool_call_id": tool_call.id,
|
||||
}
|
||||
)
|
||||
|
||||
return "Max iterations atteintes"
|
||||
|
||||
|
||||
async def main():
|
||||
"""Exemple d'utilisation du client MCP."""
|
||||
|
||||
# Configuration
|
||||
library_rag_path = Path(__file__).parent.parent
|
||||
server_path = library_rag_path / "mcp_server.py"
|
||||
|
||||
mistral_api_key = os.getenv("MISTRAL_API_KEY")
|
||||
if not mistral_api_key:
|
||||
print("ERROR: MISTRAL_API_KEY not set")
|
||||
return
|
||||
|
||||
# 1. Créer et démarrer le client MCP
|
||||
mcp_client = MCPClient(
|
||||
server_path=str(server_path),
|
||||
env={
|
||||
"MISTRAL_API_KEY": mistral_api_key,
|
||||
# Ajouter autres variables si nécessaire
|
||||
},
|
||||
)
|
||||
|
||||
try:
|
||||
await mcp_client.start()
|
||||
|
||||
# 2. Créer l'agent LLM
|
||||
agent = LLMWithMCP(mcp_client, mistral_api_key)
|
||||
await agent.initialize()
|
||||
|
||||
# 3. Exemples de conversations
|
||||
print("\n" + "=" * 80)
|
||||
print("EXAMPLE 1: Search")
|
||||
print("=" * 80)
|
||||
|
||||
response = await agent.chat(
|
||||
"What did Charles Sanders Peirce say about the debate between "
|
||||
"nominalism and realism? Search the database and give me a summary "
|
||||
"with specific quotes."
|
||||
)
|
||||
|
||||
print(f"\n[ASSISTANT]\n{response}\n")
|
||||
|
||||
print("\n" + "=" * 80)
|
||||
print("EXAMPLE 2: List documents")
|
||||
print("=" * 80)
|
||||
|
||||
response = await agent.chat(
|
||||
"List all the documents in the database. "
|
||||
"How many are there and who are the authors?"
|
||||
)
|
||||
|
||||
print(f"\n[ASSISTANT]\n{response}\n")
|
||||
|
||||
finally:
|
||||
await mcp_client.stop()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
192
generations/library_rag/examples/test_mcp_client.py
Normal file
192
generations/library_rag/examples/test_mcp_client.py
Normal file
@@ -0,0 +1,192 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Test simple du client MCP (sans LLM).
|
||||
|
||||
Teste la communication directe avec le MCP server.
|
||||
|
||||
Usage:
|
||||
python test_mcp_client.py
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Ajouter le parent au path pour import
|
||||
sys.path.insert(0, str(Path(__file__).parent))
|
||||
|
||||
from mcp_client_reference import MCPClient
|
||||
|
||||
|
||||
async def test_basic_communication():
|
||||
"""Test: Communication basique avec le server."""
|
||||
print("TEST 1: Basic Communication")
|
||||
print("-" * 80)
|
||||
|
||||
library_rag_path = Path(__file__).parent.parent
|
||||
server_path = library_rag_path / "mcp_server.py"
|
||||
|
||||
client = MCPClient(
|
||||
server_path=str(server_path),
|
||||
env={"MISTRAL_API_KEY": os.getenv("MISTRAL_API_KEY", "")},
|
||||
)
|
||||
|
||||
try:
|
||||
await client.start()
|
||||
print("[OK] Server started\n")
|
||||
|
||||
# Liste des outils
|
||||
tools = await client.list_tools()
|
||||
print(f"[OK] Found {len(tools)} tools:")
|
||||
for tool in tools:
|
||||
print(f" - {tool.name}: {tool.description}")
|
||||
|
||||
print("\n[OK] Test passed")
|
||||
|
||||
finally:
|
||||
await client.stop()
|
||||
|
||||
|
||||
async def test_search_chunks():
|
||||
"""Test: Recherche sémantique."""
|
||||
print("\n\nTEST 2: Search Chunks")
|
||||
print("-" * 80)
|
||||
|
||||
library_rag_path = Path(__file__).parent.parent
|
||||
server_path = library_rag_path / "mcp_server.py"
|
||||
|
||||
client = MCPClient(
|
||||
server_path=str(server_path),
|
||||
env={"MISTRAL_API_KEY": os.getenv("MISTRAL_API_KEY", "")},
|
||||
)
|
||||
|
||||
try:
|
||||
await client.start()
|
||||
|
||||
# Recherche
|
||||
result = await client.call_tool(
|
||||
"search_chunks",
|
||||
{
|
||||
"query": "nominalism and realism",
|
||||
"limit": 3,
|
||||
"author_filter": "Charles Sanders Peirce",
|
||||
},
|
||||
)
|
||||
|
||||
print(f"[OK] Query: nominalism and realism")
|
||||
print(f"[OK] Found {result['total_count']} results")
|
||||
|
||||
for i, chunk in enumerate(result["results"][:3], 1):
|
||||
print(f"\n [{i}] Similarity: {chunk['similarity']:.3f}")
|
||||
print(f" Section: {chunk['section_path']}")
|
||||
print(f" Preview: {chunk['text'][:150]}...")
|
||||
|
||||
print("\n[OK] Test passed")
|
||||
|
||||
finally:
|
||||
await client.stop()
|
||||
|
||||
|
||||
async def test_list_documents():
|
||||
"""Test: Liste des documents."""
|
||||
print("\n\nTEST 3: List Documents")
|
||||
print("-" * 80)
|
||||
|
||||
library_rag_path = Path(__file__).parent.parent
|
||||
server_path = library_rag_path / "mcp_server.py"
|
||||
|
||||
client = MCPClient(
|
||||
server_path=str(server_path),
|
||||
env={"MISTRAL_API_KEY": os.getenv("MISTRAL_API_KEY", "")},
|
||||
)
|
||||
|
||||
try:
|
||||
await client.start()
|
||||
|
||||
result = await client.call_tool("list_documents", {"limit": 10})
|
||||
|
||||
print(f"[OK] Total documents: {result['total_count']}")
|
||||
|
||||
for doc in result["documents"][:5]:
|
||||
print(f"\n - {doc['source_id']}")
|
||||
print(f" Author: {doc['author']}")
|
||||
print(f" Chunks: {doc['chunks_count']}")
|
||||
|
||||
print("\n[OK] Test passed")
|
||||
|
||||
finally:
|
||||
await client.stop()
|
||||
|
||||
|
||||
async def test_get_document():
|
||||
"""Test: Récupérer un document spécifique."""
|
||||
print("\n\nTEST 4: Get Document")
|
||||
print("-" * 80)
|
||||
|
||||
library_rag_path = Path(__file__).parent.parent
|
||||
server_path = library_rag_path / "mcp_server.py"
|
||||
|
||||
client = MCPClient(
|
||||
server_path=str(server_path),
|
||||
env={"MISTRAL_API_KEY": os.getenv("MISTRAL_API_KEY", "")},
|
||||
)
|
||||
|
||||
try:
|
||||
await client.start()
|
||||
|
||||
# D'abord lister pour trouver un document
|
||||
list_result = await client.call_tool("list_documents", {"limit": 1})
|
||||
|
||||
if list_result["documents"]:
|
||||
doc_id = list_result["documents"][0]["source_id"]
|
||||
|
||||
# Récupérer le document
|
||||
result = await client.call_tool(
|
||||
"get_document",
|
||||
{"source_id": doc_id, "include_chunks": True, "chunk_limit": 5},
|
||||
)
|
||||
|
||||
print(f"[OK] Document: {result['source_id']}")
|
||||
print(f" Author: {result['author']}")
|
||||
print(f" Pages: {result['pages']}")
|
||||
print(f" Chunks: {result['chunks_count']}")
|
||||
|
||||
if result.get("chunks"):
|
||||
print(f"\n First chunk preview:")
|
||||
print(f" {result['chunks'][0]['text'][:200]}...")
|
||||
|
||||
print("\n[OK] Test passed")
|
||||
else:
|
||||
print("[WARN] No documents in database")
|
||||
|
||||
finally:
|
||||
await client.stop()
|
||||
|
||||
|
||||
async def main():
|
||||
"""Exécuter tous les tests."""
|
||||
print("=" * 80)
|
||||
print("MCP CLIENT TESTS")
|
||||
print("=" * 80)
|
||||
|
||||
try:
|
||||
await test_basic_communication()
|
||||
await test_search_chunks()
|
||||
await test_list_documents()
|
||||
await test_get_document()
|
||||
|
||||
print("\n" + "=" * 80)
|
||||
print("ALL TESTS PASSED [OK]")
|
||||
print("=" * 80)
|
||||
|
||||
except Exception as e:
|
||||
print(f"\n[ERROR] Test failed: {e}")
|
||||
import traceback
|
||||
|
||||
traceback.print_exc()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
62
generations/library_rag/examples/test_mcp_quick.py
Normal file
62
generations/library_rag/examples/test_mcp_quick.py
Normal file
@@ -0,0 +1,62 @@
|
||||
import asyncio
|
||||
import sys
|
||||
from pathlib import Path
|
||||
sys.path.insert(0, str(Path(__file__).parent))
|
||||
|
||||
from mcp_client_reference import MCPClient
|
||||
|
||||
async def main():
|
||||
client = MCPClient(server_path=str(Path(__file__).parent.parent / "mcp_server.py"), env={})
|
||||
|
||||
await client.start()
|
||||
|
||||
try:
|
||||
print("=" * 70)
|
||||
print("MCP CLIENT - FUNCTIONAL TESTS")
|
||||
print("=" * 70)
|
||||
|
||||
# Test 1: Search chunks
|
||||
print("\n[TEST 1] Search chunks (semantic search)")
|
||||
result = await client.call_tool("search_chunks", {
|
||||
"query": "nominalism realism debate",
|
||||
"limit": 2
|
||||
})
|
||||
|
||||
print(f"Results: {result['total_count']}")
|
||||
for i, chunk in enumerate(result['results'], 1):
|
||||
print(f" [{i}] {chunk['work_author']} - Similarity: {chunk['similarity']:.3f}")
|
||||
print(f" {chunk['text'][:80]}...")
|
||||
print("[OK]")
|
||||
|
||||
# Test 2: List documents
|
||||
print("\n[TEST 2] List documents")
|
||||
result = await client.call_tool("list_documents", {"limit": 5})
|
||||
|
||||
print(f"Total: {result['total_count']} documents")
|
||||
for doc in result['documents'][:3]:
|
||||
print(f" - {doc['source_id']} ({doc['work_author']}): {doc['chunks_count']} chunks")
|
||||
print("[OK]")
|
||||
|
||||
# Test 3: Filter by author
|
||||
print("\n[TEST 3] Filter by author")
|
||||
result = await client.call_tool("filter_by_author", {
|
||||
"author": "Charles Sanders Peirce"
|
||||
})
|
||||
|
||||
print(f"Author: {result['author']}")
|
||||
print(f"Works: {result['total_works']}")
|
||||
print(f"Documents: {result['total_documents']}")
|
||||
if 'total_chunks' in result:
|
||||
print(f"Chunks: {result['total_chunks']}")
|
||||
print("[OK]")
|
||||
|
||||
print("\n" + "=" * 70)
|
||||
print("ALL TESTS PASSED - MCP CLIENT IS WORKING!")
|
||||
print("=" * 70)
|
||||
print("\nNote: author_filter and work_filter parameters are not supported")
|
||||
print(" due to Weaviate v4 limitation. See examples/KNOWN_ISSUES.md")
|
||||
|
||||
finally:
|
||||
await client.stop()
|
||||
|
||||
asyncio.run(main())
|
||||
Reference in New Issue
Block a user