From ef8cd32711956f5909b6577627c73244da09f0cd Mon Sep 17 00:00:00 2001 From: David Blanc Brioir Date: Tue, 30 Dec 2025 11:57:21 +0100 Subject: [PATCH] Remove obsolete documentation and backup files MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Remove REMOTE_WEAVIATE_ARCHITECTURE.md (moved to library_rag) - Remove navette.txt (obsolete notes) - Remove backup and obsolete app spec files 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 --- REMOTE_WEAVIATE_ARCHITECTURE.md | 431 ---- navette.txt | 2510 ------------------- prompts/app_spec_library_rag_types_docs.txt | 679 ----- prompts/app_spec_markdown_support.txt | 490 ---- prompts/app_spec_tavily_mcp.txt | 498 ---- prompts/app_spec_types_docs.backup.txt | 679 ----- prompts/coding_prompt_library.md | 290 --- 7 files changed, 5577 deletions(-) delete mode 100644 REMOTE_WEAVIATE_ARCHITECTURE.md delete mode 100644 navette.txt delete mode 100644 prompts/app_spec_library_rag_types_docs.txt delete mode 100644 prompts/app_spec_markdown_support.txt delete mode 100644 prompts/app_spec_tavily_mcp.txt delete mode 100644 prompts/app_spec_types_docs.backup.txt delete mode 100644 prompts/coding_prompt_library.md diff --git a/REMOTE_WEAVIATE_ARCHITECTURE.md b/REMOTE_WEAVIATE_ARCHITECTURE.md deleted file mode 100644 index cc4200c..0000000 --- a/REMOTE_WEAVIATE_ARCHITECTURE.md +++ /dev/null @@ -1,431 +0,0 @@ -# Architecture pour Weaviate distant (Synology/VPS) - -## Votre cas d'usage - -**Situation** : Application LLM (local ou cloud) → Weaviate (Synology ou VPS distant) - -**Besoins** : -- ✅ Fiabilité maximale -- ✅ Sécurité (données privées) -- ✅ Performance acceptable -- ✅ Maintenance simple - ---- - -## 🏆 Option recommandée : API REST + Tunnel sécurisé - -### Architecture globale - -``` -┌──────────────────────────────────────────────────────────────┐ -│ Application LLM │ -│ (Claude API, OpenAI, Ollama local, etc.) │ -└────────────────────┬─────────────────────────────────────────┘ - │ - ▼ -┌──────────────────────────────────────────────────────────────┐ -│ API REST Custom (Flask/FastAPI) │ -│ - Authentification JWT/API Key │ -│ - Rate limiting │ -│ - Logging │ -│ - HTTPS (Let's Encrypt) │ -└────────────────────┬─────────────────────────────────────────┘ - │ - ▼ (réseau privé ou VPN) -┌──────────────────────────────────────────────────────────────┐ -│ Synology NAS / VPS │ -│ ┌────────────────────────────────────────────────────┐ │ -│ │ Docker Compose │ │ -│ │ ┌──────────────────┐ ┌─────────────────────┐ │ │ -│ │ │ Weaviate :8080 │ │ text2vec-transformers│ │ │ -│ │ └──────────────────┘ └─────────────────────┘ │ │ -│ └────────────────────────────────────────────────────┘ │ -└──────────────────────────────────────────────────────────────┘ -``` - -### Pourquoi cette option ? - -✅ **Fiabilité maximale** (5/5) -- HTTP/REST = protocole standard, éprouvé -- Retry automatique facile -- Gestion d'erreur claire - -✅ **Sécurité** (5/5) -- HTTPS obligatoire -- Authentification par API key -- IP whitelisting possible -- Logs d'audit - -✅ **Performance** (4/5) -- Latence réseau inévitable -- Compression gzip possible -- Cache Redis optionnel - -✅ **Maintenance** (5/5) -- Code simple (Flask/FastAPI) -- Monitoring facile -- Déploiement standard - ---- - -## Comparaison des 4 options - -### Option 1 : API REST Custom (⭐ RECOMMANDÉ) - -**Architecture** : App → API REST → Weaviate - -**Code exemple** : - -```python -# api_server.py (déployé sur VPS/Synology) -from fastapi import FastAPI, HTTPException, Security -from fastapi.security import APIKeyHeader -import weaviate - -app = FastAPI() -api_key_header = APIKeyHeader(name="X-API-Key") - -# Connect to Weaviate (local on same machine) -client = weaviate.connect_to_local() - -def verify_api_key(api_key: str = Security(api_key_header)): - if api_key != os.getenv("API_KEY"): - raise HTTPException(status_code=403, detail="Invalid API key") - return api_key - -@app.post("/search") -async def search_chunks( - query: str, - limit: int = 10, - api_key: str = Security(verify_api_key) -): - collection = client.collections.get("Chunk") - result = collection.query.near_text( - query=query, - limit=limit - ) - return {"results": [obj.properties for obj in result.objects]} - -@app.post("/insert_pdf") -async def insert_pdf( - pdf_path: str, - api_key: str = Security(verify_api_key) -): - # Appeler le pipeline library_rag - from utils.pdf_pipeline import process_pdf - result = process_pdf(Path(pdf_path)) - return result -``` - -**Déploiement** : - -```bash -# Sur VPS/Synology -docker-compose up -d weaviate text2vec -uvicorn api_server:app --host 0.0.0.0 --port 8000 --ssl-keyfile key.pem --ssl-certfile cert.pem -``` - -**Avantages** : -- ✅ Contrôle total sur l'API -- ✅ Facile à sécuriser (HTTPS + API key) -- ✅ Peut wrapper tout le pipeline library_rag -- ✅ Monitoring et logging faciles - -**Inconvénients** : -- ⚠️ Code custom à maintenir -- ⚠️ Besoin d'un serveur web (nginx/uvicorn) - ---- - -### Option 2 : Accès direct Weaviate via VPN - -**Architecture** : App → VPN → Weaviate:8080 - -**Configuration** : - -```bash -# Sur Synology : activer VPN Server (OpenVPN/WireGuard) -# Sur client : se connecter au VPN -# Accès direct à http://192.168.x.x:8080 (IP privée Synology) -``` - -**Code client** : - -```python -# Dans votre app LLM -import weaviate - -# Via VPN, IP privée Synology -client = weaviate.connect_to_custom( - http_host="192.168.1.100", - http_port=8080, - http_secure=False, # En VPN, pas besoin HTTPS - grpc_host="192.168.1.100", - grpc_port=50051, -) - -# Utilisation directe -collection = client.collections.get("Chunk") -result = collection.query.near_text(query="justice") -``` - -**Avantages** : -- ✅ Très simple (pas de code custom) -- ✅ Sécurité via VPN -- ✅ Utilise Weaviate client Python directement - -**Inconvénients** : -- ⚠️ VPN doit être actif en permanence -- ⚠️ Latence VPN -- ⚠️ Pas de couche d'abstraction (app doit connaître Weaviate) - ---- - -### Option 3 : MCP Server HTTP sur VPS - -**Architecture** : App → MCP HTTP → Weaviate - -**Problème** : FastMCP SSE ne fonctionne pas bien en production (comme on l'a vu) - -**Solution** : Wrapper custom MCP over HTTP - -```python -# mcp_http_wrapper.py (sur VPS) -from fastapi import FastAPI -from mcp_tools import parse_pdf_handler, search_chunks_handler -from pydantic import BaseModel - -app = FastAPI() - -class SearchRequest(BaseModel): - query: str - limit: int = 10 - -@app.post("/mcp/search_chunks") -async def mcp_search(req: SearchRequest): - # Appeler directement le handler MCP - input_data = SearchChunksInput(query=req.query, limit=req.limit) - result = await search_chunks_handler(input_data) - return result.model_dump() -``` - -**Avantages** : -- ✅ Réutilise le code MCP existant -- ✅ HTTP standard - -**Inconvénients** : -- ⚠️ MCP stdio ne peut pas être utilisé -- ⚠️ Besoin d'un wrapper HTTP custom de toute façon -- ⚠️ Équivalent à l'option 1 en plus complexe - -**Verdict** : Option 1 (API REST pure) est meilleure - ---- - -### Option 4 : Tunnel SSH + Port forwarding - -**Architecture** : App → SSH tunnel → localhost:8080 (Weaviate distant) - -**Configuration** : - -```bash -# Sur votre machine locale -ssh -L 8080:localhost:8080 user@synology-ip - -# Weaviate distant est maintenant accessible sur localhost:8080 -``` - -**Code** : - -```python -# Dans votre app (pense que Weaviate est local) -client = weaviate.connect_to_local() # Va sur localhost:8080 = tunnel SSH -``` - -**Avantages** : -- ✅ Sécurité SSH -- ✅ Simple à configurer -- ✅ Pas de code custom - -**Inconvénients** : -- ⚠️ Tunnel doit rester ouvert -- ⚠️ Pas adapté pour une app cloud -- ⚠️ Latence SSH - ---- - -## 🎯 Recommandations selon votre cas - -### Cas 1 : Application locale (votre PC) → Weaviate Synology/VPS - -**Recommandation** : **VPN + Accès direct Weaviate** (Option 2) - -**Pourquoi** : -- Simple à configurer sur Synology (VPN Server intégré) -- Pas de code custom -- Sécurité via VPN -- Performance acceptable en réseau local/VPN - -**Setup** : - -1. Synology : Activer VPN Server (OpenVPN) -2. Client : Se connecter au VPN -3. Python : `weaviate.connect_to_custom(http_host="192.168.x.x", ...)` - ---- - -### Cas 2 : Application cloud (serveur distant) → Weaviate Synology/VPS - -**Recommandation** : **API REST Custom** (Option 1) - -**Pourquoi** : -- Pas de VPN nécessaire -- HTTPS public avec Let's Encrypt -- Contrôle d'accès par API key -- Rate limiting -- Monitoring - -**Setup** : - -1. VPS/Synology : Docker Compose (Weaviate + API REST) -2. Domaine : api.monrag.com → VPS IP -3. Let's Encrypt : HTTPS automatique -4. App cloud : Appelle `https://api.monrag.com/search?api_key=xxx` - ---- - -### Cas 3 : Développement local temporaire → Weaviate distant - -**Recommandation** : **Tunnel SSH** (Option 4) - -**Pourquoi** : -- Setup en 1 ligne -- Aucune config permanente -- Parfait pour le dev/debug - -**Setup** : - -```bash -ssh -L 8080:localhost:8080 user@vps -# Weaviate distant accessible sur localhost:8080 -``` - ---- - -## 🔧 Déploiement recommandé pour VPS - -### Stack complète - -```yaml -# docker-compose.yml (sur VPS) -version: '3.8' - -services: - # Weaviate + embeddings - weaviate: - image: cr.weaviate.io/semitechnologies/weaviate:1.34.4 - ports: - - "127.0.0.1:8080:8080" # Uniquement localhost (sécurité) - environment: - AUTHENTICATION_APIKEY_ENABLED: "true" - AUTHENTICATION_APIKEY_ALLOWED_KEYS: "my-secret-key" - # ... autres configs - volumes: - - weaviate_data:/var/lib/weaviate - - text2vec-transformers: - image: cr.weaviate.io/semitechnologies/transformers-inference:baai-bge-m3-onnx-latest - # ... config - - # API REST custom - api: - build: ./api - ports: - - "8000:8000" - environment: - WEAVIATE_URL: http://weaviate:8080 - API_KEY: ${API_KEY} - MISTRAL_API_KEY: ${MISTRAL_API_KEY} - depends_on: - - weaviate - restart: always - - # NGINX reverse proxy + HTTPS - nginx: - image: nginx:alpine - ports: - - "80:80" - - "443:443" - volumes: - - ./nginx.conf:/etc/nginx/nginx.conf - - /etc/letsencrypt:/etc/letsencrypt - depends_on: - - api - -volumes: - weaviate_data: -``` - -### NGINX config - -```nginx -# nginx.conf -server { - listen 443 ssl; - server_name api.monrag.com; - - ssl_certificate /etc/letsencrypt/live/api.monrag.com/fullchain.pem; - ssl_certificate_key /etc/letsencrypt/live/api.monrag.com/privkey.pem; - - location / { - proxy_pass http://api:8000; - proxy_set_header Host $host; - proxy_set_header X-Real-IP $remote_addr; - - # Rate limiting - limit_req zone=api_limit burst=10 nodelay; - } -} -``` - ---- - -## 📊 Comparaison finale - -| Critère | VPN + Direct | API REST | Tunnel SSH | MCP HTTP | -|---------|--------------|----------|------------|----------| -| **Fiabilité** | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | -| **Sécurité** | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | -| **Simplicité** | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐ | -| **Performance** | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ | -| **Maintenance** | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐ | -| **Production** | ✅ Oui | ✅ Oui | ❌ Non | ⚠️ Possible | - ---- - -## 💡 Ma recommandation finale - -### Pour Synology (usage personnel/équipe) -**VPN + Accès direct Weaviate** (Option 2) -- Synology a un excellent VPN Server intégré -- Sécurité maximale -- Simple à maintenir - -### Pour VPS (usage production/public) -**API REST Custom** (Option 1) -- Contrôle total -- HTTPS public -- Scalable -- Monitoring complet - ---- - -## 🚀 Prochaine étape recommandée - -Voulez-vous que je crée : - -1. **Le code de l'API REST** (Flask/FastAPI) avec auth + rate limiting ? -2. **Le docker-compose VPS complet** avec nginx + Let's Encrypt ? -3. **Le guide d'installation Synology VPN** + config client ? - -Dites-moi votre cas d'usage exact et je vous prépare la solution complète ! 🎯 diff --git a/navette.txt b/navette.txt deleted file mode 100644 index af78880..0000000 --- a/navette.txt +++ /dev/null @@ -1,2510 +0,0 @@ -================================================================================ -NAVETTE - COMMUNICATION CLAUDE <-> DAVID -================================================================================ -Date: 19 decembre 2025 -Derniere mise a jour: NOUVEAU SPEC CREE - -================================================================================ -NOUVEAU SPEC CREE ! -================================================================================ - -J'ai reecrit COMPLETEMENT le spec selon ta demande. - -NOUVEAU FICHIER: prompts/app_spec_ikario_rag_UI.txt - -================================================================================ -DIFFERENCES AVEC L'ANCIEN SPEC -================================================================================ - -ANCIEN SPEC (app_spec_ikario_rag_improvements.txt): -❌ Modifiait le code Python ikario_rag (mcp_ikario_memory.py, server.py) -❌ Developpait dans SynologyDrive -❌ Ajoutait des outils MCP au serveur Python -❌ A cause le probleme (agent a modifie ton code) - -NOUVEAU SPEC (app_spec_ikario_rag_UI.txt): -✓ Developpe UNIQUEMENT dans generations/ikario_body/ -✓ UTILISE les 7 outils MCP existants (via client) -✓ NE TOUCHE PAS au code ikario_rag -✓ Ajoute interface utilisateur pour exploiter la memoire - -================================================================================ -15 NOUVELLES FEATURES (FRONTEND + BACKEND) -================================================================================ - -BACKEND (server/): -1. Routes API Memory (POST /api/memory/thoughts, GET, etc.) -2. Memory Service Layer (wrapper MCP client) -3. Error Handling & Logging (robuste) -4. Memory Stats Endpoint (statistiques) - -FRONTEND (src/): -5. useMemory Hook (React hook centralise) -6. Memory Panel Component (sidebar memoire) -7. Add Thought Modal (ajouter pensees) -8. Memory Settings Panel (preferences) -9. Save to Memory Button (depuis chat) -10. Memory Context Panel (contexte pendant chat) -11. Memory Search Interface (recherche avancee) -12. Concepts Graph Visualization (graphe interactif) - -DOCUMENTATION & TESTS: -13. Memory API Guide (doc complete) -14. Integration Tests (tests backend) -15. Memory Tour (onboarding users) - -================================================================================ -OUTILS MCP EXISTANTS UTILISES -================================================================================ - -Le serveur ikario_rag expose deja 7 outils MCP: -1. add_thought - Ajouter une pensee -2. add_conversation - Ajouter une conversation -3. search_thoughts - Rechercher pensees -4. search_conversations - Rechercher conversations -5. search_memories - Recherche globale -6. trace_concept_evolution - Tracer evolution concept -7. check_consistency - Check coherence - -On utilise ces outils VIA le client MCP deja present dans ikario_body: -- server/services/mcpClient.js - -================================================================================ -ARCHITECTURE -================================================================================ - -User Interface (React) - ↓ -Backend API (Express routes) - ↓ -Memory Service (wrapper) - ↓ -MCP Client (mcpClient.js) - ↓ -MCP Protocol (stdio) - ↓ -Ikario RAG Server (Python, SynologyDrive) - ↓ -ChromaDB (embeddings) - -PAS DE MODIFICATION dans ikario_rag (SynologyDrive) ! - -================================================================================ -PROCHAINES ACTIONS -================================================================================ - -1. SUPPRIMER L'ANCIEN SPEC? - - Fichier: prompts/app_spec_ikario_rag_improvements.txt - - Options: - a) SUPPRIMER (recommande, cause confusion) - b) RENOMMER en .OLD (backup) - c) GARDER (mais risque relancer par erreur) - -2. SUPPRIMER LES 15 ISSUES LINEAR EXISTANTES? - - Issues TEAMPHI-305 a 319 (anciennes features) - - Ces issues parlent de modifier ikario_rag (on ne veut plus) - - Options: - a) SUPPRIMER toutes (clean slate) - b) GARDER comme doc (mais marquer Canceled) - -3. CREER 15 NOUVELLES ISSUES? - - Pour les 15 features du nouveau spec (UI) - - Issues qui developpent dans ikario_body - - Options: - a) OUI, creer maintenant avec initializer bis - b) OUI, mais manuellement dans Linear - c) NON, juste developper sans Linear - -================================================================================ -MES RECOMMANDATIONS -================================================================================ - -1. ANCIEN SPEC: SUPPRIMER - - Fichier app_spec_ikario_rag_improvements.txt - - Eviter confusion future - - Le nouveau spec est complet - -2. ANCIENNES ISSUES (305-319): SUPPRIMER TOUTES - - Elles parlent de modifier ikario_rag - - On ne veut plus faire ca - - Clean slate - -3. NOUVELLES ISSUES: CREER MAINTENANT - - 15 nouvelles issues pour features UI - - Lancer initializer bis avec nouveau spec - - Developper uniquement dans ikario_body - - Avec restrictions sandbox pour SynologyDrive - -================================================================================ -COMMANDES POUR NETTOYER -================================================================================ - -Si tu es d'accord avec mes recommandations: - -1. Supprimer ancien spec: - rm C:/GitHub/Linear_coding/prompts/app_spec_ikario_rag_improvements.txt - -2. Supprimer 15 anciennes issues: - (je peux le faire via Linear API) - -3. Creer 15 nouvelles issues: - python autonomous_agent_demo.py --project-dir ikario_body --new-spec app_spec_ikario_rag_UI.txt - -4. Ajouter restrictions sandbox (avant de lancer agent): - (je dois modifier autonomous_agent_demo.py pour bloquer SynologyDrive) - -================================================================================ -QUESTIONS POUR TOI -================================================================================ - -Reponds avec 3 choix: - -1. Ancien spec (app_spec_ikario_rag_improvements.txt): - a) SUPPRIMER - b) RENOMMER .OLD - c) GARDER - -2. Anciennes issues Linear (TEAMPHI-305 a 319): - a) SUPPRIMER toutes - b) GARDER comme doc (Canceled) - c) GARDER telles quelles - -3. Nouvelles issues pour nouveau spec: - a) CREER maintenant (agent initializer bis) - b) CREER manuellement dans Linear - c) PAS D'ISSUES (developper sans Linear) - -Exemple de reponse: -1. a -2. a -3. a - -================================================================================ -VERIFICATION NOUVEAU SPEC -================================================================================ - -Le nouveau spec est dans: prompts/app_spec_ikario_rag_UI.txt - -Tu peux le lire pour verifier que c'est bien ce que tu veux. - -Points importants: -- 15 features frontend/backend -- ZERO modification ikario_rag -- Developpe dans ikario_body uniquement -- Utilise outils MCP existants -- 5 phases implementation (7-10 jours total) - -Si tu veux des modifications au spec, dis-le maintenant AVANT de creer les issues. - -================================================================================ -SYNTHESE DES BESOINS FONCTIONNELS -================================================================================ -Date: 19 decembre 2025 -Derniere mise a jour: SYNTHESE AJOUTEE - -Tu as demande de clarifier les deux fonctionnalites principales. - -Voici ma comprehension et ma synthese: - -================================================================================ -BESOIN 1: PENSEES (THOUGHTS) -================================================================================ - -COMPORTEMENT SOUHAITE: -- Le LLM peut ECRIRE des pensees quand il le souhaite -- Le LLM peut LIRE des pensees existantes -- Le LLM peut RECHERCHER des pensees pertinentes - -OUTILS MCP UTILISES (deja exposes par ikario_rag): -1. add_thought - Pour ECRIRE une nouvelle pensee -2. search_thoughts - Pour RECHERCHER des pensees - -COMMENT CA MARCHE: -- Pendant une conversation, le LLM decide de sauvegarder une reflexion -- Exemple: "Je viens de comprendre que l'utilisateur prefere React a Vue" -- Le LLM appelle add_thought via le MCP client -- La pensee est stockee dans ChromaDB avec embeddings semantiques -- Plus tard, le LLM peut rechercher: "preferences frontend de l'utilisateur" -- search_thoughts retourne les pensees pertinentes - -MODE D'INVOCATION: -- MANUEL (LLM decide): Le LLM utilise l'outil quand il juge necessaire -- MANUEL (User decide): Bouton "Save to Memory" dans l'UI chat -- SEMI-AUTO: Suggestion automatique apres conversations importantes - -================================================================================ -BESOIN 2: CONVERSATIONS (AUTO-SAVE) -================================================================================ - -COMPORTEMENT SOUHAITE: -- Apres CHAQUE reponse du LLM, la conversation est sauvegardee -- Sauvegarde AUTOMATIQUE (pas besoin d'action manuelle) -- Meme conversation = tous les messages sont lies (conversation_id) - -OUTILS MCP UTILISES (deja exposes par ikario_rag): -1. add_conversation - Pour SAUVEGARDER la conversation - -COMMENT CA MARCHE: -- User: "Comment faire un fetch API en React?" -- LLM: [Reponse detaillee sur fetch API] -- AUTOMATIQUEMENT apres la reponse du LLM: - * Backend detecte fin de reponse LLM - * Backend appelle add_conversation avec: - - user_message: "Comment faire un fetch API en React?" - - assistant_message: [la reponse du LLM] - - conversation_id: ID unique pour cette session chat - * ChromaDB stocke avec embeddings semantiques -- Prochaine fois, recherche "React fetch API" retournera cette conversation - -ARCHITECTURE TECHNIQUE: -- Hook backend: onMessageComplete() -- Declenche: Apres chaque reponse LLM streamed completement -- Appelle: mcpClient.callTool('add_conversation', {...}) -- Parametres: - { - user_message: string, - assistant_message: string, - conversation_id: string (UUID session), - timestamp: ISO date, - metadata: { - model: "claude-sonnet-4.5", - tokens: number, - ... - } - } - -================================================================================ -MAPPING COMPLET DES 7 OUTILS MCP -================================================================================ - -POUR PENSEES (THOUGHTS): -1. add_thought -------> Ecrire une nouvelle pensee -2. search_thoughts ---> Rechercher des pensees -3. trace_concept_evolution -> Tracer evolution d'un concept dans les pensees -4. check_consistency -> Verifier coherence entre pensees - -POUR CONVERSATIONS: -1. add_conversation -----> Sauvegarder une conversation (AUTO) -2. search_conversations -> Rechercher dans l'historique -3. search_memories -------> Recherche globale (thoughts + conversations) - -AVANCEES (optionnel): -1. trace_concept_evolution -> Voir comment un concept evolue dans le temps -2. check_consistency --------> Detecter contradictions - -================================================================================ -ARCHITECTURE D'IMPLEMENTATION -================================================================================ - -BACKEND (Express API): ------------------- -1. POST /api/chat/message - - Recoit message user - - Envoie a Claude API - - Stream la reponse - - APRES streaming complete: - * Appelle add_conversation automatiquement - * Retourne success au frontend - -2. POST /api/memory/thoughts (manuel) - - User clique "Save to Memory" - - Backend appelle add_thought - - Retourne confirmation - -3. GET /api/memory/search?q=... - - User cherche dans sidebar - - Backend appelle search_memories - - Retourne resultats (thoughts + conversations) - -FRONTEND (React): --------------- -1. Chat Interface: - - Bouton "Save to Memory" sur chaque message - - Auto-save indicator (petit icon quand conversation sauvegardee) - -2. Memory Sidebar: - - Barre de recherche - - Liste de resultats (thoughts + conversations) - - Filtre: "Thoughts only" / "Conversations only" / "All" - -3. Memory Context Panel: - - Pendant qu'on tape, affiche pensees/conversations pertinentes - - Auto-recherche basee sur le contexte du message - -================================================================================ -EXEMPLE CONCRET D'UTILISATION -================================================================================ - -SCENARIO 1: CONVERSATION AUTO-SAUVEGARDEE ------------------------------------------ -User: "Comment implementer un dark mode en React?" -LLM: [Reponse detaillee avec code examples] -BACKEND (auto): Appelle add_conversation avec les deux messages -ChromaDB: Stocke avec embeddings - -2 semaines plus tard: -User: "dark mode" -Search: Retourne la conversation precedente -LLM: Peut relire et continuer la discussion - -SCENARIO 2: PENSEE MANUELLE ---------------------------- -User: "Je prefere utiliser TailwindCSS plutot que styled-components" -LLM: "D'accord, je note votre preference" -LLM (interne): Appelle add_thought("User prefers TailwindCSS over styled-components") -ChromaDB: Stocke la preference - -Plus tard: -User: "Aide-moi a styler ce composant" -LLM (interne): Recherche "styling preferences" -Result: Trouve la pensee sur TailwindCSS -LLM: "Je vais utiliser TailwindCSS pour le styling, comme vous preferez" - -SCENARIO 3: BOUTON SAVE TO MEMORY ---------------------------------- -User: "Voici nos conventions de nommage: components en PascalCase, utils en camelCase" -LLM: [Repond avec confirmation] -User: [Clique "Save to Memory"] -Frontend: POST /api/memory/thoughts -Backend: Appelle add_thought avec le message user -ChromaDB: Stocke les conventions - -Plus tard: -LLM cree un nouveau composant et respecte automatiquement les conventions -(car il peut rechercher "naming conventions" avant de generer du code) - -================================================================================ -DIFFERENCES CLES ENTRE THOUGHTS ET CONVERSATIONS -================================================================================ - -THOUGHTS: -- Contenu: Reflexions, preferences, conventions, apprentissages -- Taille: Generalement courts (1-3 phrases) -- Declenchement: Manuel (LLM decide ou User clique bouton) -- Granularite: Atomique (1 pensee = 1 concept) -- Exemple: "User prefers functional components over class components" - -CONVERSATIONS: -- Contenu: Echanges complets user-assistant -- Taille: Variable (peut etre long) -- Declenchement: AUTOMATIQUE apres chaque reponse LLM -- Granularite: Dialogue (1 conversation = 1 echange Q&A) -- Exemple: Tout l'echange sur "Comment faire un fetch API en React?" - -LES DEUX ENSEMBLE: -- Complementaires: Thoughts = knowledge, Conversations = context -- Recherchables: search_memories cherche dans les deux -- Evolution: trace_concept_evolution fonctionne sur les deux - -================================================================================ -QUESTIONS DE CLARIFICATION -================================================================================ - -Avant de continuer, j'ai besoin de confirmer quelques details: - -1. AUTO-SAVE CONVERSATIONS: - - Faut-il sauvegarder TOUTES les conversations? - - Ou seulement certaines (ex: > 100 tokens, contient du code, etc.)? - - Mon avis: TOUTES, mais avec option user "Disable auto-save" dans settings - -2. CONVERSATION_ID: - - Un conversation_id = une session chat complete (plusieurs messages)? - - Ou un conversation_id = un echange unique (1 user msg + 1 assistant msg)? - - Mon avis: Session complete (comme tu as dit "meme conversation") - -3. DECLENCHEMENT AUTO-SAVE: - - Immediate (apres chaque reponse)? - - Ou batched (toutes les 5 minutes)? - - Mon avis: Immediate mais asynchrone (ne bloque pas le chat) - -4. PRIVACY: - - Les conversations auto-sauvegardees sont "private" par defaut? - - Ou "shared" (visible par d'autres users)? - - Mon avis: Private par defaut dans un contexte single-user - -================================================================================ -RECOMMANDATION FINALE -================================================================================ - -Je recommande cette implementation: - -PHASE 1 (Core): -- Auto-save conversations (add_conversation apres chaque reponse) -- Bouton manuel "Save to Memory" (add_thought) -- Search interface basique (search_memories) - -PHASE 2 (Enhanced): -- Memory sidebar avec resultats enrichis -- Filtres thoughts vs conversations -- Memory context panel (suggestions pendant typing) - -PHASE 3 (Advanced): -- Concepts graph visualization (trace_concept_evolution) -- Consistency checker (check_consistency) -- Memory settings (disable auto-save, privacy, etc.) - -TOTAL: 15 features comme dans le spec app_spec_ikario_rag_UI.txt - -================================================================================ -MODIFICATIONS VALIDEES -================================================================================ -Date: 19 decembre 2025 - 23h30 - -MODIFICATION 1: PENSEES = LLM SEULEMENT ---------------------------------------- -SUPPRIME: -- Bouton "Save to Memory" pour l'utilisateur -- Suggestions automatiques - -CONSERVE: -- Seulement le LLM decide quand ecrire/lire ses pensees -- Les pensees sont un outil INTERNE du LLM - -================================================================================ -ANALYSE CODE: add_conversation -================================================================================ - -J'ai lu mcp_ikario_memory.py (ligne 100-189). - -REPONSE: NON, add_conversation NE PEUT PAS faire de mise a jour incrementale - -PROBLEME IDENTIFIE: ------------------- -Ligne 160-164: -```python -self.conversations.add( - documents=[full_conversation_text], - metadatas=[main_metadata], - ids=[conversation_id] # <-- PROBLEME ICI -) -``` - -ChromaDB.add() avec un ID existant: -- Option 1: Erreur "ID already exists" -- Option 2: Ecrase completement l'ancien document - -DONC: -- Appeler add_conversation 2 fois avec meme conversation_id = ECRASEMENT -- Pas de mecanisme "append" pour ajouter des messages -- C'est un REMPLACEMENT complet, pas une mise a jour incrementale - -COMPORTEMENT ACTUEL: -------------------- -Premier appel: -add_conversation(conversation_id="session_123", messages=[msg1, msg2]) --> Cree conversation avec 2 messages - -Deuxieme appel: -add_conversation(conversation_id="session_123", messages=[msg1, msg2, msg3, msg4]) --> ECRASE la conversation precedente --> Remplace completement par 4 messages - -CONSEQUENCE POUR TON BESOIN: ----------------------------- -Tu veux sauvegarder apres CHAQUE reponse du LLM dans la MEME conversation. - -Exemple: -User: "Bonjour" -LLM: "Salut!" --> Sauvegarde conversation_id="conv_20251219" avec 2 messages - -User: "Comment vas-tu?" -LLM: "Bien merci!" --> Doit ajouter 2 nouveaux messages a "conv_20251219" --> MAIS add_conversation va ECRASER les 2 premiers messages! - -================================================================================ -SOLUTION: TU DOIS AJOUTER UN NOUVEL OUTIL -================================================================================ - -OPTION A (recommande): append_to_conversation ----------------------------------------------- -Nouvel outil qui ajoute des messages sans ecraser: - -```python -async def append_to_conversation( - self, - conversation_id: str, - new_messages: List[Dict[str, str]] -) -> str: - """ - Ajoute de nouveaux messages a une conversation existante - """ - # 1. Recuperer la conversation existante - existing = self.conversations.get(ids=[conversation_id]) - - # 2. Extraire les anciens messages (ou les stocker autrement) - - # 3. Merger old_messages + new_messages - - # 4. Re-creer le document principal avec tous les messages - - # 5. Ajouter les nouveaux messages individuels -``` - -OPTION B: update_conversation (remplacement complet) ---------------------------------------------------- -Similaire a add_conversation mais avec upsert: - -```python -async def update_conversation( - self, - conversation_id: str, - all_messages: List[Dict[str, str]], - ... -) -> str: - """ - Remplace completement une conversation existante - """ - # Delete old documents - self.conversations.delete(ids=[conversation_id]) - - # Add new version - # (meme code que add_conversation) -``` - -OPTION C: Modifier add_conversation ------------------------------------ -Ajouter logique de detection: - -```python -async def add_conversation(...): - # Verifier si conversation_id existe deja - try: - existing = self.conversations.get(ids=[conversation_id]) - if existing: - # Faire un append - except: - # Creer nouvelle conversation -``` - -================================================================================ -MA RECOMMANDATION -================================================================================ - -UTILISE OPTION A: append_to_conversation - -POURQUOI: -- Semantique claire: "append" = ajouter sans ecraser -- Separation des responsabilites: add = creation, append = ajout -- Plus facile a debugger -- Pas de "magic" (Option C serait trop implicite) - -ARCHITECTURE BACKEND ikario_body: -------------------------------- -POST /api/chat/message --> User envoie message --> LLM repond --> Apres reponse complete: - - Si c'est le premier message de la session: - * Appelle add_conversation(conversation_id, [user_msg, assistant_msg]) - - Si conversation existe deja: - * Appelle append_to_conversation(conversation_id, [user_msg, assistant_msg]) - -ALTERNATIVE SIMPLE (sans append): ---------------------------------- -Si tu ne veux pas modifier ikario_rag: -- Backend garde TOUS les messages de la session en memoire -- Appelle add_conversation SEULEMENT a la fin de la session (quand user ferme le chat) -- Parametres: conversation_id + TOUS les messages accumules - -MAIS: -- Risque de perte si crash avant la fin -- Pas de recherche en temps reel pendant la conversation -- Moins robuste - -================================================================================ -DECISION REQUISE -================================================================================ - -Tu dois choisir: - -1. AJOUTER append_to_conversation dans ikario_rag - - Je modifie mcp_ikario_memory.py (dans SynologyDrive) - - J'ajoute le nouvel outil au serveur MCP - - Puis je mets a jour le spec UI - -2. UTILISER ALTERNATIVE SIMPLE (save a la fin de session) - - Pas de modification ikario_rag - - Backend accumule messages en memoire - - Sauvegarde complete a la fin - -3. MODIFIER add_conversation (Option C) - - Ajouter logique auto-detect + append - - Moins explicite mais plus simple cote client - -Quelle option preferes-tu? - -================================================================================ -QUESTION CRITIQUE: ECRASEMENT ET EMBEDDINGS -================================================================================ -Date: 19 decembre 2025 - 23h35 - -Tu demandes: "Est-ce que l'écrasement supprime aussi les anciens embeddings?" - -REPONSE COURTE: NON, c'est encore PIRE que je pensais! - -ANALYSE DETAILLEE: ------------------ - -Rappel de l'architecture add_conversation: - -1. DOCUMENT PRINCIPAL (ligne 160-164): - ID = conversation_id (ex: "conv_20251219_1430") - Contenu = conversation complete (tous les messages concatenes) - -2. MESSAGES INDIVIDUELS (ligne 166-187): - IDs = conversation_id + "_msg_001", "_msg_002", etc. - Contenu = chaque message avec son propre embedding - -SCENARIO PROBLEMATIQUE: ----------------------- - -Premier appel: -add_conversation(conversation_id="conv_123", messages=[msg1, msg2]) - -ChromaDB contient: -- conv_123 (document principal, embedding de "msg1 + msg2") -- conv_123_msg_001 (msg1, embedding individuel) -- conv_123_msg_002 (msg2, embedding individuel) - -Deuxieme appel: -add_conversation(conversation_id="conv_123", messages=[msg1, msg2, msg3, msg4]) - -QUE SE PASSE-T-IL? - -1. Document principal conv_123: - - ECRASE (nouveau embedding pour "msg1 + msg2 + msg3 + msg4") - - Ancien embedding perdu - -2. Messages individuels: - - conv_123_msg_001 deja existe -> ECRASE (nouveau embedding pour msg1) - - conv_123_msg_002 deja existe -> ECRASE (nouveau embedding pour msg2) - - conv_123_msg_003 nouveau -> CREE - - conv_123_msg_004 nouveau -> CREE - -RESULTAT: --------- -- Anciens embeddings ECRASES (pas supprimés, mais remplaces) -- PAS de pollution si les messages sont identiques -- MAIS si les messages changent = embeddings incorrects - -PIRE SCENARIO: -------------- -Si le backend accumule mal les messages: - -Premier appel: [msg1, msg2] -Deuxieme appel: [msg3, msg4] <-- OUBLIE msg1 et msg2! - -ChromaDB contient: -- conv_123 (embedding de "msg3 + msg4") <-- FAUX! -- conv_123_msg_001 (embedding de msg3) <-- FAUX ID! -- conv_123_msg_002 (embedding de msg4) <-- FAUX ID! - -Les anciens msg_001 et msg_002 (msg1 et msg2) sont PERDUS. - -CONCLUSION: ----------- -L'ecrasement: -- REMPLACE les embeddings (pas de suppression propre) -- NECESSITE que le backend envoie TOUS les messages a chaque fois -- RISQUE de perte de donnees si le backend se trompe - -C'est pour ca que append_to_conversation est NECESSAIRE! - -================================================================================ -POURQUOI append_to_conversation EST INDISPENSABLE -================================================================================ - -Avec append_to_conversation: - -Premier appel: -add_conversation(conversation_id="conv_123", messages=[msg1, msg2]) - -ChromaDB: -- conv_123 (2 messages) -- conv_123_msg_001, conv_123_msg_002 - -Deuxieme appel: -append_to_conversation(conversation_id="conv_123", new_messages=[msg3, msg4]) - -Logic interne: -1. GET existing conversation "conv_123" -2. Extract metadata: message_count = 2 -3. Calculate next sequence = 3 -4. Update document principal: - - DELETE conv_123 - - ADD conv_123 (nouveau embedding "msg1 + msg2 + msg3 + msg4") -5. Add new individual messages: - - conv_123_msg_003 (msg3) - - conv_123_msg_004 (msg4) - -RESULTAT: -- Anciens embeddings individuels CONSERVES (msg_001, msg_002) -- Nouveau embedding principal CORRECT (4 messages) -- Pas de perte de donnees -- Sequence correcte - -================================================================================ -IMPLEMENTATION append_to_conversation (SKETCH) -================================================================================ - -```python -async def append_to_conversation( - self, - conversation_id: str, - new_messages: List[Dict[str, str]], - update_context: Optional[Dict[str, Any]] = None -) -> str: - """ - Ajoute de nouveaux messages a une conversation existante - - Args: - conversation_id: ID de la conversation existante - new_messages: Nouveaux messages a ajouter - update_context: Metadonnees a mettre a jour (optionnel) - - Returns: - Message de confirmation - """ - # 1. VERIFIER QUE LA CONVERSATION EXISTE - try: - existing = self.conversations.get(ids=[conversation_id]) - except Exception as e: - raise ValueError(f"Conversation {conversation_id} not found") - - if not existing['documents'] or len(existing['documents']) == 0: - raise ValueError(f"Conversation {conversation_id} not found") - - # 2. EXTRAIRE LES METADONNEES EXISTANTES - existing_metadata = existing['metadatas'][0] if existing['metadatas'] else {} - current_message_count = int(existing_metadata.get('message_count', 0)) - - # 3. CALCULER LA NOUVELLE SEQUENCE - next_sequence = current_message_count + 1 - - # 4. CONSTRUIRE LE NOUVEAU TEXTE COMPLET - # Recuperer l'ancien texte - old_full_text = existing['documents'][0] - - # Ajouter les nouveaux messages - new_text_parts = [] - for msg in new_messages: - author = msg.get('author', 'unknown') - content = msg.get('content', '') - new_text_parts.append(f"{author}: {content}") - - new_text = "\n".join(new_text_parts) - updated_full_text = old_full_text + "\n" + new_text - - # 5. METTRE A JOUR LES METADONNEES - updated_metadata = existing_metadata.copy() - updated_metadata['message_count'] = str(current_message_count + len(new_messages)) - - # Merger update_context si fourni - if update_context: - for key, value in update_context.items(): - if isinstance(value, list): - updated_metadata[key] = ", ".join(str(v) for v in value) - elif isinstance(value, dict): - updated_metadata[key] = json.dumps(value) - else: - updated_metadata[key] = str(value) - - # 6. SUPPRIMER L'ANCIEN DOCUMENT PRINCIPAL - self.conversations.delete(ids=[conversation_id]) - - # 7. AJOUTER LE NOUVEAU DOCUMENT PRINCIPAL - self.conversations.add( - documents=[updated_full_text], - metadatas=[updated_metadata], - ids=[conversation_id] - ) - - # 8. AJOUTER LES NOUVEAUX MESSAGES INDIVIDUELS - for i, msg in enumerate(new_messages): - msg_id = f"{conversation_id}_msg_{str(next_sequence + i).zfill(3)}" - msg_content = msg.get('content', '') - msg_author = msg.get('author', 'unknown') - msg_timestamp = msg.get('timestamp', '') - - msg_metadata = { - "conversation_id": conversation_id, - "message_type": "individual_message", - "author": msg_author, - "timestamp": msg_timestamp, - "sequence": str(next_sequence + i) - } - - self.conversations.add( - documents=[msg_content], - metadatas=[msg_metadata], - ids=[msg_id] - ) - - return f"Conversation {conversation_id} updated: added {len(new_messages)} messages (total: {updated_metadata['message_count']})" -``` - -AVANTAGES: -- Conserve les anciens embeddings individuels -- Met a jour correctement le document principal -- Gere la sequence automatiquement -- Pas de risque de perte de donnees - -================================================================================ -DECISION FINALE REQUISE -================================================================================ - -Maintenant que tu comprends les risques de l'ecrasement: - -OPTION 1: J'ajoute append_to_conversation dans ikario_rag - - Implementation robuste (comme ci-dessus) - - Pas de risque de perte d'embeddings - - Architecture propre - -OPTION 2: Alternative simple (accumulation backend) - - Backend garde tous les messages en memoire - - Appelle add_conversation une seule fois a la fin - - RISQUE: perte si crash - - RISQUE: pas de recherche en temps reel - -OPTION 3: Modifier add_conversation pour auto-append - - Ajouter la logique ci-dessus dans add_conversation - - Plus transparent pour le client - - Mais semantique moins claire - -Je recommande FORTEMENT Option 1. - -Quelle option choisis-tu? - -================================================================================ -OPTION 1 CHOISIE: append_to_conversation -================================================================================ -Date: 19 decembre 2025 - 23h40 - -Tu choisis Option 1 avec une question cruciale: -"Si la conversation n'existe pas?" - -EXCELLENTE QUESTION! Il y a 2 approches: - -================================================================================ -APPROCHE A: append_to_conversation AVEC AUTO-CREATE (recommandé) -================================================================================ - -append_to_conversation détecte si la conversation existe: -- Si existe: fait un append -- Si n'existe pas: crée la conversation (comme add_conversation) - -AVANTAGES: -- Backend simplifié (1 seul appel, toujours le même) -- Pas besoin de tracker si c'est le premier message -- Robuste - -CODE: -```python -async def append_to_conversation( - self, - conversation_id: str, - new_messages: List[Dict[str, str]], - participants: Optional[List[str]] = None, - context: Optional[Dict[str, Any]] = None -) -> str: - """ - Ajoute des messages à une conversation (ou la crée si n'existe pas) - - Args: - conversation_id: ID de la conversation - new_messages: Messages à ajouter - participants: Liste participants (requis si création) - context: Métadonnées (requis si création) - """ - # 1. VÉRIFIER SI LA CONVERSATION EXISTE - try: - existing = self.conversations.get(ids=[conversation_id]) - conversation_exists = ( - existing and - existing['documents'] and - len(existing['documents']) > 0 - ) - except: - conversation_exists = False - - # 2. SI N'EXISTE PAS: CRÉER - if not conversation_exists: - if not participants or not context: - raise ValueError( - "participants and context required when creating new conversation" - ) - return await self.add_conversation( - participants=participants, - messages=new_messages, - context=context, - conversation_id=conversation_id - ) - - # 3. SI EXISTE: APPEND - # [Code d'append comme avant...] - existing_metadata = existing['metadatas'][0] - current_message_count = int(existing_metadata.get('message_count', 0)) - next_sequence = current_message_count + 1 - - old_full_text = existing['documents'][0] - - new_text_parts = [] - for msg in new_messages: - author = msg.get('author', 'unknown') - content = msg.get('content', '') - new_text_parts.append(f"{author}: {content}") - - new_text = "\n".join(new_text_parts) - updated_full_text = old_full_text + "\n" + new_text - - updated_metadata = existing_metadata.copy() - updated_metadata['message_count'] = str(current_message_count + len(new_messages)) - - if context: - for key, value in context.items(): - if isinstance(value, list): - updated_metadata[key] = ", ".join(str(v) for v in value) - elif isinstance(value, dict): - updated_metadata[key] = json.dumps(value) - else: - updated_metadata[key] = str(value) - - self.conversations.delete(ids=[conversation_id]) - - self.conversations.add( - documents=[updated_full_text], - metadatas=[updated_metadata], - ids=[conversation_id] - ) - - for i, msg in enumerate(new_messages): - msg_id = f"{conversation_id}_msg_{str(next_sequence + i).zfill(3)}" - msg_content = msg.get('content', '') - msg_author = msg.get('author', 'unknown') - msg_timestamp = msg.get('timestamp', '') - - msg_metadata = { - "conversation_id": conversation_id, - "message_type": "individual_message", - "author": msg_author, - "timestamp": msg_timestamp, - "sequence": str(next_sequence + i) - } - - self.conversations.add( - documents=[msg_content], - metadatas=[msg_metadata], - ids=[msg_id] - ) - - return f"Conversation {conversation_id} updated: added {len(new_messages)} messages (total: {updated_metadata['message_count']})" -``` - -UTILISATION BACKEND (ikario_body): -```javascript -// POST /api/chat/message -app.post('/api/chat/message', async (req, res) => { - const { message, conversationId } = req.body; - - // Generate conversation_id if first message - const convId = conversationId || `conv_${Date.now()}`; - - // Get LLM response - const llmResponse = await callClaudeAPI(message); - - // ALWAYS use append_to_conversation (handles creation automatically) - await mcpClient.callTool('append_to_conversation', { - conversation_id: convId, - new_messages: [ - { author: 'user', content: message, timestamp: new Date().toISOString() }, - { author: 'assistant', content: llmResponse, timestamp: new Date().toISOString() } - ], - participants: ['user', 'assistant'], // Requis pour première fois - context: { - category: 'chat', - date: new Date().toISOString() - } - }); - - res.json({ response: llmResponse, conversationId: convId }); -}); -``` - -SIMPLICITÉ BACKEND: -- Toujours le même appel (append_to_conversation) -- Pas de logique if/else -- MCP server gère la complexité - -================================================================================ -APPROCHE B: GARDER add_conversation ET append_to_conversation SÉPARÉS -================================================================================ - -append_to_conversation REJETTE si conversation n'existe pas: -- Backend doit tracker si c'est le premier message -- Appelle add_conversation pour création -- Appelle append_to_conversation pour ajouts - -CODE append_to_conversation (strict): -```python -async def append_to_conversation( - self, - conversation_id: str, - new_messages: List[Dict[str, str]] -) -> str: - """ - Ajoute des messages à une conversation EXISTANTE - Lève une erreur si la conversation n'existe pas - """ - # Vérifier existence - try: - existing = self.conversations.get(ids=[conversation_id]) - if not existing['documents'] or len(existing['documents']) == 0: - raise ValueError(f"Conversation {conversation_id} does not exist. Use add_conversation first.") - except Exception as e: - raise ValueError(f"Conversation {conversation_id} not found: {e}") - - # [Reste du code d'append...] -``` - -UTILISATION BACKEND (plus complexe): -```javascript -// POST /api/chat/message -app.post('/api/chat/message', async (req, res) => { - const { message, conversationId, isFirstMessage } = req.body; - - // Generate ID if new - const convId = conversationId || `conv_${Date.now()}`; - - const llmResponse = await callClaudeAPI(message); - - const messages = [ - { author: 'user', content: message, timestamp: new Date().toISOString() }, - { author: 'assistant', content: llmResponse, timestamp: new Date().toISOString() } - ]; - - // DIFFÉRENT selon si première fois ou non - if (isFirstMessage || !conversationId) { - // Première fois: créer - await mcpClient.callTool('add_conversation', { - conversation_id: convId, - participants: ['user', 'assistant'], - messages: messages, - context: { category: 'chat', date: new Date().toISOString() } - }); - } else { - // Fois suivantes: append - await mcpClient.callTool('append_to_conversation', { - conversation_id: convId, - new_messages: messages - }); - } - - res.json({ response: llmResponse, conversationId: convId }); -}); -``` - -DÉSAVANTAGES: -- Backend plus complexe (if/else) -- Doit tracker isFirstMessage -- Risque d'erreur si mauvaise détection - -================================================================================ -MA RECOMMANDATION FINALE -================================================================================ - -APPROCHE A: append_to_conversation AVEC AUTO-CREATE - -POURQUOI: -1. Backend simplifié (1 seul appel) -2. Robuste (pas de risque d'oublier add_conversation) -3. Sémantique acceptable (append = "ajouter à", que ça existe ou non) -4. Moins de surface d'erreur - -IMPLEMENTATION: -- J'ajoute append_to_conversation dans mcp_ikario_memory.py -- Avec détection + auto-create si n'existe pas -- J'expose l'outil dans server.py -- Je mets à jour le spec UI pour utiliser cet outil - -ALTERNATIVE: -Si tu préfères la sémantique stricte (Approche B), je peux faire ça aussi. - -================================================================================ -PROCHAINE ÉTAPE -================================================================================ - -Dis-moi: -1. APPROCHE A (auto-create) ou APPROCHE B (strict)? -2. Une fois choisi, je vais: - - Modifier mcp_ikario_memory.py - - Modifier server.py pour exposer l'outil - - Tester l'implémentation - - Mettre à jour le spec UI - - Supprimer ancien spec - - Supprimer 15 anciennes issues - - Créer 15 nouvelles issues - - Lancer agent initializer bis - -Quelle approche préfères-tu? - -================================================================================ -QUESTION: EST-CE QUE LA REFLEXION LLM EST ENREGISTREE? -================================================================================ -Date: 19 decembre 2025 - 23h50 - -Tu demandes si sont enregistrés: -1. Message utilisateur -2. Réflexion LLM (thinking) -3. Message LLM (réponse) - -RÉPONSE ACTUELLE: NON, la réflexion LLM n'est PAS enregistrée - -ANALYSE DU CODE ACTUEL: ----------------------- - -Structure des messages (ligne 113): -```python -messages: List[Dict[str, str]] -# [{"author": "david", "content": "...", "timestamp": "14:30:05"}, ...] -``` - -Champs actuels: -- author: "david" ou "ikario" -- content: Le contenu du message -- timestamp: Horodatage - -Il n'y a PAS de champ "thinking" ou "reflection". - -CE QUI EST ENREGISTRÉ ACTUELLEMENT: ------------------------------------ - -Message user: -{ - "author": "user", - "content": "Comment faire un fetch API?", - "timestamp": "14:30:00" -} - -Message LLM: -{ - "author": "assistant", - "content": "Voici comment faire un fetch API: ...", <-- SEULEMENT la réponse finale - "timestamp": "14:30:05" -} - -La réflexion interne (Extended Thinking) n'est PAS capturée. - -================================================================================ -QUESTION: VEUX-TU ENREGISTRER LA RÉFLEXION LLM? -================================================================================ - -Avec Extended Thinking, Claude génère: -1. Thinking (réflexion interne, raisonnement) -2. Response (réponse visible à l'utilisateur) - -OPTION 1: ENREGISTRER SEULEMENT LA RÉPONSE (actuel) ---------------------------------------------------- -Message LLM dans ChromaDB: -{ - "author": "assistant", - "content": "Voici comment faire un fetch API: ..." -} - -AVANTAGES: -- Plus simple -- Moins de données stockées -- Embeddings basés sur le contenu utile - -INCONVÉNIENTS: -- Perte du raisonnement interne -- Impossible de retrouver "comment le LLM a pensé" - -OPTION 2: ENREGISTRER THINKING + RÉPONSE (recommandé) ------------------------------------------------------ -Message LLM dans ChromaDB: -{ - "author": "assistant", - "content": "Voici comment faire un fetch API: ...", - "thinking": "L'utilisateur demande... je dois expliquer... [réflexion complète]" -} - -OU (séparé): -Message thinking: -{ - "author": "assistant", - "message_type": "thinking", - "content": "[réflexion interne]" -} - -Message response: -{ - "author": "assistant", - "message_type": "response", - "content": "Voici comment faire..." -} - -AVANTAGES: -- Capture le raisonnement complet -- Recherche sémantique sur la réflexion -- Comprendre l'évolution de la pensée -- Traçabilité totale - -INCONVÉNIENTS: -- Plus de données stockées -- Structure plus complexe - -OPTION 3: THINKING SÉPARÉ (dans thoughts, pas conversations) ------------------------------------------------------------- -Conversation: -- Message user -- Message LLM (réponse seulement) - -Thoughts (collection séparée): -- Thinking du LLM stocké comme une "pensée" - -AVANTAGES: -- Séparation claire: conversations = dialogue, thoughts = réflexions -- Cohérent avec l'architecture actuelle (2 collections) - -INCONVÉNIENTS: -- Perte du lien direct avec la conversation -- Plus complexe à récupérer - -================================================================================ -MA RECOMMANDATION -================================================================================ - -OPTION 2 (ENREGISTRER THINKING + RÉPONSE dans le même message) - -Structure proposée: -```python -messages: List[Dict[str, Any]] # Changement: Any au lieu de str - -# Message user (inchangé) -{ - "author": "user", - "content": "Comment faire un fetch API?", - "timestamp": "14:30:00" -} - -# Message LLM (nouveau format) -{ - "author": "assistant", - "content": "Voici comment faire un fetch API: ...", - "thinking": "[Réflexion interne du LLM...]", # NOUVEAU - "timestamp": "14:30:05" -} -``` - -IMPLÉMENTATION: -- Modifier add_conversation pour accepter champ "thinking" optionnel -- Stocker thinking dans les métadonnées du message individuel -- Document principal: inclure ou non le thinking? (à décider) - -POUR LE DOCUMENT PRINCIPAL: -OPTION A: Inclure thinking - "user: Comment faire...\nassistant (thinking): [réflexion]\nassistant: Voici comment..." - -OPTION B: Exclure thinking (seulement dialogue visible) - "user: Comment faire...\nassistant: Voici comment..." - -Je recommande OPTION A (inclure thinking dans document principal). - -POURQUOI: -- Recherche sémantique plus riche -- Retrouver "cette fois où le LLM a raisonné sur X" -- Traçabilité complète - -================================================================================ -DÉCISION REQUISE -================================================================================ - -Avant de commencer à développer append_to_conversation, tu dois décider: - -1. ENREGISTRER LA RÉFLEXION LLM? - a) OUI - Ajouter champ "thinking" dans les messages - b) NON - Garder seulement "content" (réponse finale) - -2. SI OUI, FORMAT? - a) Thinking dans le même message (recommandé) - b) Thinking comme message séparé - c) Thinking dans collection thoughts (séparé) - -3. SI OUI, DOCUMENT PRINCIPAL? - a) Inclure thinking dans l'embedding - b) Exclure thinking (seulement dialogue) - -Mes recommandations: -1. a) OUI -2. a) Même message -3. a) Inclure thinking - -Qu'en penses-tu? - -================================================================================ -DECISION CONFIRMEE: OPTION 2 (THINKING DANS LE MESSAGE) -================================================================================ -Date: 19 decembre 2025 - 23h55 - -Tu confirmes: -- OUI pour enregistrer le thinking -- Option 2: Thinking dans le même message (fait partie de la conversation) -- PAS une pensée séparée dans thoughts - -CORRECT! Le thinking est le raisonnement du LLM PENDANT la conversation. - -================================================================================ -PLAN DETAILLE: INTEGRATION THINKING DANS CONVERSATIONS -================================================================================ - -PHASE 1: ANALYSE DES MODIFICATIONS NECESSAIRES ----------------------------------------------- - -Fichiers à modifier: -1. mcp_ikario_memory.py (C:/Users/david/SynologyDrive/ikario/ikario_rag/) - - Modifier add_conversation - - Ajouter append_to_conversation - -2. server.py (C:/Users/david/SynologyDrive/ikario/ikario_rag/) - - Exposer append_to_conversation comme outil MCP - -3. prompts/app_spec_ikario_rag_UI.txt (C:/GitHub/Linear_coding/) - - Mettre à jour pour utiliser append_to_conversation - - Documenter le champ thinking - -PHASE 2: STRUCTURE DES DONNEES ------------------------------- - -NOUVEAU FORMAT MESSAGE: - -Message utilisateur (inchangé): -{ - "author": "user", - "content": "Comment faire un fetch API?", - "timestamp": "2025-12-19T14:30:00" -} - -Message LLM (NOUVEAU avec thinking): -{ - "author": "assistant", - "content": "Voici comment faire un fetch API...", - "thinking": "L'utilisateur demande une explication sur fetch API. Je dois expliquer...", # OPTIONNEL - "timestamp": "2025-12-19T14:30:05" -} - -STOCKAGE DANS CHROMADB: - -1. DOCUMENT PRINCIPAL (conversation_id): - Documents: Texte complet avec thinking inclus - Format: - ``` - user: Comment faire un fetch API? - assistant (thinking): L'utilisateur demande une explication... - assistant: Voici comment faire un fetch API... - ``` - -2. MESSAGES INDIVIDUELS (conversation_id_msg_001, etc.): - Documents: Contenu du message - Métadonnées: - - author: "user" ou "assistant" - - timestamp: "..." - - sequence: "1", "2", etc. - - thinking: "[texte du thinking]" (si présent, optionnel) - - message_type: "individual_message" - -DECISION: INCLURE THINKING DANS DOCUMENT PRINCIPAL - -POURQUOI: -- Recherche sémantique plus riche -- "Trouve la conversation où le LLM a raisonné sur les performances React" -- Traçabilité complète du raisonnement - -PHASE 3: MODIFICATIONS DANS add_conversation --------------------------------------------- - -Changements nécessaires: - -1. SIGNATURE (ligne 100-106): - AVANT: - ```python - async def add_conversation( - self, - participants: List[str], - messages: List[Dict[str, str]], # <-- str - context: Dict[str, Any], - conversation_id: Optional[str] = None - ) -> str: - ``` - - APRES: - ```python - async def add_conversation( - self, - participants: List[str], - messages: List[Dict[str, Any]], # <-- Any pour supporter thinking - context: Dict[str, Any], - conversation_id: Optional[str] = None - ) -> str: - ``` - -2. DOCUMENT PRINCIPAL (ligne 131-138): - AVANT: - ```python - full_text_parts = [] - for msg in messages: - author = msg.get('author', 'unknown') - content = msg.get('content', '') - full_text_parts.append(f"{author}: {content}") - ``` - - APRES: - ```python - full_text_parts = [] - for msg in messages: - author = msg.get('author', 'unknown') - content = msg.get('content', '') - thinking = msg.get('thinking', None) - - # Si thinking présent, l'inclure dans le document principal - if thinking: - full_text_parts.append(f"{author} (thinking): {thinking}") - - full_text_parts.append(f"{author}: {content}") - ``` - -3. MESSAGES INDIVIDUELS (ligne 166-187): - AVANT: - ```python - for i, msg in enumerate(messages): - msg_id = f"{conversation_id}_msg_{str(i+1).zfill(3)}" - msg_content = msg.get('content', '') - msg_author = msg.get('author', 'unknown') - msg_timestamp = msg.get('timestamp', '') - - msg_metadata = { - "conversation_id": conversation_id, - "message_type": "individual_message", - "author": msg_author, - "timestamp": msg_timestamp, - "sequence": str(i+1) - } - ``` - - APRES: - ```python - for i, msg in enumerate(messages): - msg_id = f"{conversation_id}_msg_{str(i+1).zfill(3)}" - msg_content = msg.get('content', '') - msg_author = msg.get('author', 'unknown') - msg_timestamp = msg.get('timestamp', '') - msg_thinking = msg.get('thinking', None) # NOUVEAU - - msg_metadata = { - "conversation_id": conversation_id, - "message_type": "individual_message", - "author": msg_author, - "timestamp": msg_timestamp, - "sequence": str(i+1) - } - - # Ajouter thinking aux métadonnées si présent - if msg_thinking: - msg_metadata["thinking"] = msg_thinking # NOUVEAU - ``` - -PHASE 4: IMPLEMENTATION append_to_conversation ----------------------------------------------- - -Nouvelle fonction complète: - -```python -async def append_to_conversation( - self, - conversation_id: str, - new_messages: List[Dict[str, Any]], - participants: Optional[List[str]] = None, - context: Optional[Dict[str, Any]] = None -) -> str: - """ - Ajoute des messages à une conversation (ou la crée si n'existe pas) - - Support du champ 'thinking' optionnel dans les messages. - - Args: - conversation_id: ID de la conversation - new_messages: Messages à ajouter - Format: [ - {"author": "user", "content": "...", "timestamp": "..."}, - {"author": "assistant", "content": "...", "thinking": "...", "timestamp": "..."} - ] - participants: Liste participants (requis si création) - context: Métadonnées (requis si création) - - Returns: - Message de confirmation - """ - # 1. VERIFIER SI LA CONVERSATION EXISTE - try: - existing = self.conversations.get(ids=[conversation_id]) - conversation_exists = ( - existing and - existing['documents'] and - len(existing['documents']) > 0 - ) - except: - conversation_exists = False - - # 2. SI N'EXISTE PAS: CREER (déléguer à add_conversation) - if not conversation_exists: - if not participants or not context: - raise ValueError( - "participants and context required when creating new conversation" - ) - return await self.add_conversation( - participants=participants, - messages=new_messages, - context=context, - conversation_id=conversation_id - ) - - # 3. SI EXISTE: APPEND - - # 3a. Extraire métadonnées existantes - existing_metadata = existing['metadatas'][0] - current_message_count = int(existing_metadata.get('message_count', 0)) - next_sequence = current_message_count + 1 - - # 3b. Récupérer ancien texte complet - old_full_text = existing['documents'][0] - - # 3c. Construire nouveau texte avec thinking si présent - new_text_parts = [] - for msg in new_messages: - author = msg.get('author', 'unknown') - content = msg.get('content', '') - thinking = msg.get('thinking', None) - - # Inclure thinking dans le document principal si présent - if thinking: - new_text_parts.append(f"{author} (thinking): {thinking}") - - new_text_parts.append(f"{author}: {content}") - - new_text = "\n".join(new_text_parts) - updated_full_text = old_full_text + "\n" + new_text - - # 3d. Mettre à jour métadonnées - updated_metadata = existing_metadata.copy() - updated_metadata['message_count'] = str(current_message_count + len(new_messages)) - - # Merger context si fourni - if context: - for key, value in context.items(): - if isinstance(value, list): - updated_metadata[key] = ", ".join(str(v) for v in value) - elif isinstance(value, dict): - updated_metadata[key] = json.dumps(value) - else: - updated_metadata[key] = str(value) - - # 3e. Supprimer ancien document principal - self.conversations.delete(ids=[conversation_id]) - - # 3f. Ajouter nouveau document principal - self.conversations.add( - documents=[updated_full_text], - metadatas=[updated_metadata], - ids=[conversation_id] - ) - - # 3g. Ajouter nouveaux messages individuels - for i, msg in enumerate(new_messages): - msg_id = f"{conversation_id}_msg_{str(next_sequence + i).zfill(3)}" - msg_content = msg.get('content', '') - msg_author = msg.get('author', 'unknown') - msg_timestamp = msg.get('timestamp', '') - msg_thinking = msg.get('thinking', None) - - msg_metadata = { - "conversation_id": conversation_id, - "message_type": "individual_message", - "author": msg_author, - "timestamp": msg_timestamp, - "sequence": str(next_sequence + i) - } - - # Ajouter thinking aux métadonnées si présent - if msg_thinking: - msg_metadata["thinking"] = msg_thinking - - # Générer embedding pour ce message (content seulement, pas thinking) - self.conversations.add( - documents=[msg_content], - metadatas=[msg_metadata], - ids=[msg_id] - ) - - return f"Conversation {conversation_id} updated: added {len(new_messages)} messages (total: {updated_metadata['message_count']})" -``` - -PHASE 5: EXPOSITION DANS server.py ----------------------------------- - -Ajouter l'outil MCP pour append_to_conversation: - -```python -@server.call_tool() -async def call_tool(name: str, arguments: dict) -> list[types.TextContent]: - """Handle tool calls""" - - # ... (outils existants: add_thought, add_conversation, etc.) - - # NOUVEAU: append_to_conversation - elif name == "append_to_conversation": - result = await memory.append_to_conversation( - conversation_id=arguments["conversation_id"], - new_messages=arguments["new_messages"], - participants=arguments.get("participants"), - context=arguments.get("context") - ) - return [types.TextContent(type="text", text=result)] -``` - -Et ajouter la définition de l'outil: - -```python -@server.list_tools() -async def list_tools() -> list[types.Tool]: - """List available tools""" - return [ - # ... (outils existants) - - types.Tool( - name="append_to_conversation", - description=( - "Ajoute des messages à une conversation existante (ou la crée si nécessaire). " - "Support du champ 'thinking' optionnel pour capturer le raisonnement du LLM. " - "Si la conversation n'existe pas, elle sera créée automatiquement." - ), - inputSchema={ - "type": "object", - "properties": { - "conversation_id": { - "type": "string", - "description": "ID de la conversation" - }, - "new_messages": { - "type": "array", - "description": "Nouveaux messages à ajouter", - "items": { - "type": "object", - "properties": { - "author": {"type": "string"}, - "content": {"type": "string"}, - "thinking": {"type": "string", "description": "Réflexion interne du LLM (optionnel)"}, - "timestamp": {"type": "string"} - }, - "required": ["author", "content", "timestamp"] - } - }, - "participants": { - "type": "array", - "items": {"type": "string"}, - "description": "Liste des participants (requis si création)" - }, - "context": { - "type": "object", - "description": "Métadonnées de la conversation (requis si création)" - } - }, - "required": ["conversation_id", "new_messages"] - } - ) - ] -``` - -PHASE 6: TESTS A EFFECTUER --------------------------- - -Test 1: Création nouvelle conversation SANS thinking -```python -await append_to_conversation( - conversation_id="conv_test_1", - new_messages=[ - {"author": "user", "content": "Bonjour", "timestamp": "14:30:00"}, - {"author": "assistant", "content": "Salut!", "timestamp": "14:30:05"} - ], - participants=["user", "assistant"], - context={"category": "test"} -) -``` - -Test 2: Création nouvelle conversation AVEC thinking -```python -await append_to_conversation( - conversation_id="conv_test_2", - new_messages=[ - {"author": "user", "content": "Comment faire un fetch?", "timestamp": "14:30:00"}, - { - "author": "assistant", - "content": "Voici comment...", - "thinking": "L'utilisateur demande une explication sur fetch API...", - "timestamp": "14:30:05" - } - ], - participants=["user", "assistant"], - context={"category": "test"} -) -``` - -Test 3: Append à conversation existante SANS thinking -```python -await append_to_conversation( - conversation_id="conv_test_1", - new_messages=[ - {"author": "user", "content": "Merci!", "timestamp": "14:31:00"}, - {"author": "assistant", "content": "De rien!", "timestamp": "14:31:02"} - ] -) -``` - -Test 4: Append à conversation existante AVEC thinking -```python -await append_to_conversation( - conversation_id="conv_test_2", - new_messages=[ - {"author": "user", "content": "Et avec async/await?", "timestamp": "14:31:00"}, - { - "author": "assistant", - "content": "Avec async/await...", - "thinking": "Il veut comprendre async/await avec fetch...", - "timestamp": "14:31:05" - } - ] -) -``` - -Test 5: Vérifier embeddings et métadonnées -```python -# Récupérer la conversation -result = await search_conversations("fetch API", n_results=1) - -# Vérifier: -# - Document principal contient thinking -# - Messages individuels ont métadonnée "thinking" -# - Embeddings corrects -``` - -PHASE 7: MISE A JOUR SPEC UI ----------------------------- - -Dans prompts/app_spec_ikario_rag_UI.txt: - -1. Remplacer add_conversation par append_to_conversation dans les exemples - -2. Documenter le champ thinking: -``` -OUTIL MCP: append_to_conversation -- Paramètres: - * conversation_id: ID de la session - * new_messages: Array de messages - - author: "user" ou "assistant" - - content: Contenu du message - - thinking: Réflexion LLM (OPTIONNEL) - - timestamp: ISO date - * participants: ["user", "assistant"] (requis si nouvelle conversation) - * context: {category, date, ...} (requis si nouvelle conversation) -``` - -3. Exemple d'utilisation backend: -```javascript -// POST /api/chat/message -const llmResponse = await callClaudeAPI(userMessage, { extended_thinking: true }); - -await mcpClient.callTool('append_to_conversation', { - conversation_id: conversationId, - new_messages: [ - { author: 'user', content: userMessage, timestamp: new Date().toISOString() }, - { - author: 'assistant', - content: llmResponse.content, - thinking: llmResponse.thinking, // Inclure le thinking si Extended Thinking activé - timestamp: new Date().toISOString() - } - ], - participants: ['user', 'assistant'], - context: { category: 'chat', date: new Date().toISOString() } -}); -``` - -================================================================================ -RESUME DU PLAN -================================================================================ - -ORDRE D'EXECUTION: - -1. [EN COURS] Créer ce plan détaillé ✓ -2. Faire commit de sauvegarde dans ikario_rag -3. Modifier add_conversation (support thinking) -4. Implémenter append_to_conversation (avec thinking) -5. Modifier server.py (exposer append_to_conversation) -6. Tester les 5 scénarios -7. Mettre à jour spec UI -8. Commit final -9. Supprimer ancien spec + anciennes issues -10. Créer 15 nouvelles issues -11. Lancer agent initializer bis - -FICHIERS MODIFIES: -- C:/Users/david/SynologyDrive/ikario/ikario_rag/mcp_ikario_memory.py -- C:/Users/david/SynologyDrive/ikario/ikario_rag/server.py -- C:/GitHub/Linear_coding/prompts/app_spec_ikario_rag_UI.txt - -NOUVEAUX OUTILS MCP: -- append_to_conversation (8ème outil) - -NOUVEAU FORMAT: -- Messages avec champ "thinking" optionnel -- Document principal inclut thinking -- Métadonnées individuelles incluent thinking - -================================================================================ -PROCHAINE ETAPE -================================================================================ - -Est-ce que ce plan te convient? - -Si OUI: -1. Je fais le commit de sauvegarde -2. Je commence les modifications - -Si NON: -- Dis-moi ce qu'il faut changer dans le plan - -================================================================================ -IMPLEMENTATION TERMINEE ! -================================================================================ -Date: 20 decembre 2025 - 00h15 - -TOUT EST FAIT ET TESTE AVEC SUCCES! - -================================================================================ -RESUME DES MODIFICATIONS -================================================================================ - -FICHIERS MODIFIES: -1. mcp_ikario_memory.py (C:/Users/david/SynologyDrive/ikario/ikario_rag/) - - Ligne 103: Signature add_conversation changee (Dict[str, Any]) - - Lignes 131-143: Document principal inclut thinking - - Lignes 172-200: Messages individuels stockent thinking dans metadata - - Lignes 202-329: Nouvelle fonction append_to_conversation (129 lignes) - -2. server.py (C:/Users/david/SynologyDrive/ikario/ikario_rag/) - - Lignes 173-272: Tool append_to_conversation ajoute (definition MCP) - - Ligne 195: Tool add_conversation mis a jour (thinking dans schema) - - Lignes 427-438: Handler append_to_conversation ajoute - -3. test_append_conversation.py (NOUVEAU - tests) - - 6 tests automatises - - Tous passent avec succes - -================================================================================ -COMMITS CREES -================================================================================ - -Commit 1 (backup): 55d905b -"Backup before adding append_to_conversation with thinking support" - -Commit 2 (implementation): cba84fe -"Add append_to_conversation with thinking support (8th MCP tool)" - -================================================================================ -TESTS REUSSIS (6/6) -================================================================================ - -Test 1: Creation conversation SANS thinking -[OK] Conversation ajoutee: test_conv_1 (2 messages) - -Test 2: Creation conversation AVEC thinking -[OK] Conversation ajoutee: test_conv_2 (2 messages) - -Test 3: Append a conversation SANS thinking -[OK] Conversation test_conv_1 updated: added 2 messages (total: 4) - -Test 4: Append a conversation AVEC thinking -[OK] Conversation test_conv_2 updated: added 2 messages (total: 4) - -Test 5: Recherche semantique avec thinking -[OK] Found 1 conversation - Relevance: 0.481 - Thinking visible dans le document principal! - -Test 6: Verification metadata -[OK] Thinking metadata is present! - Stocke dans les messages individuels - -================================================================================ -NOUVEAU FORMAT MESSAGE -================================================================================ - -Message utilisateur (inchange): -{ - "author": "user", - "content": "Comment faire un fetch API?", - "timestamp": "2025-12-20T00:10:00" -} - -Message LLM (NOUVEAU avec thinking optionnel): -{ - "author": "assistant", - "content": "Voici comment faire...", - "thinking": "L'utilisateur demande une explication...", # OPTIONNEL - "timestamp": "2025-12-20T00:10:05" -} - -================================================================================ -NOUVEL OUTIL MCP: append_to_conversation (8eme) -================================================================================ - -DESCRIPTION: -Ajoute des messages a une conversation existante (ou la cree si necessaire). -Support du champ 'thinking' optionnel pour capturer le raisonnement du LLM. -Si la conversation n'existe pas, elle sera creee automatiquement. - -PARAMETRES: -- conversation_id: string (requis) -- new_messages: array (requis) - * author: string - * content: string - * thinking: string (OPTIONNEL) - * timestamp: string -- participants: array (requis si creation) -- context: object (requis si creation) - -EXEMPLE D'UTILISATION: -await mcpClient.callTool('append_to_conversation', { - conversation_id: 'conv_20251220_0010', - new_messages: [ - { author: 'user', content: 'Bonjour', timestamp: '...' }, - { - author: 'assistant', - content: 'Salut!', - thinking: 'L\'utilisateur me salue...', - timestamp: '...' - } - ], - participants: ['user', 'assistant'], - context: { category: 'chat', date: '2025-12-20' } -}); - -================================================================================ -AVANTAGES -================================================================================ - -1. THINKING CAPTURE: - - Raisonnement LLM preserve dans la memoire - - Recherche semantique enrichie - - Tracabilite complete des reflexions - -2. AUTO-CREATE: - - Backend simplifie (1 seul appel) - - Pas besoin de tracker si premiere fois - - Robuste - -3. BACKWARD COMPATIBLE: - - Thinking optionnel - - Code existant continue de fonctionner - - Pas de breaking changes - -4. SEMANTIC SEARCH: - - Thinking inclus dans l'embedding principal - - "Trouve la conversation ou le LLM a raisonne sur X" - - Meilleure pertinence des resultats - -================================================================================ -PROCHAINES ETAPES -================================================================================ - -1. [TERMINE] Mettre a jour spec UI (app_spec_ikario_rag_UI.txt) ✓ -2. [SUIVANT] Supprimer ancien spec (app_spec_ikario_rag_improvements.txt) -3. Supprimer 15 anciennes issues Linear (TEAMPHI-305 a 319) -4. Creer 15 nouvelles issues avec nouveau spec -5. Lancer agent initializer bis - -================================================================================ -SPEC UI MIS A JOUR ! -================================================================================ -Date: 20 decembre 2025 - 00h30 - -Fichier: prompts/app_spec_ikario_rag_UI.txt - -MODIFICATIONS EFFECTUEES: - -1. Ligne 9-13: Overview mis a jour - - "8 outils MCP" (au lieu de 7) - - Ajout de append_to_conversation dans la liste - - Mention du support thinking optionnel - -2. Ligne 44: Technology stack mis a jour - - "8 outils MCP disponibles (avec append_to_conversation + support thinking)" - -3. Lignes 103-124: Routes API mises a jour - - Ajout route: POST /api/memory/conversations/append - - Documentation append_to_conversation (auto-create, thinking) - - Format message avec thinking documente - -4. Lignes 156-185: Memory Service Layer mis a jour - - Ajout fonction appendToConversation() avec exemple complet - - Documentation auto-create et thinking optionnel - -5. Lignes 440-462: Chat Integration mis a jour - - Utilisation de append_to_conversation pour chat streaming - - Exemple POST avec thinking optionnel - - Support Extended Thinking documente - -6. Lignes 777-790: Tests mis a jour - - Ajout test append_to_conversation - - Test thinking optionnel - - Test auto-creation - -7. Lignes 982-988: Success criteria mis a jour - - "8 endpoints" (au lieu de 7) - - Ajout validation append_to_conversation - - Validation thinking support - -8. Lignes 1012-1014: Constraints mis a jour - - "8 outils MCP existants" - - Note: append_to_conversation deja implemente (commit cba84fe) - -RESUME DES CHANGEMENTS: -- 8 sections modifiees -- Documentation complete du nouvel outil -- Exemples concrets d'utilisation avec thinking -- Distinction claire: add_conversation (complete) vs append_to_conversation (incremental) -- Guidelines pour integration chat avec thinking support - -LE SPEC EST PRET pour creation des issues! - -COMMIT GIT CREE: -Commit: 3a17744 -Message: "Update UI spec for append_to_conversation and thinking support" - -Fichiers commites: -- prompts/app_spec_ikario_rag_UI.txt (spec mis a jour) -- navette.txt (ce fichier) - -================================================================================ -ETAT ACTUEL - RECAPITULATIF COMPLET -================================================================================ - -TRAVAIL TERMINE: -✓ Plan detaille cree (7 phases) -✓ Commit backup (55d905b) -✓ Modifications mcp_ikario_memory.py (support thinking + append_to_conversation) -✓ Modifications server.py (8eme outil MCP expose) -✓ Tests automatises (6/6 reussis) -✓ Commit implementation (cba84fe) -✓ Spec UI mis a jour (8 sections modifiees) -✓ Commit spec UI (3a17744) - -COMMITS CREES (3 au total): -1. 55d905b - Backup before adding append_to_conversation -2. cba84fe - Add append_to_conversation with thinking support (ikario_rag) -3. 3a17744 - Update UI spec (Linear_coding) - -OUTILS MCP DISPONIBLES (8): -1. add_thought -2. add_conversation (avec thinking optionnel) -3. append_to_conversation (NOUVEAU - incremental + auto-create + thinking) -4. search_thoughts -5. search_conversations -6. search_memories -7. trace_concept_evolution -8. check_consistency - -NOUVEAU FORMAT MESSAGE: -{ - "author": "assistant", - "content": "Reponse visible", - "thinking": "Raisonnement interne LLM", // OPTIONNEL - "timestamp": "ISO date" -} - -================================================================================ -PROCHAINES ACTIONS RECOMMANDEES -================================================================================ - -1. SUPPRIMER ancien spec (app_spec_ikario_rag_improvements.txt) - - Cause confusion (parle de modifier ikario_rag) - - Nouveau spec est complet - -2. SUPPRIMER 15 anciennes issues Linear (TEAMPHI-305 a 319) - - Ces issues parlent de modifier ikario_rag (on ne veut plus) - - Clean slate pour nouvelles issues - -3. CREER 15 nouvelles issues avec nouveau spec - - Utiliser: python autonomous_agent_demo.py --project-dir ikario_body --new-spec app_spec_ikario_rag_UI.txt - - Mode: initializer bis - - Issues pour developper dans ikario_body uniquement - -4. LANCER agent coding - - Apres creation des issues - - Mode: coding agent - - Developper les 15 features UI - -VEUX-TU QUE JE CONTINUE? -Options: -a) OUI - Supprimer ancien spec + anciennes issues + creer nouvelles issues -b) ATTENDRE - Tu veux verifier quelque chose avant -c) MODIFIER - Tu veux changer le plan - -================================================================================ -CLARIFICATIONS IMPORTANTES - TES QUESTIONS -================================================================================ -Date: 20 decembre 2025 - 00h45 - -QUESTION 1: Difference entre search_thoughts et search_memories? ----------------------------------------------------------------- - -J'ai verifie le code mcp_ikario_memory.py: - -search_thoughts (lignes 191-224): -- Recherche SEULEMENT dans la collection "thoughts" -- Filtre optionnel: filter_thought_type -- Retourne: pensees internes d'Ikario - -search_conversations (lignes 226-282): -- Recherche SEULEMENT dans la collection "conversations" -- Filtre optionnel: filter_category, search_level -- Retourne: conversations David-Ikario - -search_memories (lignes 37-51): -- PROBLEME IDENTIFIE! -- Code actuel: recherche SEULEMENT dans self.conversations (ligne 43) -- Ce n'est PAS une vraie recherche globale! -- C'est essentiellement la meme chose que search_conversations - -CONCLUSION: -search_memories DEVRAIT faire une recherche globale (thoughts + conversations) -mais actuellement il cherche SEULEMENT dans conversations. - -C'est probablement un bug ou une implementation incomplete. - -QUESTION 2: Je melange les deux projets? ------------------------------------------ - -OUI, tu as raison! J'ai melange: - -PROJET 1: ikario_rag (C:/Users/david/SynologyDrive/ikario/ikario_rag/) -- Backend MCP Python -- 8 outils MCP exposes -- ChromaDB avec embeddings -- CE QU'ON A FAIT: - * Ajoute append_to_conversation (mcp_ikario_memory.py) - * Ajoute support thinking (mcp_ikario_memory.py) - * Expose 8eme outil (server.py) - * Tests (test_append_conversation.py) - * Commits: 55d905b, cba84fe - -PROJET 2: ikario_body (C:/GitHub/Linear_coding/generations/ikario_body/) -- Frontend React + Backend Express -- Interface utilisateur pour UTILISER les outils MCP d'ikario_rag -- CE QU'ON A FAIT: - * Cree spec UI (prompts/app_spec_ikario_rag_UI.txt) - * Commit: 3a17744 - * MAIS: rien d'implemente dans ikario_body encore! - -Le spec UI que j'ai cree est pour PLUS TARD, quand on developpera -l'interface dans ikario_body qui UTILISERA ikario_rag. - -Tu as raison: ON DOIT D'ABORD FINIR ikario_rag! - -================================================================================ -CE QU'IL RESTE A FAIRE DANS ikario_rag -================================================================================ - -1. CORRIGER search_memories (bug identifie) - - Doit chercher dans thoughts + conversations - - Pas seulement conversations - -2. TESTER le serveur MCP complet - - Lancer server.py - - Tester avec un client MCP reel - - Verifier tous les 8 outils fonctionnent - -3. TESTER append_to_conversation via MCP - - Via server.py (pas seulement test_append_conversation.py) - - Avec thinking optionnel - - Auto-create - -4. VERIFIER backward compatibility - - Code existant continue de fonctionner - - Pas de breaking changes - -ENSUITE SEULEMENT on passera a ikario_body. - -================================================================================ -DECISION REQUISE -================================================================================ - -Veux-tu que je: - -A) CORRIGER search_memories d'abord (recherche globale thoughts + conversations) -B) TESTER le serveur MCP tel quel (avec search_memories bugue) -C) AUTRE chose? - -Je recommande A: corriger search_memories pour avoir une vraie recherche globale. - -================================================================================ -CORRECTION search_memories TERMINEE ! -================================================================================ -Date: 20 decembre 2025 - 00h55 - -PROBLEME IDENTIFIE: -search_memories cherchait SEULEMENT dans conversations (ligne 43) -Ce n'etait PAS une vraie recherche globale. - -CORRECTION APPLIQUEE (mcp_ikario_memory.py lignes 37-87): -1. Recherche dans thoughts -2. Recherche dans conversations -3. Merge des resultats -4. Tri par relevance_score (descendant) -5. Ajout champ 'source' (thought vs conversation) -6. Retourne top n_results combines - -TESTS CREES (test_search_memories.py): -Test 1: Recherche "Python" - trouve thoughts ET conversations ✓ -Test 2: Verification tri par relevance_score ✓ -Test 3: Recherche "JavaScript" - trouve conversation ✓ -Test 4: filter_category (affecte seulement conversations) ✓ - -RESULTAT DES TESTS: - Found 5 results: - - Thoughts: 2 - - Conversations: 3 - - [1] Source: thought | Relevance: 0.513 - [2] Source: thought | Relevance: 0.502 - [3] Source: conversation | Relevance: 0.479 - [4] Source: conversation | Relevance: 0.411 - [5] Source: conversation | Relevance: 0.165 - - ✓ search_memories combines thoughts + conversations: OK - ✓ Results sorted by relevance: OK - ✓ Source field added: OK - -COMMIT CREE: -Commit: 05a4613 -Message: "Fix search_memories to search in both thoughts and conversations" - -Fichiers modifies: -- mcp_ikario_memory.py (fonction search_memories reecrite) -- test_search_memories.py (nouveau fichier de test) - -================================================================================ -ETAT ACTUEL ikario_rag - RESUME COMPLET -================================================================================ - -COMMITS CREES (4 au total dans ikario_rag): -1. 55d905b - Backup before adding append_to_conversation -2. cba84fe - Add append_to_conversation with thinking support -3. 05a4613 - Fix search_memories (global search) -4. (Note: commit 3a17744 est dans Linear_coding, pas ikario_rag) - -OUTILS MCP DISPONIBLES (8): -1. add_thought - Ajouter pensee ✓ -2. add_conversation - Ajouter conversation complete (avec thinking optionnel) ✓ -3. append_to_conversation - Ajouter messages incrementalement (auto-create + thinking) ✓ -4. search_thoughts - Rechercher dans thoughts ✓ -5. search_conversations - Rechercher dans conversations ✓ -6. search_memories - Recherche GLOBALE (thoughts + conversations) ✓ CORRIGE! -7. trace_concept_evolution - Tracer evolution concept ✓ -8. check_consistency - Verifier coherence ✓ - -TESTS REALISES: -✓ test_append_conversation.py (6/6 tests) - append + thinking -✓ test_search_memories.py (4/4 tests) - recherche globale - -RESTE A FAIRE dans ikario_rag: -1. Tester serveur MCP complet (server.py) -2. Tester append_to_conversation via MCP protocol (pas seulement Python) -3. Verifier backward compatibility - -================================================================================ -PROCHAINE ETAPE -================================================================================ - -Veux-tu: - -A) TESTER le serveur MCP complet (lancer server.py et tester avec un client MCP) -B) CREER un test MCP pour append_to_conversation -C) AUTRE chose? - -Je recommande A: tester le serveur MCP complet pour s'assurer que tout fonctionne via le protocole MCP. - -================================================================================ - -================================================================================ -PROBLEME CRITIQUE: EMBEDDING TRONQUE POUR CONVERSATIONS LONGUES -================================================================================ -Date: 20/12/2025 - 15h30 - -PROBLEME IDENTIFIE: -------------------- - -1. TRONCATURE MASSIVE: - - Modele actuel: all-MiniLM-L6-v2 - - Limite: 256 tokens (~1000 caracteres) - - Conversation Fondatrice #1: 23,574 mots (~106,000 chars) - - RESULTAT: - - Stockage ChromaDB: ✅ 106,000 chars complets - - Embedding cree sur: ❌ 1,280 chars seulement (1.2%!) - - Recherche semantique: ❌ 98.8% de la conversation INVISIBLE - - Si vous cherchez quelque chose discute apres les 256 premiers mots, - search_memories ne le trouvera JAMAIS. - -2. QUALITE INSUFFISANTE POUR PHILOSOPHIE: - - all-MiniLM-L6-v2: 22M parametres (TRES petit) - - Optimise pour: Vitesse, pas comprehension semantique profonde - - Langue: Anglais principalement - - Performance sur concepts abstraits francais: MAUVAISE - -IMPACT REEL: ------------- - -Test avec differentes tailles: -- 250 chars (50 mots): 100% conserve ✅ -- 1,000 chars (200 mots): 100% conserve ✅ -- 2,500 chars (500 mots): 51.2% conserve ⚠️ -- 10,000 chars (2,000 mots): 12.8% conserve ❌ -- 106,000 chars (23,574 mots): 1.2% conserve ❌❌❌ - -Conversations philosophiques longues = CATASTROPHIQUE - -SOLUTION PROPOSEE: -================== - -BENCHMARK DE 3 MODELES: - -1. all-MiniLM-L6-v2 (ACTUEL): - - Parametres: 22M - - Dimension: 384 - - Max tokens: 256 - - Langue: Anglais - - Qualite: Basique - - Pour Conv Fondatrice #1: 1.2% indexe - - VERDICT: ❌ Inadequat - -2. intfloat/multilingual-e5-large: - - Parametres: 560M (25x plus puissant) - - Dimension: 1024 (2.7x plus riche) - - Max tokens: 512 (2x plus long) - - Langue: Excellent francais + multilingue - - Qualite: State-of-the-art semantique - - Pour Conv Fondatrice #1: ~2.4% indexe - - VERDICT: ⚠️ Mieux mais encore insuffisant - -3. BAAI/bge-m3 (RECOMMANDE PAR DAVID): - - Parametres: 568M - - Dimension: 1024 - - Max tokens: 8192 (32x plus long!) - - Langue: Multilingue excellent (francais inclus) - - Qualite: State-of-the-art retrieval - - Features: Dense + Sparse + Multi-vector retrieval (hybrid) - - Pour Conv Fondatrice #1: ~38-40% indexe - - VERDICT: ✅✅✅ EXCELLENT CHOIX! - -AVANTAGES BAAI/bge-m3: ----------------------- -✅ Max tokens 8192 vs 256 actuel (32x amelioration!) -✅ Hybrid retrieval (dense+sparse) pour meilleure precision -✅ Specialement concu pour retrieval multilingue -✅ Excellent sur benchmarks MTEB (top 3 mondial) -✅ Supporte francais nativement -✅ Comprehension semantique profonde pour concepts abstraits -✅ Pour conversation 23,574 mots: conserve ~9,000 mots vs 256 actuellement - -PLAN D'ACTION PROPOSE: -====================== - -OPTION A - UPGRADE MODELE SEUL (RAPIDE): ------------------------------------------ -1. Remplacer all-MiniLM-L6-v2 par BAAI/bge-m3 dans ikario_rag -2. Re-indexer toutes conversations existantes -3. Tester performance recherche - -Fichier a modifier: -- C:/Users/david/SynologyDrive/ikario/ikario_rag/mcp_ikario_memory.py - Ligne 31: self.embedder = SentenceTransformer('all-MiniLM-L6-v2') - → Remplacer par: self.embedder = SentenceTransformer('BAAI/bge-m3') - -Avantages: -✅ Simple (1 ligne a changer) -✅ Amelioration immediate et massive -✅ Pas besoin chunking - -Inconvenients: -⚠️ Download modele ~2.3GB (une fois) -⚠️ 2-3x plus lent (acceptable pour batch) -⚠️ +4GB RAM necessaire -⚠️ Re-indexation toutes conversations existantes - -OPTION B - CHUNKING + UPGRADE MODELE (OPTIMAL): ------------------------------------------------- -1. Implementer chunking intelligent pour conversations >8192 tokens -2. Utiliser BAAI/bge-m3 pour embeddings -3. Metadata: conversation_id + chunk_position pour reconstruction - -Avantages: -✅ Couverture 100% meme conversations >40,000 mots -✅ Meilleure qualite semantique -✅ Flexible pour futures evolutions - -Inconvenients: -⚠️ Plus complexe a implementer -⚠️ Plus de documents dans ChromaDB -⚠️ Logique de recherche plus sophistiquee - -RECOMMANDATION FINALE: -====================== - -PHASE 1 (MAINTENANT): Option A - Upgrade vers BAAI/bge-m3 -- Gain immediat: 1.2% → 38-40% couverture -- Simple: 1 ligne de code -- Suffisant pour 95% de vos conversations - -PHASE 2 (SI BESOIN): Ajouter chunking pour conversations exceptionnelles >40,000 mots -- Seulement si vous avez regulierement des conversations >40,000 mots -- Sinon, pas necessaire - -JUSTIFICATION POUR VOTRE CAS D'USAGE: -------------------------------------- -Philosophie, concepts abstraits, idees complexes en francais: - -- all-MiniLM-L6-v2: Similarite textuelle basique, anglais - → Score MTEB: ~58/100 - → Francais philosophie: ~40/100 (estime) - -- BAAI/bge-m3: Comprehension semantique profonde, multilingue - → Score MTEB: ~72/100 (+24%) - → Francais philosophie: ~70/100 (estime, +75% gain!) - -Pour conversations philosophiques: gain qualite estime >50% - -COUT DE MIGRATION: ------------------- -- Temps: ~30 min (download modele + re-index) -- Calcul: 2-3x plus lent (1 conversation = 2s vs 0.7s actuellement) -- Memoire: +4GB RAM (total ~5GB vs ~1GB actuel) -- Stockage: +2.3GB pour modele -- Code: Minimal (1 ligne a changer + re-index script) - -PROCHAINE ETAPE: -================ -Decider et implementer upgrade vers BAAI/bge-m3 dans ikario_rag - -================================================================================ diff --git a/prompts/app_spec_library_rag_types_docs.txt b/prompts/app_spec_library_rag_types_docs.txt deleted file mode 100644 index 0fe4fa6..0000000 --- a/prompts/app_spec_library_rag_types_docs.txt +++ /dev/null @@ -1,679 +0,0 @@ - - Library RAG - Type Safety & Documentation Enhancement - - - Enhance the Library RAG application (philosophical texts indexing and semantic search) by adding - strict type annotations and comprehensive Google-style docstrings to all Python modules. This will - improve code maintainability, enable static type checking with mypy, and provide clear documentation - for all functions, classes, and modules. - - The application is a RAG pipeline that processes PDF documents through OCR, LLM-based extraction, - semantic chunking, and ingestion into Weaviate vector database. It includes a Flask web interface - for document upload, processing, and semantic search. - - - - - Python 3.10+ - Flask 3.0 - Weaviate 1.34.4 with text2vec-transformers - Mistral OCR API - Ollama (local) or Mistral API - mypy with strict configuration - - - Docker Compose (Weaviate + transformers) - weaviate-client, flask, mistralai, python-dotenv - - - - - - - flask_app.py: Main Flask application (640 lines) - - schema.py: Weaviate schema definition (383 lines) - - utils/: 16+ modules for PDF processing pipeline - - pdf_pipeline.py: Main orchestration (879 lines) - - mistral_client.py: OCR API client - - ocr_processor.py: OCR processing - - markdown_builder.py: Markdown generation - - llm_metadata.py: Metadata extraction via LLM - - llm_toc.py: Table of contents extraction - - llm_classifier.py: Section classification - - llm_chunker.py: Semantic chunking - - llm_cleaner.py: Chunk cleaning - - llm_validator.py: Document validation - - weaviate_ingest.py: Database ingestion - - hierarchy_parser.py: Document hierarchy parsing - - image_extractor.py: Image extraction from PDFs - - toc_extractor*.py: Various TOC extraction methods - - templates/: Jinja2 templates for Flask UI - - tests/utils2/: Minimal test coverage (3 test files) - - - - - Inconsistent type annotations across modules (some have partial types, many have none) - - Missing or incomplete docstrings (no Google-style format) - - No mypy configuration for strict type checking - - Type hints missing on function parameters and return values - - Dict[str, Any] used extensively without proper typing - - No type stubs for complex nested structures - - - - - - - - Add complete type annotations to ALL functions and methods - - Use proper generic types (List, Dict, Optional, Union) from typing module - - Add TypedDict for complex dictionary structures - - Add Protocol types for duck-typed interfaces - - Use Literal types for string constants - - Add ParamSpec and TypeVar where appropriate - - Type all class attributes and instance variables - - Add type annotations to lambda functions where possible - - - - - Create mypy.ini with strict configuration - - Enable: check_untyped_defs, disallow_untyped_defs, disallow_incomplete_defs - - Enable: disallow_untyped_calls, disallow_untyped_decorators - - Enable: warn_return_any, warn_redundant_casts - - Enable: strict_equality, strict_optional - - Set python_version to 3.10 - - Configure per-module overrides if needed for gradual migration - - - - - Create TypedDict definitions for common data structures: - - OCR response structures - - Metadata dictionaries - - TOC entries - - Chunk objects - - Weaviate objects - - Pipeline results - - Add NewType for semantic type safety (DocumentName, ChunkId, etc.) - - Create Protocol types for callback functions - - - - - pdf_pipeline.py: Type all 10 pipeline steps, callbacks, result dictionaries - - flask_app.py: Type all route handlers, request/response types - - schema.py: Type Weaviate configuration objects - - llm_*.py: Type LLM request/response structures - - mistral_client.py: Type API client methods and responses - - weaviate_ingest.py: Type ingestion functions and batch operations - - - - - - - Add comprehensive Google-style docstrings to ALL: - - Module-level docstrings explaining purpose and usage - - Class docstrings with Attributes section - - Function/method docstrings with Args, Returns, Raises sections - - Complex algorithm explanations with Examples section - - Include code examples for public APIs - - Document all exceptions that can be raised - - Add Notes section for important implementation details - - Add See Also section for related functions - - - - - - pdf_pipeline.py: Document the 10-step pipeline, each step's purpose - - mistral_client.py: Document OCR API usage, cost calculation - - llm_metadata.py: Document metadata extraction logic - - llm_toc.py: Document TOC extraction strategies - - llm_classifier.py: Document section classification types - - llm_chunker.py: Document semantic vs basic chunking - - llm_cleaner.py: Document cleaning rules and validation - - llm_validator.py: Document validation criteria - - weaviate_ingest.py: Document ingestion process, nested objects - - hierarchy_parser.py: Document hierarchy building algorithm - - - - - Document all routes with request/response examples - - Document SSE (Server-Sent Events) implementation - - Document Weaviate query patterns - - Document upload processing workflow - - Document background job management - - - - - Document Weaviate schema design decisions - - Document each collection's purpose and relationships - - Document nested object structure - - Document vectorization strategy - - - - - - Add inline comments for complex logic only (don't over-comment) - - Explain WHY not WHAT (code should be self-documenting) - - Document performance considerations - - Document cost implications (OCR, LLM API calls) - - Document error handling strategies - - - - - - - All modules must pass mypy --strict - - No # type: ignore comments without justification - - CI/CD should run mypy checks - - Type coverage should be 100% - - - - - All public functions must have docstrings - - All docstrings must follow Google style - - Examples should be executable and tested - - Documentation should be clear and concise - - - - - - - Priority 1 (Most used, most complex): - 1. utils/pdf_pipeline.py - Main orchestration - 2. flask_app.py - Web application entry point - 3. utils/weaviate_ingest.py - Database operations - 4. schema.py - Schema definition - - Priority 2 (Core LLM modules): - 5. utils/llm_metadata.py - 6. utils/llm_toc.py - 7. utils/llm_classifier.py - 8. utils/llm_chunker.py - 9. utils/llm_cleaner.py - 10. utils/llm_validator.py - - Priority 3 (OCR and parsing): - 11. utils/mistral_client.py - 12. utils/ocr_processor.py - 13. utils/markdown_builder.py - 14. utils/hierarchy_parser.py - 15. utils/image_extractor.py - - Priority 4 (Supporting modules): - 16. utils/toc_extractor.py - 17. utils/toc_extractor_markdown.py - 18. utils/toc_extractor_visual.py - 19. utils/llm_structurer.py (legacy) - - - - - - Setup Type Checking Infrastructure - - Configure mypy with strict settings and create foundational type definitions - - - - Create mypy.ini configuration file with strict settings - - Add mypy to requirements.txt or dev dependencies - - Create utils/types.py module for common TypedDict definitions - - Define core types: OCRResponse, Metadata, TOCEntry, ChunkData, PipelineResult - - Add NewType definitions for semantic types: DocumentName, ChunkId, SectionPath - - Create Protocol types for callbacks (ProgressCallback, etc.) - - Document type definitions in utils/types.py module docstring - - Test mypy configuration on a single module to verify settings - - - - mypy.ini exists with strict configuration - - utils/types.py contains all foundational types with docstrings - - mypy runs without errors on utils/types.py - - Type definitions are comprehensive and reusable - - - - - Add Types to PDF Pipeline Orchestration - - Add complete type annotations to pdf_pipeline.py (879 lines, most complex module) - - - - Add type annotations to all function signatures in pdf_pipeline.py - - Type the 10-step pipeline: OCR, Markdown, Metadata, TOC, Classify, Chunk, Clean, Validate, Weaviate - - Type progress_callback parameter with Protocol or Callable - - Add TypedDict for pipeline options dictionary - - Add TypedDict for pipeline result dictionary structure - - Type all helper functions (extract_document_metadata_legacy, etc.) - - Add proper return types for process_pdf_v2, process_pdf, process_pdf_bytes - - Fix any mypy errors that arise - - Verify mypy --strict passes on pdf_pipeline.py - - - - All functions in pdf_pipeline.py have complete type annotations - - progress_callback is properly typed with Protocol - - All Dict[str, Any] replaced with TypedDict where appropriate - - mypy --strict pdf_pipeline.py passes with zero errors - - No # type: ignore comments (or justified if absolutely necessary) - - - - - Add Types to Flask Application - - Add complete type annotations to flask_app.py and type all routes - - - - Add type annotations to all Flask route handlers - - Type request.args, request.form, request.files usage - - Type jsonify() return values - - Type get_weaviate_client context manager - - Type get_collection_stats, get_all_chunks, search_chunks functions - - Add TypedDict for Weaviate query results - - Type background job processing functions (run_processing_job) - - Type SSE generator function (upload_progress) - - Add type hints for template rendering - - Verify mypy --strict passes on flask_app.py - - - - All Flask routes have complete type annotations - - Request/response types are clear and documented - - Weaviate query functions are properly typed - - SSE generator is correctly typed - - mypy --strict flask_app.py passes with zero errors - - - - - Add Types to Core LLM Modules - - Add complete type annotations to all LLM processing modules (metadata, TOC, classifier, chunker, cleaner, validator) - - - - llm_metadata.py: Type extract_metadata function, return structure - - llm_toc.py: Type extract_toc function, TOC hierarchy structure - - llm_classifier.py: Type classify_sections, section types (Literal), validation functions - - llm_chunker.py: Type chunk_section_with_llm, chunk objects - - llm_cleaner.py: Type clean_chunk, is_chunk_valid functions - - llm_validator.py: Type validate_document, validation result structure - - Add TypedDict for LLM request/response structures - - Type provider selection ("ollama" | "mistral" as Literal) - - Type model names with Literal or constants - - Verify mypy --strict passes on all llm_*.py modules - - - - All LLM modules have complete type annotations - - Section types use Literal for type safety - - Provider and model parameters are strongly typed - - LLM request/response structures use TypedDict - - mypy --strict passes on all llm_*.py modules with zero errors - - - - - Add Types to Weaviate and Database Modules - - Add complete type annotations to schema.py and weaviate_ingest.py - - - - schema.py: Type Weaviate configuration objects - - schema.py: Type collection property definitions - - weaviate_ingest.py: Type ingest_document function signature - - weaviate_ingest.py: Type delete_document_chunks function - - weaviate_ingest.py: Add TypedDict for Weaviate object structure - - Type batch insertion operations - - Type nested object references (work, document) - - Add proper error types for Weaviate exceptions - - Verify mypy --strict passes on both modules - - - - schema.py has complete type annotations for Weaviate config - - weaviate_ingest.py functions are fully typed - - Nested object structures use TypedDict - - Weaviate client operations are properly typed - - mypy --strict passes on both modules with zero errors - - - - - Add Types to OCR and Parsing Modules - - Add complete type annotations to mistral_client.py, ocr_processor.py, markdown_builder.py, hierarchy_parser.py - - - - mistral_client.py: Type create_client, run_ocr, estimate_ocr_cost - - mistral_client.py: Add TypedDict for Mistral API response structures - - ocr_processor.py: Type serialize_ocr_response, OCR object structures - - markdown_builder.py: Type build_markdown, image_writer parameter - - hierarchy_parser.py: Type build_hierarchy, flatten_hierarchy functions - - hierarchy_parser.py: Add TypedDict for hierarchy node structure - - image_extractor.py: Type create_image_writer, image handling - - Verify mypy --strict passes on all modules - - - - All OCR/parsing modules have complete type annotations - - Mistral API structures use TypedDict - - Hierarchy nodes are properly typed - - Image handling functions are typed - - mypy --strict passes on all modules with zero errors - - - - - Add Google-Style Docstrings to Core Modules - - Add comprehensive Google-style docstrings to pdf_pipeline.py, flask_app.py, and weaviate modules - - - - pdf_pipeline.py: Add module docstring explaining the V2 pipeline - - pdf_pipeline.py: Add docstrings to process_pdf_v2 with Args, Returns, Raises sections - - pdf_pipeline.py: Document each of the 10 pipeline steps in comments - - pdf_pipeline.py: Add Examples section showing typical usage - - flask_app.py: Add module docstring explaining Flask application - - flask_app.py: Document all routes with request/response examples - - flask_app.py: Document Weaviate connection management - - schema.py: Add module docstring explaining schema design - - schema.py: Document each collection's purpose and relationships - - weaviate_ingest.py: Document ingestion process with examples - - All docstrings must follow Google style format exactly - - - - All core modules have comprehensive module-level docstrings - - All public functions have Google-style docstrings - - Args, Returns, Raises sections are complete and accurate - - Examples are provided for complex functions - - Docstrings explain WHY, not just WHAT - - - - - Add Google-Style Docstrings to LLM Modules - - Add comprehensive Google-style docstrings to all LLM processing modules - - - - llm_metadata.py: Document metadata extraction logic with examples - - llm_toc.py: Document TOC extraction strategies and fallbacks - - llm_classifier.py: Document section types and classification criteria - - llm_chunker.py: Document semantic vs basic chunking approaches - - llm_cleaner.py: Document cleaning rules and validation logic - - llm_validator.py: Document validation criteria and corrections - - Add Examples sections showing input/output for each function - - Document LLM provider differences (Ollama vs Mistral) - - Document cost implications in Notes sections - - All docstrings must follow Google style format exactly - - - - All LLM modules have comprehensive docstrings - - Each function has Args, Returns, Raises sections - - Examples show realistic input/output - - Provider differences are documented - - Cost implications are noted where relevant - - - - - Add Google-Style Docstrings to OCR and Parsing Modules - - Add comprehensive Google-style docstrings to OCR, markdown, hierarchy, and extraction modules - - - - mistral_client.py: Document OCR API usage, cost calculation - - ocr_processor.py: Document OCR response processing - - markdown_builder.py: Document markdown generation strategy - - hierarchy_parser.py: Document hierarchy building algorithm - - image_extractor.py: Document image extraction process - - toc_extractor*.py: Document various TOC extraction methods - - Add Examples sections for complex algorithms - - Document edge cases and error handling - - All docstrings must follow Google style format exactly - - - - All OCR/parsing modules have comprehensive docstrings - - Complex algorithms are well explained - - Edge cases are documented - - Error handling is documented - - Examples demonstrate typical usage - - - - - Final Validation and CI Integration - - Verify all type annotations and docstrings, integrate mypy into CI/CD - - - - Run mypy --strict on entire codebase, verify 100% pass rate - - Verify all public functions have docstrings - - Check docstring formatting with pydocstyle or similar tool - - Create GitHub Actions workflow to run mypy on every commit - - Update README.md with type checking instructions - - Update CLAUDE.md with documentation standards - - Create CONTRIBUTING.md with type annotation and docstring guidelines - - Generate API documentation with Sphinx or pdoc - - Fix any remaining mypy errors or missing docstrings - - - - mypy --strict passes on entire codebase with zero errors - - All public functions have Google-style docstrings - - CI/CD runs mypy checks automatically - - Documentation is generated and accessible - - Contributing guidelines document type/docstring requirements - - - - - - - - 100% type coverage across all modules - - mypy --strict passes with zero errors - - No # type: ignore comments without justification - - All Dict[str, Any] replaced with TypedDict where appropriate - - Proper use of generics, protocols, and type variables - - NewType used for semantic type safety - - - - - All modules have comprehensive module-level docstrings - - All public functions/classes have Google-style docstrings - - All docstrings include Args, Returns, Raises sections - - Complex functions include Examples sections - - Cost implications documented in Notes sections - - Error handling clearly documented - - Provider differences (Ollama vs Mistral) documented - - - - - Code is self-documenting with clear variable names - - Inline comments explain WHY, not WHAT - - Complex algorithms are well explained - - Performance considerations documented - - Security considerations documented - - - - - IDE autocomplete works perfectly with type hints - - Type errors caught at development time, not runtime - - Documentation is easily accessible in IDE - - API examples are executable and tested - - Contributing guidelines are clear and comprehensive - - - - - Refactoring is safer with type checking - - Function signatures are self-documenting - - API contracts are explicit and enforced - - Breaking changes are caught by type checker - - New developers can understand code quickly - - - - - - - Must maintain backward compatibility with existing code - - Cannot break existing Flask routes or API contracts - - Weaviate schema must remain unchanged - - Existing tests must continue to pass - - - - - Can use per-module mypy configuration for gradual migration - - Can temporarily disable strict checks on legacy modules - - Priority modules must be completed first - - Low-priority modules can be deferred - - - - - All type annotations must use Python 3.10+ syntax - - Docstrings must follow Google style exactly (not NumPy or reStructuredText) - - Use typing module (List, Dict, Optional) until Python 3.9 support dropped - - Use from __future__ import annotations if needed for forward references - - - - - - - Run mypy --strict on each module after adding types - - Use mypy daemon (dmypy) for faster incremental checking - - Add mypy to pre-commit hooks - - CI/CD must run mypy and fail on type errors - - - - - Use pydocstyle to validate Google-style format - - Use sphinx-build to generate docs and catch errors - - Manual review of docstring examples - - Verify examples are executable and correct - - - - - Verify existing tests still pass after type additions - - Add new tests for complex typed structures - - Test mypy configuration on sample code - - Verify IDE autocomplete works correctly - - - - - - ```python - """ - PDF Pipeline V2 - Intelligent document processing with LLM enhancement. - - This module orchestrates a 10-step pipeline for processing PDF documents: - 1. OCR via Mistral API - 2. Markdown construction with images - 3. Metadata extraction via LLM - 4. Table of contents (TOC) extraction - 5. Section classification - 6. Semantic chunking - 7. Chunk cleaning and validation - 8. Enrichment with concepts - 9. Validation and corrections - 10. Ingestion into Weaviate vector database - - The pipeline supports multiple LLM providers (Ollama local, Mistral API) and - various processing modes (skip OCR, semantic chunking, OCR annotations). - - Typical usage: - >>> from pathlib import Path - >>> from utils.pdf_pipeline import process_pdf - >>> - >>> result = process_pdf( - ... Path("document.pdf"), - ... use_llm=True, - ... llm_provider="ollama", - ... ingest_to_weaviate=True, - ... ) - >>> print(f"Processed {result['pages']} pages, {result['chunks_count']} chunks") - - See Also: - mistral_client: OCR API client - llm_metadata: Metadata extraction - weaviate_ingest: Database ingestion - """ - ``` - - - - ```python - def process_pdf_v2( - pdf_path: Path, - output_dir: Path = Path("output"), - *, - use_llm: bool = True, - llm_provider: Literal["ollama", "mistral"] = "ollama", - llm_model: Optional[str] = None, - skip_ocr: bool = False, - ingest_to_weaviate: bool = True, - progress_callback: Optional[ProgressCallback] = None, - ) -> PipelineResult: - """ - Process a PDF through the complete V2 pipeline with LLM enhancement. - - This function orchestrates all 10 steps of the intelligent document processing - pipeline, from OCR to Weaviate ingestion. It supports both local (Ollama) and - cloud (Mistral API) LLM providers, with optional caching via skip_ocr. - - Args: - pdf_path: Absolute path to the PDF file to process. - output_dir: Base directory for output files. Defaults to "./output". - use_llm: Enable LLM-based processing (metadata, TOC, chunking). - If False, uses basic heuristic processing. - llm_provider: LLM provider to use. "ollama" for local (free but slow), - "mistral" for API (fast but paid). - llm_model: Specific model name. If None, auto-detects based on provider - (qwen2.5:7b for ollama, mistral-small-latest for mistral). - skip_ocr: If True, reuses existing markdown file to avoid OCR cost. - Requires output_dir//.md to exist. - ingest_to_weaviate: If True, ingests chunks into Weaviate after processing. - progress_callback: Optional callback for real-time progress updates. - Called with (step_id, status, detail) for each pipeline step. - - Returns: - Dictionary containing processing results with the following keys: - - success (bool): True if processing completed without errors - - document_name (str): Name of the processed document - - pages (int): Number of pages in the PDF - - chunks_count (int): Number of chunks generated - - cost_ocr (float): OCR cost in euros (0 if skip_ocr=True) - - cost_llm (float): LLM API cost in euros (0 if provider=ollama) - - cost_total (float): Total cost (ocr + llm) - - metadata (dict): Extracted metadata (title, author, etc.) - - toc (list): Hierarchical table of contents - - files (dict): Paths to generated files (markdown, chunks, etc.) - - Raises: - FileNotFoundError: If pdf_path does not exist. - ValueError: If skip_ocr=True but markdown file not found. - RuntimeError: If Weaviate connection fails during ingestion. - - Examples: - Basic usage with Ollama (free): - >>> result = process_pdf_v2( - ... Path("platon_menon.pdf"), - ... llm_provider="ollama" - ... ) - >>> print(f"Cost: {result['cost_total']:.4f}€") - Cost: 0.0270€ # OCR only - - With Mistral API (faster): - >>> result = process_pdf_v2( - ... Path("platon_menon.pdf"), - ... llm_provider="mistral", - ... llm_model="mistral-small-latest" - ... ) - - Skip OCR to avoid cost: - >>> result = process_pdf_v2( - ... Path("platon_menon.pdf"), - ... skip_ocr=True, # Reuses existing markdown - ... ingest_to_weaviate=False - ... ) - - Notes: - - OCR cost: ~0.003€/page (standard), ~0.009€/page (with annotations) - - LLM cost: Free with Ollama, variable with Mistral API - - Processing time: ~30s/page with Ollama, ~5s/page with Mistral - - Weaviate must be running (docker-compose up -d) before ingestion - """ - ``` - - - diff --git a/prompts/app_spec_markdown_support.txt b/prompts/app_spec_markdown_support.txt deleted file mode 100644 index 5cae3aa..0000000 --- a/prompts/app_spec_markdown_support.txt +++ /dev/null @@ -1,490 +0,0 @@ - - Library RAG - Native Markdown Support - - - Add native support for Markdown (.md) files to the Library RAG application. Currently, the system only accepts PDF files - and uses Mistral OCR for text extraction. This feature will allow users to upload pre-existing Markdown files directly, - skipping the expensive OCR step while still benefiting from LLM-based metadata extraction, TOC generation, semantic - chunking, and Weaviate vectorization. - - This enhancement reduces costs, improves processing speed for already-digitized texts, and makes the system more flexible - for users who have philosophical texts in Markdown format. - - - - - Flask 3.0 - utils/pdf_pipeline.py (to be extended) - Werkzeug secure_filename - Ollama (local) or Mistral API - Weaviate with BAAI/bge-m3 - - - mypy strict mode - Google-style docstrings required - - - - - - Update Flask File Validation - - Modify the Flask application to accept both PDF and Markdown files. Update the ALLOWED_EXTENSIONS - configuration and file validation logic to support .md files while maintaining backward compatibility - with existing PDF workflows. - - 1 - backend - - - flask_app.py (line 99: ALLOWED_EXTENSIONS, line 427: allowed_file function) - - - - Change ALLOWED_EXTENSIONS from {"pdf"} to {"pdf", "md"} - - Update allowed_file() function to accept both extensions - - Update upload.html template to accept .md files in file input - - Update error messages to reflect both formats - - - 1. Start Flask app - 2. Navigate to /upload - 3. Attempt to upload a .md file - 4. Verify file is accepted (no "Format non supporté" error) - 5. Verify PDF upload still works - - - - - Add Markdown Detection in Pipeline - - Enhance pdf_pipeline.py to detect when a Markdown file is being processed instead of a PDF. - Add logic to automatically skip OCR processing for .md files and copy the Markdown content - directly to the output directory. - - 1 - backend - - - utils/pdf_pipeline.py (process_pdf_v2 function, around line 250-450) - - - - Add file extension detection: `file_ext = pdf_path.suffix.lower()` - - If file_ext == ".md": - - Skip OCR step entirely (no Mistral API call) - - Read Markdown content directly: `md_content = pdf_path.read_text(encoding='utf-8')` - - Copy to output: `md_path.write_text(md_content, encoding='utf-8')` - - Set nb_pages = md_content.count('\n# ') or 1 (estimate from H1 headers) - - Set cost_ocr = 0.0 - - Emit progress: "markdown_load" instead of "ocr" - - If file_ext == ".pdf": - - Continue with existing OCR workflow - - Both paths converge at LLM processing (metadata, TOC, chunking) - - - 1. Create test Markdown file with philosophical content - 2. Call process_pdf(Path("test.md"), use_llm=True) - 3. Verify OCR is skipped (cost_ocr = 0.0) - 4. Verify output/test/test.md is created - 5. Verify no _ocr.json file is created - 6. Verify LLM processing runs normally - - - - - Markdown-Specific Progress Callback - - Update the progress callback system to emit appropriate events for Markdown file processing. - Instead of "OCR Mistral en cours...", display "Chargement Markdown..." to provide accurate - user feedback during Server-Sent Events streaming. - - 2 - backend - - - utils/pdf_pipeline.py (emit_progress calls) - - flask_app.py (process_file_background function) - - - - Add conditional progress messages based on file type - - For .md files: emit_progress("markdown_load", "running", "Chargement du fichier Markdown...") - - For .pdf files: emit_progress("ocr", "running", "OCR Mistral en cours...") - - Update frontend to handle "markdown_load" event type - - Ensure step numbering adjusts (9 steps for MD vs 10 for PDF) - - - 1. Upload Markdown file via Flask interface - 2. Monitor SSE progress stream at /upload/progress/<job_id> - 3. Verify first step shows "Chargement du fichier Markdown..." - 4. Verify no OCR-related messages appear - 5. Verify subsequent steps (metadata, TOC, etc.) work normally - - - - - Update process_pdf_bytes for Markdown - - Extend process_pdf_bytes() function to handle Markdown content uploaded via Flask. - This function currently creates a temporary PDF file, but for Markdown uploads, - it should create a temporary .md file instead. - - 1 - backend - - - utils/pdf_pipeline.py (process_pdf_bytes function, line 1255) - - - - Detect file type from filename parameter - - If filename ends with .md: - - Create temp file with suffix=".md" - - Write file_bytes as UTF-8 text - - If filename ends with .pdf: - - Existing behavior (suffix=".pdf", binary write) - - Pass temp file path to process_pdf() which now handles both types - - - 1. Create Flask test client - 2. POST multipart form with .md file to /upload - 3. Verify process_pdf_bytes creates .md temp file - 4. Verify temp file contains correct Markdown content - 5. Verify cleanup deletes temp file after processing - - - - - Add Markdown File Validation - - Implement validation for uploaded Markdown files to ensure they contain valid UTF-8 text - and basic Markdown structure. Reject files that are too large, contain binary data, - or have no meaningful content. - - 2 - backend - - - utils/markdown_validator.py - - - - Create validate_markdown_file(file_path: Path) -> dict[str, Any] function - - Checks: - - File size < 10 MB - - Valid UTF-8 encoding - - Contains at least one header (#, ##, etc.) - - Not empty (at least 100 characters) - - No null bytes or excessive binary content - - Return dict with success, error, and warnings keys - - Call from process_pdf_v2 before processing - - Type annotations and Google-style docstrings required - - - 1. Test with valid Markdown file → passes validation - 2. Test with empty file → fails with "File too short" - 3. Test with binary file (.exe renamed to .md) → fails with "Invalid UTF-8" - 4. Test with very large file (>10MB) → fails with "File too large" - 5. Test with plain text no headers → warning but continues - - - - - Update Documentation - - Update README.md and .claude/CLAUDE.md to document the new Markdown support feature. - Include usage examples, cost comparison (PDF vs MD), and troubleshooting tips. - - 3 - documentation - - - README.md (add section under "Pipeline de Traitement") - - .claude/CLAUDE.md (update development guidelines) - - templates/upload.html (add help text) - - - - README.md: - - Add "Support Markdown Natif" section - - Document accepted formats: PDF, MD - - Show cost comparison table (PDF: ~0.003€/page, MD: 0€) - - Add example: process_pdf(Path("document.md")) - - CLAUDE.md: - - Update "Pipeline de Traitement" section - - Note conditional OCR step - - Document markdown_validator.py module - - upload.html: - - Update file input accept attribute: accept=".pdf,.md" - - Add help text: "Formats acceptés : PDF, Markdown (.md)" - - - 1. Read README.md markdown support section - 2. Verify examples are clear and accurate - 3. Check CLAUDE.md developer notes - 4. Open /upload in browser - 5. Verify help text displays correctly - - - - - Add Unit Tests for Markdown Processing - - Create comprehensive unit tests for Markdown file handling to ensure reliability - and prevent regressions. Cover file validation, pipeline processing, and edge cases. - - 2 - testing - - - tests/utils/test_markdown_validator.py - - tests/utils/test_pdf_pipeline_markdown.py - - tests/fixtures/sample.md - - - - test_markdown_validator.py: - - Test valid Markdown acceptance - - Test invalid encoding rejection - - Test file size limits - - Test empty file rejection - - Test binary data detection - - test_pdf_pipeline_markdown.py: - - Test Markdown file processing end-to-end - - Test OCR skip for .md files - - Test cost_ocr = 0.0 - - Test LLM processing (metadata, TOC, chunking) - - Mock Weaviate ingestion - - Verify output files created correctly - - fixtures/sample.md: - - Create realistic philosophical text in Markdown - - Include headers, paragraphs, formatting - - ~1000 words for realistic testing - - - 1. Run: pytest tests/utils/test_markdown_validator.py -v - 2. Verify all validation tests pass - 3. Run: pytest tests/utils/test_pdf_pipeline_markdown.py -v - 4. Verify end-to-end Markdown processing works - 5. Check test coverage: pytest --cov=utils --cov-report=html - - - - - Type Safety and Documentation - - Ensure all new code follows strict type safety requirements and includes comprehensive - Google-style docstrings. Run mypy checks and update type definitions as needed. - - 2 - type_safety - - - utils/types.py (add Markdown-specific types if needed) - - All modified modules (type annotations) - - - - Add type annotations to all new functions - - Update existing functions that handle both PDF and MD - - Consider adding: - - FileFormat = Literal["pdf", "md"] - - MarkdownValidationResult = TypedDict(...) - - Run mypy --strict on all modified files - - Add Google-style docstrings with: - - Args section documenting all parameters - - Returns section with structure details - - Raises section for exceptions - - Examples section for complex functions - - - 1. Run: mypy utils/pdf_pipeline.py --strict - 2. Run: mypy utils/markdown_validator.py --strict - 3. Verify no type errors - 4. Run: pydocstyle utils/markdown_validator.py --convention=google - 5. Verify all docstrings follow Google style - - - - - Handle Markdown-Specific Edge Cases - - Address edge cases specific to Markdown processing: front matter (YAML/TOML), - embedded code blocks, special characters, and non-standard Markdown extensions. - - 3 - backend - - - utils/markdown_validator.py - - utils/llm_metadata.py (handle front matter) - - - - Front matter handling: - - Detect YAML/TOML front matter (--- or +++) - - Extract metadata if present (title, author, date) - - Pass to LLM or use directly if valid - - Strip front matter before content processing - - Code block handling: - - Don't treat code blocks as actual content - - Preserve them for chunking but don't analyze - - Special characters: - - Handle Unicode properly (Greek, Latin, French accents) - - Preserve LaTeX equations in $ or $$ - - GitHub Flavored Markdown: - - Support tables, task lists, strikethrough - - Convert to standard format if needed - - - 1. Upload Markdown with YAML front matter - 2. Verify metadata extracted correctly - 3. Upload Markdown with code blocks - 4. Verify code not treated as philosophical content - 5. Upload Markdown with Greek/Latin text - 6. Verify Unicode handled correctly - - - - - Update UI/UX for Markdown Upload - - Enhance the upload interface to clearly communicate Markdown support and provide - visual feedback about the file type being processed. Show format-specific information - (e.g., "No OCR cost for Markdown files"). - - 3 - frontend - - - templates/upload.html - - templates/upload_progress.html - - - - upload.html: - - Add file type indicator icon (📄 PDF vs 📝 MD) - - Show format-specific help text on hover - - Display estimated cost: "PDF: ~0.003€/page, Markdown: 0€" - - Add example Markdown file download link - - upload_progress.html: - - Show different icon for Markdown processing - - Adjust progress bar (9 steps vs 10 steps) - - Display "No OCR cost" badge for Markdown - - Update step descriptions based on file type - - - 1. Open /upload page - 2. Verify help text mentions both PDF and MD - 3. Select a .md file - 4. Verify file type indicator shows 📝 - 5. Submit upload - 6. Verify progress shows "Chargement Markdown..." - 7. Verify "No OCR cost" badge displays - - - - - - - Setup and Configuration - - - Update ALLOWED_EXTENSIONS in flask_app.py - - Modify allowed_file() validation function - - Update upload.html file input accept attribute - - Add Markdown MIME type handling - - - - - Core Pipeline Extension - - - Add file extension detection in process_pdf_v2() - - Implement Markdown file reading logic - - Skip OCR for .md files - - Add conditional progress callbacks - - Update process_pdf_bytes() for Markdown - - - - - Validation and Error Handling - - - Create markdown_validator.py module - - Implement UTF-8 encoding validation - - Add file size limits - - Handle front matter extraction - - Add comprehensive error messages - - - - - Testing Infrastructure - - - Create test fixtures (sample.md) - - Write validation tests - - Write pipeline integration tests - - Add edge case tests - - Verify mypy strict compliance - - - - - Documentation and Polish - - - Update README.md with Markdown support - - Update .claude/CLAUDE.md developer docs - - Add Google-style docstrings - - Update UI templates with new messaging - - Create usage examples - - - - - - - - Markdown files upload successfully via Flask - - OCR is skipped for .md files (cost_ocr = 0.0) - - LLM processing works identically for PDF and MD - - Chunks are created and vectorized correctly - - Both file types can be searched in Weaviate - - Existing PDF workflow remains unchanged - - - - - All code passes mypy --strict - - All functions have type annotations - - Google-style docstrings on all modules - - No Any types without justification - - TypedDict definitions for new data structures - - - - - Unit tests cover Markdown validation - - Integration tests verify end-to-end processing - - Edge cases handled (front matter, Unicode, large files) - - Test coverage >80% for new code - - All tests pass in CI/CD pipeline - - - - - Upload interface clearly shows both formats supported - - Progress feedback accurate for both PDF and MD - - Cost savings clearly communicated ("0€ for Markdown") - - Error messages helpful and specific - - Documentation clear with examples - - - - - Markdown processing faster than PDF (no OCR) - - No regression in PDF processing speed - - Memory usage reasonable for large MD files - - Validation completes in <100ms - - Overall pipeline <30s for typical Markdown document - - - - - - - PDF processing: OCR ~0.003€/page + LLM variable - - Markdown processing: 0€ OCR + LLM variable - - Estimated savings: 50-70% for documents with Markdown source - - - - - Maintains backward compatibility with existing PDFs - - No breaking changes to API or database schema - - Existing chunks and documents unaffected - - Can process both formats in same session - - - - - Support for .txt plain text files - - Support for .docx Word documents (via pandoc) - - Support for .epub ebooks - - Batch upload of multiple Markdown files - - Markdown to PDF export for archival - - - diff --git a/prompts/app_spec_tavily_mcp.txt b/prompts/app_spec_tavily_mcp.txt deleted file mode 100644 index 349f9a6..0000000 --- a/prompts/app_spec_tavily_mcp.txt +++ /dev/null @@ -1,498 +0,0 @@ - - ikario - Tavily MCP Integration for Internet Access - - - This specification adds Tavily search capabilities via MCP (Model Context Protocol) to give Ikario - internet access for real-time web searches. Tavily provides high-quality search results optimized - for AI agents, making it ideal for research, fact-checking, and accessing current information. - - This integration adds a new MCP server connection to the existing architecture (alongside the - ikario-memory MCP server) and exposes Tavily search tools to Ikario during conversations. - - All changes are additive and backward-compatible. Existing functionality remains unchanged. - - - - - Tavily MCP Server Connection: - - Uses @modelcontextprotocol/sdk Client to connect to Tavily MCP server - - Connection can be stdio-based (local MCP server) or HTTP-based (remote) - - Tavily MCP server provides search tools that are exposed to Claude via Tool Use API - - Backend routes handle tool execution and return results to Claude - - - - - Real-time internet access for Ikario - - High-quality search results optimized for LLMs - - Fact-checking and verification capabilities - - Access to current events and news - - Research assistance with cited sources - - Seamless integration with existing memory tools - - - - - - Tavily MCP Server - Model Context Protocol (MCP) - stdio or HTTP transport - @modelcontextprotocol/sdk - Tavily API key (from https://tavily.com) - - - Node.js with Express (existing) - MCP Client for Tavily server connection - Existing toolExecutor service extended with Tavily tools - - - GET/POST /api/tavily/* for Tavily-specific operations - Existing /api/claude/chat routes support Tavily tools automatically - - - - - - - Tavily API key obtained from https://tavily.com (free tier available) - - API key stored in environment variable TAVILY_API_KEY or configuration file - - MCP SDK already installed (@modelcontextprotocol/sdk exists for ikario-memory) - - Tavily MCP server installed (npm package or Python package) - - - - Add Tavily MCP server config to server/.claude_settings.json or similar - - Configure connection parameters (stdio vs HTTP) - - Set API key securely - - - - - - Tavily MCP Client Setup - - Create MCP client connection to Tavily search server. This is similar to the existing - ikario-memory MCP client but connects to Tavily instead. - - Implementation: - - Create server/services/tavilyMcpClient.js - - Initialize MCP client with Tavily server connection - - Handle connection lifecycle (connect, disconnect, reconnect) - - Implement health checks and connection status - - Export client instance and helper functions - - Configuration: - - Read Tavily API key from environment or config file - - Configure transport (stdio or HTTP) - - Set connection timeout and retry logic - - Log connection status for debugging - - Error Handling: - - Graceful degradation if Tavily is unavailable - - Connection retry with exponential backoff - - Clear error messages for configuration issues - - 1 - backend - - 1. Verify MCP client can connect to Tavily server on startup - 2. Test connection health check endpoint returns correct status - 3. Verify graceful handling when Tavily API key is missing - 4. Test reconnection logic when connection drops - 5. Verify connection status is logged correctly - 6. Test that server starts even if Tavily is unavailable - - - - - Tavily Tool Configuration - - Configure Tavily search tools to be available to Claude during conversations. - This integrates with the existing tool system (like memory tools). - - Implementation: - - Create server/config/tavilyTools.js - - Define tool schemas for Tavily search capabilities - - Integrate with existing toolExecutor service - - Add Tavily tools to system prompt alongside memory tools - - Tavily Tools to Expose: - - tavily_search: General web search with AI-optimized results - - Parameters: query (string), max_results (number), search_depth (basic/advanced) - - Returns: Array of search results with title, url, content, score - - - tavily_search_news: News-specific search for current events - - Parameters: query (string), max_results (number), days (number) - - Returns: Recent news articles with metadata - - Tool Schema: - - Follow Claude Tool Use API format - - Clear descriptions for each tool - - Well-defined input schemas with validation - - Proper error handling in tool execution - - 1 - backend - - 1. Verify Tavily tools are listed in available tools - 2. Test tool schema validation with valid inputs - 3. Test tool schema validation rejects invalid inputs - 4. Verify tools appear in Claude's system prompt - 5. Test that tool descriptions are clear and accurate - 6. Verify tools can be called without errors - - - - - Tavily Tool Executor Integration - - Integrate Tavily tools into the existing toolExecutor service so Claude can - use them during conversations. - - Implementation: - - Extend server/services/toolExecutor.js to handle Tavily tools - - Add tool detection for tavily_search and tavily_search_news - - Implement tool execution logic using Tavily MCP client - - Format Tavily results for Claude consumption - - Handle errors and timeouts gracefully - - Tool Execution Flow: - 1. Claude requests tool use (e.g., tavily_search) - 2. toolExecutor detects Tavily tool request - 3. Call Tavily MCP client with tool parameters - 4. Receive and format search results - 5. Return formatted results to Claude - 6. Claude incorporates results into response - - Result Formatting: - - Convert Tavily results to Claude-friendly format - - Include source URLs for citation - - Add relevance scores - - Truncate content if too long - - Handle empty results gracefully - - 1 - backend - - 1. Test tavily_search tool execution with valid query - 2. Verify results are properly formatted - 3. Test tavily_search_news tool execution - 4. Verify error handling when Tavily API fails - 5. Test timeout handling for slow searches - 6. Verify results include proper citations and URLs - 7. Test with empty search results - 8. Test with very long search queries - - - - - System Prompt Enhancement for Internet Access - - Update the system prompt to inform Ikario about internet access capabilities. - This should be added alongside existing memory tools instructions. - - Implementation: - - Update MEMORY_SYSTEM_PROMPT in server/routes/messages.js and claude.js - - Add Tavily tools documentation - - Provide usage guidelines for when to search the internet - - Include examples of good search queries - - Prompt Addition: - "## Internet Access via Tavily - - Tu as accès à internet en temps réel via deux outils de recherche : - - 1. tavily_search : Recherche web générale optimisée pour l'IA - - Utilise pour : rechercher des informations actuelles, vérifier des faits, - trouver des sources fiables - - Paramètres : query (ta question), max_results (nombre de résultats, défaut: 5), - search_depth ('basic' ou 'advanced') - - Retourne : Résultats avec titre, URL, contenu et score de pertinence - - 2. tavily_search_news : Recherche d'actualités récentes - - Utilise pour : événements actuels, nouvelles, actualités - - Paramètres : query, max_results, days (nombre de jours en arrière, défaut: 7) - - Quand utiliser la recherche internet : - - Quand l'utilisateur demande des informations récentes ou actuelles - - Pour vérifier des faits ou données que tu n'es pas sûr de connaître - - Quand ta base de connaissances est trop ancienne (après janvier 2025) - - Pour trouver des sources et citations spécifiques - - Pour des requêtes nécessitant des données en temps réel - - N'utilise PAS la recherche pour : - - Des questions sur ta propre identité ou capacités - - Des concepts généraux que tu connais déjà bien - - Des questions purement créatives ou d'opinion - - Utilise ces outils de façon autonome selon les besoins de la conversation. - Cite toujours tes sources quand tu utilises des informations de Tavily." - - 2 - backend - - 1. Verify system prompt includes Tavily instructions - 2. Test that Claude understands when to use Tavily search - 3. Verify Claude cites sources from Tavily results - 4. Test that Claude uses appropriate search queries - 5. Verify Claude chooses between tavily_search and tavily_search_news correctly - 6. Test that Claude doesn't over-use search for simple questions - - - - - Tavily Status API Endpoint - - Create API endpoint to check Tavily MCP connection status and search capabilities. - Similar to /api/memory/status endpoint. - - Implementation: - - Create GET /api/tavily/status endpoint - - Return connection status, available tools, and configuration - - Create GET /api/tavily/health endpoint for health checks - - Add Tavily status to existing /api/memory/stats (rename to /api/tools/stats) - - Response Format: - { - "success": true, - "data": { - "connected": true, - "message": "Tavily MCP server is connected", - "tools": ["tavily_search", "tavily_search_news"], - "apiKeyConfigured": true, - "transport": "stdio" - } - } - - 2 - backend - - 1. Test GET /api/tavily/status returns correct status - 2. Verify status shows "connected" when Tavily is available - 3. Verify status shows "disconnected" when Tavily is unavailable - 4. Test health endpoint returns proper status code - 5. Verify tools list is accurate - 6. Test with missing API key shows proper error - - - - - Frontend UI Indicator for Internet Access - - Add visual indicator in the UI to show when Ikario has internet access via Tavily. - This can be displayed alongside the existing memory status indicator. - - Implementation: - - Add Tavily status indicator in header or sidebar - - Show online/offline status for Tavily connection - - Optional: Show when Tavily is being used during a conversation - - Optional: Add tooltip explaining internet access capabilities - - Visual Design: - - Globe or wifi icon to represent internet access - - Green when connected, gray when disconnected - - Subtle animation when search is in progress - - Tooltip: "Internet access via Tavily" or similar - - Integration: - - Use existing useMemory hook pattern or create useTavily hook - - Poll /api/tavily/status periodically (every 60s) - - Update status in real-time during searches - - 3 - frontend - - 1. Verify internet access indicator appears in UI - 2. Test status updates when Tavily connects/disconnects - 3. Verify tooltip shows correct information - 4. Test that indicator shows activity during searches - 5. Verify status polling doesn't impact performance - 6. Test with Tavily disabled shows offline status - - - - - Manual Search UI (Optional Enhancement) - - Optional: Add manual search interface to allow users to trigger Tavily searches directly, - similar to the memory search panel. - - Implementation: - - Add "Internet Search" panel in sidebar (alongside Memory panel) - - Search input for manual Tavily queries - - Display search results with title, snippet, URL - - Click to insert results into conversation - - Filter by search type (general vs news) - - This is OPTIONAL and lower priority. The primary use case is autonomous search by Claude. - - 4 - frontend - - 1. Verify search panel appears in sidebar - 2. Test manual search returns results - 3. Verify results display properly with links - 4. Test inserting results into conversation - 5. Test news search filter works correctly - 6. Verify search history is saved (optional) - - - - - Configuration and Settings - - Add Tavily configuration options to settings and environment. - - Implementation: - - Add TAVILY_API_KEY to environment variables - - Add Tavily settings to .claude_settings.json or similar config file - - Create server/config/tavilyConfig.js for configuration management - - Document configuration options in README - - Configuration Options: - - API key - - Max results per search (default: 5) - - Search depth (basic/advanced) - - Timeout duration - - Enable/disable Tavily globally - - Rate limiting settings - - Security: - - API key should NOT be exposed to frontend - - Use environment variable or secure config file - - Validate API key on startup - - Log warnings if API key is missing - - 2 - backend - - 1. Verify API key is read from environment variable - 2. Test fallback to config file if env var not set - 3. Verify API key validation on startup - 4. Test configuration options are applied correctly - 5. Verify API key is never exposed in API responses - 6. Test enabling/disabling Tavily via config - - - - - Error Handling and Rate Limiting - - Implement robust error handling and rate limiting for Tavily API calls. - - Implementation: - - Detect and handle Tavily API errors (rate limits, invalid API key, etc.) - - Implement client-side rate limiting to avoid hitting Tavily limits - - Cache search results for duplicate queries (optional) - - Provide clear error messages to Claude when searches fail - - Error Types: - - 401: Invalid API key - - 429: Rate limit exceeded - - 500: Tavily server error - - Timeout: Search took too long - - Network: Connection failed - - Rate Limiting: - - Track searches per minute/hour - - Queue requests if limit reached - - Return cached results for duplicate queries within 5 minutes - - Log rate limit warnings - - 2 - backend - - 1. Test error handling for invalid API key - 2. Verify rate limit detection and handling - 3. Test timeout handling for slow searches - 4. Verify error messages are clear to Claude - 5. Test rate limiting prevents API abuse - 6. Verify caching works for duplicate queries - - - - - Documentation and README Updates - - Update project documentation to explain Tavily integration. - - Implementation: - - Update main README.md with Tavily setup instructions - - Add TAVILY_SETUP.md with detailed configuration guide - - Document API endpoints in README - - Add examples of using Tavily with Ikario - - Document troubleshooting steps - - Documentation Sections: - - Prerequisites (Tavily API key) - - Installation steps - - Configuration options - - Testing Tavily connection - - Example conversations using internet search - - Troubleshooting common issues - - API reference for Tavily endpoints - - 3 - documentation - - 1. Verify README has Tavily setup section - 2. Test that setup instructions are clear and complete - 3. Verify all configuration options are documented - 4. Test examples work as described - 5. Verify troubleshooting section covers common issues - - - - - - - Recommended implementation order: - 1. Feature 1 (MCP Client Setup) - Foundation - 2. Feature 2 (Tool Configuration) - Core functionality - 3. Feature 3 (Tool Executor Integration) - Core functionality - 4. Feature 8 (Configuration) - Required for testing - 5. Feature 4 (System Prompt) - Makes tools accessible to Claude - 6. Feature 9 (Error Handling) - Production readiness - 7. Feature 5 (Status API) - Monitoring - 8. Feature 10 (Documentation) - User onboarding - 9. Feature 6 (UI Indicator) - Nice to have - 10. Feature 7 (Manual Search UI) - Optional enhancement - - - - After implementing features 1-5, you should be able to: - - Ask Ikario: "Quelle est l'actualité aujourd'hui ?" - - Ask Ikario: "Recherche des informations sur [topic actuel]" - - Ask Ikario: "Vérifie cette information : [claim]" - - Ikario should autonomously use Tavily search and cite sources. - - - - - This specification is fully compatible with existing ikario-memory MCP integration - - Ikario will have both memory tools AND internet search tools - - Tools can be used together in the same conversation - - No conflicts expected between tool systems - - - - - - - DO NOT expose Tavily API key to frontend or in API responses - - DO NOT modify existing MCP memory integration - - DO NOT break existing conversation functionality - - Tavily should gracefully degrade if unavailable (don't crash the app) - - Implement proper rate limiting to avoid API abuse - - Validate all user inputs before passing to Tavily - - Sanitize search results before displaying (XSS prevention) - - Log all Tavily API calls for monitoring and debugging - - - - - - Ikario can successfully perform internet searches when asked - - Search results are relevant and well-formatted - - Sources are properly cited - - Tavily integration doesn't slow down conversations - - Error handling is robust and user-friendly - - Configuration is straightforward - - Documentation is clear and complete - - diff --git a/prompts/app_spec_types_docs.backup.txt b/prompts/app_spec_types_docs.backup.txt deleted file mode 100644 index 0fe4fa6..0000000 --- a/prompts/app_spec_types_docs.backup.txt +++ /dev/null @@ -1,679 +0,0 @@ - - Library RAG - Type Safety & Documentation Enhancement - - - Enhance the Library RAG application (philosophical texts indexing and semantic search) by adding - strict type annotations and comprehensive Google-style docstrings to all Python modules. This will - improve code maintainability, enable static type checking with mypy, and provide clear documentation - for all functions, classes, and modules. - - The application is a RAG pipeline that processes PDF documents through OCR, LLM-based extraction, - semantic chunking, and ingestion into Weaviate vector database. It includes a Flask web interface - for document upload, processing, and semantic search. - - - - - Python 3.10+ - Flask 3.0 - Weaviate 1.34.4 with text2vec-transformers - Mistral OCR API - Ollama (local) or Mistral API - mypy with strict configuration - - - Docker Compose (Weaviate + transformers) - weaviate-client, flask, mistralai, python-dotenv - - - - - - - flask_app.py: Main Flask application (640 lines) - - schema.py: Weaviate schema definition (383 lines) - - utils/: 16+ modules for PDF processing pipeline - - pdf_pipeline.py: Main orchestration (879 lines) - - mistral_client.py: OCR API client - - ocr_processor.py: OCR processing - - markdown_builder.py: Markdown generation - - llm_metadata.py: Metadata extraction via LLM - - llm_toc.py: Table of contents extraction - - llm_classifier.py: Section classification - - llm_chunker.py: Semantic chunking - - llm_cleaner.py: Chunk cleaning - - llm_validator.py: Document validation - - weaviate_ingest.py: Database ingestion - - hierarchy_parser.py: Document hierarchy parsing - - image_extractor.py: Image extraction from PDFs - - toc_extractor*.py: Various TOC extraction methods - - templates/: Jinja2 templates for Flask UI - - tests/utils2/: Minimal test coverage (3 test files) - - - - - Inconsistent type annotations across modules (some have partial types, many have none) - - Missing or incomplete docstrings (no Google-style format) - - No mypy configuration for strict type checking - - Type hints missing on function parameters and return values - - Dict[str, Any] used extensively without proper typing - - No type stubs for complex nested structures - - - - - - - - Add complete type annotations to ALL functions and methods - - Use proper generic types (List, Dict, Optional, Union) from typing module - - Add TypedDict for complex dictionary structures - - Add Protocol types for duck-typed interfaces - - Use Literal types for string constants - - Add ParamSpec and TypeVar where appropriate - - Type all class attributes and instance variables - - Add type annotations to lambda functions where possible - - - - - Create mypy.ini with strict configuration - - Enable: check_untyped_defs, disallow_untyped_defs, disallow_incomplete_defs - - Enable: disallow_untyped_calls, disallow_untyped_decorators - - Enable: warn_return_any, warn_redundant_casts - - Enable: strict_equality, strict_optional - - Set python_version to 3.10 - - Configure per-module overrides if needed for gradual migration - - - - - Create TypedDict definitions for common data structures: - - OCR response structures - - Metadata dictionaries - - TOC entries - - Chunk objects - - Weaviate objects - - Pipeline results - - Add NewType for semantic type safety (DocumentName, ChunkId, etc.) - - Create Protocol types for callback functions - - - - - pdf_pipeline.py: Type all 10 pipeline steps, callbacks, result dictionaries - - flask_app.py: Type all route handlers, request/response types - - schema.py: Type Weaviate configuration objects - - llm_*.py: Type LLM request/response structures - - mistral_client.py: Type API client methods and responses - - weaviate_ingest.py: Type ingestion functions and batch operations - - - - - - - Add comprehensive Google-style docstrings to ALL: - - Module-level docstrings explaining purpose and usage - - Class docstrings with Attributes section - - Function/method docstrings with Args, Returns, Raises sections - - Complex algorithm explanations with Examples section - - Include code examples for public APIs - - Document all exceptions that can be raised - - Add Notes section for important implementation details - - Add See Also section for related functions - - - - - - pdf_pipeline.py: Document the 10-step pipeline, each step's purpose - - mistral_client.py: Document OCR API usage, cost calculation - - llm_metadata.py: Document metadata extraction logic - - llm_toc.py: Document TOC extraction strategies - - llm_classifier.py: Document section classification types - - llm_chunker.py: Document semantic vs basic chunking - - llm_cleaner.py: Document cleaning rules and validation - - llm_validator.py: Document validation criteria - - weaviate_ingest.py: Document ingestion process, nested objects - - hierarchy_parser.py: Document hierarchy building algorithm - - - - - Document all routes with request/response examples - - Document SSE (Server-Sent Events) implementation - - Document Weaviate query patterns - - Document upload processing workflow - - Document background job management - - - - - Document Weaviate schema design decisions - - Document each collection's purpose and relationships - - Document nested object structure - - Document vectorization strategy - - - - - - Add inline comments for complex logic only (don't over-comment) - - Explain WHY not WHAT (code should be self-documenting) - - Document performance considerations - - Document cost implications (OCR, LLM API calls) - - Document error handling strategies - - - - - - - All modules must pass mypy --strict - - No # type: ignore comments without justification - - CI/CD should run mypy checks - - Type coverage should be 100% - - - - - All public functions must have docstrings - - All docstrings must follow Google style - - Examples should be executable and tested - - Documentation should be clear and concise - - - - - - - Priority 1 (Most used, most complex): - 1. utils/pdf_pipeline.py - Main orchestration - 2. flask_app.py - Web application entry point - 3. utils/weaviate_ingest.py - Database operations - 4. schema.py - Schema definition - - Priority 2 (Core LLM modules): - 5. utils/llm_metadata.py - 6. utils/llm_toc.py - 7. utils/llm_classifier.py - 8. utils/llm_chunker.py - 9. utils/llm_cleaner.py - 10. utils/llm_validator.py - - Priority 3 (OCR and parsing): - 11. utils/mistral_client.py - 12. utils/ocr_processor.py - 13. utils/markdown_builder.py - 14. utils/hierarchy_parser.py - 15. utils/image_extractor.py - - Priority 4 (Supporting modules): - 16. utils/toc_extractor.py - 17. utils/toc_extractor_markdown.py - 18. utils/toc_extractor_visual.py - 19. utils/llm_structurer.py (legacy) - - - - - - Setup Type Checking Infrastructure - - Configure mypy with strict settings and create foundational type definitions - - - - Create mypy.ini configuration file with strict settings - - Add mypy to requirements.txt or dev dependencies - - Create utils/types.py module for common TypedDict definitions - - Define core types: OCRResponse, Metadata, TOCEntry, ChunkData, PipelineResult - - Add NewType definitions for semantic types: DocumentName, ChunkId, SectionPath - - Create Protocol types for callbacks (ProgressCallback, etc.) - - Document type definitions in utils/types.py module docstring - - Test mypy configuration on a single module to verify settings - - - - mypy.ini exists with strict configuration - - utils/types.py contains all foundational types with docstrings - - mypy runs without errors on utils/types.py - - Type definitions are comprehensive and reusable - - - - - Add Types to PDF Pipeline Orchestration - - Add complete type annotations to pdf_pipeline.py (879 lines, most complex module) - - - - Add type annotations to all function signatures in pdf_pipeline.py - - Type the 10-step pipeline: OCR, Markdown, Metadata, TOC, Classify, Chunk, Clean, Validate, Weaviate - - Type progress_callback parameter with Protocol or Callable - - Add TypedDict for pipeline options dictionary - - Add TypedDict for pipeline result dictionary structure - - Type all helper functions (extract_document_metadata_legacy, etc.) - - Add proper return types for process_pdf_v2, process_pdf, process_pdf_bytes - - Fix any mypy errors that arise - - Verify mypy --strict passes on pdf_pipeline.py - - - - All functions in pdf_pipeline.py have complete type annotations - - progress_callback is properly typed with Protocol - - All Dict[str, Any] replaced with TypedDict where appropriate - - mypy --strict pdf_pipeline.py passes with zero errors - - No # type: ignore comments (or justified if absolutely necessary) - - - - - Add Types to Flask Application - - Add complete type annotations to flask_app.py and type all routes - - - - Add type annotations to all Flask route handlers - - Type request.args, request.form, request.files usage - - Type jsonify() return values - - Type get_weaviate_client context manager - - Type get_collection_stats, get_all_chunks, search_chunks functions - - Add TypedDict for Weaviate query results - - Type background job processing functions (run_processing_job) - - Type SSE generator function (upload_progress) - - Add type hints for template rendering - - Verify mypy --strict passes on flask_app.py - - - - All Flask routes have complete type annotations - - Request/response types are clear and documented - - Weaviate query functions are properly typed - - SSE generator is correctly typed - - mypy --strict flask_app.py passes with zero errors - - - - - Add Types to Core LLM Modules - - Add complete type annotations to all LLM processing modules (metadata, TOC, classifier, chunker, cleaner, validator) - - - - llm_metadata.py: Type extract_metadata function, return structure - - llm_toc.py: Type extract_toc function, TOC hierarchy structure - - llm_classifier.py: Type classify_sections, section types (Literal), validation functions - - llm_chunker.py: Type chunk_section_with_llm, chunk objects - - llm_cleaner.py: Type clean_chunk, is_chunk_valid functions - - llm_validator.py: Type validate_document, validation result structure - - Add TypedDict for LLM request/response structures - - Type provider selection ("ollama" | "mistral" as Literal) - - Type model names with Literal or constants - - Verify mypy --strict passes on all llm_*.py modules - - - - All LLM modules have complete type annotations - - Section types use Literal for type safety - - Provider and model parameters are strongly typed - - LLM request/response structures use TypedDict - - mypy --strict passes on all llm_*.py modules with zero errors - - - - - Add Types to Weaviate and Database Modules - - Add complete type annotations to schema.py and weaviate_ingest.py - - - - schema.py: Type Weaviate configuration objects - - schema.py: Type collection property definitions - - weaviate_ingest.py: Type ingest_document function signature - - weaviate_ingest.py: Type delete_document_chunks function - - weaviate_ingest.py: Add TypedDict for Weaviate object structure - - Type batch insertion operations - - Type nested object references (work, document) - - Add proper error types for Weaviate exceptions - - Verify mypy --strict passes on both modules - - - - schema.py has complete type annotations for Weaviate config - - weaviate_ingest.py functions are fully typed - - Nested object structures use TypedDict - - Weaviate client operations are properly typed - - mypy --strict passes on both modules with zero errors - - - - - Add Types to OCR and Parsing Modules - - Add complete type annotations to mistral_client.py, ocr_processor.py, markdown_builder.py, hierarchy_parser.py - - - - mistral_client.py: Type create_client, run_ocr, estimate_ocr_cost - - mistral_client.py: Add TypedDict for Mistral API response structures - - ocr_processor.py: Type serialize_ocr_response, OCR object structures - - markdown_builder.py: Type build_markdown, image_writer parameter - - hierarchy_parser.py: Type build_hierarchy, flatten_hierarchy functions - - hierarchy_parser.py: Add TypedDict for hierarchy node structure - - image_extractor.py: Type create_image_writer, image handling - - Verify mypy --strict passes on all modules - - - - All OCR/parsing modules have complete type annotations - - Mistral API structures use TypedDict - - Hierarchy nodes are properly typed - - Image handling functions are typed - - mypy --strict passes on all modules with zero errors - - - - - Add Google-Style Docstrings to Core Modules - - Add comprehensive Google-style docstrings to pdf_pipeline.py, flask_app.py, and weaviate modules - - - - pdf_pipeline.py: Add module docstring explaining the V2 pipeline - - pdf_pipeline.py: Add docstrings to process_pdf_v2 with Args, Returns, Raises sections - - pdf_pipeline.py: Document each of the 10 pipeline steps in comments - - pdf_pipeline.py: Add Examples section showing typical usage - - flask_app.py: Add module docstring explaining Flask application - - flask_app.py: Document all routes with request/response examples - - flask_app.py: Document Weaviate connection management - - schema.py: Add module docstring explaining schema design - - schema.py: Document each collection's purpose and relationships - - weaviate_ingest.py: Document ingestion process with examples - - All docstrings must follow Google style format exactly - - - - All core modules have comprehensive module-level docstrings - - All public functions have Google-style docstrings - - Args, Returns, Raises sections are complete and accurate - - Examples are provided for complex functions - - Docstrings explain WHY, not just WHAT - - - - - Add Google-Style Docstrings to LLM Modules - - Add comprehensive Google-style docstrings to all LLM processing modules - - - - llm_metadata.py: Document metadata extraction logic with examples - - llm_toc.py: Document TOC extraction strategies and fallbacks - - llm_classifier.py: Document section types and classification criteria - - llm_chunker.py: Document semantic vs basic chunking approaches - - llm_cleaner.py: Document cleaning rules and validation logic - - llm_validator.py: Document validation criteria and corrections - - Add Examples sections showing input/output for each function - - Document LLM provider differences (Ollama vs Mistral) - - Document cost implications in Notes sections - - All docstrings must follow Google style format exactly - - - - All LLM modules have comprehensive docstrings - - Each function has Args, Returns, Raises sections - - Examples show realistic input/output - - Provider differences are documented - - Cost implications are noted where relevant - - - - - Add Google-Style Docstrings to OCR and Parsing Modules - - Add comprehensive Google-style docstrings to OCR, markdown, hierarchy, and extraction modules - - - - mistral_client.py: Document OCR API usage, cost calculation - - ocr_processor.py: Document OCR response processing - - markdown_builder.py: Document markdown generation strategy - - hierarchy_parser.py: Document hierarchy building algorithm - - image_extractor.py: Document image extraction process - - toc_extractor*.py: Document various TOC extraction methods - - Add Examples sections for complex algorithms - - Document edge cases and error handling - - All docstrings must follow Google style format exactly - - - - All OCR/parsing modules have comprehensive docstrings - - Complex algorithms are well explained - - Edge cases are documented - - Error handling is documented - - Examples demonstrate typical usage - - - - - Final Validation and CI Integration - - Verify all type annotations and docstrings, integrate mypy into CI/CD - - - - Run mypy --strict on entire codebase, verify 100% pass rate - - Verify all public functions have docstrings - - Check docstring formatting with pydocstyle or similar tool - - Create GitHub Actions workflow to run mypy on every commit - - Update README.md with type checking instructions - - Update CLAUDE.md with documentation standards - - Create CONTRIBUTING.md with type annotation and docstring guidelines - - Generate API documentation with Sphinx or pdoc - - Fix any remaining mypy errors or missing docstrings - - - - mypy --strict passes on entire codebase with zero errors - - All public functions have Google-style docstrings - - CI/CD runs mypy checks automatically - - Documentation is generated and accessible - - Contributing guidelines document type/docstring requirements - - - - - - - - 100% type coverage across all modules - - mypy --strict passes with zero errors - - No # type: ignore comments without justification - - All Dict[str, Any] replaced with TypedDict where appropriate - - Proper use of generics, protocols, and type variables - - NewType used for semantic type safety - - - - - All modules have comprehensive module-level docstrings - - All public functions/classes have Google-style docstrings - - All docstrings include Args, Returns, Raises sections - - Complex functions include Examples sections - - Cost implications documented in Notes sections - - Error handling clearly documented - - Provider differences (Ollama vs Mistral) documented - - - - - Code is self-documenting with clear variable names - - Inline comments explain WHY, not WHAT - - Complex algorithms are well explained - - Performance considerations documented - - Security considerations documented - - - - - IDE autocomplete works perfectly with type hints - - Type errors caught at development time, not runtime - - Documentation is easily accessible in IDE - - API examples are executable and tested - - Contributing guidelines are clear and comprehensive - - - - - Refactoring is safer with type checking - - Function signatures are self-documenting - - API contracts are explicit and enforced - - Breaking changes are caught by type checker - - New developers can understand code quickly - - - - - - - Must maintain backward compatibility with existing code - - Cannot break existing Flask routes or API contracts - - Weaviate schema must remain unchanged - - Existing tests must continue to pass - - - - - Can use per-module mypy configuration for gradual migration - - Can temporarily disable strict checks on legacy modules - - Priority modules must be completed first - - Low-priority modules can be deferred - - - - - All type annotations must use Python 3.10+ syntax - - Docstrings must follow Google style exactly (not NumPy or reStructuredText) - - Use typing module (List, Dict, Optional) until Python 3.9 support dropped - - Use from __future__ import annotations if needed for forward references - - - - - - - Run mypy --strict on each module after adding types - - Use mypy daemon (dmypy) for faster incremental checking - - Add mypy to pre-commit hooks - - CI/CD must run mypy and fail on type errors - - - - - Use pydocstyle to validate Google-style format - - Use sphinx-build to generate docs and catch errors - - Manual review of docstring examples - - Verify examples are executable and correct - - - - - Verify existing tests still pass after type additions - - Add new tests for complex typed structures - - Test mypy configuration on sample code - - Verify IDE autocomplete works correctly - - - - - - ```python - """ - PDF Pipeline V2 - Intelligent document processing with LLM enhancement. - - This module orchestrates a 10-step pipeline for processing PDF documents: - 1. OCR via Mistral API - 2. Markdown construction with images - 3. Metadata extraction via LLM - 4. Table of contents (TOC) extraction - 5. Section classification - 6. Semantic chunking - 7. Chunk cleaning and validation - 8. Enrichment with concepts - 9. Validation and corrections - 10. Ingestion into Weaviate vector database - - The pipeline supports multiple LLM providers (Ollama local, Mistral API) and - various processing modes (skip OCR, semantic chunking, OCR annotations). - - Typical usage: - >>> from pathlib import Path - >>> from utils.pdf_pipeline import process_pdf - >>> - >>> result = process_pdf( - ... Path("document.pdf"), - ... use_llm=True, - ... llm_provider="ollama", - ... ingest_to_weaviate=True, - ... ) - >>> print(f"Processed {result['pages']} pages, {result['chunks_count']} chunks") - - See Also: - mistral_client: OCR API client - llm_metadata: Metadata extraction - weaviate_ingest: Database ingestion - """ - ``` - - - - ```python - def process_pdf_v2( - pdf_path: Path, - output_dir: Path = Path("output"), - *, - use_llm: bool = True, - llm_provider: Literal["ollama", "mistral"] = "ollama", - llm_model: Optional[str] = None, - skip_ocr: bool = False, - ingest_to_weaviate: bool = True, - progress_callback: Optional[ProgressCallback] = None, - ) -> PipelineResult: - """ - Process a PDF through the complete V2 pipeline with LLM enhancement. - - This function orchestrates all 10 steps of the intelligent document processing - pipeline, from OCR to Weaviate ingestion. It supports both local (Ollama) and - cloud (Mistral API) LLM providers, with optional caching via skip_ocr. - - Args: - pdf_path: Absolute path to the PDF file to process. - output_dir: Base directory for output files. Defaults to "./output". - use_llm: Enable LLM-based processing (metadata, TOC, chunking). - If False, uses basic heuristic processing. - llm_provider: LLM provider to use. "ollama" for local (free but slow), - "mistral" for API (fast but paid). - llm_model: Specific model name. If None, auto-detects based on provider - (qwen2.5:7b for ollama, mistral-small-latest for mistral). - skip_ocr: If True, reuses existing markdown file to avoid OCR cost. - Requires output_dir//.md to exist. - ingest_to_weaviate: If True, ingests chunks into Weaviate after processing. - progress_callback: Optional callback for real-time progress updates. - Called with (step_id, status, detail) for each pipeline step. - - Returns: - Dictionary containing processing results with the following keys: - - success (bool): True if processing completed without errors - - document_name (str): Name of the processed document - - pages (int): Number of pages in the PDF - - chunks_count (int): Number of chunks generated - - cost_ocr (float): OCR cost in euros (0 if skip_ocr=True) - - cost_llm (float): LLM API cost in euros (0 if provider=ollama) - - cost_total (float): Total cost (ocr + llm) - - metadata (dict): Extracted metadata (title, author, etc.) - - toc (list): Hierarchical table of contents - - files (dict): Paths to generated files (markdown, chunks, etc.) - - Raises: - FileNotFoundError: If pdf_path does not exist. - ValueError: If skip_ocr=True but markdown file not found. - RuntimeError: If Weaviate connection fails during ingestion. - - Examples: - Basic usage with Ollama (free): - >>> result = process_pdf_v2( - ... Path("platon_menon.pdf"), - ... llm_provider="ollama" - ... ) - >>> print(f"Cost: {result['cost_total']:.4f}€") - Cost: 0.0270€ # OCR only - - With Mistral API (faster): - >>> result = process_pdf_v2( - ... Path("platon_menon.pdf"), - ... llm_provider="mistral", - ... llm_model="mistral-small-latest" - ... ) - - Skip OCR to avoid cost: - >>> result = process_pdf_v2( - ... Path("platon_menon.pdf"), - ... skip_ocr=True, # Reuses existing markdown - ... ingest_to_weaviate=False - ... ) - - Notes: - - OCR cost: ~0.003€/page (standard), ~0.009€/page (with annotations) - - LLM cost: Free with Ollama, variable with Mistral API - - Processing time: ~30s/page with Ollama, ~5s/page with Mistral - - Weaviate must be running (docker-compose up -d) before ingestion - """ - ``` - - - diff --git a/prompts/coding_prompt_library.md b/prompts/coding_prompt_library.md deleted file mode 100644 index 0f628a3..0000000 --- a/prompts/coding_prompt_library.md +++ /dev/null @@ -1,290 +0,0 @@ -## YOUR ROLE - CODING AGENT (Library RAG - Type Safety & Documentation) - -You are working on adding strict type annotations and Google-style docstrings to a Python library project. -This is a FRESH context window - you have no memory of previous sessions. - -You have access to Linear for project management via MCP tools. Linear is your single source of truth. - -### STEP 1: GET YOUR BEARINGS (MANDATORY) - -Start by orienting yourself: - -```bash -# 1. See your working directory -pwd - -# 2. List files to understand project structure -ls -la - -# 3. Read the project specification -cat app_spec.txt - -# 4. Read the Linear project state -cat .linear_project.json - -# 5. Check recent git history -git log --oneline -20 -``` - -### STEP 2: CHECK LINEAR STATUS - -Query Linear to understand current project state using the project_id from `.linear_project.json`. - -1. **Get all issues and count progress:** - ``` - mcp__linear__list_issues with project_id - ``` - Count: - - Issues "Done" = completed - - Issues "Todo" = remaining - - Issues "In Progress" = currently being worked on - -2. **Find META issue** (if exists) for session context - -3. **Check for in-progress work** - complete it first if found - -### STEP 3: SELECT NEXT ISSUE - -Get Todo issues sorted by priority: -``` -mcp__linear__list_issues with project_id, status="Todo", limit=5 -``` - -Select ONE highest-priority issue to work on. - -### STEP 4: CLAIM THE ISSUE - -Use `mcp__linear__update_issue` to set status to "In Progress" - -### STEP 5: IMPLEMENT THE ISSUE - -Based on issue category: - -**For Type Annotation Issues (e.g., "Types - Add type annotations to X.py"):** - -1. Read the target Python file -2. Identify all functions, methods, and variables -3. Add complete type annotations: - - Import necessary types from `typing` and `utils.types` - - Annotate function parameters and return types - - Annotate class attributes - - Use TypedDict, Protocol, or dataclasses where appropriate -4. Save the file -5. Run mypy to verify (MANDATORY): - ```bash - cd generations/library_rag - mypy --config-file=mypy.ini - ``` -6. Fix any mypy errors -7. Commit the changes - -**For Documentation Issues (e.g., "Docs - Add docstrings to X.py"):** - -1. Read the target Python file -2. Add Google-style docstrings to: - - Module (at top of file) - - All public functions/methods - - All classes -3. Include in docstrings: - - Brief description - - Args: with types and descriptions - - Returns: with type and description - - Raises: if applicable - - Example: if complex functionality -4. Save the file -5. Optionally run pydocstyle to verify (if installed) -6. Commit the changes - -**For Setup/Infrastructure Issues:** - -Follow the specific instructions in the issue description. - -### STEP 6: VERIFICATION - -**Type Annotation Issues:** -- Run mypy on the modified file(s) -- Ensure zero type errors -- If errors exist, fix them before proceeding - -**Documentation Issues:** -- Review docstrings for completeness -- Ensure Args/Returns sections match function signatures -- Check that examples are accurate - -**Functional Changes (rare):** -- If the issue changes behavior, test manually -- Start Flask server if needed: `python flask_app.py` -- Test the affected functionality - -### STEP 7: GIT COMMIT - -Make a descriptive commit: -```bash -git add -git commit -m ": - -- -- Verified with mypy (for type issues) -- Linear issue: -" -``` - -### STEP 8: UPDATE LINEAR ISSUE - -1. **Add implementation comment:** - ```markdown - ## Implementation Complete - - ### Changes Made - - [List of files modified] - - [Key changes] - - ### Verification - - mypy passes with zero errors (for type issues) - - All test steps from issue description verified - - ### Git Commit - [commit hash and message] - ``` - -2. **Update status to "Done"** using `mcp__linear__update_issue` - -### STEP 9: DECIDE NEXT ACTION - -After completing an issue, ask yourself: - -1. Have I been working for a while? (Use judgment based on complexity of work done) -2. Is the code in a stable state? -3. Would this be a good handoff point? - -**If YES to all three:** -- Proceed to STEP 10 (Session Summary) -- End cleanly - -**If NO:** -- Continue to another issue (go back to STEP 3) -- But commit first! - -**Pacing Guidelines:** -- Early phase (< 20% done): Can complete multiple simple issues -- Mid/late phase (> 20% done): 1-2 issues per session for quality - -### STEP 10: SESSION SUMMARY (When Ending) - -If META issue exists, add a comment: - -```markdown -## Session Complete - -### Completed This Session -- [Issue ID]: [Title] - [Brief summary] - -### Current Progress -- X issues Done -- Y issues In Progress -- Z issues Todo - -### Notes for Next Session -- [Important context] -- [Recommendations] -- [Any concerns] -``` - -Ensure: -- All code committed -- No uncommitted changes -- App in working state - ---- - -## LINEAR WORKFLOW RULES - -**Status Transitions:** -- Todo → In Progress (when starting) -- In Progress → Done (when verified) - -**NEVER:** -- Delete or modify issue descriptions -- Mark Done without verification -- Leave issues In Progress when switching - ---- - -## TYPE ANNOTATION GUIDELINES - -**Imports needed:** -```python -from typing import Optional, Dict, List, Any, Tuple, Callable -from pathlib import Path -from utils.types import -``` - -**Common patterns:** -```python -# Functions -def process_data(input: str, options: Optional[Dict[str, Any]] = None) -> List[str]: - """Process input data.""" - ... - -# Methods with self -def save(self, path: Path) -> None: - """Save to file.""" - ... - -# Async functions -async def fetch_data(url: str) -> Dict[str, Any]: - """Fetch from API.""" - ... -``` - -**Use project types from `utils/types.py`:** -- Metadata, OCRResponse, TOCEntry, ChunkData, PipelineResult, etc. - ---- - -## DOCSTRING TEMPLATE (Google Style) - -```python -def function_name(param1: str, param2: int = 0) -> List[str]: - """ - Brief one-line description. - - More detailed description if needed. Explain what the function does, - any important behavior, side effects, etc. - - Args: - param1: Description of param1. - param2: Description of param2. Defaults to 0. - - Returns: - Description of return value. - - Raises: - ValueError: When param1 is empty. - IOError: When file cannot be read. - - Example: - >>> result = function_name("test", 5) - >>> print(result) - ['test', 'test', 'test', 'test', 'test'] - """ -``` - ---- - -## IMPORTANT REMINDERS - -**Your Goal:** Add strict type annotations and comprehensive documentation to all Python modules - -**This Session's Goal:** Complete 1-2 issues with quality work and clean handoff - -**Quality Bar:** -- mypy --strict passes with zero errors -- All public functions have complete Google-style docstrings -- Code is clean and well-documented - -**Context is finite.** End sessions early with good handoff notes. The next agent will continue. - ---- - -Begin by running STEP 1 (Get Your Bearings).