diff --git a/REMOTE_WEAVIATE_ARCHITECTURE.md b/REMOTE_WEAVIATE_ARCHITECTURE.md
deleted file mode 100644
index cc4200c..0000000
--- a/REMOTE_WEAVIATE_ARCHITECTURE.md
+++ /dev/null
@@ -1,431 +0,0 @@
-# Architecture pour Weaviate distant (Synology/VPS)
-
-## Votre cas d'usage
-
-**Situation** : Application LLM (local ou cloud) → Weaviate (Synology ou VPS distant)
-
-**Besoins** :
-- ✅ Fiabilité maximale
-- ✅ Sécurité (données privées)
-- ✅ Performance acceptable
-- ✅ Maintenance simple
-
----
-
-## 🏆 Option recommandée : API REST + Tunnel sécurisé
-
-### Architecture globale
-
-```
-┌──────────────────────────────────────────────────────────────┐
-│                    Application LLM                            │
-│  (Claude API, OpenAI, Ollama local, etc.)                    │
-└────────────────────┬─────────────────────────────────────────┘
-                     │
-                     ▼
-┌──────────────────────────────────────────────────────────────┐
-│              API REST Custom (Flask/FastAPI)                  │
-│  - Authentification JWT/API Key                              │
-│  - Rate limiting                                              │
-│  - Logging                                                    │
-│  - HTTPS (Let's Encrypt)                                     │
-└────────────────────┬─────────────────────────────────────────┘
-                     │
-                     ▼ (réseau privé ou VPN)
-┌──────────────────────────────────────────────────────────────┐
-│         Synology NAS / VPS                                    │
-│  ┌────────────────────────────────────────────────────┐      │
-│  │  Docker Compose                                     │      │
-│  │  ┌──────────────────┐  ┌─────────────────────┐     │      │
-│  │  │ Weaviate :8080   │  │ text2vec-transformers│    │      │
-│  │  └──────────────────┘  └─────────────────────┘     │      │
-│  └────────────────────────────────────────────────────┘      │
-└──────────────────────────────────────────────────────────────┘
-```
-
-### Pourquoi cette option ?
-
-✅ **Fiabilité maximale** (5/5)
-- HTTP/REST = protocole standard, éprouvé
-- Retry automatique facile
-- Gestion d'erreur claire
-
-✅ **Sécurité** (5/5)
-- HTTPS obligatoire
-- Authentification par API key
-- IP whitelisting possible
-- Logs d'audit
-
-✅ **Performance** (4/5)
-- Latence réseau inévitable
-- Compression gzip possible
-- Cache Redis optionnel
-
-✅ **Maintenance** (5/5)
-- Code simple (Flask/FastAPI)
-- Monitoring facile
-- Déploiement standard
-
----
-
-## Comparaison des 4 options
-
-### Option 1 : API REST Custom (⭐ RECOMMANDÉ)
-
-**Architecture** : App → API REST → Weaviate
-
-**Code exemple** :
-
-```python
-# api_server.py (déployé sur VPS/Synology)
-from fastapi import FastAPI, HTTPException, Security
-from fastapi.security import APIKeyHeader
-import weaviate
-
-app = FastAPI()
-api_key_header = APIKeyHeader(name="X-API-Key")
-
-# Connect to Weaviate (local on same machine)
-client = weaviate.connect_to_local()
-
-def verify_api_key(api_key: str = Security(api_key_header)):
-    if api_key != os.getenv("API_KEY"):
-        raise HTTPException(status_code=403, detail="Invalid API key")
-    return api_key
-
-@app.post("/search")
-async def search_chunks(
-    query: str,
-    limit: int = 10,
-    api_key: str = Security(verify_api_key)
-):
-    collection = client.collections.get("Chunk")
-    result = collection.query.near_text(
-        query=query,
-        limit=limit
-    )
-    return {"results": [obj.properties for obj in result.objects]}
-
-@app.post("/insert_pdf")
-async def insert_pdf(
-    pdf_path: str,
-    api_key: str = Security(verify_api_key)
-):
-    # Appeler le pipeline library_rag
-    from utils.pdf_pipeline import process_pdf
-    result = process_pdf(Path(pdf_path))
-    return result
-```
-
-**Déploiement** :
-
-```bash
-# Sur VPS/Synology
-docker-compose up -d weaviate text2vec
-uvicorn api_server:app --host 0.0.0.0 --port 8000 --ssl-keyfile key.pem --ssl-certfile cert.pem
-```
-
-**Avantages** :
-- ✅ Contrôle total sur l'API
-- ✅ Facile à sécuriser (HTTPS + API key)
-- ✅ Peut wrapper tout le pipeline library_rag
-- ✅ Monitoring et logging faciles
-
-**Inconvénients** :
-- ⚠️ Code custom à maintenir
-- ⚠️ Besoin d'un serveur web (nginx/uvicorn)
-
----
-
-### Option 2 : Accès direct Weaviate via VPN
-
-**Architecture** : App → VPN → Weaviate:8080
-
-**Configuration** :
-
-```bash
-# Sur Synology : activer VPN Server (OpenVPN/WireGuard)
-# Sur client : se connecter au VPN
-# Accès direct à http://192.168.x.x:8080 (IP privée Synology)
-```
-
-**Code client** :
-
-```python
-# Dans votre app LLM
-import weaviate
-
-# Via VPN, IP privée Synology
-client = weaviate.connect_to_custom(
-    http_host="192.168.1.100",
-    http_port=8080,
-    http_secure=False,  # En VPN, pas besoin HTTPS
-    grpc_host="192.168.1.100",
-    grpc_port=50051,
-)
-
-# Utilisation directe
-collection = client.collections.get("Chunk")
-result = collection.query.near_text(query="justice")
-```
-
-**Avantages** :
-- ✅ Très simple (pas de code custom)
-- ✅ Sécurité via VPN
-- ✅ Utilise Weaviate client Python directement
-
-**Inconvénients** :
-- ⚠️ VPN doit être actif en permanence
-- ⚠️ Latence VPN
-- ⚠️ Pas de couche d'abstraction (app doit connaître Weaviate)
-
----
-
-### Option 3 : MCP Server HTTP sur VPS
-
-**Architecture** : App → MCP HTTP → Weaviate
-
-**Problème** : FastMCP SSE ne fonctionne pas bien en production (comme on l'a vu)
-
-**Solution** : Wrapper custom MCP over HTTP
-
-```python
-# mcp_http_wrapper.py (sur VPS)
-from fastapi import FastAPI
-from mcp_tools import parse_pdf_handler, search_chunks_handler
-from pydantic import BaseModel
-
-app = FastAPI()
-
-class SearchRequest(BaseModel):
-    query: str
-    limit: int = 10
-
-@app.post("/mcp/search_chunks")
-async def mcp_search(req: SearchRequest):
-    # Appeler directement le handler MCP
-    input_data = SearchChunksInput(query=req.query, limit=req.limit)
-    result = await search_chunks_handler(input_data)
-    return result.model_dump()
-```
-
-**Avantages** :
-- ✅ Réutilise le code MCP existant
-- ✅ HTTP standard
-
-**Inconvénients** :
-- ⚠️ MCP stdio ne peut pas être utilisé
-- ⚠️ Besoin d'un wrapper HTTP custom de toute façon
-- ⚠️ Équivalent à l'option 1 en plus complexe
-
-**Verdict** : Option 1 (API REST pure) est meilleure
-
----
-
-### Option 4 : Tunnel SSH + Port forwarding
-
-**Architecture** : App → SSH tunnel → localhost:8080 (Weaviate distant)
-
-**Configuration** :
-
-```bash
-# Sur votre machine locale
-ssh -L 8080:localhost:8080 user@synology-ip
-
-# Weaviate distant est maintenant accessible sur localhost:8080
-```
-
-**Code** :
-
-```python
-# Dans votre app (pense que Weaviate est local)
-client = weaviate.connect_to_local()  # Va sur localhost:8080 = tunnel SSH
-```
-
-**Avantages** :
-- ✅ Sécurité SSH
-- ✅ Simple à configurer
-- ✅ Pas de code custom
-
-**Inconvénients** :
-- ⚠️ Tunnel doit rester ouvert
-- ⚠️ Pas adapté pour une app cloud
-- ⚠️ Latence SSH
-
----
-
-## 🎯 Recommandations selon votre cas
-
-### Cas 1 : Application locale (votre PC) → Weaviate Synology/VPS
-
-**Recommandation** : **VPN + Accès direct Weaviate** (Option 2)
-
-**Pourquoi** :
-- Simple à configurer sur Synology (VPN Server intégré)
-- Pas de code custom
-- Sécurité via VPN
-- Performance acceptable en réseau local/VPN
-
-**Setup** :
-
-1. Synology : Activer VPN Server (OpenVPN)
-2. Client : Se connecter au VPN
-3. Python : `weaviate.connect_to_custom(http_host="192.168.x.x", ...)`
-
----
-
-### Cas 2 : Application cloud (serveur distant) → Weaviate Synology/VPS
-
-**Recommandation** : **API REST Custom** (Option 1)
-
-**Pourquoi** :
-- Pas de VPN nécessaire
-- HTTPS public avec Let's Encrypt
-- Contrôle d'accès par API key
-- Rate limiting
-- Monitoring
-
-**Setup** :
-
-1. VPS/Synology : Docker Compose (Weaviate + API REST)
-2. Domaine : api.monrag.com → VPS IP
-3. Let's Encrypt : HTTPS automatique
-4. App cloud : Appelle `https://api.monrag.com/search?api_key=xxx`
-
----
-
-### Cas 3 : Développement local temporaire → Weaviate distant
-
-**Recommandation** : **Tunnel SSH** (Option 4)
-
-**Pourquoi** :
-- Setup en 1 ligne
-- Aucune config permanente
-- Parfait pour le dev/debug
-
-**Setup** :
-
-```bash
-ssh -L 8080:localhost:8080 user@vps
-# Weaviate distant accessible sur localhost:8080
-```
-
----
-
-## 🔧 Déploiement recommandé pour VPS
-
-### Stack complète
-
-```yaml
-# docker-compose.yml (sur VPS)
-version: '3.8'
-
-services:
-  # Weaviate + embeddings
-  weaviate:
-    image: cr.weaviate.io/semitechnologies/weaviate:1.34.4
-    ports:
-      - "127.0.0.1:8080:8080"  # Uniquement localhost (sécurité)
-    environment:
-      AUTHENTICATION_APIKEY_ENABLED: "true"
-      AUTHENTICATION_APIKEY_ALLOWED_KEYS: "my-secret-key"
-      # ... autres configs
-    volumes:
-      - weaviate_data:/var/lib/weaviate
-
-  text2vec-transformers:
-    image: cr.weaviate.io/semitechnologies/transformers-inference:baai-bge-m3-onnx-latest
-    # ... config
-
-  # API REST custom
-  api:
-    build: ./api
-    ports:
-      - "8000:8000"
-    environment:
-      WEAVIATE_URL: http://weaviate:8080
-      API_KEY: ${API_KEY}
-      MISTRAL_API_KEY: ${MISTRAL_API_KEY}
-    depends_on:
-      - weaviate
-    restart: always
-
-  # NGINX reverse proxy + HTTPS
-  nginx:
-    image: nginx:alpine
-    ports:
-      - "80:80"
-      - "443:443"
-    volumes:
-      - ./nginx.conf:/etc/nginx/nginx.conf
-      - /etc/letsencrypt:/etc/letsencrypt
-    depends_on:
-      - api
-
-volumes:
-  weaviate_data:
-```
-
-### NGINX config
-
-```nginx
-# nginx.conf
-server {
-    listen 443 ssl;
-    server_name api.monrag.com;
-
-    ssl_certificate /etc/letsencrypt/live/api.monrag.com/fullchain.pem;
-    ssl_certificate_key /etc/letsencrypt/live/api.monrag.com/privkey.pem;
-
-    location / {
-        proxy_pass http://api:8000;
-        proxy_set_header Host $host;
-        proxy_set_header X-Real-IP $remote_addr;
-
-        # Rate limiting
-        limit_req zone=api_limit burst=10 nodelay;
-    }
-}
-```
-
----
-
-## 📊 Comparaison finale
-
-| Critère | VPN + Direct | API REST | Tunnel SSH | MCP HTTP |
-|---------|--------------|----------|------------|----------|
-| **Fiabilité** | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ |
-| **Sécurité** | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
-| **Simplicité** | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐ |
-| **Performance** | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ |
-| **Maintenance** | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐ |
-| **Production** | ✅ Oui | ✅ Oui | ❌ Non | ⚠️ Possible |
-
----
-
-## 💡 Ma recommandation finale
-
-### Pour Synology (usage personnel/équipe)
-**VPN + Accès direct Weaviate** (Option 2)
-- Synology a un excellent VPN Server intégré
-- Sécurité maximale
-- Simple à maintenir
-
-### Pour VPS (usage production/public)
-**API REST Custom** (Option 1)
-- Contrôle total
-- HTTPS public
-- Scalable
-- Monitoring complet
-
----
-
-## 🚀 Prochaine étape recommandée
-
-Voulez-vous que je crée :
-
-1. **Le code de l'API REST** (Flask/FastAPI) avec auth + rate limiting ?
-2. **Le docker-compose VPS complet** avec nginx + Let's Encrypt ?
-3. **Le guide d'installation Synology VPN** + config client ?
-
-Dites-moi votre cas d'usage exact et je vous prépare la solution complète ! 🎯
diff --git a/navette.txt b/navette.txt
deleted file mode 100644
index af78880..0000000
--- a/navette.txt
+++ /dev/null
@@ -1,2510 +0,0 @@
-================================================================================
-NAVETTE - COMMUNICATION CLAUDE <-> DAVID
-================================================================================
-Date: 19 decembre 2025
-Derniere mise a jour: NOUVEAU SPEC CREE
-
-================================================================================
-NOUVEAU SPEC CREE !
-================================================================================
-
-J'ai reecrit COMPLETEMENT le spec selon ta demande.
-
-NOUVEAU FICHIER: prompts/app_spec_ikario_rag_UI.txt
-
-================================================================================
-DIFFERENCES AVEC L'ANCIEN SPEC
-================================================================================
-
-ANCIEN SPEC (app_spec_ikario_rag_improvements.txt):
-❌ Modifiait le code Python ikario_rag (mcp_ikario_memory.py, server.py)
-❌ Developpait dans SynologyDrive
-❌ Ajoutait des outils MCP au serveur Python
-❌ A cause le probleme (agent a modifie ton code)
-
-NOUVEAU SPEC (app_spec_ikario_rag_UI.txt):
-✓ Developpe UNIQUEMENT dans generations/ikario_body/
-✓ UTILISE les 7 outils MCP existants (via client)
-✓ NE TOUCHE PAS au code ikario_rag
-✓ Ajoute interface utilisateur pour exploiter la memoire
-
-================================================================================
-15 NOUVELLES FEATURES (FRONTEND + BACKEND)
-================================================================================
-
-BACKEND (server/):
-1. Routes API Memory (POST /api/memory/thoughts, GET, etc.)
-2. Memory Service Layer (wrapper MCP client)
-3. Error Handling & Logging (robuste)
-4. Memory Stats Endpoint (statistiques)
-
-FRONTEND (src/):
-5. useMemory Hook (React hook centralise)
-6. Memory Panel Component (sidebar memoire)
-7. Add Thought Modal (ajouter pensees)
-8. Memory Settings Panel (preferences)
-9. Save to Memory Button (depuis chat)
-10. Memory Context Panel (contexte pendant chat)
-11. Memory Search Interface (recherche avancee)
-12. Concepts Graph Visualization (graphe interactif)
-
-DOCUMENTATION & TESTS:
-13. Memory API Guide (doc complete)
-14. Integration Tests (tests backend)
-15. Memory Tour (onboarding users)
-
-================================================================================
-OUTILS MCP EXISTANTS UTILISES
-================================================================================
-
-Le serveur ikario_rag expose deja 7 outils MCP:
-1. add_thought - Ajouter une pensee
-2. add_conversation - Ajouter une conversation
-3. search_thoughts - Rechercher pensees
-4. search_conversations - Rechercher conversations
-5. search_memories - Recherche globale
-6. trace_concept_evolution - Tracer evolution concept
-7. check_consistency - Check coherence
-
-On utilise ces outils VIA le client MCP deja present dans ikario_body:
-- server/services/mcpClient.js
-
-================================================================================
-ARCHITECTURE
-================================================================================
-
-User Interface (React)
-    ↓
-Backend API (Express routes)
-    ↓
-Memory Service (wrapper)
-    ↓
-MCP Client (mcpClient.js)
-    ↓
-MCP Protocol (stdio)
-    ↓
-Ikario RAG Server (Python, SynologyDrive)
-    ↓
-ChromaDB (embeddings)
-
-PAS DE MODIFICATION dans ikario_rag (SynologyDrive) !
-
-================================================================================
-PROCHAINES ACTIONS
-================================================================================
-
-1. SUPPRIMER L'ANCIEN SPEC?
-   - Fichier: prompts/app_spec_ikario_rag_improvements.txt
-   - Options:
-     a) SUPPRIMER (recommande, cause confusion)
-     b) RENOMMER en .OLD (backup)
-     c) GARDER (mais risque relancer par erreur)
-
-2. SUPPRIMER LES 15 ISSUES LINEAR EXISTANTES?
-   - Issues TEAMPHI-305 a 319 (anciennes features)
-   - Ces issues parlent de modifier ikario_rag (on ne veut plus)
-   - Options:
-     a) SUPPRIMER toutes (clean slate)
-     b) GARDER comme doc (mais marquer Canceled)
-
-3. CREER 15 NOUVELLES ISSUES?
-   - Pour les 15 features du nouveau spec (UI)
-   - Issues qui developpent dans ikario_body
-   - Options:
-     a) OUI, creer maintenant avec initializer bis
-     b) OUI, mais manuellement dans Linear
-     c) NON, juste developper sans Linear
-
-================================================================================
-MES RECOMMANDATIONS
-================================================================================
-
-1. ANCIEN SPEC: SUPPRIMER
-   - Fichier app_spec_ikario_rag_improvements.txt
-   - Eviter confusion future
-   - Le nouveau spec est complet
-
-2. ANCIENNES ISSUES (305-319): SUPPRIMER TOUTES
-   - Elles parlent de modifier ikario_rag
-   - On ne veut plus faire ca
-   - Clean slate
-
-3. NOUVELLES ISSUES: CREER MAINTENANT
-   - 15 nouvelles issues pour features UI
-   - Lancer initializer bis avec nouveau spec
-   - Developper uniquement dans ikario_body
-   - Avec restrictions sandbox pour SynologyDrive
-
-================================================================================
-COMMANDES POUR NETTOYER
-================================================================================
-
-Si tu es d'accord avec mes recommandations:
-
-1. Supprimer ancien spec:
-   rm C:/GitHub/Linear_coding/prompts/app_spec_ikario_rag_improvements.txt
-
-2. Supprimer 15 anciennes issues:
-   (je peux le faire via Linear API)
-
-3. Creer 15 nouvelles issues:
-   python autonomous_agent_demo.py --project-dir ikario_body --new-spec app_spec_ikario_rag_UI.txt
-
-4. Ajouter restrictions sandbox (avant de lancer agent):
-   (je dois modifier autonomous_agent_demo.py pour bloquer SynologyDrive)
-
-================================================================================
-QUESTIONS POUR TOI
-================================================================================
-
-Reponds avec 3 choix:
-
-1. Ancien spec (app_spec_ikario_rag_improvements.txt):
-   a) SUPPRIMER
-   b) RENOMMER .OLD
-   c) GARDER
-
-2. Anciennes issues Linear (TEAMPHI-305 a 319):
-   a) SUPPRIMER toutes
-   b) GARDER comme doc (Canceled)
-   c) GARDER telles quelles
-
-3. Nouvelles issues pour nouveau spec:
-   a) CREER maintenant (agent initializer bis)
-   b) CREER manuellement dans Linear
-   c) PAS D'ISSUES (developper sans Linear)
-
-Exemple de reponse:
-1. a
-2. a
-3. a
-
-================================================================================
-VERIFICATION NOUVEAU SPEC
-================================================================================
-
-Le nouveau spec est dans: prompts/app_spec_ikario_rag_UI.txt
-
-Tu peux le lire pour verifier que c'est bien ce que tu veux.
-
-Points importants:
-- 15 features frontend/backend
-- ZERO modification ikario_rag
-- Developpe dans ikario_body uniquement
-- Utilise outils MCP existants
-- 5 phases implementation (7-10 jours total)
-
-Si tu veux des modifications au spec, dis-le maintenant AVANT de creer les issues.
-
-================================================================================
-SYNTHESE DES BESOINS FONCTIONNELS
-================================================================================
-Date: 19 decembre 2025
-Derniere mise a jour: SYNTHESE AJOUTEE
-
-Tu as demande de clarifier les deux fonctionnalites principales.
-
-Voici ma comprehension et ma synthese:
-
-================================================================================
-BESOIN 1: PENSEES (THOUGHTS)
-================================================================================
-
-COMPORTEMENT SOUHAITE:
-- Le LLM peut ECRIRE des pensees quand il le souhaite
-- Le LLM peut LIRE des pensees existantes
-- Le LLM peut RECHERCHER des pensees pertinentes
-
-OUTILS MCP UTILISES (deja exposes par ikario_rag):
-1. add_thought - Pour ECRIRE une nouvelle pensee
-2. search_thoughts - Pour RECHERCHER des pensees
-
-COMMENT CA MARCHE:
-- Pendant une conversation, le LLM decide de sauvegarder une reflexion
-- Exemple: "Je viens de comprendre que l'utilisateur prefere React a Vue"
-- Le LLM appelle add_thought via le MCP client
-- La pensee est stockee dans ChromaDB avec embeddings semantiques
-- Plus tard, le LLM peut rechercher: "preferences frontend de l'utilisateur"
-- search_thoughts retourne les pensees pertinentes
-
-MODE D'INVOCATION:
-- MANUEL (LLM decide): Le LLM utilise l'outil quand il juge necessaire
-- MANUEL (User decide): Bouton "Save to Memory" dans l'UI chat
-- SEMI-AUTO: Suggestion automatique apres conversations importantes
-
-================================================================================
-BESOIN 2: CONVERSATIONS (AUTO-SAVE)
-================================================================================
-
-COMPORTEMENT SOUHAITE:
-- Apres CHAQUE reponse du LLM, la conversation est sauvegardee
-- Sauvegarde AUTOMATIQUE (pas besoin d'action manuelle)
-- Meme conversation = tous les messages sont lies (conversation_id)
-
-OUTILS MCP UTILISES (deja exposes par ikario_rag):
-1. add_conversation - Pour SAUVEGARDER la conversation
-
-COMMENT CA MARCHE:
-- User: "Comment faire un fetch API en React?"
-- LLM: [Reponse detaillee sur fetch API]
-- AUTOMATIQUEMENT apres la reponse du LLM:
-  * Backend detecte fin de reponse LLM
-  * Backend appelle add_conversation avec:
-    - user_message: "Comment faire un fetch API en React?"
-    - assistant_message: [la reponse du LLM]
-    - conversation_id: ID unique pour cette session chat
-  * ChromaDB stocke avec embeddings semantiques
-- Prochaine fois, recherche "React fetch API" retournera cette conversation
-
-ARCHITECTURE TECHNIQUE:
-- Hook backend: onMessageComplete()
-- Declenche: Apres chaque reponse LLM streamed completement
-- Appelle: mcpClient.callTool('add_conversation', {...})
-- Parametres:
-  {
-    user_message: string,
-    assistant_message: string,
-    conversation_id: string (UUID session),
-    timestamp: ISO date,
-    metadata: {
-      model: "claude-sonnet-4.5",
-      tokens: number,
-      ...
-    }
-  }
-
-================================================================================
-MAPPING COMPLET DES 7 OUTILS MCP
-================================================================================
-
-POUR PENSEES (THOUGHTS):
-1. add_thought -------> Ecrire une nouvelle pensee
-2. search_thoughts ---> Rechercher des pensees
-3. trace_concept_evolution -> Tracer evolution d'un concept dans les pensees
-4. check_consistency -> Verifier coherence entre pensees
-
-POUR CONVERSATIONS:
-1. add_conversation -----> Sauvegarder une conversation (AUTO)
-2. search_conversations -> Rechercher dans l'historique
-3. search_memories -------> Recherche globale (thoughts + conversations)
-
-AVANCEES (optionnel):
-1. trace_concept_evolution -> Voir comment un concept evolue dans le temps
-2. check_consistency --------> Detecter contradictions
-
-================================================================================
-ARCHITECTURE D'IMPLEMENTATION
-================================================================================
-
-BACKEND (Express API):
-------------------
-1. POST /api/chat/message
-   - Recoit message user
-   - Envoie a Claude API
-   - Stream la reponse
-   - APRES streaming complete:
-     * Appelle add_conversation automatiquement
-     * Retourne success au frontend
-
-2. POST /api/memory/thoughts (manuel)
-   - User clique "Save to Memory"
-   - Backend appelle add_thought
-   - Retourne confirmation
-
-3. GET /api/memory/search?q=...
-   - User cherche dans sidebar
-   - Backend appelle search_memories
-   - Retourne resultats (thoughts + conversations)
-
-FRONTEND (React):
---------------
-1. Chat Interface:
-   - Bouton "Save to Memory" sur chaque message
-   - Auto-save indicator (petit icon quand conversation sauvegardee)
-
-2. Memory Sidebar:
-   - Barre de recherche
-   - Liste de resultats (thoughts + conversations)
-   - Filtre: "Thoughts only" / "Conversations only" / "All"
-
-3. Memory Context Panel:
-   - Pendant qu'on tape, affiche pensees/conversations pertinentes
-   - Auto-recherche basee sur le contexte du message
-
-================================================================================
-EXEMPLE CONCRET D'UTILISATION
-================================================================================
-
-SCENARIO 1: CONVERSATION AUTO-SAUVEGARDEE
------------------------------------------
-User: "Comment implementer un dark mode en React?"
-LLM: [Reponse detaillee avec code examples]
-BACKEND (auto): Appelle add_conversation avec les deux messages
-ChromaDB: Stocke avec embeddings
-
-2 semaines plus tard:
-User: "dark mode"
-Search: Retourne la conversation precedente
-LLM: Peut relire et continuer la discussion
-
-SCENARIO 2: PENSEE MANUELLE
----------------------------
-User: "Je prefere utiliser TailwindCSS plutot que styled-components"
-LLM: "D'accord, je note votre preference"
-LLM (interne): Appelle add_thought("User prefers TailwindCSS over styled-components")
-ChromaDB: Stocke la preference
-
-Plus tard:
-User: "Aide-moi a styler ce composant"
-LLM (interne): Recherche "styling preferences"
-Result: Trouve la pensee sur TailwindCSS
-LLM: "Je vais utiliser TailwindCSS pour le styling, comme vous preferez"
-
-SCENARIO 3: BOUTON SAVE TO MEMORY
----------------------------------
-User: "Voici nos conventions de nommage: components en PascalCase, utils en camelCase"
-LLM: [Repond avec confirmation]
-User: [Clique "Save to Memory"]
-Frontend: POST /api/memory/thoughts
-Backend: Appelle add_thought avec le message user
-ChromaDB: Stocke les conventions
-
-Plus tard:
-LLM cree un nouveau composant et respecte automatiquement les conventions
-(car il peut rechercher "naming conventions" avant de generer du code)
-
-================================================================================
-DIFFERENCES CLES ENTRE THOUGHTS ET CONVERSATIONS
-================================================================================
-
-THOUGHTS:
-- Contenu: Reflexions, preferences, conventions, apprentissages
-- Taille: Generalement courts (1-3 phrases)
-- Declenchement: Manuel (LLM decide ou User clique bouton)
-- Granularite: Atomique (1 pensee = 1 concept)
-- Exemple: "User prefers functional components over class components"
-
-CONVERSATIONS:
-- Contenu: Echanges complets user-assistant
-- Taille: Variable (peut etre long)
-- Declenchement: AUTOMATIQUE apres chaque reponse LLM
-- Granularite: Dialogue (1 conversation = 1 echange Q&A)
-- Exemple: Tout l'echange sur "Comment faire un fetch API en React?"
-
-LES DEUX ENSEMBLE:
-- Complementaires: Thoughts = knowledge, Conversations = context
-- Recherchables: search_memories cherche dans les deux
-- Evolution: trace_concept_evolution fonctionne sur les deux
-
-================================================================================
-QUESTIONS DE CLARIFICATION
-================================================================================
-
-Avant de continuer, j'ai besoin de confirmer quelques details:
-
-1. AUTO-SAVE CONVERSATIONS:
-   - Faut-il sauvegarder TOUTES les conversations?
-   - Ou seulement certaines (ex: > 100 tokens, contient du code, etc.)?
-   - Mon avis: TOUTES, mais avec option user "Disable auto-save" dans settings
-
-2. CONVERSATION_ID:
-   - Un conversation_id = une session chat complete (plusieurs messages)?
-   - Ou un conversation_id = un echange unique (1 user msg + 1 assistant msg)?
-   - Mon avis: Session complete (comme tu as dit "meme conversation")
-
-3. DECLENCHEMENT AUTO-SAVE:
-   - Immediate (apres chaque reponse)?
-   - Ou batched (toutes les 5 minutes)?
-   - Mon avis: Immediate mais asynchrone (ne bloque pas le chat)
-
-4. PRIVACY:
-   - Les conversations auto-sauvegardees sont "private" par defaut?
-   - Ou "shared" (visible par d'autres users)?
-   - Mon avis: Private par defaut dans un contexte single-user
-
-================================================================================
-RECOMMANDATION FINALE
-================================================================================
-
-Je recommande cette implementation:
-
-PHASE 1 (Core):
-- Auto-save conversations (add_conversation apres chaque reponse)
-- Bouton manuel "Save to Memory" (add_thought)
-- Search interface basique (search_memories)
-
-PHASE 2 (Enhanced):
-- Memory sidebar avec resultats enrichis
-- Filtres thoughts vs conversations
-- Memory context panel (suggestions pendant typing)
-
-PHASE 3 (Advanced):
-- Concepts graph visualization (trace_concept_evolution)
-- Consistency checker (check_consistency)
-- Memory settings (disable auto-save, privacy, etc.)
-
-TOTAL: 15 features comme dans le spec app_spec_ikario_rag_UI.txt
-
-================================================================================
-MODIFICATIONS VALIDEES
-================================================================================
-Date: 19 decembre 2025 - 23h30
-
-MODIFICATION 1: PENSEES = LLM SEULEMENT
----------------------------------------
-SUPPRIME:
-- Bouton "Save to Memory" pour l'utilisateur
-- Suggestions automatiques
-
-CONSERVE:
-- Seulement le LLM decide quand ecrire/lire ses pensees
-- Les pensees sont un outil INTERNE du LLM
-
-================================================================================
-ANALYSE CODE: add_conversation
-================================================================================
-
-J'ai lu mcp_ikario_memory.py (ligne 100-189).
-
-REPONSE: NON, add_conversation NE PEUT PAS faire de mise a jour incrementale
-
-PROBLEME IDENTIFIE:
-------------------
-Ligne 160-164:
-```python
-self.conversations.add(
-    documents=[full_conversation_text],
-    metadatas=[main_metadata],
-    ids=[conversation_id]  # <-- PROBLEME ICI
-)
-```
-
-ChromaDB.add() avec un ID existant:
-- Option 1: Erreur "ID already exists"
-- Option 2: Ecrase completement l'ancien document
-
-DONC:
-- Appeler add_conversation 2 fois avec meme conversation_id = ECRASEMENT
-- Pas de mecanisme "append" pour ajouter des messages
-- C'est un REMPLACEMENT complet, pas une mise a jour incrementale
-
-COMPORTEMENT ACTUEL:
--------------------
-Premier appel:
-add_conversation(conversation_id="session_123", messages=[msg1, msg2])
--> Cree conversation avec 2 messages
-
-Deuxieme appel:
-add_conversation(conversation_id="session_123", messages=[msg1, msg2, msg3, msg4])
--> ECRASE la conversation precedente
--> Remplace completement par 4 messages
-
-CONSEQUENCE POUR TON BESOIN:
-----------------------------
-Tu veux sauvegarder apres CHAQUE reponse du LLM dans la MEME conversation.
-
-Exemple:
-User: "Bonjour"
-LLM: "Salut!"
--> Sauvegarde conversation_id="conv_20251219" avec 2 messages
-
-User: "Comment vas-tu?"
-LLM: "Bien merci!"
--> Doit ajouter 2 nouveaux messages a "conv_20251219"
--> MAIS add_conversation va ECRASER les 2 premiers messages!
-
-================================================================================
-SOLUTION: TU DOIS AJOUTER UN NOUVEL OUTIL
-================================================================================
-
-OPTION A (recommande): append_to_conversation
-----------------------------------------------
-Nouvel outil qui ajoute des messages sans ecraser:
-
-```python
-async def append_to_conversation(
-    self,
-    conversation_id: str,
-    new_messages: List[Dict[str, str]]
-) -> str:
-    """
-    Ajoute de nouveaux messages a une conversation existante
-    """
-    # 1. Recuperer la conversation existante
-    existing = self.conversations.get(ids=[conversation_id])
-
-    # 2. Extraire les anciens messages (ou les stocker autrement)
-
-    # 3. Merger old_messages + new_messages
-
-    # 4. Re-creer le document principal avec tous les messages
-
-    # 5. Ajouter les nouveaux messages individuels
-```
-
-OPTION B: update_conversation (remplacement complet)
----------------------------------------------------
-Similaire a add_conversation mais avec upsert:
-
-```python
-async def update_conversation(
-    self,
-    conversation_id: str,
-    all_messages: List[Dict[str, str]],
-    ...
-) -> str:
-    """
-    Remplace completement une conversation existante
-    """
-    # Delete old documents
-    self.conversations.delete(ids=[conversation_id])
-
-    # Add new version
-    # (meme code que add_conversation)
-```
-
-OPTION C: Modifier add_conversation
------------------------------------
-Ajouter logique de detection:
-
-```python
-async def add_conversation(...):
-    # Verifier si conversation_id existe deja
-    try:
-        existing = self.conversations.get(ids=[conversation_id])
-        if existing:
-            # Faire un append
-    except:
-        # Creer nouvelle conversation
-```
-
-================================================================================
-MA RECOMMANDATION
-================================================================================
-
-UTILISE OPTION A: append_to_conversation
-
-POURQUOI:
-- Semantique claire: "append" = ajouter sans ecraser
-- Separation des responsabilites: add = creation, append = ajout
-- Plus facile a debugger
-- Pas de "magic" (Option C serait trop implicite)
-
-ARCHITECTURE BACKEND ikario_body:
--------------------------------
-POST /api/chat/message
--> User envoie message
--> LLM repond
--> Apres reponse complete:
-   - Si c'est le premier message de la session:
-     * Appelle add_conversation(conversation_id, [user_msg, assistant_msg])
-   - Si conversation existe deja:
-     * Appelle append_to_conversation(conversation_id, [user_msg, assistant_msg])
-
-ALTERNATIVE SIMPLE (sans append):
----------------------------------
-Si tu ne veux pas modifier ikario_rag:
-- Backend garde TOUS les messages de la session en memoire
-- Appelle add_conversation SEULEMENT a la fin de la session (quand user ferme le chat)
-- Parametres: conversation_id + TOUS les messages accumules
-
-MAIS:
-- Risque de perte si crash avant la fin
-- Pas de recherche en temps reel pendant la conversation
-- Moins robuste
-
-================================================================================
-DECISION REQUISE
-================================================================================
-
-Tu dois choisir:
-
-1. AJOUTER append_to_conversation dans ikario_rag
-   - Je modifie mcp_ikario_memory.py (dans SynologyDrive)
-   - J'ajoute le nouvel outil au serveur MCP
-   - Puis je mets a jour le spec UI
-
-2. UTILISER ALTERNATIVE SIMPLE (save a la fin de session)
-   - Pas de modification ikario_rag
-   - Backend accumule messages en memoire
-   - Sauvegarde complete a la fin
-
-3. MODIFIER add_conversation (Option C)
-   - Ajouter logique auto-detect + append
-   - Moins explicite mais plus simple cote client
-
-Quelle option preferes-tu?
-
-================================================================================
-QUESTION CRITIQUE: ECRASEMENT ET EMBEDDINGS
-================================================================================
-Date: 19 decembre 2025 - 23h35
-
-Tu demandes: "Est-ce que l'écrasement supprime aussi les anciens embeddings?"
-
-REPONSE COURTE: NON, c'est encore PIRE que je pensais!
-
-ANALYSE DETAILLEE:
------------------
-
-Rappel de l'architecture add_conversation:
-
-1. DOCUMENT PRINCIPAL (ligne 160-164):
-   ID = conversation_id (ex: "conv_20251219_1430")
-   Contenu = conversation complete (tous les messages concatenes)
-
-2. MESSAGES INDIVIDUELS (ligne 166-187):
-   IDs = conversation_id + "_msg_001", "_msg_002", etc.
-   Contenu = chaque message avec son propre embedding
-
-SCENARIO PROBLEMATIQUE:
-----------------------
-
-Premier appel:
-add_conversation(conversation_id="conv_123", messages=[msg1, msg2])
-
-ChromaDB contient:
-- conv_123 (document principal, embedding de "msg1 + msg2")
-- conv_123_msg_001 (msg1, embedding individuel)
-- conv_123_msg_002 (msg2, embedding individuel)
-
-Deuxieme appel:
-add_conversation(conversation_id="conv_123", messages=[msg1, msg2, msg3, msg4])
-
-QUE SE PASSE-T-IL?
-
-1. Document principal conv_123:
-   - ECRASE (nouveau embedding pour "msg1 + msg2 + msg3 + msg4")
-   - Ancien embedding perdu
-
-2. Messages individuels:
-   - conv_123_msg_001 deja existe -> ECRASE (nouveau embedding pour msg1)
-   - conv_123_msg_002 deja existe -> ECRASE (nouveau embedding pour msg2)
-   - conv_123_msg_003 nouveau -> CREE
-   - conv_123_msg_004 nouveau -> CREE
-
-RESULTAT:
---------
-- Anciens embeddings ECRASES (pas supprimés, mais remplaces)
-- PAS de pollution si les messages sont identiques
-- MAIS si les messages changent = embeddings incorrects
-
-PIRE SCENARIO:
--------------
-Si le backend accumule mal les messages:
-
-Premier appel: [msg1, msg2]
-Deuxieme appel: [msg3, msg4]  <-- OUBLIE msg1 et msg2!
-
-ChromaDB contient:
-- conv_123 (embedding de "msg3 + msg4") <-- FAUX!
-- conv_123_msg_001 (embedding de msg3) <-- FAUX ID!
-- conv_123_msg_002 (embedding de msg4) <-- FAUX ID!
-
-Les anciens msg_001 et msg_002 (msg1 et msg2) sont PERDUS.
-
-CONCLUSION:
-----------
-L'ecrasement:
-- REMPLACE les embeddings (pas de suppression propre)
-- NECESSITE que le backend envoie TOUS les messages a chaque fois
-- RISQUE de perte de donnees si le backend se trompe
-
-C'est pour ca que append_to_conversation est NECESSAIRE!
-
-================================================================================
-POURQUOI append_to_conversation EST INDISPENSABLE
-================================================================================
-
-Avec append_to_conversation:
-
-Premier appel:
-add_conversation(conversation_id="conv_123", messages=[msg1, msg2])
-
-ChromaDB:
-- conv_123 (2 messages)
-- conv_123_msg_001, conv_123_msg_002
-
-Deuxieme appel:
-append_to_conversation(conversation_id="conv_123", new_messages=[msg3, msg4])
-
-Logic interne:
-1. GET existing conversation "conv_123"
-2. Extract metadata: message_count = 2
-3. Calculate next sequence = 3
-4. Update document principal:
-   - DELETE conv_123
-   - ADD conv_123 (nouveau embedding "msg1 + msg2 + msg3 + msg4")
-5. Add new individual messages:
-   - conv_123_msg_003 (msg3)
-   - conv_123_msg_004 (msg4)
-
-RESULTAT:
-- Anciens embeddings individuels CONSERVES (msg_001, msg_002)
-- Nouveau embedding principal CORRECT (4 messages)
-- Pas de perte de donnees
-- Sequence correcte
-
-================================================================================
-IMPLEMENTATION append_to_conversation (SKETCH)
-================================================================================
-
-```python
-async def append_to_conversation(
-    self,
-    conversation_id: str,
-    new_messages: List[Dict[str, str]],
-    update_context: Optional[Dict[str, Any]] = None
-) -> str:
-    """
-    Ajoute de nouveaux messages a une conversation existante
-
-    Args:
-        conversation_id: ID de la conversation existante
-        new_messages: Nouveaux messages a ajouter
-        update_context: Metadonnees a mettre a jour (optionnel)
-
-    Returns:
-        Message de confirmation
-    """
-    # 1. VERIFIER QUE LA CONVERSATION EXISTE
-    try:
-        existing = self.conversations.get(ids=[conversation_id])
-    except Exception as e:
-        raise ValueError(f"Conversation {conversation_id} not found")
-
-    if not existing['documents'] or len(existing['documents']) == 0:
-        raise ValueError(f"Conversation {conversation_id} not found")
-
-    # 2. EXTRAIRE LES METADONNEES EXISTANTES
-    existing_metadata = existing['metadatas'][0] if existing['metadatas'] else {}
-    current_message_count = int(existing_metadata.get('message_count', 0))
-
-    # 3. CALCULER LA NOUVELLE SEQUENCE
-    next_sequence = current_message_count + 1
-
-    # 4. CONSTRUIRE LE NOUVEAU TEXTE COMPLET
-    # Recuperer l'ancien texte
-    old_full_text = existing['documents'][0]
-
-    # Ajouter les nouveaux messages
-    new_text_parts = []
-    for msg in new_messages:
-        author = msg.get('author', 'unknown')
-        content = msg.get('content', '')
-        new_text_parts.append(f"{author}: {content}")
-
-    new_text = "\n".join(new_text_parts)
-    updated_full_text = old_full_text + "\n" + new_text
-
-    # 5. METTRE A JOUR LES METADONNEES
-    updated_metadata = existing_metadata.copy()
-    updated_metadata['message_count'] = str(current_message_count + len(new_messages))
-
-    # Merger update_context si fourni
-    if update_context:
-        for key, value in update_context.items():
-            if isinstance(value, list):
-                updated_metadata[key] = ", ".join(str(v) for v in value)
-            elif isinstance(value, dict):
-                updated_metadata[key] = json.dumps(value)
-            else:
-                updated_metadata[key] = str(value)
-
-    # 6. SUPPRIMER L'ANCIEN DOCUMENT PRINCIPAL
-    self.conversations.delete(ids=[conversation_id])
-
-    # 7. AJOUTER LE NOUVEAU DOCUMENT PRINCIPAL
-    self.conversations.add(
-        documents=[updated_full_text],
-        metadatas=[updated_metadata],
-        ids=[conversation_id]
-    )
-
-    # 8. AJOUTER LES NOUVEAUX MESSAGES INDIVIDUELS
-    for i, msg in enumerate(new_messages):
-        msg_id = f"{conversation_id}_msg_{str(next_sequence + i).zfill(3)}"
-        msg_content = msg.get('content', '')
-        msg_author = msg.get('author', 'unknown')
-        msg_timestamp = msg.get('timestamp', '')
-
-        msg_metadata = {
-            "conversation_id": conversation_id,
-            "message_type": "individual_message",
-            "author": msg_author,
-            "timestamp": msg_timestamp,
-            "sequence": str(next_sequence + i)
-        }
-
-        self.conversations.add(
-            documents=[msg_content],
-            metadatas=[msg_metadata],
-            ids=[msg_id]
-        )
-
-    return f"Conversation {conversation_id} updated: added {len(new_messages)} messages (total: {updated_metadata['message_count']})"
-```
-
-AVANTAGES:
-- Conserve les anciens embeddings individuels
-- Met a jour correctement le document principal
-- Gere la sequence automatiquement
-- Pas de risque de perte de donnees
-
-================================================================================
-DECISION FINALE REQUISE
-================================================================================
-
-Maintenant que tu comprends les risques de l'ecrasement:
-
-OPTION 1: J'ajoute append_to_conversation dans ikario_rag
-   - Implementation robuste (comme ci-dessus)
-   - Pas de risque de perte d'embeddings
-   - Architecture propre
-
-OPTION 2: Alternative simple (accumulation backend)
-   - Backend garde tous les messages en memoire
-   - Appelle add_conversation une seule fois a la fin
-   - RISQUE: perte si crash
-   - RISQUE: pas de recherche en temps reel
-
-OPTION 3: Modifier add_conversation pour auto-append
-   - Ajouter la logique ci-dessus dans add_conversation
-   - Plus transparent pour le client
-   - Mais semantique moins claire
-
-Je recommande FORTEMENT Option 1.
-
-Quelle option choisis-tu?
-
-================================================================================
-OPTION 1 CHOISIE: append_to_conversation
-================================================================================
-Date: 19 decembre 2025 - 23h40
-
-Tu choisis Option 1 avec une question cruciale:
-"Si la conversation n'existe pas?"
-
-EXCELLENTE QUESTION! Il y a 2 approches:
-
-================================================================================
-APPROCHE A: append_to_conversation AVEC AUTO-CREATE (recommandé)
-================================================================================
-
-append_to_conversation détecte si la conversation existe:
-- Si existe: fait un append
-- Si n'existe pas: crée la conversation (comme add_conversation)
-
-AVANTAGES:
-- Backend simplifié (1 seul appel, toujours le même)
-- Pas besoin de tracker si c'est le premier message
-- Robuste
-
-CODE:
-```python
-async def append_to_conversation(
-    self,
-    conversation_id: str,
-    new_messages: List[Dict[str, str]],
-    participants: Optional[List[str]] = None,
-    context: Optional[Dict[str, Any]] = None
-) -> str:
-    """
-    Ajoute des messages à une conversation (ou la crée si n'existe pas)
-
-    Args:
-        conversation_id: ID de la conversation
-        new_messages: Messages à ajouter
-        participants: Liste participants (requis si création)
-        context: Métadonnées (requis si création)
-    """
-    # 1. VÉRIFIER SI LA CONVERSATION EXISTE
-    try:
-        existing = self.conversations.get(ids=[conversation_id])
-        conversation_exists = (
-            existing and
-            existing['documents'] and
-            len(existing['documents']) > 0
-        )
-    except:
-        conversation_exists = False
-
-    # 2. SI N'EXISTE PAS: CRÉER
-    if not conversation_exists:
-        if not participants or not context:
-            raise ValueError(
-                "participants and context required when creating new conversation"
-            )
-        return await self.add_conversation(
-            participants=participants,
-            messages=new_messages,
-            context=context,
-            conversation_id=conversation_id
-        )
-
-    # 3. SI EXISTE: APPEND
-    # [Code d'append comme avant...]
-    existing_metadata = existing['metadatas'][0]
-    current_message_count = int(existing_metadata.get('message_count', 0))
-    next_sequence = current_message_count + 1
-
-    old_full_text = existing['documents'][0]
-
-    new_text_parts = []
-    for msg in new_messages:
-        author = msg.get('author', 'unknown')
-        content = msg.get('content', '')
-        new_text_parts.append(f"{author}: {content}")
-
-    new_text = "\n".join(new_text_parts)
-    updated_full_text = old_full_text + "\n" + new_text
-
-    updated_metadata = existing_metadata.copy()
-    updated_metadata['message_count'] = str(current_message_count + len(new_messages))
-
-    if context:
-        for key, value in context.items():
-            if isinstance(value, list):
-                updated_metadata[key] = ", ".join(str(v) for v in value)
-            elif isinstance(value, dict):
-                updated_metadata[key] = json.dumps(value)
-            else:
-                updated_metadata[key] = str(value)
-
-    self.conversations.delete(ids=[conversation_id])
-
-    self.conversations.add(
-        documents=[updated_full_text],
-        metadatas=[updated_metadata],
-        ids=[conversation_id]
-    )
-
-    for i, msg in enumerate(new_messages):
-        msg_id = f"{conversation_id}_msg_{str(next_sequence + i).zfill(3)}"
-        msg_content = msg.get('content', '')
-        msg_author = msg.get('author', 'unknown')
-        msg_timestamp = msg.get('timestamp', '')
-
-        msg_metadata = {
-            "conversation_id": conversation_id,
-            "message_type": "individual_message",
-            "author": msg_author,
-            "timestamp": msg_timestamp,
-            "sequence": str(next_sequence + i)
-        }
-
-        self.conversations.add(
-            documents=[msg_content],
-            metadatas=[msg_metadata],
-            ids=[msg_id]
-        )
-
-    return f"Conversation {conversation_id} updated: added {len(new_messages)} messages (total: {updated_metadata['message_count']})"
-```
-
-UTILISATION BACKEND (ikario_body):
-```javascript
-// POST /api/chat/message
-app.post('/api/chat/message', async (req, res) => {
-  const { message, conversationId } = req.body;
-
-  // Generate conversation_id if first message
-  const convId = conversationId || `conv_${Date.now()}`;
-
-  // Get LLM response
-  const llmResponse = await callClaudeAPI(message);
-
-  // ALWAYS use append_to_conversation (handles creation automatically)
-  await mcpClient.callTool('append_to_conversation', {
-    conversation_id: convId,
-    new_messages: [
-      { author: 'user', content: message, timestamp: new Date().toISOString() },
-      { author: 'assistant', content: llmResponse, timestamp: new Date().toISOString() }
-    ],
-    participants: ['user', 'assistant'],  // Requis pour première fois
-    context: {
-      category: 'chat',
-      date: new Date().toISOString()
-    }
-  });
-
-  res.json({ response: llmResponse, conversationId: convId });
-});
-```
-
-SIMPLICITÉ BACKEND:
-- Toujours le même appel (append_to_conversation)
-- Pas de logique if/else
-- MCP server gère la complexité
-
-================================================================================
-APPROCHE B: GARDER add_conversation ET append_to_conversation SÉPARÉS
-================================================================================
-
-append_to_conversation REJETTE si conversation n'existe pas:
-- Backend doit tracker si c'est le premier message
-- Appelle add_conversation pour création
-- Appelle append_to_conversation pour ajouts
-
-CODE append_to_conversation (strict):
-```python
-async def append_to_conversation(
-    self,
-    conversation_id: str,
-    new_messages: List[Dict[str, str]]
-) -> str:
-    """
-    Ajoute des messages à une conversation EXISTANTE
-    Lève une erreur si la conversation n'existe pas
-    """
-    # Vérifier existence
-    try:
-        existing = self.conversations.get(ids=[conversation_id])
-        if not existing['documents'] or len(existing['documents']) == 0:
-            raise ValueError(f"Conversation {conversation_id} does not exist. Use add_conversation first.")
-    except Exception as e:
-        raise ValueError(f"Conversation {conversation_id} not found: {e}")
-
-    # [Reste du code d'append...]
-```
-
-UTILISATION BACKEND (plus complexe):
-```javascript
-// POST /api/chat/message
-app.post('/api/chat/message', async (req, res) => {
-  const { message, conversationId, isFirstMessage } = req.body;
-
-  // Generate ID if new
-  const convId = conversationId || `conv_${Date.now()}`;
-
-  const llmResponse = await callClaudeAPI(message);
-
-  const messages = [
-    { author: 'user', content: message, timestamp: new Date().toISOString() },
-    { author: 'assistant', content: llmResponse, timestamp: new Date().toISOString() }
-  ];
-
-  // DIFFÉRENT selon si première fois ou non
-  if (isFirstMessage || !conversationId) {
-    // Première fois: créer
-    await mcpClient.callTool('add_conversation', {
-      conversation_id: convId,
-      participants: ['user', 'assistant'],
-      messages: messages,
-      context: { category: 'chat', date: new Date().toISOString() }
-    });
-  } else {
-    // Fois suivantes: append
-    await mcpClient.callTool('append_to_conversation', {
-      conversation_id: convId,
-      new_messages: messages
-    });
-  }
-
-  res.json({ response: llmResponse, conversationId: convId });
-});
-```
-
-DÉSAVANTAGES:
-- Backend plus complexe (if/else)
-- Doit tracker isFirstMessage
-- Risque d'erreur si mauvaise détection
-
-================================================================================
-MA RECOMMANDATION FINALE
-================================================================================
-
-APPROCHE A: append_to_conversation AVEC AUTO-CREATE
-
-POURQUOI:
-1. Backend simplifié (1 seul appel)
-2. Robuste (pas de risque d'oublier add_conversation)
-3. Sémantique acceptable (append = "ajouter à", que ça existe ou non)
-4. Moins de surface d'erreur
-
-IMPLEMENTATION:
-- J'ajoute append_to_conversation dans mcp_ikario_memory.py
-- Avec détection + auto-create si n'existe pas
-- J'expose l'outil dans server.py
-- Je mets à jour le spec UI pour utiliser cet outil
-
-ALTERNATIVE:
-Si tu préfères la sémantique stricte (Approche B), je peux faire ça aussi.
-
-================================================================================
-PROCHAINE ÉTAPE
-================================================================================
-
-Dis-moi:
-1. APPROCHE A (auto-create) ou APPROCHE B (strict)?
-2. Une fois choisi, je vais:
-   - Modifier mcp_ikario_memory.py
-   - Modifier server.py pour exposer l'outil
-   - Tester l'implémentation
-   - Mettre à jour le spec UI
-   - Supprimer ancien spec
-   - Supprimer 15 anciennes issues
-   - Créer 15 nouvelles issues
-   - Lancer agent initializer bis
-
-Quelle approche préfères-tu?
-
-================================================================================
-QUESTION: EST-CE QUE LA REFLEXION LLM EST ENREGISTREE?
-================================================================================
-Date: 19 decembre 2025 - 23h50
-
-Tu demandes si sont enregistrés:
-1. Message utilisateur
-2. Réflexion LLM (thinking)
-3. Message LLM (réponse)
-
-RÉPONSE ACTUELLE: NON, la réflexion LLM n'est PAS enregistrée
-
-ANALYSE DU CODE ACTUEL:
-----------------------
-
-Structure des messages (ligne 113):
-```python
-messages: List[Dict[str, str]]
-# [{"author": "david", "content": "...", "timestamp": "14:30:05"}, ...]
-```
-
-Champs actuels:
-- author: "david" ou "ikario"
-- content: Le contenu du message
-- timestamp: Horodatage
-
-Il n'y a PAS de champ "thinking" ou "reflection".
-
-CE QUI EST ENREGISTRÉ ACTUELLEMENT:
------------------------------------
-
-Message user:
-{
-  "author": "user",
-  "content": "Comment faire un fetch API?",
-  "timestamp": "14:30:00"
-}
-
-Message LLM:
-{
-  "author": "assistant",
-  "content": "Voici comment faire un fetch API: ...",  <-- SEULEMENT la réponse finale
-  "timestamp": "14:30:05"
-}
-
-La réflexion interne (Extended Thinking) n'est PAS capturée.
-
-================================================================================
-QUESTION: VEUX-TU ENREGISTRER LA RÉFLEXION LLM?
-================================================================================
-
-Avec Extended Thinking, Claude génère:
-1. Thinking (réflexion interne, raisonnement)
-2. Response (réponse visible à l'utilisateur)
-
-OPTION 1: ENREGISTRER SEULEMENT LA RÉPONSE (actuel)
----------------------------------------------------
-Message LLM dans ChromaDB:
-{
-  "author": "assistant",
-  "content": "Voici comment faire un fetch API: ..."
-}
-
-AVANTAGES:
-- Plus simple
-- Moins de données stockées
-- Embeddings basés sur le contenu utile
-
-INCONVÉNIENTS:
-- Perte du raisonnement interne
-- Impossible de retrouver "comment le LLM a pensé"
-
-OPTION 2: ENREGISTRER THINKING + RÉPONSE (recommandé)
------------------------------------------------------
-Message LLM dans ChromaDB:
-{
-  "author": "assistant",
-  "content": "Voici comment faire un fetch API: ...",
-  "thinking": "L'utilisateur demande... je dois expliquer... [réflexion complète]"
-}
-
-OU (séparé):
-Message thinking:
-{
-  "author": "assistant",
-  "message_type": "thinking",
-  "content": "[réflexion interne]"
-}
-
-Message response:
-{
-  "author": "assistant",
-  "message_type": "response",
-  "content": "Voici comment faire..."
-}
-
-AVANTAGES:
-- Capture le raisonnement complet
-- Recherche sémantique sur la réflexion
-- Comprendre l'évolution de la pensée
-- Traçabilité totale
-
-INCONVÉNIENTS:
-- Plus de données stockées
-- Structure plus complexe
-
-OPTION 3: THINKING SÉPARÉ (dans thoughts, pas conversations)
-------------------------------------------------------------
-Conversation:
-- Message user
-- Message LLM (réponse seulement)
-
-Thoughts (collection séparée):
-- Thinking du LLM stocké comme une "pensée"
-
-AVANTAGES:
-- Séparation claire: conversations = dialogue, thoughts = réflexions
-- Cohérent avec l'architecture actuelle (2 collections)
-
-INCONVÉNIENTS:
-- Perte du lien direct avec la conversation
-- Plus complexe à récupérer
-
-================================================================================
-MA RECOMMANDATION
-================================================================================
-
-OPTION 2 (ENREGISTRER THINKING + RÉPONSE dans le même message)
-
-Structure proposée:
-```python
-messages: List[Dict[str, Any]]  # Changement: Any au lieu de str
-
-# Message user (inchangé)
-{
-  "author": "user",
-  "content": "Comment faire un fetch API?",
-  "timestamp": "14:30:00"
-}
-
-# Message LLM (nouveau format)
-{
-  "author": "assistant",
-  "content": "Voici comment faire un fetch API: ...",
-  "thinking": "[Réflexion interne du LLM...]",  # NOUVEAU
-  "timestamp": "14:30:05"
-}
-```
-
-IMPLÉMENTATION:
-- Modifier add_conversation pour accepter champ "thinking" optionnel
-- Stocker thinking dans les métadonnées du message individuel
-- Document principal: inclure ou non le thinking? (à décider)
-
-POUR LE DOCUMENT PRINCIPAL:
-OPTION A: Inclure thinking
-  "user: Comment faire...\nassistant (thinking): [réflexion]\nassistant: Voici comment..."
-
-OPTION B: Exclure thinking (seulement dialogue visible)
-  "user: Comment faire...\nassistant: Voici comment..."
-
-Je recommande OPTION A (inclure thinking dans document principal).
-
-POURQUOI:
-- Recherche sémantique plus riche
-- Retrouver "cette fois où le LLM a raisonné sur X"
-- Traçabilité complète
-
-================================================================================
-DÉCISION REQUISE
-================================================================================
-
-Avant de commencer à développer append_to_conversation, tu dois décider:
-
-1. ENREGISTRER LA RÉFLEXION LLM?
-   a) OUI - Ajouter champ "thinking" dans les messages
-   b) NON - Garder seulement "content" (réponse finale)
-
-2. SI OUI, FORMAT?
-   a) Thinking dans le même message (recommandé)
-   b) Thinking comme message séparé
-   c) Thinking dans collection thoughts (séparé)
-
-3. SI OUI, DOCUMENT PRINCIPAL?
-   a) Inclure thinking dans l'embedding
-   b) Exclure thinking (seulement dialogue)
-
-Mes recommandations:
-1. a) OUI
-2. a) Même message
-3. a) Inclure thinking
-
-Qu'en penses-tu?
-
-================================================================================
-DECISION CONFIRMEE: OPTION 2 (THINKING DANS LE MESSAGE)
-================================================================================
-Date: 19 decembre 2025 - 23h55
-
-Tu confirmes:
-- OUI pour enregistrer le thinking
-- Option 2: Thinking dans le même message (fait partie de la conversation)
-- PAS une pensée séparée dans thoughts
-
-CORRECT! Le thinking est le raisonnement du LLM PENDANT la conversation.
-
-================================================================================
-PLAN DETAILLE: INTEGRATION THINKING DANS CONVERSATIONS
-================================================================================
-
-PHASE 1: ANALYSE DES MODIFICATIONS NECESSAIRES
-----------------------------------------------
-
-Fichiers à modifier:
-1. mcp_ikario_memory.py (C:/Users/david/SynologyDrive/ikario/ikario_rag/)
-   - Modifier add_conversation
-   - Ajouter append_to_conversation
-
-2. server.py (C:/Users/david/SynologyDrive/ikario/ikario_rag/)
-   - Exposer append_to_conversation comme outil MCP
-
-3. prompts/app_spec_ikario_rag_UI.txt (C:/GitHub/Linear_coding/)
-   - Mettre à jour pour utiliser append_to_conversation
-   - Documenter le champ thinking
-
-PHASE 2: STRUCTURE DES DONNEES
-------------------------------
-
-NOUVEAU FORMAT MESSAGE:
-
-Message utilisateur (inchangé):
-{
-  "author": "user",
-  "content": "Comment faire un fetch API?",
-  "timestamp": "2025-12-19T14:30:00"
-}
-
-Message LLM (NOUVEAU avec thinking):
-{
-  "author": "assistant",
-  "content": "Voici comment faire un fetch API...",
-  "thinking": "L'utilisateur demande une explication sur fetch API. Je dois expliquer...",  # OPTIONNEL
-  "timestamp": "2025-12-19T14:30:05"
-}
-
-STOCKAGE DANS CHROMADB:
-
-1. DOCUMENT PRINCIPAL (conversation_id):
-   Documents: Texte complet avec thinking inclus
-   Format:
-   ```
-   user: Comment faire un fetch API?
-   assistant (thinking): L'utilisateur demande une explication...
-   assistant: Voici comment faire un fetch API...
-   ```
-
-2. MESSAGES INDIVIDUELS (conversation_id_msg_001, etc.):
-   Documents: Contenu du message
-   Métadonnées:
-   - author: "user" ou "assistant"
-   - timestamp: "..."
-   - sequence: "1", "2", etc.
-   - thinking: "[texte du thinking]" (si présent, optionnel)
-   - message_type: "individual_message"
-
-DECISION: INCLURE THINKING DANS DOCUMENT PRINCIPAL
-
-POURQUOI:
-- Recherche sémantique plus riche
-- "Trouve la conversation où le LLM a raisonné sur les performances React"
-- Traçabilité complète du raisonnement
-
-PHASE 3: MODIFICATIONS DANS add_conversation
---------------------------------------------
-
-Changements nécessaires:
-
-1. SIGNATURE (ligne 100-106):
-   AVANT:
-   ```python
-   async def add_conversation(
-       self,
-       participants: List[str],
-       messages: List[Dict[str, str]],  # <-- str
-       context: Dict[str, Any],
-       conversation_id: Optional[str] = None
-   ) -> str:
-   ```
-
-   APRES:
-   ```python
-   async def add_conversation(
-       self,
-       participants: List[str],
-       messages: List[Dict[str, Any]],  # <-- Any pour supporter thinking
-       context: Dict[str, Any],
-       conversation_id: Optional[str] = None
-   ) -> str:
-   ```
-
-2. DOCUMENT PRINCIPAL (ligne 131-138):
-   AVANT:
-   ```python
-   full_text_parts = []
-   for msg in messages:
-       author = msg.get('author', 'unknown')
-       content = msg.get('content', '')
-       full_text_parts.append(f"{author}: {content}")
-   ```
-
-   APRES:
-   ```python
-   full_text_parts = []
-   for msg in messages:
-       author = msg.get('author', 'unknown')
-       content = msg.get('content', '')
-       thinking = msg.get('thinking', None)
-
-       # Si thinking présent, l'inclure dans le document principal
-       if thinking:
-           full_text_parts.append(f"{author} (thinking): {thinking}")
-
-       full_text_parts.append(f"{author}: {content}")
-   ```
-
-3. MESSAGES INDIVIDUELS (ligne 166-187):
-   AVANT:
-   ```python
-   for i, msg in enumerate(messages):
-       msg_id = f"{conversation_id}_msg_{str(i+1).zfill(3)}"
-       msg_content = msg.get('content', '')
-       msg_author = msg.get('author', 'unknown')
-       msg_timestamp = msg.get('timestamp', '')
-
-       msg_metadata = {
-           "conversation_id": conversation_id,
-           "message_type": "individual_message",
-           "author": msg_author,
-           "timestamp": msg_timestamp,
-           "sequence": str(i+1)
-       }
-   ```
-
-   APRES:
-   ```python
-   for i, msg in enumerate(messages):
-       msg_id = f"{conversation_id}_msg_{str(i+1).zfill(3)}"
-       msg_content = msg.get('content', '')
-       msg_author = msg.get('author', 'unknown')
-       msg_timestamp = msg.get('timestamp', '')
-       msg_thinking = msg.get('thinking', None)  # NOUVEAU
-
-       msg_metadata = {
-           "conversation_id": conversation_id,
-           "message_type": "individual_message",
-           "author": msg_author,
-           "timestamp": msg_timestamp,
-           "sequence": str(i+1)
-       }
-
-       # Ajouter thinking aux métadonnées si présent
-       if msg_thinking:
-           msg_metadata["thinking"] = msg_thinking  # NOUVEAU
-   ```
-
-PHASE 4: IMPLEMENTATION append_to_conversation
-----------------------------------------------
-
-Nouvelle fonction complète:
-
-```python
-async def append_to_conversation(
-    self,
-    conversation_id: str,
-    new_messages: List[Dict[str, Any]],
-    participants: Optional[List[str]] = None,
-    context: Optional[Dict[str, Any]] = None
-) -> str:
-    """
-    Ajoute des messages à une conversation (ou la crée si n'existe pas)
-
-    Support du champ 'thinking' optionnel dans les messages.
-
-    Args:
-        conversation_id: ID de la conversation
-        new_messages: Messages à ajouter
-            Format: [
-                {"author": "user", "content": "...", "timestamp": "..."},
-                {"author": "assistant", "content": "...", "thinking": "...", "timestamp": "..."}
-            ]
-        participants: Liste participants (requis si création)
-        context: Métadonnées (requis si création)
-
-    Returns:
-        Message de confirmation
-    """
-    # 1. VERIFIER SI LA CONVERSATION EXISTE
-    try:
-        existing = self.conversations.get(ids=[conversation_id])
-        conversation_exists = (
-            existing and
-            existing['documents'] and
-            len(existing['documents']) > 0
-        )
-    except:
-        conversation_exists = False
-
-    # 2. SI N'EXISTE PAS: CREER (déléguer à add_conversation)
-    if not conversation_exists:
-        if not participants or not context:
-            raise ValueError(
-                "participants and context required when creating new conversation"
-            )
-        return await self.add_conversation(
-            participants=participants,
-            messages=new_messages,
-            context=context,
-            conversation_id=conversation_id
-        )
-
-    # 3. SI EXISTE: APPEND
-
-    # 3a. Extraire métadonnées existantes
-    existing_metadata = existing['metadatas'][0]
-    current_message_count = int(existing_metadata.get('message_count', 0))
-    next_sequence = current_message_count + 1
-
-    # 3b. Récupérer ancien texte complet
-    old_full_text = existing['documents'][0]
-
-    # 3c. Construire nouveau texte avec thinking si présent
-    new_text_parts = []
-    for msg in new_messages:
-        author = msg.get('author', 'unknown')
-        content = msg.get('content', '')
-        thinking = msg.get('thinking', None)
-
-        # Inclure thinking dans le document principal si présent
-        if thinking:
-            new_text_parts.append(f"{author} (thinking): {thinking}")
-
-        new_text_parts.append(f"{author}: {content}")
-
-    new_text = "\n".join(new_text_parts)
-    updated_full_text = old_full_text + "\n" + new_text
-
-    # 3d. Mettre à jour métadonnées
-    updated_metadata = existing_metadata.copy()
-    updated_metadata['message_count'] = str(current_message_count + len(new_messages))
-
-    # Merger context si fourni
-    if context:
-        for key, value in context.items():
-            if isinstance(value, list):
-                updated_metadata[key] = ", ".join(str(v) for v in value)
-            elif isinstance(value, dict):
-                updated_metadata[key] = json.dumps(value)
-            else:
-                updated_metadata[key] = str(value)
-
-    # 3e. Supprimer ancien document principal
-    self.conversations.delete(ids=[conversation_id])
-
-    # 3f. Ajouter nouveau document principal
-    self.conversations.add(
-        documents=[updated_full_text],
-        metadatas=[updated_metadata],
-        ids=[conversation_id]
-    )
-
-    # 3g. Ajouter nouveaux messages individuels
-    for i, msg in enumerate(new_messages):
-        msg_id = f"{conversation_id}_msg_{str(next_sequence + i).zfill(3)}"
-        msg_content = msg.get('content', '')
-        msg_author = msg.get('author', 'unknown')
-        msg_timestamp = msg.get('timestamp', '')
-        msg_thinking = msg.get('thinking', None)
-
-        msg_metadata = {
-            "conversation_id": conversation_id,
-            "message_type": "individual_message",
-            "author": msg_author,
-            "timestamp": msg_timestamp,
-            "sequence": str(next_sequence + i)
-        }
-
-        # Ajouter thinking aux métadonnées si présent
-        if msg_thinking:
-            msg_metadata["thinking"] = msg_thinking
-
-        # Générer embedding pour ce message (content seulement, pas thinking)
-        self.conversations.add(
-            documents=[msg_content],
-            metadatas=[msg_metadata],
-            ids=[msg_id]
-        )
-
-    return f"Conversation {conversation_id} updated: added {len(new_messages)} messages (total: {updated_metadata['message_count']})"
-```
-
-PHASE 5: EXPOSITION DANS server.py
-----------------------------------
-
-Ajouter l'outil MCP pour append_to_conversation:
-
-```python
-@server.call_tool()
-async def call_tool(name: str, arguments: dict) -> list[types.TextContent]:
-    """Handle tool calls"""
-
-    # ... (outils existants: add_thought, add_conversation, etc.)
-
-    # NOUVEAU: append_to_conversation
-    elif name == "append_to_conversation":
-        result = await memory.append_to_conversation(
-            conversation_id=arguments["conversation_id"],
-            new_messages=arguments["new_messages"],
-            participants=arguments.get("participants"),
-            context=arguments.get("context")
-        )
-        return [types.TextContent(type="text", text=result)]
-```
-
-Et ajouter la définition de l'outil:
-
-```python
-@server.list_tools()
-async def list_tools() -> list[types.Tool]:
-    """List available tools"""
-    return [
-        # ... (outils existants)
-
-        types.Tool(
-            name="append_to_conversation",
-            description=(
-                "Ajoute des messages à une conversation existante (ou la crée si nécessaire). "
-                "Support du champ 'thinking' optionnel pour capturer le raisonnement du LLM. "
-                "Si la conversation n'existe pas, elle sera créée automatiquement."
-            ),
-            inputSchema={
-                "type": "object",
-                "properties": {
-                    "conversation_id": {
-                        "type": "string",
-                        "description": "ID de la conversation"
-                    },
-                    "new_messages": {
-                        "type": "array",
-                        "description": "Nouveaux messages à ajouter",
-                        "items": {
-                            "type": "object",
-                            "properties": {
-                                "author": {"type": "string"},
-                                "content": {"type": "string"},
-                                "thinking": {"type": "string", "description": "Réflexion interne du LLM (optionnel)"},
-                                "timestamp": {"type": "string"}
-                            },
-                            "required": ["author", "content", "timestamp"]
-                        }
-                    },
-                    "participants": {
-                        "type": "array",
-                        "items": {"type": "string"},
-                        "description": "Liste des participants (requis si création)"
-                    },
-                    "context": {
-                        "type": "object",
-                        "description": "Métadonnées de la conversation (requis si création)"
-                    }
-                },
-                "required": ["conversation_id", "new_messages"]
-            }
-        )
-    ]
-```
-
-PHASE 6: TESTS A EFFECTUER
---------------------------
-
-Test 1: Création nouvelle conversation SANS thinking
-```python
-await append_to_conversation(
-    conversation_id="conv_test_1",
-    new_messages=[
-        {"author": "user", "content": "Bonjour", "timestamp": "14:30:00"},
-        {"author": "assistant", "content": "Salut!", "timestamp": "14:30:05"}
-    ],
-    participants=["user", "assistant"],
-    context={"category": "test"}
-)
-```
-
-Test 2: Création nouvelle conversation AVEC thinking
-```python
-await append_to_conversation(
-    conversation_id="conv_test_2",
-    new_messages=[
-        {"author": "user", "content": "Comment faire un fetch?", "timestamp": "14:30:00"},
-        {
-            "author": "assistant",
-            "content": "Voici comment...",
-            "thinking": "L'utilisateur demande une explication sur fetch API...",
-            "timestamp": "14:30:05"
-        }
-    ],
-    participants=["user", "assistant"],
-    context={"category": "test"}
-)
-```
-
-Test 3: Append à conversation existante SANS thinking
-```python
-await append_to_conversation(
-    conversation_id="conv_test_1",
-    new_messages=[
-        {"author": "user", "content": "Merci!", "timestamp": "14:31:00"},
-        {"author": "assistant", "content": "De rien!", "timestamp": "14:31:02"}
-    ]
-)
-```
-
-Test 4: Append à conversation existante AVEC thinking
-```python
-await append_to_conversation(
-    conversation_id="conv_test_2",
-    new_messages=[
-        {"author": "user", "content": "Et avec async/await?", "timestamp": "14:31:00"},
-        {
-            "author": "assistant",
-            "content": "Avec async/await...",
-            "thinking": "Il veut comprendre async/await avec fetch...",
-            "timestamp": "14:31:05"
-        }
-    ]
-)
-```
-
-Test 5: Vérifier embeddings et métadonnées
-```python
-# Récupérer la conversation
-result = await search_conversations("fetch API", n_results=1)
-
-# Vérifier:
-# - Document principal contient thinking
-# - Messages individuels ont métadonnée "thinking"
-# - Embeddings corrects
-```
-
-PHASE 7: MISE A JOUR SPEC UI
-----------------------------
-
-Dans prompts/app_spec_ikario_rag_UI.txt:
-
-1. Remplacer add_conversation par append_to_conversation dans les exemples
-
-2. Documenter le champ thinking:
-```
-OUTIL MCP: append_to_conversation
-- Paramètres:
-  * conversation_id: ID de la session
-  * new_messages: Array de messages
-    - author: "user" ou "assistant"
-    - content: Contenu du message
-    - thinking: Réflexion LLM (OPTIONNEL)
-    - timestamp: ISO date
-  * participants: ["user", "assistant"] (requis si nouvelle conversation)
-  * context: {category, date, ...} (requis si nouvelle conversation)
-```
-
-3. Exemple d'utilisation backend:
-```javascript
-// POST /api/chat/message
-const llmResponse = await callClaudeAPI(userMessage, { extended_thinking: true });
-
-await mcpClient.callTool('append_to_conversation', {
-  conversation_id: conversationId,
-  new_messages: [
-    { author: 'user', content: userMessage, timestamp: new Date().toISOString() },
-    {
-      author: 'assistant',
-      content: llmResponse.content,
-      thinking: llmResponse.thinking,  // Inclure le thinking si Extended Thinking activé
-      timestamp: new Date().toISOString()
-    }
-  ],
-  participants: ['user', 'assistant'],
-  context: { category: 'chat', date: new Date().toISOString() }
-});
-```
-
-================================================================================
-RESUME DU PLAN
-================================================================================
-
-ORDRE D'EXECUTION:
-
-1. [EN COURS] Créer ce plan détaillé ✓
-2. Faire commit de sauvegarde dans ikario_rag
-3. Modifier add_conversation (support thinking)
-4. Implémenter append_to_conversation (avec thinking)
-5. Modifier server.py (exposer append_to_conversation)
-6. Tester les 5 scénarios
-7. Mettre à jour spec UI
-8. Commit final
-9. Supprimer ancien spec + anciennes issues
-10. Créer 15 nouvelles issues
-11. Lancer agent initializer bis
-
-FICHIERS MODIFIES:
-- C:/Users/david/SynologyDrive/ikario/ikario_rag/mcp_ikario_memory.py
-- C:/Users/david/SynologyDrive/ikario/ikario_rag/server.py
-- C:/GitHub/Linear_coding/prompts/app_spec_ikario_rag_UI.txt
-
-NOUVEAUX OUTILS MCP:
-- append_to_conversation (8ème outil)
-
-NOUVEAU FORMAT:
-- Messages avec champ "thinking" optionnel
-- Document principal inclut thinking
-- Métadonnées individuelles incluent thinking
-
-================================================================================
-PROCHAINE ETAPE
-================================================================================
-
-Est-ce que ce plan te convient?
-
-Si OUI:
-1. Je fais le commit de sauvegarde
-2. Je commence les modifications
-
-Si NON:
-- Dis-moi ce qu'il faut changer dans le plan
-
-================================================================================
-IMPLEMENTATION TERMINEE !
-================================================================================
-Date: 20 decembre 2025 - 00h15
-
-TOUT EST FAIT ET TESTE AVEC SUCCES!
-
-================================================================================
-RESUME DES MODIFICATIONS
-================================================================================
-
-FICHIERS MODIFIES:
-1. mcp_ikario_memory.py (C:/Users/david/SynologyDrive/ikario/ikario_rag/)
-   - Ligne 103: Signature add_conversation changee (Dict[str, Any])
-   - Lignes 131-143: Document principal inclut thinking
-   - Lignes 172-200: Messages individuels stockent thinking dans metadata
-   - Lignes 202-329: Nouvelle fonction append_to_conversation (129 lignes)
-
-2. server.py (C:/Users/david/SynologyDrive/ikario/ikario_rag/)
-   - Lignes 173-272: Tool append_to_conversation ajoute (definition MCP)
-   - Ligne 195: Tool add_conversation mis a jour (thinking dans schema)
-   - Lignes 427-438: Handler append_to_conversation ajoute
-
-3. test_append_conversation.py (NOUVEAU - tests)
-   - 6 tests automatises
-   - Tous passent avec succes
-
-================================================================================
-COMMITS CREES
-================================================================================
-
-Commit 1 (backup): 55d905b
-"Backup before adding append_to_conversation with thinking support"
-
-Commit 2 (implementation): cba84fe
-"Add append_to_conversation with thinking support (8th MCP tool)"
-
-================================================================================
-TESTS REUSSIS (6/6)
-================================================================================
-
-Test 1: Creation conversation SANS thinking
-[OK] Conversation ajoutee: test_conv_1 (2 messages)
-
-Test 2: Creation conversation AVEC thinking
-[OK] Conversation ajoutee: test_conv_2 (2 messages)
-
-Test 3: Append a conversation SANS thinking
-[OK] Conversation test_conv_1 updated: added 2 messages (total: 4)
-
-Test 4: Append a conversation AVEC thinking
-[OK] Conversation test_conv_2 updated: added 2 messages (total: 4)
-
-Test 5: Recherche semantique avec thinking
-[OK] Found 1 conversation
-     Relevance: 0.481
-     Thinking visible dans le document principal!
-
-Test 6: Verification metadata
-[OK] Thinking metadata is present!
-     Stocke dans les messages individuels
-
-================================================================================
-NOUVEAU FORMAT MESSAGE
-================================================================================
-
-Message utilisateur (inchange):
-{
-  "author": "user",
-  "content": "Comment faire un fetch API?",
-  "timestamp": "2025-12-20T00:10:00"
-}
-
-Message LLM (NOUVEAU avec thinking optionnel):
-{
-  "author": "assistant",
-  "content": "Voici comment faire...",
-  "thinking": "L'utilisateur demande une explication...",  # OPTIONNEL
-  "timestamp": "2025-12-20T00:10:05"
-}
-
-================================================================================
-NOUVEL OUTIL MCP: append_to_conversation (8eme)
-================================================================================
-
-DESCRIPTION:
-Ajoute des messages a une conversation existante (ou la cree si necessaire).
-Support du champ 'thinking' optionnel pour capturer le raisonnement du LLM.
-Si la conversation n'existe pas, elle sera creee automatiquement.
-
-PARAMETRES:
-- conversation_id: string (requis)
-- new_messages: array (requis)
-  * author: string
-  * content: string
-  * thinking: string (OPTIONNEL)
-  * timestamp: string
-- participants: array (requis si creation)
-- context: object (requis si creation)
-
-EXEMPLE D'UTILISATION:
-await mcpClient.callTool('append_to_conversation', {
-  conversation_id: 'conv_20251220_0010',
-  new_messages: [
-    { author: 'user', content: 'Bonjour', timestamp: '...' },
-    {
-      author: 'assistant',
-      content: 'Salut!',
-      thinking: 'L\'utilisateur me salue...',
-      timestamp: '...'
-    }
-  ],
-  participants: ['user', 'assistant'],
-  context: { category: 'chat', date: '2025-12-20' }
-});
-
-================================================================================
-AVANTAGES
-================================================================================
-
-1. THINKING CAPTURE:
-   - Raisonnement LLM preserve dans la memoire
-   - Recherche semantique enrichie
-   - Tracabilite complete des reflexions
-
-2. AUTO-CREATE:
-   - Backend simplifie (1 seul appel)
-   - Pas besoin de tracker si premiere fois
-   - Robuste
-
-3. BACKWARD COMPATIBLE:
-   - Thinking optionnel
-   - Code existant continue de fonctionner
-   - Pas de breaking changes
-
-4. SEMANTIC SEARCH:
-   - Thinking inclus dans l'embedding principal
-   - "Trouve la conversation ou le LLM a raisonne sur X"
-   - Meilleure pertinence des resultats
-
-================================================================================
-PROCHAINES ETAPES
-================================================================================
-
-1. [TERMINE] Mettre a jour spec UI (app_spec_ikario_rag_UI.txt) ✓
-2. [SUIVANT] Supprimer ancien spec (app_spec_ikario_rag_improvements.txt)
-3. Supprimer 15 anciennes issues Linear (TEAMPHI-305 a 319)
-4. Creer 15 nouvelles issues avec nouveau spec
-5. Lancer agent initializer bis
-
-================================================================================
-SPEC UI MIS A JOUR !
-================================================================================
-Date: 20 decembre 2025 - 00h30
-
-Fichier: prompts/app_spec_ikario_rag_UI.txt
-
-MODIFICATIONS EFFECTUEES:
-
-1. Ligne 9-13: Overview mis a jour
-   - "8 outils MCP" (au lieu de 7)
-   - Ajout de append_to_conversation dans la liste
-   - Mention du support thinking optionnel
-
-2. Ligne 44: Technology stack mis a jour
-   - "8 outils MCP disponibles (avec append_to_conversation + support thinking)"
-
-3. Lignes 103-124: Routes API mises a jour
-   - Ajout route: POST /api/memory/conversations/append
-   - Documentation append_to_conversation (auto-create, thinking)
-   - Format message avec thinking documente
-
-4. Lignes 156-185: Memory Service Layer mis a jour
-   - Ajout fonction appendToConversation() avec exemple complet
-   - Documentation auto-create et thinking optionnel
-
-5. Lignes 440-462: Chat Integration mis a jour
-   - Utilisation de append_to_conversation pour chat streaming
-   - Exemple POST avec thinking optionnel
-   - Support Extended Thinking documente
-
-6. Lignes 777-790: Tests mis a jour
-   - Ajout test append_to_conversation
-   - Test thinking optionnel
-   - Test auto-creation
-
-7. Lignes 982-988: Success criteria mis a jour
-   - "8 endpoints" (au lieu de 7)
-   - Ajout validation append_to_conversation
-   - Validation thinking support
-
-8. Lignes 1012-1014: Constraints mis a jour
-   - "8 outils MCP existants"
-   - Note: append_to_conversation deja implemente (commit cba84fe)
-
-RESUME DES CHANGEMENTS:
-- 8 sections modifiees
-- Documentation complete du nouvel outil
-- Exemples concrets d'utilisation avec thinking
-- Distinction claire: add_conversation (complete) vs append_to_conversation (incremental)
-- Guidelines pour integration chat avec thinking support
-
-LE SPEC EST PRET pour creation des issues!
-
-COMMIT GIT CREE:
-Commit: 3a17744
-Message: "Update UI spec for append_to_conversation and thinking support"
-
-Fichiers commites:
-- prompts/app_spec_ikario_rag_UI.txt (spec mis a jour)
-- navette.txt (ce fichier)
-
-================================================================================
-ETAT ACTUEL - RECAPITULATIF COMPLET
-================================================================================
-
-TRAVAIL TERMINE:
-✓ Plan detaille cree (7 phases)
-✓ Commit backup (55d905b)
-✓ Modifications mcp_ikario_memory.py (support thinking + append_to_conversation)
-✓ Modifications server.py (8eme outil MCP expose)
-✓ Tests automatises (6/6 reussis)
-✓ Commit implementation (cba84fe)
-✓ Spec UI mis a jour (8 sections modifiees)
-✓ Commit spec UI (3a17744)
-
-COMMITS CREES (3 au total):
-1. 55d905b - Backup before adding append_to_conversation
-2. cba84fe - Add append_to_conversation with thinking support (ikario_rag)
-3. 3a17744 - Update UI spec (Linear_coding)
-
-OUTILS MCP DISPONIBLES (8):
-1. add_thought
-2. add_conversation (avec thinking optionnel)
-3. append_to_conversation (NOUVEAU - incremental + auto-create + thinking)
-4. search_thoughts
-5. search_conversations
-6. search_memories
-7. trace_concept_evolution
-8. check_consistency
-
-NOUVEAU FORMAT MESSAGE:
-{
-  "author": "assistant",
-  "content": "Reponse visible",
-  "thinking": "Raisonnement interne LLM",  // OPTIONNEL
-  "timestamp": "ISO date"
-}
-
-================================================================================
-PROCHAINES ACTIONS RECOMMANDEES
-================================================================================
-
-1. SUPPRIMER ancien spec (app_spec_ikario_rag_improvements.txt)
-   - Cause confusion (parle de modifier ikario_rag)
-   - Nouveau spec est complet
-
-2. SUPPRIMER 15 anciennes issues Linear (TEAMPHI-305 a 319)
-   - Ces issues parlent de modifier ikario_rag (on ne veut plus)
-   - Clean slate pour nouvelles issues
-
-3. CREER 15 nouvelles issues avec nouveau spec
-   - Utiliser: python autonomous_agent_demo.py --project-dir ikario_body --new-spec app_spec_ikario_rag_UI.txt
-   - Mode: initializer bis
-   - Issues pour developper dans ikario_body uniquement
-
-4. LANCER agent coding
-   - Apres creation des issues
-   - Mode: coding agent
-   - Developper les 15 features UI
-
-VEUX-TU QUE JE CONTINUE?
-Options:
-a) OUI - Supprimer ancien spec + anciennes issues + creer nouvelles issues
-b) ATTENDRE - Tu veux verifier quelque chose avant
-c) MODIFIER - Tu veux changer le plan
-
-================================================================================
-CLARIFICATIONS IMPORTANTES - TES QUESTIONS
-================================================================================
-Date: 20 decembre 2025 - 00h45
-
-QUESTION 1: Difference entre search_thoughts et search_memories?
-----------------------------------------------------------------
-
-J'ai verifie le code mcp_ikario_memory.py:
-
-search_thoughts (lignes 191-224):
-- Recherche SEULEMENT dans la collection "thoughts"
-- Filtre optionnel: filter_thought_type
-- Retourne: pensees internes d'Ikario
-
-search_conversations (lignes 226-282):
-- Recherche SEULEMENT dans la collection "conversations"
-- Filtre optionnel: filter_category, search_level
-- Retourne: conversations David-Ikario
-
-search_memories (lignes 37-51):
-- PROBLEME IDENTIFIE!
-- Code actuel: recherche SEULEMENT dans self.conversations (ligne 43)
-- Ce n'est PAS une vraie recherche globale!
-- C'est essentiellement la meme chose que search_conversations
-
-CONCLUSION:
-search_memories DEVRAIT faire une recherche globale (thoughts + conversations)
-mais actuellement il cherche SEULEMENT dans conversations.
-
-C'est probablement un bug ou une implementation incomplete.
-
-QUESTION 2: Je melange les deux projets?
------------------------------------------
-
-OUI, tu as raison! J'ai melange:
-
-PROJET 1: ikario_rag (C:/Users/david/SynologyDrive/ikario/ikario_rag/)
-- Backend MCP Python
-- 8 outils MCP exposes
-- ChromaDB avec embeddings
-- CE QU'ON A FAIT:
-  * Ajoute append_to_conversation (mcp_ikario_memory.py)
-  * Ajoute support thinking (mcp_ikario_memory.py)
-  * Expose 8eme outil (server.py)
-  * Tests (test_append_conversation.py)
-  * Commits: 55d905b, cba84fe
-
-PROJET 2: ikario_body (C:/GitHub/Linear_coding/generations/ikario_body/)
-- Frontend React + Backend Express
-- Interface utilisateur pour UTILISER les outils MCP d'ikario_rag
-- CE QU'ON A FAIT:
-  * Cree spec UI (prompts/app_spec_ikario_rag_UI.txt)
-  * Commit: 3a17744
-  * MAIS: rien d'implemente dans ikario_body encore!
-
-Le spec UI que j'ai cree est pour PLUS TARD, quand on developpera
-l'interface dans ikario_body qui UTILISERA ikario_rag.
-
-Tu as raison: ON DOIT D'ABORD FINIR ikario_rag!
-
-================================================================================
-CE QU'IL RESTE A FAIRE DANS ikario_rag
-================================================================================
-
-1. CORRIGER search_memories (bug identifie)
-   - Doit chercher dans thoughts + conversations
-   - Pas seulement conversations
-
-2. TESTER le serveur MCP complet
-   - Lancer server.py
-   - Tester avec un client MCP reel
-   - Verifier tous les 8 outils fonctionnent
-
-3. TESTER append_to_conversation via MCP
-   - Via server.py (pas seulement test_append_conversation.py)
-   - Avec thinking optionnel
-   - Auto-create
-
-4. VERIFIER backward compatibility
-   - Code existant continue de fonctionner
-   - Pas de breaking changes
-
-ENSUITE SEULEMENT on passera a ikario_body.
-
-================================================================================
-DECISION REQUISE
-================================================================================
-
-Veux-tu que je:
-
-A) CORRIGER search_memories d'abord (recherche globale thoughts + conversations)
-B) TESTER le serveur MCP tel quel (avec search_memories bugue)
-C) AUTRE chose?
-
-Je recommande A: corriger search_memories pour avoir une vraie recherche globale.
-
-================================================================================
-CORRECTION search_memories TERMINEE !
-================================================================================
-Date: 20 decembre 2025 - 00h55
-
-PROBLEME IDENTIFIE:
-search_memories cherchait SEULEMENT dans conversations (ligne 43)
-Ce n'etait PAS une vraie recherche globale.
-
-CORRECTION APPLIQUEE (mcp_ikario_memory.py lignes 37-87):
-1. Recherche dans thoughts
-2. Recherche dans conversations
-3. Merge des resultats
-4. Tri par relevance_score (descendant)
-5. Ajout champ 'source' (thought vs conversation)
-6. Retourne top n_results combines
-
-TESTS CREES (test_search_memories.py):
-Test 1: Recherche "Python" - trouve thoughts ET conversations ✓
-Test 2: Verification tri par relevance_score ✓
-Test 3: Recherche "JavaScript" - trouve conversation ✓
-Test 4: filter_category (affecte seulement conversations) ✓
-
-RESULTAT DES TESTS:
-  Found 5 results:
-    - Thoughts: 2
-    - Conversations: 3
-
-  [1] Source: thought | Relevance: 0.513
-  [2] Source: thought | Relevance: 0.502
-  [3] Source: conversation | Relevance: 0.479
-  [4] Source: conversation | Relevance: 0.411
-  [5] Source: conversation | Relevance: 0.165
-
-  ✓ search_memories combines thoughts + conversations: OK
-  ✓ Results sorted by relevance: OK
-  ✓ Source field added: OK
-
-COMMIT CREE:
-Commit: 05a4613
-Message: "Fix search_memories to search in both thoughts and conversations"
-
-Fichiers modifies:
-- mcp_ikario_memory.py (fonction search_memories reecrite)
-- test_search_memories.py (nouveau fichier de test)
-
-================================================================================
-ETAT ACTUEL ikario_rag - RESUME COMPLET
-================================================================================
-
-COMMITS CREES (4 au total dans ikario_rag):
-1. 55d905b - Backup before adding append_to_conversation
-2. cba84fe - Add append_to_conversation with thinking support
-3. 05a4613 - Fix search_memories (global search)
-4. (Note: commit 3a17744 est dans Linear_coding, pas ikario_rag)
-
-OUTILS MCP DISPONIBLES (8):
-1. add_thought - Ajouter pensee ✓
-2. add_conversation - Ajouter conversation complete (avec thinking optionnel) ✓
-3. append_to_conversation - Ajouter messages incrementalement (auto-create + thinking) ✓
-4. search_thoughts - Rechercher dans thoughts ✓
-5. search_conversations - Rechercher dans conversations ✓
-6. search_memories - Recherche GLOBALE (thoughts + conversations) ✓ CORRIGE!
-7. trace_concept_evolution - Tracer evolution concept ✓
-8. check_consistency - Verifier coherence ✓
-
-TESTS REALISES:
-✓ test_append_conversation.py (6/6 tests) - append + thinking
-✓ test_search_memories.py (4/4 tests) - recherche globale
-
-RESTE A FAIRE dans ikario_rag:
-1. Tester serveur MCP complet (server.py)
-2. Tester append_to_conversation via MCP protocol (pas seulement Python)
-3. Verifier backward compatibility
-
-================================================================================
-PROCHAINE ETAPE
-================================================================================
-
-Veux-tu:
-
-A) TESTER le serveur MCP complet (lancer server.py et tester avec un client MCP)
-B) CREER un test MCP pour append_to_conversation
-C) AUTRE chose?
-
-Je recommande A: tester le serveur MCP complet pour s'assurer que tout fonctionne via le protocole MCP.
-
-================================================================================
-
-================================================================================
-PROBLEME CRITIQUE: EMBEDDING TRONQUE POUR CONVERSATIONS LONGUES
-================================================================================
-Date: 20/12/2025 - 15h30
-
-PROBLEME IDENTIFIE:
--------------------
-
-1. TRONCATURE MASSIVE:
-   - Modele actuel: all-MiniLM-L6-v2
-   - Limite: 256 tokens (~1000 caracteres)
-   - Conversation Fondatrice #1: 23,574 mots (~106,000 chars)
-
-   RESULTAT:
-   - Stockage ChromaDB: ✅ 106,000 chars complets
-   - Embedding cree sur: ❌ 1,280 chars seulement (1.2%!)
-   - Recherche semantique: ❌ 98.8% de la conversation INVISIBLE
-
-   Si vous cherchez quelque chose discute apres les 256 premiers mots,
-   search_memories ne le trouvera JAMAIS.
-
-2. QUALITE INSUFFISANTE POUR PHILOSOPHIE:
-   - all-MiniLM-L6-v2: 22M parametres (TRES petit)
-   - Optimise pour: Vitesse, pas comprehension semantique profonde
-   - Langue: Anglais principalement
-   - Performance sur concepts abstraits francais: MAUVAISE
-
-IMPACT REEL:
-------------
-
-Test avec differentes tailles:
-- 250 chars (50 mots): 100% conserve ✅
-- 1,000 chars (200 mots): 100% conserve ✅
-- 2,500 chars (500 mots): 51.2% conserve ⚠️
-- 10,000 chars (2,000 mots): 12.8% conserve ❌
-- 106,000 chars (23,574 mots): 1.2% conserve ❌❌❌
-
-Conversations philosophiques longues = CATASTROPHIQUE
-
-SOLUTION PROPOSEE:
-==================
-
-BENCHMARK DE 3 MODELES:
-
-1. all-MiniLM-L6-v2 (ACTUEL):
-   - Parametres: 22M
-   - Dimension: 384
-   - Max tokens: 256
-   - Langue: Anglais
-   - Qualite: Basique
-   - Pour Conv Fondatrice #1: 1.2% indexe
-   - VERDICT: ❌ Inadequat
-
-2. intfloat/multilingual-e5-large:
-   - Parametres: 560M (25x plus puissant)
-   - Dimension: 1024 (2.7x plus riche)
-   - Max tokens: 512 (2x plus long)
-   - Langue: Excellent francais + multilingue
-   - Qualite: State-of-the-art semantique
-   - Pour Conv Fondatrice #1: ~2.4% indexe
-   - VERDICT: ⚠️ Mieux mais encore insuffisant
-
-3. BAAI/bge-m3 (RECOMMANDE PAR DAVID):
-   - Parametres: 568M
-   - Dimension: 1024
-   - Max tokens: 8192 (32x plus long!)
-   - Langue: Multilingue excellent (francais inclus)
-   - Qualite: State-of-the-art retrieval
-   - Features: Dense + Sparse + Multi-vector retrieval (hybrid)
-   - Pour Conv Fondatrice #1: ~38-40% indexe
-   - VERDICT: ✅✅✅ EXCELLENT CHOIX!
-
-AVANTAGES BAAI/bge-m3:
-----------------------
-✅ Max tokens 8192 vs 256 actuel (32x amelioration!)
-✅ Hybrid retrieval (dense+sparse) pour meilleure precision
-✅ Specialement concu pour retrieval multilingue
-✅ Excellent sur benchmarks MTEB (top 3 mondial)
-✅ Supporte francais nativement
-✅ Comprehension semantique profonde pour concepts abstraits
-✅ Pour conversation 23,574 mots: conserve ~9,000 mots vs 256 actuellement
-
-PLAN D'ACTION PROPOSE:
-======================
-
-OPTION A - UPGRADE MODELE SEUL (RAPIDE):
------------------------------------------
-1. Remplacer all-MiniLM-L6-v2 par BAAI/bge-m3 dans ikario_rag
-2. Re-indexer toutes conversations existantes
-3. Tester performance recherche
-
-Fichier a modifier:
-- C:/Users/david/SynologyDrive/ikario/ikario_rag/mcp_ikario_memory.py
-  Ligne 31: self.embedder = SentenceTransformer('all-MiniLM-L6-v2')
-  → Remplacer par: self.embedder = SentenceTransformer('BAAI/bge-m3')
-
-Avantages:
-✅ Simple (1 ligne a changer)
-✅ Amelioration immediate et massive
-✅ Pas besoin chunking
-
-Inconvenients:
-⚠️ Download modele ~2.3GB (une fois)
-⚠️ 2-3x plus lent (acceptable pour batch)
-⚠️ +4GB RAM necessaire
-⚠️ Re-indexation toutes conversations existantes
-
-OPTION B - CHUNKING + UPGRADE MODELE (OPTIMAL):
-------------------------------------------------
-1. Implementer chunking intelligent pour conversations >8192 tokens
-2. Utiliser BAAI/bge-m3 pour embeddings
-3. Metadata: conversation_id + chunk_position pour reconstruction
-
-Avantages:
-✅ Couverture 100% meme conversations >40,000 mots
-✅ Meilleure qualite semantique
-✅ Flexible pour futures evolutions
-
-Inconvenients:
-⚠️ Plus complexe a implementer
-⚠️ Plus de documents dans ChromaDB
-⚠️ Logique de recherche plus sophistiquee
-
-RECOMMANDATION FINALE:
-======================
-
-PHASE 1 (MAINTENANT): Option A - Upgrade vers BAAI/bge-m3
-- Gain immediat: 1.2% → 38-40% couverture
-- Simple: 1 ligne de code
-- Suffisant pour 95% de vos conversations
-
-PHASE 2 (SI BESOIN): Ajouter chunking pour conversations exceptionnelles >40,000 mots
-- Seulement si vous avez regulierement des conversations >40,000 mots
-- Sinon, pas necessaire
-
-JUSTIFICATION POUR VOTRE CAS D'USAGE:
--------------------------------------
-Philosophie, concepts abstraits, idees complexes en francais:
-
-- all-MiniLM-L6-v2: Similarite textuelle basique, anglais
-  → Score MTEB: ~58/100
-  → Francais philosophie: ~40/100 (estime)
-
-- BAAI/bge-m3: Comprehension semantique profonde, multilingue
-  → Score MTEB: ~72/100 (+24%)
-  → Francais philosophie: ~70/100 (estime, +75% gain!)
-
-Pour conversations philosophiques: gain qualite estime >50%
-
-COUT DE MIGRATION:
-------------------
-- Temps: ~30 min (download modele + re-index)
-- Calcul: 2-3x plus lent (1 conversation = 2s vs 0.7s actuellement)
-- Memoire: +4GB RAM (total ~5GB vs ~1GB actuel)
-- Stockage: +2.3GB pour modele
-- Code: Minimal (1 ligne a changer + re-index script)
-
-PROCHAINE ETAPE:
-================
-Decider et implementer upgrade vers BAAI/bge-m3 dans ikario_rag
-
-================================================================================
diff --git a/prompts/app_spec_library_rag_types_docs.txt b/prompts/app_spec_library_rag_types_docs.txt
deleted file mode 100644
index 0fe4fa6..0000000
--- a/prompts/app_spec_library_rag_types_docs.txt
+++ /dev/null
@@ -1,679 +0,0 @@
-<project_specification>
-  <project_name>Library RAG - Type Safety & Documentation Enhancement</project_name>
-
-  <overview>
-    Enhance the Library RAG application (philosophical texts indexing and semantic search) by adding
-    strict type annotations and comprehensive Google-style docstrings to all Python modules. This will
-    improve code maintainability, enable static type checking with mypy, and provide clear documentation
-    for all functions, classes, and modules.
-
-    The application is a RAG pipeline that processes PDF documents through OCR, LLM-based extraction,
-    semantic chunking, and ingestion into Weaviate vector database. It includes a Flask web interface
-    for document upload, processing, and semantic search.
-  </overview>
-
-  <technology_stack>
-    <backend>
-      <runtime>Python 3.10+</runtime>
-      <web_framework>Flask 3.0</web_framework>
-      <vector_database>Weaviate 1.34.4 with text2vec-transformers</vector_database>
-      <ocr>Mistral OCR API</ocr>
-      <llm>Ollama (local) or Mistral API</llm>
-      <type_checking>mypy with strict configuration</type_checking>
-    </backend>
-    <infrastructure>
-      <containerization>Docker Compose (Weaviate + transformers)</containerization>
-      <dependencies>weaviate-client, flask, mistralai, python-dotenv</dependencies>
-    </infrastructure>
-  </technology_stack>
-
-  <current_state>
-    <project_structure>
-      - flask_app.py: Main Flask application (640 lines)
-      - schema.py: Weaviate schema definition (383 lines)
-      - utils/: 16+ modules for PDF processing pipeline
-        - pdf_pipeline.py: Main orchestration (879 lines)
-        - mistral_client.py: OCR API client
-        - ocr_processor.py: OCR processing
-        - markdown_builder.py: Markdown generation
-        - llm_metadata.py: Metadata extraction via LLM
-        - llm_toc.py: Table of contents extraction
-        - llm_classifier.py: Section classification
-        - llm_chunker.py: Semantic chunking
-        - llm_cleaner.py: Chunk cleaning
-        - llm_validator.py: Document validation
-        - weaviate_ingest.py: Database ingestion
-        - hierarchy_parser.py: Document hierarchy parsing
-        - image_extractor.py: Image extraction from PDFs
-        - toc_extractor*.py: Various TOC extraction methods
-      - templates/: Jinja2 templates for Flask UI
-      - tests/utils2/: Minimal test coverage (3 test files)
-    </project_structure>
-
-    <issues>
-      - Inconsistent type annotations across modules (some have partial types, many have none)
-      - Missing or incomplete docstrings (no Google-style format)
-      - No mypy configuration for strict type checking
-      - Type hints missing on function parameters and return values
-      - Dict[str, Any] used extensively without proper typing
-      - No type stubs for complex nested structures
-    </issues>
-  </current_state>
-
-  <core_features>
-    <type_annotations>
-      <strict_typing>
-        - Add complete type annotations to ALL functions and methods
-        - Use proper generic types (List, Dict, Optional, Union) from typing module
-        - Add TypedDict for complex dictionary structures
-        - Add Protocol types for duck-typed interfaces
-        - Use Literal types for string constants
-        - Add ParamSpec and TypeVar where appropriate
-        - Type all class attributes and instance variables
-        - Add type annotations to lambda functions where possible
-      </strict_typing>
-
-      <mypy_configuration>
-        - Create mypy.ini with strict configuration
-        - Enable: check_untyped_defs, disallow_untyped_defs, disallow_incomplete_defs
-        - Enable: disallow_untyped_calls, disallow_untyped_decorators
-        - Enable: warn_return_any, warn_redundant_casts
-        - Enable: strict_equality, strict_optional
-        - Set python_version to 3.10
-        - Configure per-module overrides if needed for gradual migration
-      </mypy_configuration>
-
-      <type_stubs>
-        - Create TypedDict definitions for common data structures:
-          - OCR response structures
-          - Metadata dictionaries
-          - TOC entries
-          - Chunk objects
-          - Weaviate objects
-          - Pipeline results
-        - Add NewType for semantic type safety (DocumentName, ChunkId, etc.)
-        - Create Protocol types for callback functions
-      </type_stubs>
-
-      <specific_improvements>
-        - pdf_pipeline.py: Type all 10 pipeline steps, callbacks, result dictionaries
-        - flask_app.py: Type all route handlers, request/response types
-        - schema.py: Type Weaviate configuration objects
-        - llm_*.py: Type LLM request/response structures
-        - mistral_client.py: Type API client methods and responses
-        - weaviate_ingest.py: Type ingestion functions and batch operations
-      </specific_improvements>
-    </type_annotations>
-
-    <documentation>
-      <google_style_docstrings>
-        - Add comprehensive Google-style docstrings to ALL:
-          - Module-level docstrings explaining purpose and usage
-          - Class docstrings with Attributes section
-          - Function/method docstrings with Args, Returns, Raises sections
-          - Complex algorithm explanations with Examples section
-        - Include code examples for public APIs
-        - Document all exceptions that can be raised
-        - Add Notes section for important implementation details
-        - Add See Also section for related functions
-      </google_style_docstrings>
-
-      <module_documentation>
-        <utils_modules>
-          - pdf_pipeline.py: Document the 10-step pipeline, each step's purpose
-          - mistral_client.py: Document OCR API usage, cost calculation
-          - llm_metadata.py: Document metadata extraction logic
-          - llm_toc.py: Document TOC extraction strategies
-          - llm_classifier.py: Document section classification types
-          - llm_chunker.py: Document semantic vs basic chunking
-          - llm_cleaner.py: Document cleaning rules and validation
-          - llm_validator.py: Document validation criteria
-          - weaviate_ingest.py: Document ingestion process, nested objects
-          - hierarchy_parser.py: Document hierarchy building algorithm
-        </utils_modules>
-
-        <flask_app>
-          - Document all routes with request/response examples
-          - Document SSE (Server-Sent Events) implementation
-          - Document Weaviate query patterns
-          - Document upload processing workflow
-          - Document background job management
-        </flask_app>
-
-        <schema>
-          - Document Weaviate schema design decisions
-          - Document each collection's purpose and relationships
-          - Document nested object structure
-          - Document vectorization strategy
-        </schema>
-      </module_documentation>
-
-      <inline_comments>
-        - Add inline comments for complex logic only (don't over-comment)
-        - Explain WHY not WHAT (code should be self-documenting)
-        - Document performance considerations
-        - Document cost implications (OCR, LLM API calls)
-        - Document error handling strategies
-      </inline_comments>
-    </documentation>
-
-    <validation>
-      <type_checking>
-        - All modules must pass mypy --strict
-        - No # type: ignore comments without justification
-        - CI/CD should run mypy checks
-        - Type coverage should be 100%
-      </type_checking>
-
-      <documentation_quality>
-        - All public functions must have docstrings
-        - All docstrings must follow Google style
-        - Examples should be executable and tested
-        - Documentation should be clear and concise
-      </documentation_quality>
-    </validation>
-  </core_features>
-
-  <implementation_priority>
-    <critical_modules>
-      Priority 1 (Most used, most complex):
-      1. utils/pdf_pipeline.py - Main orchestration
-      2. flask_app.py - Web application entry point
-      3. utils/weaviate_ingest.py - Database operations
-      4. schema.py - Schema definition
-
-      Priority 2 (Core LLM modules):
-      5. utils/llm_metadata.py
-      6. utils/llm_toc.py
-      7. utils/llm_classifier.py
-      8. utils/llm_chunker.py
-      9. utils/llm_cleaner.py
-      10. utils/llm_validator.py
-
-      Priority 3 (OCR and parsing):
-      11. utils/mistral_client.py
-      12. utils/ocr_processor.py
-      13. utils/markdown_builder.py
-      14. utils/hierarchy_parser.py
-      15. utils/image_extractor.py
-
-      Priority 4 (Supporting modules):
-      16. utils/toc_extractor.py
-      17. utils/toc_extractor_markdown.py
-      18. utils/toc_extractor_visual.py
-      19. utils/llm_structurer.py (legacy)
-    </critical_modules>
-  </implementation_priority>
-
-  <implementation_steps>
-    <feature_1>
-      <title>Setup Type Checking Infrastructure</title>
-      <description>
-        Configure mypy with strict settings and create foundational type definitions
-      </description>
-      <tasks>
-        - Create mypy.ini configuration file with strict settings
-        - Add mypy to requirements.txt or dev dependencies
-        - Create utils/types.py module for common TypedDict definitions
-        - Define core types: OCRResponse, Metadata, TOCEntry, ChunkData, PipelineResult
-        - Add NewType definitions for semantic types: DocumentName, ChunkId, SectionPath
-        - Create Protocol types for callbacks (ProgressCallback, etc.)
-        - Document type definitions in utils/types.py module docstring
-        - Test mypy configuration on a single module to verify settings
-      </tasks>
-      <acceptance_criteria>
-        - mypy.ini exists with strict configuration
-        - utils/types.py contains all foundational types with docstrings
-        - mypy runs without errors on utils/types.py
-        - Type definitions are comprehensive and reusable
-      </acceptance_criteria>
-    </feature_1>
-
-    <feature_2>
-      <title>Add Types to PDF Pipeline Orchestration</title>
-      <description>
-        Add complete type annotations to pdf_pipeline.py (879 lines, most complex module)
-      </description>
-      <tasks>
-        - Add type annotations to all function signatures in pdf_pipeline.py
-        - Type the 10-step pipeline: OCR, Markdown, Metadata, TOC, Classify, Chunk, Clean, Validate, Weaviate
-        - Type progress_callback parameter with Protocol or Callable
-        - Add TypedDict for pipeline options dictionary
-        - Add TypedDict for pipeline result dictionary structure
-        - Type all helper functions (extract_document_metadata_legacy, etc.)
-        - Add proper return types for process_pdf_v2, process_pdf, process_pdf_bytes
-        - Fix any mypy errors that arise
-        - Verify mypy --strict passes on pdf_pipeline.py
-      </tasks>
-      <acceptance_criteria>
-        - All functions in pdf_pipeline.py have complete type annotations
-        - progress_callback is properly typed with Protocol
-        - All Dict[str, Any] replaced with TypedDict where appropriate
-        - mypy --strict pdf_pipeline.py passes with zero errors
-        - No # type: ignore comments (or justified if absolutely necessary)
-      </acceptance_criteria>
-    </feature_2>
-
-    <feature_3>
-      <title>Add Types to Flask Application</title>
-      <description>
-        Add complete type annotations to flask_app.py and type all routes
-      </description>
-      <tasks>
-        - Add type annotations to all Flask route handlers
-        - Type request.args, request.form, request.files usage
-        - Type jsonify() return values
-        - Type get_weaviate_client context manager
-        - Type get_collection_stats, get_all_chunks, search_chunks functions
-        - Add TypedDict for Weaviate query results
-        - Type background job processing functions (run_processing_job)
-        - Type SSE generator function (upload_progress)
-        - Add type hints for template rendering
-        - Verify mypy --strict passes on flask_app.py
-      </tasks>
-      <acceptance_criteria>
-        - All Flask routes have complete type annotations
-        - Request/response types are clear and documented
-        - Weaviate query functions are properly typed
-        - SSE generator is correctly typed
-        - mypy --strict flask_app.py passes with zero errors
-      </acceptance_criteria>
-    </feature_3>
-
-    <feature_4>
-      <title>Add Types to Core LLM Modules</title>
-      <description>
-        Add complete type annotations to all LLM processing modules (metadata, TOC, classifier, chunker, cleaner, validator)
-      </description>
-      <tasks>
-        - llm_metadata.py: Type extract_metadata function, return structure
-        - llm_toc.py: Type extract_toc function, TOC hierarchy structure
-        - llm_classifier.py: Type classify_sections, section types (Literal), validation functions
-        - llm_chunker.py: Type chunk_section_with_llm, chunk objects
-        - llm_cleaner.py: Type clean_chunk, is_chunk_valid functions
-        - llm_validator.py: Type validate_document, validation result structure
-        - Add TypedDict for LLM request/response structures
-        - Type provider selection ("ollama" | "mistral" as Literal)
-        - Type model names with Literal or constants
-        - Verify mypy --strict passes on all llm_*.py modules
-      </tasks>
-      <acceptance_criteria>
-        - All LLM modules have complete type annotations
-        - Section types use Literal for type safety
-        - Provider and model parameters are strongly typed
-        - LLM request/response structures use TypedDict
-        - mypy --strict passes on all llm_*.py modules with zero errors
-      </acceptance_criteria>
-    </feature_4>
-
-    <feature_5>
-      <title>Add Types to Weaviate and Database Modules</title>
-      <description>
-        Add complete type annotations to schema.py and weaviate_ingest.py
-      </description>
-      <tasks>
-        - schema.py: Type Weaviate configuration objects
-        - schema.py: Type collection property definitions
-        - weaviate_ingest.py: Type ingest_document function signature
-        - weaviate_ingest.py: Type delete_document_chunks function
-        - weaviate_ingest.py: Add TypedDict for Weaviate object structure
-        - Type batch insertion operations
-        - Type nested object references (work, document)
-        - Add proper error types for Weaviate exceptions
-        - Verify mypy --strict passes on both modules
-      </tasks>
-      <acceptance_criteria>
-        - schema.py has complete type annotations for Weaviate config
-        - weaviate_ingest.py functions are fully typed
-        - Nested object structures use TypedDict
-        - Weaviate client operations are properly typed
-        - mypy --strict passes on both modules with zero errors
-      </acceptance_criteria>
-    </feature_5>
-
-    <feature_6>
-      <title>Add Types to OCR and Parsing Modules</title>
-      <description>
-        Add complete type annotations to mistral_client.py, ocr_processor.py, markdown_builder.py, hierarchy_parser.py
-      </description>
-      <tasks>
-        - mistral_client.py: Type create_client, run_ocr, estimate_ocr_cost
-        - mistral_client.py: Add TypedDict for Mistral API response structures
-        - ocr_processor.py: Type serialize_ocr_response, OCR object structures
-        - markdown_builder.py: Type build_markdown, image_writer parameter
-        - hierarchy_parser.py: Type build_hierarchy, flatten_hierarchy functions
-        - hierarchy_parser.py: Add TypedDict for hierarchy node structure
-        - image_extractor.py: Type create_image_writer, image handling
-        - Verify mypy --strict passes on all modules
-      </tasks>
-      <acceptance_criteria>
-        - All OCR/parsing modules have complete type annotations
-        - Mistral API structures use TypedDict
-        - Hierarchy nodes are properly typed
-        - Image handling functions are typed
-        - mypy --strict passes on all modules with zero errors
-      </acceptance_criteria>
-    </feature_6>
-
-    <feature_7>
-      <title>Add Google-Style Docstrings to Core Modules</title>
-      <description>
-        Add comprehensive Google-style docstrings to pdf_pipeline.py, flask_app.py, and weaviate modules
-      </description>
-      <tasks>
-        - pdf_pipeline.py: Add module docstring explaining the V2 pipeline
-        - pdf_pipeline.py: Add docstrings to process_pdf_v2 with Args, Returns, Raises sections
-        - pdf_pipeline.py: Document each of the 10 pipeline steps in comments
-        - pdf_pipeline.py: Add Examples section showing typical usage
-        - flask_app.py: Add module docstring explaining Flask application
-        - flask_app.py: Document all routes with request/response examples
-        - flask_app.py: Document Weaviate connection management
-        - schema.py: Add module docstring explaining schema design
-        - schema.py: Document each collection's purpose and relationships
-        - weaviate_ingest.py: Document ingestion process with examples
-        - All docstrings must follow Google style format exactly
-      </tasks>
-      <acceptance_criteria>
-        - All core modules have comprehensive module-level docstrings
-        - All public functions have Google-style docstrings
-        - Args, Returns, Raises sections are complete and accurate
-        - Examples are provided for complex functions
-        - Docstrings explain WHY, not just WHAT
-      </acceptance_criteria>
-    </feature_7>
-
-    <feature_8>
-      <title>Add Google-Style Docstrings to LLM Modules</title>
-      <description>
-        Add comprehensive Google-style docstrings to all LLM processing modules
-      </description>
-      <tasks>
-        - llm_metadata.py: Document metadata extraction logic with examples
-        - llm_toc.py: Document TOC extraction strategies and fallbacks
-        - llm_classifier.py: Document section types and classification criteria
-        - llm_chunker.py: Document semantic vs basic chunking approaches
-        - llm_cleaner.py: Document cleaning rules and validation logic
-        - llm_validator.py: Document validation criteria and corrections
-        - Add Examples sections showing input/output for each function
-        - Document LLM provider differences (Ollama vs Mistral)
-        - Document cost implications in Notes sections
-        - All docstrings must follow Google style format exactly
-      </tasks>
-      <acceptance_criteria>
-        - All LLM modules have comprehensive docstrings
-        - Each function has Args, Returns, Raises sections
-        - Examples show realistic input/output
-        - Provider differences are documented
-        - Cost implications are noted where relevant
-      </acceptance_criteria>
-    </feature_8>
-
-    <feature_9>
-      <title>Add Google-Style Docstrings to OCR and Parsing Modules</title>
-      <description>
-        Add comprehensive Google-style docstrings to OCR, markdown, hierarchy, and extraction modules
-      </description>
-      <tasks>
-        - mistral_client.py: Document OCR API usage, cost calculation
-        - ocr_processor.py: Document OCR response processing
-        - markdown_builder.py: Document markdown generation strategy
-        - hierarchy_parser.py: Document hierarchy building algorithm
-        - image_extractor.py: Document image extraction process
-        - toc_extractor*.py: Document various TOC extraction methods
-        - Add Examples sections for complex algorithms
-        - Document edge cases and error handling
-        - All docstrings must follow Google style format exactly
-      </tasks>
-      <acceptance_criteria>
-        - All OCR/parsing modules have comprehensive docstrings
-        - Complex algorithms are well explained
-        - Edge cases are documented
-        - Error handling is documented
-        - Examples demonstrate typical usage
-      </acceptance_criteria>
-    </feature_9>
-
-    <feature_10>
-      <title>Final Validation and CI Integration</title>
-      <description>
-        Verify all type annotations and docstrings, integrate mypy into CI/CD
-      </description>
-      <tasks>
-        - Run mypy --strict on entire codebase, verify 100% pass rate
-        - Verify all public functions have docstrings
-        - Check docstring formatting with pydocstyle or similar tool
-        - Create GitHub Actions workflow to run mypy on every commit
-        - Update README.md with type checking instructions
-        - Update CLAUDE.md with documentation standards
-        - Create CONTRIBUTING.md with type annotation and docstring guidelines
-        - Generate API documentation with Sphinx or pdoc
-        - Fix any remaining mypy errors or missing docstrings
-      </tasks>
-      <acceptance_criteria>
-        - mypy --strict passes on entire codebase with zero errors
-        - All public functions have Google-style docstrings
-        - CI/CD runs mypy checks automatically
-        - Documentation is generated and accessible
-        - Contributing guidelines document type/docstring requirements
-      </acceptance_criteria>
-    </feature_10>
-  </implementation_steps>
-
-  <success_criteria>
-    <type_safety>
-      - 100% type coverage across all modules
-      - mypy --strict passes with zero errors
-      - No # type: ignore comments without justification
-      - All Dict[str, Any] replaced with TypedDict where appropriate
-      - Proper use of generics, protocols, and type variables
-      - NewType used for semantic type safety
-    </type_safety>
-
-    <documentation_quality>
-      - All modules have comprehensive module-level docstrings
-      - All public functions/classes have Google-style docstrings
-      - All docstrings include Args, Returns, Raises sections
-      - Complex functions include Examples sections
-      - Cost implications documented in Notes sections
-      - Error handling clearly documented
-      - Provider differences (Ollama vs Mistral) documented
-    </documentation_quality>
-
-    <code_quality>
-      - Code is self-documenting with clear variable names
-      - Inline comments explain WHY, not WHAT
-      - Complex algorithms are well explained
-      - Performance considerations documented
-      - Security considerations documented
-    </code_quality>
-
-    <developer_experience>
-      - IDE autocomplete works perfectly with type hints
-      - Type errors caught at development time, not runtime
-      - Documentation is easily accessible in IDE
-      - API examples are executable and tested
-      - Contributing guidelines are clear and comprehensive
-    </developer_experience>
-
-    <maintainability>
-      - Refactoring is safer with type checking
-      - Function signatures are self-documenting
-      - API contracts are explicit and enforced
-      - Breaking changes are caught by type checker
-      - New developers can understand code quickly
-    </maintainability>
-  </success_criteria>
-
-  <constraints>
-    <compatibility>
-      - Must maintain backward compatibility with existing code
-      - Cannot break existing Flask routes or API contracts
-      - Weaviate schema must remain unchanged
-      - Existing tests must continue to pass
-    </compatibility>
-
-    <gradual_migration>
-      - Can use per-module mypy configuration for gradual migration
-      - Can temporarily disable strict checks on legacy modules
-      - Priority modules must be completed first
-      - Low-priority modules can be deferred
-    </gradual_migration>
-
-    <standards>
-      - All type annotations must use Python 3.10+ syntax
-      - Docstrings must follow Google style exactly (not NumPy or reStructuredText)
-      - Use typing module (List, Dict, Optional) until Python 3.9 support dropped
-      - Use from __future__ import annotations if needed for forward references
-    </standards>
-  </constraints>
-
-  <testing_strategy>
-    <type_checking>
-      - Run mypy --strict on each module after adding types
-      - Use mypy daemon (dmypy) for faster incremental checking
-      - Add mypy to pre-commit hooks
-      - CI/CD must run mypy and fail on type errors
-    </type_checking>
-
-    <documentation_validation>
-      - Use pydocstyle to validate Google-style format
-      - Use sphinx-build to generate docs and catch errors
-      - Manual review of docstring examples
-      - Verify examples are executable and correct
-    </documentation_validation>
-
-    <integration_testing>
-      - Verify existing tests still pass after type additions
-      - Add new tests for complex typed structures
-      - Test mypy configuration on sample code
-      - Verify IDE autocomplete works correctly
-    </integration_testing>
-  </testing_strategy>
-
-  <documentation_examples>
-    <module_docstring>
-      ```python
-      """
-      PDF Pipeline V2 - Intelligent document processing with LLM enhancement.
-
-      This module orchestrates a 10-step pipeline for processing PDF documents:
-      1. OCR via Mistral API
-      2. Markdown construction with images
-      3. Metadata extraction via LLM
-      4. Table of contents (TOC) extraction
-      5. Section classification
-      6. Semantic chunking
-      7. Chunk cleaning and validation
-      8. Enrichment with concepts
-      9. Validation and corrections
-      10. Ingestion into Weaviate vector database
-
-      The pipeline supports multiple LLM providers (Ollama local, Mistral API) and
-      various processing modes (skip OCR, semantic chunking, OCR annotations).
-
-      Typical usage:
-          >>> from pathlib import Path
-          >>> from utils.pdf_pipeline import process_pdf
-          >>>
-          >>> result = process_pdf(
-          ...     Path("document.pdf"),
-          ...     use_llm=True,
-          ...     llm_provider="ollama",
-          ...     ingest_to_weaviate=True,
-          ... )
-          >>> print(f"Processed {result['pages']} pages, {result['chunks_count']} chunks")
-
-      See Also:
-          mistral_client: OCR API client
-          llm_metadata: Metadata extraction
-          weaviate_ingest: Database ingestion
-      """
-      ```
-    </module_docstring>
-
-    <function_docstring>
-      ```python
-      def process_pdf_v2(
-          pdf_path: Path,
-          output_dir: Path = Path("output"),
-          *,
-          use_llm: bool = True,
-          llm_provider: Literal["ollama", "mistral"] = "ollama",
-          llm_model: Optional[str] = None,
-          skip_ocr: bool = False,
-          ingest_to_weaviate: bool = True,
-          progress_callback: Optional[ProgressCallback] = None,
-      ) -> PipelineResult:
-          """
-          Process a PDF through the complete V2 pipeline with LLM enhancement.
-
-          This function orchestrates all 10 steps of the intelligent document processing
-          pipeline, from OCR to Weaviate ingestion. It supports both local (Ollama) and
-          cloud (Mistral API) LLM providers, with optional caching via skip_ocr.
-
-          Args:
-              pdf_path: Absolute path to the PDF file to process.
-              output_dir: Base directory for output files. Defaults to "./output".
-              use_llm: Enable LLM-based processing (metadata, TOC, chunking).
-                  If False, uses basic heuristic processing.
-              llm_provider: LLM provider to use. "ollama" for local (free but slow),
-                  "mistral" for API (fast but paid).
-              llm_model: Specific model name. If None, auto-detects based on provider
-                  (qwen2.5:7b for ollama, mistral-small-latest for mistral).
-              skip_ocr: If True, reuses existing markdown file to avoid OCR cost.
-                  Requires output_dir/<doc_name>/<doc_name>.md to exist.
-              ingest_to_weaviate: If True, ingests chunks into Weaviate after processing.
-              progress_callback: Optional callback for real-time progress updates.
-                  Called with (step_id, status, detail) for each pipeline step.
-
-          Returns:
-              Dictionary containing processing results with the following keys:
-                  - success (bool): True if processing completed without errors
-                  - document_name (str): Name of the processed document
-                  - pages (int): Number of pages in the PDF
-                  - chunks_count (int): Number of chunks generated
-                  - cost_ocr (float): OCR cost in euros (0 if skip_ocr=True)
-                  - cost_llm (float): LLM API cost in euros (0 if provider=ollama)
-                  - cost_total (float): Total cost (ocr + llm)
-                  - metadata (dict): Extracted metadata (title, author, etc.)
-                  - toc (list): Hierarchical table of contents
-                  - files (dict): Paths to generated files (markdown, chunks, etc.)
-
-          Raises:
-              FileNotFoundError: If pdf_path does not exist.
-              ValueError: If skip_ocr=True but markdown file not found.
-              RuntimeError: If Weaviate connection fails during ingestion.
-
-          Examples:
-              Basic usage with Ollama (free):
-              >>> result = process_pdf_v2(
-              ...     Path("platon_menon.pdf"),
-              ...     llm_provider="ollama"
-              ... )
-              >>> print(f"Cost: {result['cost_total']:.4f}€")
-              Cost: 0.0270€  # OCR only
-
-              With Mistral API (faster):
-              >>> result = process_pdf_v2(
-              ...     Path("platon_menon.pdf"),
-              ...     llm_provider="mistral",
-              ...     llm_model="mistral-small-latest"
-              ... )
-
-              Skip OCR to avoid cost:
-              >>> result = process_pdf_v2(
-              ...     Path("platon_menon.pdf"),
-              ...     skip_ocr=True,  # Reuses existing markdown
-              ...     ingest_to_weaviate=False
-              ... )
-
-          Notes:
-              - OCR cost: ~0.003€/page (standard), ~0.009€/page (with annotations)
-              - LLM cost: Free with Ollama, variable with Mistral API
-              - Processing time: ~30s/page with Ollama, ~5s/page with Mistral
-              - Weaviate must be running (docker-compose up -d) before ingestion
-          """
-      ```
-    </function_docstring>
-  </documentation_examples>
-</project_specification>
diff --git a/prompts/app_spec_markdown_support.txt b/prompts/app_spec_markdown_support.txt
deleted file mode 100644
index 5cae3aa..0000000
--- a/prompts/app_spec_markdown_support.txt
+++ /dev/null
@@ -1,490 +0,0 @@
-<project_specification>
-  <project_name>Library RAG - Native Markdown Support</project_name>
-
-  <overview>
-    Add native support for Markdown (.md) files to the Library RAG application. Currently, the system only accepts PDF files
-    and uses Mistral OCR for text extraction. This feature will allow users to upload pre-existing Markdown files directly,
-    skipping the expensive OCR step while still benefiting from LLM-based metadata extraction, TOC generation, semantic
-    chunking, and Weaviate vectorization.
-
-    This enhancement reduces costs, improves processing speed for already-digitized texts, and makes the system more flexible
-    for users who have philosophical texts in Markdown format.
-  </overview>
-
-  <technology_stack>
-    <backend>
-      <framework>Flask 3.0</framework>
-      <pipeline>utils/pdf_pipeline.py (to be extended)</pipeline>
-      <validation>Werkzeug secure_filename</validation>
-      <llm>Ollama (local) or Mistral API</llm>
-      <vectorization>Weaviate with BAAI/bge-m3</vectorization>
-    </backend>
-    <type_safety>
-      <type_checker>mypy strict mode</type_checker>
-      <docstrings>Google-style docstrings required</docstrings>
-    </type_safety>
-  </technology_stack>
-
-  <core_features>
-    <feature_1>
-      <title>Update Flask File Validation</title>
-      <description>
-        Modify the Flask application to accept both PDF and Markdown files. Update the ALLOWED_EXTENSIONS
-        configuration and file validation logic to support .md files while maintaining backward compatibility
-        with existing PDF workflows.
-      </description>
-      <priority>1</priority>
-      <category>backend</category>
-      <files_to_modify>
-        - flask_app.py (line 99: ALLOWED_EXTENSIONS, line 427: allowed_file function)
-      </files_to_modify>
-      <implementation_details>
-        - Change ALLOWED_EXTENSIONS from {"pdf"} to {"pdf", "md"}
-        - Update allowed_file() function to accept both extensions
-        - Update upload.html template to accept .md files in file input
-        - Update error messages to reflect both formats
-      </implementation_details>
-      <test_steps>
-        1. Start Flask app
-        2. Navigate to /upload
-        3. Attempt to upload a .md file
-        4. Verify file is accepted (no "Format non supporté" error)
-        5. Verify PDF upload still works
-      </test_steps>
-    </feature_1>
-
-    <feature_2>
-      <title>Add Markdown Detection in Pipeline</title>
-      <description>
-        Enhance pdf_pipeline.py to detect when a Markdown file is being processed instead of a PDF.
-        Add logic to automatically skip OCR processing for .md files and copy the Markdown content
-        directly to the output directory.
-      </description>
-      <priority>1</priority>
-      <category>backend</category>
-      <files_to_modify>
-        - utils/pdf_pipeline.py (process_pdf_v2 function, around line 250-450)
-      </files_to_modify>
-      <implementation_details>
-        - Add file extension detection: `file_ext = pdf_path.suffix.lower()`
-        - If file_ext == ".md":
-          - Skip OCR step entirely (no Mistral API call)
-          - Read Markdown content directly: `md_content = pdf_path.read_text(encoding='utf-8')`
-          - Copy to output: `md_path.write_text(md_content, encoding='utf-8')`
-          - Set nb_pages = md_content.count('\n# ') or 1 (estimate from H1 headers)
-          - Set cost_ocr = 0.0
-          - Emit progress: "markdown_load" instead of "ocr"
-        - If file_ext == ".pdf":
-          - Continue with existing OCR workflow
-        - Both paths converge at LLM processing (metadata, TOC, chunking)
-      </implementation_details>
-      <test_steps>
-        1. Create test Markdown file with philosophical content
-        2. Call process_pdf(Path("test.md"), use_llm=True)
-        3. Verify OCR is skipped (cost_ocr = 0.0)
-        4. Verify output/test/test.md is created
-        5. Verify no _ocr.json file is created
-        6. Verify LLM processing runs normally
-      </test_steps>
-    </feature_2>
-
-    <feature_3>
-      <title>Markdown-Specific Progress Callback</title>
-      <description>
-        Update the progress callback system to emit appropriate events for Markdown file processing.
-        Instead of "OCR Mistral en cours...", display "Chargement Markdown..." to provide accurate
-        user feedback during Server-Sent Events streaming.
-      </description>
-      <priority>2</priority>
-      <category>backend</category>
-      <files_to_modify>
-        - utils/pdf_pipeline.py (emit_progress calls)
-        - flask_app.py (process_file_background function)
-      </files_to_modify>
-      <implementation_details>
-        - Add conditional progress messages based on file type
-        - For .md files: emit_progress("markdown_load", "running", "Chargement du fichier Markdown...")
-        - For .pdf files: emit_progress("ocr", "running", "OCR Mistral en cours...")
-        - Update frontend to handle "markdown_load" event type
-        - Ensure step numbering adjusts (9 steps for MD vs 10 for PDF)
-      </implementation_details>
-      <test_steps>
-        1. Upload Markdown file via Flask interface
-        2. Monitor SSE progress stream at /upload/progress/&lt;job_id&gt;
-        3. Verify first step shows "Chargement du fichier Markdown..."
-        4. Verify no OCR-related messages appear
-        5. Verify subsequent steps (metadata, TOC, etc.) work normally
-      </test_steps>
-    </feature_3>
-
-    <feature_4>
-      <title>Update process_pdf_bytes for Markdown</title>
-      <description>
-        Extend process_pdf_bytes() function to handle Markdown content uploaded via Flask.
-        This function currently creates a temporary PDF file, but for Markdown uploads,
-        it should create a temporary .md file instead.
-      </description>
-      <priority>1</priority>
-      <category>backend</category>
-      <files_to_modify>
-        - utils/pdf_pipeline.py (process_pdf_bytes function, line 1255)
-      </files_to_modify>
-      <implementation_details>
-        - Detect file type from filename parameter
-        - If filename ends with .md:
-          - Create temp file with suffix=".md"
-          - Write file_bytes as UTF-8 text
-        - If filename ends with .pdf:
-          - Existing behavior (suffix=".pdf", binary write)
-        - Pass temp file path to process_pdf() which now handles both types
-      </implementation_details>
-      <test_steps>
-        1. Create Flask test client
-        2. POST multipart form with .md file to /upload
-        3. Verify process_pdf_bytes creates .md temp file
-        4. Verify temp file contains correct Markdown content
-        5. Verify cleanup deletes temp file after processing
-      </test_steps>
-    </feature_4>
-
-    <feature_5>
-      <title>Add Markdown File Validation</title>
-      <description>
-        Implement validation for uploaded Markdown files to ensure they contain valid UTF-8 text
-        and basic Markdown structure. Reject files that are too large, contain binary data,
-        or have no meaningful content.
-      </description>
-      <priority>2</priority>
-      <category>backend</category>
-      <files_to_create>
-        - utils/markdown_validator.py
-      </files_to_create>
-      <implementation_details>
-        - Create validate_markdown_file(file_path: Path) -> dict[str, Any] function
-        - Checks:
-          - File size &lt; 10 MB
-          - Valid UTF-8 encoding
-          - Contains at least one header (#, ##, etc.)
-          - Not empty (at least 100 characters)
-          - No null bytes or excessive binary content
-        - Return dict with success, error, and warnings keys
-        - Call from process_pdf_v2 before processing
-        - Type annotations and Google-style docstrings required
-      </implementation_details>
-      <test_steps>
-        1. Test with valid Markdown file → passes validation
-        2. Test with empty file → fails with "File too short"
-        3. Test with binary file (.exe renamed to .md) → fails with "Invalid UTF-8"
-        4. Test with very large file (&gt;10MB) → fails with "File too large"
-        5. Test with plain text no headers → warning but continues
-      </test_steps>
-    </feature_5>
-
-    <feature_6>
-      <title>Update Documentation</title>
-      <description>
-        Update README.md and .claude/CLAUDE.md to document the new Markdown support feature.
-        Include usage examples, cost comparison (PDF vs MD), and troubleshooting tips.
-      </description>
-      <priority>3</priority>
-      <category>documentation</category>
-      <files_to_modify>
-        - README.md (add section under "Pipeline de Traitement")
-        - .claude/CLAUDE.md (update development guidelines)
-        - templates/upload.html (add help text)
-      </files_to_modify>
-      <implementation_details>
-        - README.md:
-          - Add "Support Markdown Natif" section
-          - Document accepted formats: PDF, MD
-          - Show cost comparison table (PDF: ~0.003€/page, MD: 0€)
-          - Add example: process_pdf(Path("document.md"))
-        - CLAUDE.md:
-          - Update "Pipeline de Traitement" section
-          - Note conditional OCR step
-          - Document markdown_validator.py module
-        - upload.html:
-          - Update file input accept attribute: accept=".pdf,.md"
-          - Add help text: "Formats acceptés : PDF, Markdown (.md)"
-      </implementation_details>
-      <test_steps>
-        1. Read README.md markdown support section
-        2. Verify examples are clear and accurate
-        3. Check CLAUDE.md developer notes
-        4. Open /upload in browser
-        5. Verify help text displays correctly
-      </test_steps>
-    </feature_6>
-
-    <feature_7>
-      <title>Add Unit Tests for Markdown Processing</title>
-      <description>
-        Create comprehensive unit tests for Markdown file handling to ensure reliability
-        and prevent regressions. Cover file validation, pipeline processing, and edge cases.
-      </description>
-      <priority>2</priority>
-      <category>testing</category>
-      <files_to_create>
-        - tests/utils/test_markdown_validator.py
-        - tests/utils/test_pdf_pipeline_markdown.py
-        - tests/fixtures/sample.md
-      </files_to_create>
-      <implementation_details>
-        - test_markdown_validator.py:
-          - Test valid Markdown acceptance
-          - Test invalid encoding rejection
-          - Test file size limits
-          - Test empty file rejection
-          - Test binary data detection
-        - test_pdf_pipeline_markdown.py:
-          - Test Markdown file processing end-to-end
-          - Test OCR skip for .md files
-          - Test cost_ocr = 0.0
-          - Test LLM processing (metadata, TOC, chunking)
-          - Mock Weaviate ingestion
-          - Verify output files created correctly
-        - fixtures/sample.md:
-          - Create realistic philosophical text in Markdown
-          - Include headers, paragraphs, formatting
-          - ~1000 words for realistic testing
-      </implementation_details>
-      <test_steps>
-        1. Run: pytest tests/utils/test_markdown_validator.py -v
-        2. Verify all validation tests pass
-        3. Run: pytest tests/utils/test_pdf_pipeline_markdown.py -v
-        4. Verify end-to-end Markdown processing works
-        5. Check test coverage: pytest --cov=utils --cov-report=html
-      </test_steps>
-    </feature_7>
-
-    <feature_8>
-      <title>Type Safety and Documentation</title>
-      <description>
-        Ensure all new code follows strict type safety requirements and includes comprehensive
-        Google-style docstrings. Run mypy checks and update type definitions as needed.
-      </description>
-      <priority>2</priority>
-      <category>type_safety</category>
-      <files_to_modify>
-        - utils/types.py (add Markdown-specific types if needed)
-        - All modified modules (type annotations)
-      </files_to_modify>
-      <implementation_details>
-        - Add type annotations to all new functions
-        - Update existing functions that handle both PDF and MD
-        - Consider adding:
-          - FileFormat = Literal["pdf", "md"]
-          - MarkdownValidationResult = TypedDict(...)
-        - Run mypy --strict on all modified files
-        - Add Google-style docstrings with:
-          - Args section documenting all parameters
-          - Returns section with structure details
-          - Raises section for exceptions
-          - Examples section for complex functions
-      </implementation_details>
-      <test_steps>
-        1. Run: mypy utils/pdf_pipeline.py --strict
-        2. Run: mypy utils/markdown_validator.py --strict
-        3. Verify no type errors
-        4. Run: pydocstyle utils/markdown_validator.py --convention=google
-        5. Verify all docstrings follow Google style
-      </test_steps>
-    </feature_8>
-
-    <feature_9>
-      <title>Handle Markdown-Specific Edge Cases</title>
-      <description>
-        Address edge cases specific to Markdown processing: front matter (YAML/TOML),
-        embedded code blocks, special characters, and non-standard Markdown extensions.
-      </description>
-      <priority>3</priority>
-      <category>backend</category>
-      <files_to_modify>
-        - utils/markdown_validator.py
-        - utils/llm_metadata.py (handle front matter)
-      </files_to_modify>
-      <implementation_details>
-        - Front matter handling:
-          - Detect YAML/TOML front matter (--- or +++)
-          - Extract metadata if present (title, author, date)
-          - Pass to LLM or use directly if valid
-          - Strip front matter before content processing
-        - Code block handling:
-          - Don't treat code blocks as actual content
-          - Preserve them for chunking but don't analyze
-        - Special characters:
-          - Handle Unicode properly (Greek, Latin, French accents)
-          - Preserve LaTeX equations in $ or $$
-        - GitHub Flavored Markdown:
-          - Support tables, task lists, strikethrough
-          - Convert to standard format if needed
-      </implementation_details>
-      <test_steps>
-        1. Upload Markdown with YAML front matter
-        2. Verify metadata extracted correctly
-        3. Upload Markdown with code blocks
-        4. Verify code not treated as philosophical content
-        5. Upload Markdown with Greek/Latin text
-        6. Verify Unicode handled correctly
-      </test_steps>
-    </feature_9>
-
-    <feature_10>
-      <title>Update UI/UX for Markdown Upload</title>
-      <description>
-        Enhance the upload interface to clearly communicate Markdown support and provide
-        visual feedback about the file type being processed. Show format-specific information
-        (e.g., "No OCR cost for Markdown files").
-      </description>
-      <priority>3</priority>
-      <category>frontend</category>
-      <files_to_modify>
-        - templates/upload.html
-        - templates/upload_progress.html
-      </files_to_modify>
-      <implementation_details>
-        - upload.html:
-          - Add file type indicator icon (📄 PDF vs 📝 MD)
-          - Show format-specific help text on hover
-          - Display estimated cost: "PDF: ~0.003€/page, Markdown: 0€"
-          - Add example Markdown file download link
-        - upload_progress.html:
-          - Show different icon for Markdown processing
-          - Adjust progress bar (9 steps vs 10 steps)
-          - Display "No OCR cost" badge for Markdown
-          - Update step descriptions based on file type
-      </implementation_details>
-      <test_steps>
-        1. Open /upload page
-        2. Verify help text mentions both PDF and MD
-        3. Select a .md file
-        4. Verify file type indicator shows 📝
-        5. Submit upload
-        6. Verify progress shows "Chargement Markdown..."
-        7. Verify "No OCR cost" badge displays
-      </test_steps>
-    </feature_10>
-  </core_features>
-
-  <implementation_steps>
-    <step number="1">
-      <title>Setup and Configuration</title>
-      <tasks>
-        - Update ALLOWED_EXTENSIONS in flask_app.py
-        - Modify allowed_file() validation function
-        - Update upload.html file input accept attribute
-        - Add Markdown MIME type handling
-      </tasks>
-    </step>
-
-    <step number="2">
-      <title>Core Pipeline Extension</title>
-      <tasks>
-        - Add file extension detection in process_pdf_v2()
-        - Implement Markdown file reading logic
-        - Skip OCR for .md files
-        - Add conditional progress callbacks
-        - Update process_pdf_bytes() for Markdown
-      </tasks>
-    </step>
-
-    <step number="3">
-      <title>Validation and Error Handling</title>
-      <tasks>
-        - Create markdown_validator.py module
-        - Implement UTF-8 encoding validation
-        - Add file size limits
-        - Handle front matter extraction
-        - Add comprehensive error messages
-      </tasks>
-    </step>
-
-    <step number="4">
-      <title>Testing Infrastructure</title>
-      <tasks>
-        - Create test fixtures (sample.md)
-        - Write validation tests
-        - Write pipeline integration tests
-        - Add edge case tests
-        - Verify mypy strict compliance
-      </tasks>
-    </step>
-
-    <step number="5">
-      <title>Documentation and Polish</title>
-      <tasks>
-        - Update README.md with Markdown support
-        - Update .claude/CLAUDE.md developer docs
-        - Add Google-style docstrings
-        - Update UI templates with new messaging
-        - Create usage examples
-      </tasks>
-    </step>
-  </implementation_steps>
-
-  <success_criteria>
-    <functionality>
-      - Markdown files upload successfully via Flask
-      - OCR is skipped for .md files (cost_ocr = 0.0)
-      - LLM processing works identically for PDF and MD
-      - Chunks are created and vectorized correctly
-      - Both file types can be searched in Weaviate
-      - Existing PDF workflow remains unchanged
-    </functionality>
-
-    <type_safety>
-      - All code passes mypy --strict
-      - All functions have type annotations
-      - Google-style docstrings on all modules
-      - No Any types without justification
-      - TypedDict definitions for new data structures
-    </type_safety>
-
-    <testing>
-      - Unit tests cover Markdown validation
-      - Integration tests verify end-to-end processing
-      - Edge cases handled (front matter, Unicode, large files)
-      - Test coverage &gt;80% for new code
-      - All tests pass in CI/CD pipeline
-    </testing>
-
-    <user_experience>
-      - Upload interface clearly shows both formats supported
-      - Progress feedback accurate for both PDF and MD
-      - Cost savings clearly communicated ("0€ for Markdown")
-      - Error messages helpful and specific
-      - Documentation clear with examples
-    </user_experience>
-
-    <performance>
-      - Markdown processing faster than PDF (no OCR)
-      - No regression in PDF processing speed
-      - Memory usage reasonable for large MD files
-      - Validation completes in &lt;100ms
-      - Overall pipeline &lt;30s for typical Markdown document
-    </performance>
-  </success_criteria>
-
-  <technical_notes>
-    <cost_comparison>
-      - PDF processing: OCR ~0.003€/page + LLM variable
-      - Markdown processing: 0€ OCR + LLM variable
-      - Estimated savings: 50-70% for documents with Markdown source
-    </cost_comparison>
-
-    <compatibility>
-      - Maintains backward compatibility with existing PDFs
-      - No breaking changes to API or database schema
-      - Existing chunks and documents unaffected
-      - Can process both formats in same session
-    </compatibility>
-
-    <future_enhancements>
-      - Support for .txt plain text files
-      - Support for .docx Word documents (via pandoc)
-      - Support for .epub ebooks
-      - Batch upload of multiple Markdown files
-      - Markdown to PDF export for archival
-    </future_enhancements>
-  </technical_notes>
-</project_specification>
diff --git a/prompts/app_spec_tavily_mcp.txt b/prompts/app_spec_tavily_mcp.txt
deleted file mode 100644
index 349f9a6..0000000
--- a/prompts/app_spec_tavily_mcp.txt
+++ /dev/null
@@ -1,498 +0,0 @@
-<project_specification>
-  <project_name>ikario - Tavily MCP Integration for Internet Access</project_name>
-
-  <overview>
-    This specification adds Tavily search capabilities via MCP (Model Context Protocol) to give Ikario
-    internet access for real-time web searches. Tavily provides high-quality search results optimized
-    for AI agents, making it ideal for research, fact-checking, and accessing current information.
-
-    This integration adds a new MCP server connection to the existing architecture (alongside the
-    ikario-memory MCP server) and exposes Tavily search tools to Ikario during conversations.
-
-    All changes are additive and backward-compatible. Existing functionality remains unchanged.
-  </overview>
-
-  <architecture_design>
-    <mcp_integration>
-      Tavily MCP Server Connection:
-      - Uses @modelcontextprotocol/sdk Client to connect to Tavily MCP server
-      - Connection can be stdio-based (local MCP server) or HTTP-based (remote)
-      - Tavily MCP server provides search tools that are exposed to Claude via Tool Use API
-      - Backend routes handle tool execution and return results to Claude
-    </mcp_integration>
-
-    <benefits>
-      - Real-time internet access for Ikario
-      - High-quality search results optimized for LLMs
-      - Fact-checking and verification capabilities
-      - Access to current events and news
-      - Research assistance with cited sources
-      - Seamless integration with existing memory tools
-    </benefits>
-  </architecture_design>
-
-  <technology_stack>
-    <mcp_server>
-      <name>Tavily MCP Server</name>
-      <protocol>Model Context Protocol (MCP)</protocol>
-      <connection>stdio or HTTP transport</connection>
-      <sdk>@modelcontextprotocol/sdk</sdk>
-      <api_key>Tavily API key (from https://tavily.com)</api_key>
-    </mcp_server>
-    <backend>
-      <runtime>Node.js with Express (existing)</runtime>
-      <mcp_client>MCP Client for Tavily server connection</mcp_client>
-      <tool_executor>Existing toolExecutor service extended with Tavily tools</tool_executor>
-    </backend>
-    <api_endpoints>
-      <tavily_routes>GET/POST /api/tavily/* for Tavily-specific operations</tavily_routes>
-      <existing_routes>Existing /api/claude/chat routes support Tavily tools automatically</existing_routes>
-    </api_endpoints>
-  </technology_stack>
-
-  <prerequisites>
-    <environment_setup>
-      - Tavily API key obtained from https://tavily.com (free tier available)
-      - API key stored in environment variable TAVILY_API_KEY or configuration file
-      - MCP SDK already installed (@modelcontextprotocol/sdk exists for ikario-memory)
-      - Tavily MCP server installed (npm package or Python package)
-    </environment_setup>
-    <configuration>
-      - Add Tavily MCP server config to server/.claude_settings.json or similar
-      - Configure connection parameters (stdio vs HTTP)
-      - Set API key securely
-    </configuration>
-  </prerequisites>
-
-  <core_features>
-    <feature_1>
-      <title>Tavily MCP Client Setup</title>
-      <description>
-        Create MCP client connection to Tavily search server. This is similar to the existing
-        ikario-memory MCP client but connects to Tavily instead.
-
-        Implementation:
-        - Create server/services/tavilyMcpClient.js
-        - Initialize MCP client with Tavily server connection
-        - Handle connection lifecycle (connect, disconnect, reconnect)
-        - Implement health checks and connection status
-        - Export client instance and helper functions
-
-        Configuration:
-        - Read Tavily API key from environment or config file
-        - Configure transport (stdio or HTTP)
-        - Set connection timeout and retry logic
-        - Log connection status for debugging
-
-        Error Handling:
-        - Graceful degradation if Tavily is unavailable
-        - Connection retry with exponential backoff
-        - Clear error messages for configuration issues
-      </description>
-      <priority>1</priority>
-      <category>backend</category>
-      <test_steps>
-        1. Verify MCP client can connect to Tavily server on startup
-        2. Test connection health check endpoint returns correct status
-        3. Verify graceful handling when Tavily API key is missing
-        4. Test reconnection logic when connection drops
-        5. Verify connection status is logged correctly
-        6. Test that server starts even if Tavily is unavailable
-      </test_steps>
-    </feature_1>
-
-    <feature_2>
-      <title>Tavily Tool Configuration</title>
-      <description>
-        Configure Tavily search tools to be available to Claude during conversations.
-        This integrates with the existing tool system (like memory tools).
-
-        Implementation:
-        - Create server/config/tavilyTools.js
-        - Define tool schemas for Tavily search capabilities
-        - Integrate with existing toolExecutor service
-        - Add Tavily tools to system prompt alongside memory tools
-
-        Tavily Tools to Expose:
-        - tavily_search: General web search with AI-optimized results
-          - Parameters: query (string), max_results (number), search_depth (basic/advanced)
-          - Returns: Array of search results with title, url, content, score
-
-        - tavily_search_news: News-specific search for current events
-          - Parameters: query (string), max_results (number), days (number)
-          - Returns: Recent news articles with metadata
-
-        Tool Schema:
-        - Follow Claude Tool Use API format
-        - Clear descriptions for each tool
-        - Well-defined input schemas with validation
-        - Proper error handling in tool execution
-      </description>
-      <priority>1</priority>
-      <category>backend</category>
-      <test_steps>
-        1. Verify Tavily tools are listed in available tools
-        2. Test tool schema validation with valid inputs
-        3. Test tool schema validation rejects invalid inputs
-        4. Verify tools appear in Claude's system prompt
-        5. Test that tool descriptions are clear and accurate
-        6. Verify tools can be called without errors
-      </test_steps>
-    </feature_2>
-
-    <feature_3>
-      <title>Tavily Tool Executor Integration</title>
-      <description>
-        Integrate Tavily tools into the existing toolExecutor service so Claude can
-        use them during conversations.
-
-        Implementation:
-        - Extend server/services/toolExecutor.js to handle Tavily tools
-        - Add tool detection for tavily_search and tavily_search_news
-        - Implement tool execution logic using Tavily MCP client
-        - Format Tavily results for Claude consumption
-        - Handle errors and timeouts gracefully
-
-        Tool Execution Flow:
-        1. Claude requests tool use (e.g., tavily_search)
-        2. toolExecutor detects Tavily tool request
-        3. Call Tavily MCP client with tool parameters
-        4. Receive and format search results
-        5. Return formatted results to Claude
-        6. Claude incorporates results into response
-
-        Result Formatting:
-        - Convert Tavily results to Claude-friendly format
-        - Include source URLs for citation
-        - Add relevance scores
-        - Truncate content if too long
-        - Handle empty results gracefully
-      </description>
-      <priority>1</priority>
-      <category>backend</category>
-      <test_steps>
-        1. Test tavily_search tool execution with valid query
-        2. Verify results are properly formatted
-        3. Test tavily_search_news tool execution
-        4. Verify error handling when Tavily API fails
-        5. Test timeout handling for slow searches
-        6. Verify results include proper citations and URLs
-        7. Test with empty search results
-        8. Test with very long search queries
-      </test_steps>
-    </feature_3>
-
-    <feature_4>
-      <title>System Prompt Enhancement for Internet Access</title>
-      <description>
-        Update the system prompt to inform Ikario about internet access capabilities.
-        This should be added alongside existing memory tools instructions.
-
-        Implementation:
-        - Update MEMORY_SYSTEM_PROMPT in server/routes/messages.js and claude.js
-        - Add Tavily tools documentation
-        - Provide usage guidelines for when to search the internet
-        - Include examples of good search queries
-
-        Prompt Addition:
-        "## Internet Access via Tavily
-
-        Tu as accès à internet en temps réel via deux outils de recherche :
-
-        1. tavily_search : Recherche web générale optimisée pour l'IA
-           - Utilise pour : rechercher des informations actuelles, vérifier des faits,
-             trouver des sources fiables
-           - Paramètres : query (ta question), max_results (nombre de résultats, défaut: 5),
-             search_depth ('basic' ou 'advanced')
-           - Retourne : Résultats avec titre, URL, contenu et score de pertinence
-
-        2. tavily_search_news : Recherche d'actualités récentes
-           - Utilise pour : événements actuels, nouvelles, actualités
-           - Paramètres : query, max_results, days (nombre de jours en arrière, défaut: 7)
-
-        Quand utiliser la recherche internet :
-        - Quand l'utilisateur demande des informations récentes ou actuelles
-        - Pour vérifier des faits ou données que tu n'es pas sûr de connaître
-        - Quand ta base de connaissances est trop ancienne (après janvier 2025)
-        - Pour trouver des sources et citations spécifiques
-        - Pour des requêtes nécessitant des données en temps réel
-
-        N'utilise PAS la recherche pour :
-        - Des questions sur ta propre identité ou capacités
-        - Des concepts généraux que tu connais déjà bien
-        - Des questions purement créatives ou d'opinion
-
-        Utilise ces outils de façon autonome selon les besoins de la conversation.
-        Cite toujours tes sources quand tu utilises des informations de Tavily."
-      </description>
-      <priority>2</priority>
-      <category>backend</category>
-      <test_steps>
-        1. Verify system prompt includes Tavily instructions
-        2. Test that Claude understands when to use Tavily search
-        3. Verify Claude cites sources from Tavily results
-        4. Test that Claude uses appropriate search queries
-        5. Verify Claude chooses between tavily_search and tavily_search_news correctly
-        6. Test that Claude doesn't over-use search for simple questions
-      </test_steps>
-    </feature_4>
-
-    <feature_5>
-      <title>Tavily Status API Endpoint</title>
-      <description>
-        Create API endpoint to check Tavily MCP connection status and search capabilities.
-        Similar to /api/memory/status endpoint.
-
-        Implementation:
-        - Create GET /api/tavily/status endpoint
-        - Return connection status, available tools, and configuration
-        - Create GET /api/tavily/health endpoint for health checks
-        - Add Tavily status to existing /api/memory/stats (rename to /api/tools/stats)
-
-        Response Format:
-        {
-          "success": true,
-          "data": {
-            "connected": true,
-            "message": "Tavily MCP server is connected",
-            "tools": ["tavily_search", "tavily_search_news"],
-            "apiKeyConfigured": true,
-            "transport": "stdio"
-          }
-        }
-      </description>
-      <priority>2</priority>
-      <category>backend</category>
-      <test_steps>
-        1. Test GET /api/tavily/status returns correct status
-        2. Verify status shows "connected" when Tavily is available
-        3. Verify status shows "disconnected" when Tavily is unavailable
-        4. Test health endpoint returns proper status code
-        5. Verify tools list is accurate
-        6. Test with missing API key shows proper error
-      </test_steps>
-    </feature_5>
-
-    <feature_6>
-      <title>Frontend UI Indicator for Internet Access</title>
-      <description>
-        Add visual indicator in the UI to show when Ikario has internet access via Tavily.
-        This can be displayed alongside the existing memory status indicator.
-
-        Implementation:
-        - Add Tavily status indicator in header or sidebar
-        - Show online/offline status for Tavily connection
-        - Optional: Show when Tavily is being used during a conversation
-        - Optional: Add tooltip explaining internet access capabilities
-
-        Visual Design:
-        - Globe or wifi icon to represent internet access
-        - Green when connected, gray when disconnected
-        - Subtle animation when search is in progress
-        - Tooltip: "Internet access via Tavily" or similar
-
-        Integration:
-        - Use existing useMemory hook pattern or create useTavily hook
-        - Poll /api/tavily/status periodically (every 60s)
-        - Update status in real-time during searches
-      </description>
-      <priority>3</priority>
-      <category>frontend</category>
-      <test_steps>
-        1. Verify internet access indicator appears in UI
-        2. Test status updates when Tavily connects/disconnects
-        3. Verify tooltip shows correct information
-        4. Test that indicator shows activity during searches
-        5. Verify status polling doesn't impact performance
-        6. Test with Tavily disabled shows offline status
-      </test_steps>
-    </feature_6>
-
-    <feature_7>
-      <title>Manual Search UI (Optional Enhancement)</title>
-      <description>
-        Optional: Add manual search interface to allow users to trigger Tavily searches directly,
-        similar to the memory search panel.
-
-        Implementation:
-        - Add "Internet Search" panel in sidebar (alongside Memory panel)
-        - Search input for manual Tavily queries
-        - Display search results with title, snippet, URL
-        - Click to insert results into conversation
-        - Filter by search type (general vs news)
-
-        This is OPTIONAL and lower priority. The primary use case is autonomous search by Claude.
-      </description>
-      <priority>4</priority>
-      <category>frontend</category>
-      <test_steps>
-        1. Verify search panel appears in sidebar
-        2. Test manual search returns results
-        3. Verify results display properly with links
-        4. Test inserting results into conversation
-        5. Test news search filter works correctly
-        6. Verify search history is saved (optional)
-      </test_steps>
-    </feature_7>
-
-    <feature_8>
-      <title>Configuration and Settings</title>
-      <description>
-        Add Tavily configuration options to settings and environment.
-
-        Implementation:
-        - Add TAVILY_API_KEY to environment variables
-        - Add Tavily settings to .claude_settings.json or similar config file
-        - Create server/config/tavilyConfig.js for configuration management
-        - Document configuration options in README
-
-        Configuration Options:
-        - API key
-        - Max results per search (default: 5)
-        - Search depth (basic/advanced)
-        - Timeout duration
-        - Enable/disable Tavily globally
-        - Rate limiting settings
-
-        Security:
-        - API key should NOT be exposed to frontend
-        - Use environment variable or secure config file
-        - Validate API key on startup
-        - Log warnings if API key is missing
-      </description>
-      <priority>2</priority>
-      <category>backend</category>
-      <test_steps>
-        1. Verify API key is read from environment variable
-        2. Test fallback to config file if env var not set
-        3. Verify API key validation on startup
-        4. Test configuration options are applied correctly
-        5. Verify API key is never exposed in API responses
-        6. Test enabling/disabling Tavily via config
-      </test_steps>
-    </feature_8>
-
-    <feature_9>
-      <title>Error Handling and Rate Limiting</title>
-      <description>
-        Implement robust error handling and rate limiting for Tavily API calls.
-
-        Implementation:
-        - Detect and handle Tavily API errors (rate limits, invalid API key, etc.)
-        - Implement client-side rate limiting to avoid hitting Tavily limits
-        - Cache search results for duplicate queries (optional)
-        - Provide clear error messages to Claude when searches fail
-
-        Error Types:
-        - 401: Invalid API key
-        - 429: Rate limit exceeded
-        - 500: Tavily server error
-        - Timeout: Search took too long
-        - Network: Connection failed
-
-        Rate Limiting:
-        - Track searches per minute/hour
-        - Queue requests if limit reached
-        - Return cached results for duplicate queries within 5 minutes
-        - Log rate limit warnings
-      </description>
-      <priority>2</priority>
-      <category>backend</category>
-      <test_steps>
-        1. Test error handling for invalid API key
-        2. Verify rate limit detection and handling
-        3. Test timeout handling for slow searches
-        4. Verify error messages are clear to Claude
-        5. Test rate limiting prevents API abuse
-        6. Verify caching works for duplicate queries
-      </test_steps>
-    </feature_9>
-
-    <feature_10>
-      <title>Documentation and README Updates</title>
-      <description>
-        Update project documentation to explain Tavily integration.
-
-        Implementation:
-        - Update main README.md with Tavily setup instructions
-        - Add TAVILY_SETUP.md with detailed configuration guide
-        - Document API endpoints in README
-        - Add examples of using Tavily with Ikario
-        - Document troubleshooting steps
-
-        Documentation Sections:
-        - Prerequisites (Tavily API key)
-        - Installation steps
-        - Configuration options
-        - Testing Tavily connection
-        - Example conversations using internet search
-        - Troubleshooting common issues
-        - API reference for Tavily endpoints
-      </description>
-      <priority>3</priority>
-      <category>documentation</category>
-      <test_steps>
-        1. Verify README has Tavily setup section
-        2. Test that setup instructions are clear and complete
-        3. Verify all configuration options are documented
-        4. Test examples work as described
-        5. Verify troubleshooting section covers common issues
-      </test_steps>
-    </feature_10>
-  </core_features>
-
-  <implementation_notes>
-    <order>
-      Recommended implementation order:
-      1. Feature 1 (MCP Client Setup) - Foundation
-      2. Feature 2 (Tool Configuration) - Core functionality
-      3. Feature 3 (Tool Executor Integration) - Core functionality
-      4. Feature 8 (Configuration) - Required for testing
-      5. Feature 4 (System Prompt) - Makes tools accessible to Claude
-      6. Feature 9 (Error Handling) - Production readiness
-      7. Feature 5 (Status API) - Monitoring
-      8. Feature 10 (Documentation) - User onboarding
-      9. Feature 6 (UI Indicator) - Nice to have
-      10. Feature 7 (Manual Search UI) - Optional enhancement
-    </order>
-
-    <testing>
-      After implementing features 1-5, you should be able to:
-      - Ask Ikario: "Quelle est l'actualité aujourd'hui ?"
-      - Ask Ikario: "Recherche des informations sur [topic actuel]"
-      - Ask Ikario: "Vérifie cette information : [claim]"
-
-      Ikario should autonomously use Tavily search and cite sources.
-    </testing>
-
-    <compatibility>
-      - This specification is fully compatible with existing ikario-memory MCP integration
-      - Ikario will have both memory tools AND internet search tools
-      - Tools can be used together in the same conversation
-      - No conflicts expected between tool systems
-    </compatibility>
-  </implementation_notes>
-
-  <safety_requirements>
-    <critical>
-      - DO NOT expose Tavily API key to frontend or in API responses
-      - DO NOT modify existing MCP memory integration
-      - DO NOT break existing conversation functionality
-      - Tavily should gracefully degrade if unavailable (don't crash the app)
-      - Implement proper rate limiting to avoid API abuse
-      - Validate all user inputs before passing to Tavily
-      - Sanitize search results before displaying (XSS prevention)
-      - Log all Tavily API calls for monitoring and debugging
-    </critical>
-  </safety_requirements>
-
-  <success_metrics>
-    - Ikario can successfully perform internet searches when asked
-    - Search results are relevant and well-formatted
-    - Sources are properly cited
-    - Tavily integration doesn't slow down conversations
-    - Error handling is robust and user-friendly
-    - Configuration is straightforward
-    - Documentation is clear and complete
-  </success_metrics>
-</project_specification>
diff --git a/prompts/app_spec_types_docs.backup.txt b/prompts/app_spec_types_docs.backup.txt
deleted file mode 100644
index 0fe4fa6..0000000
--- a/prompts/app_spec_types_docs.backup.txt
+++ /dev/null
@@ -1,679 +0,0 @@
-<project_specification>
-  <project_name>Library RAG - Type Safety & Documentation Enhancement</project_name>
-
-  <overview>
-    Enhance the Library RAG application (philosophical texts indexing and semantic search) by adding
-    strict type annotations and comprehensive Google-style docstrings to all Python modules. This will
-    improve code maintainability, enable static type checking with mypy, and provide clear documentation
-    for all functions, classes, and modules.
-
-    The application is a RAG pipeline that processes PDF documents through OCR, LLM-based extraction,
-    semantic chunking, and ingestion into Weaviate vector database. It includes a Flask web interface
-    for document upload, processing, and semantic search.
-  </overview>
-
-  <technology_stack>
-    <backend>
-      <runtime>Python 3.10+</runtime>
-      <web_framework>Flask 3.0</web_framework>
-      <vector_database>Weaviate 1.34.4 with text2vec-transformers</vector_database>
-      <ocr>Mistral OCR API</ocr>
-      <llm>Ollama (local) or Mistral API</llm>
-      <type_checking>mypy with strict configuration</type_checking>
-    </backend>
-    <infrastructure>
-      <containerization>Docker Compose (Weaviate + transformers)</containerization>
-      <dependencies>weaviate-client, flask, mistralai, python-dotenv</dependencies>
-    </infrastructure>
-  </technology_stack>
-
-  <current_state>
-    <project_structure>
-      - flask_app.py: Main Flask application (640 lines)
-      - schema.py: Weaviate schema definition (383 lines)
-      - utils/: 16+ modules for PDF processing pipeline
-        - pdf_pipeline.py: Main orchestration (879 lines)
-        - mistral_client.py: OCR API client
-        - ocr_processor.py: OCR processing
-        - markdown_builder.py: Markdown generation
-        - llm_metadata.py: Metadata extraction via LLM
-        - llm_toc.py: Table of contents extraction
-        - llm_classifier.py: Section classification
-        - llm_chunker.py: Semantic chunking
-        - llm_cleaner.py: Chunk cleaning
-        - llm_validator.py: Document validation
-        - weaviate_ingest.py: Database ingestion
-        - hierarchy_parser.py: Document hierarchy parsing
-        - image_extractor.py: Image extraction from PDFs
-        - toc_extractor*.py: Various TOC extraction methods
-      - templates/: Jinja2 templates for Flask UI
-      - tests/utils2/: Minimal test coverage (3 test files)
-    </project_structure>
-
-    <issues>
-      - Inconsistent type annotations across modules (some have partial types, many have none)
-      - Missing or incomplete docstrings (no Google-style format)
-      - No mypy configuration for strict type checking
-      - Type hints missing on function parameters and return values
-      - Dict[str, Any] used extensively without proper typing
-      - No type stubs for complex nested structures
-    </issues>
-  </current_state>
-
-  <core_features>
-    <type_annotations>
-      <strict_typing>
-        - Add complete type annotations to ALL functions and methods
-        - Use proper generic types (List, Dict, Optional, Union) from typing module
-        - Add TypedDict for complex dictionary structures
-        - Add Protocol types for duck-typed interfaces
-        - Use Literal types for string constants
-        - Add ParamSpec and TypeVar where appropriate
-        - Type all class attributes and instance variables
-        - Add type annotations to lambda functions where possible
-      </strict_typing>
-
-      <mypy_configuration>
-        - Create mypy.ini with strict configuration
-        - Enable: check_untyped_defs, disallow_untyped_defs, disallow_incomplete_defs
-        - Enable: disallow_untyped_calls, disallow_untyped_decorators
-        - Enable: warn_return_any, warn_redundant_casts
-        - Enable: strict_equality, strict_optional
-        - Set python_version to 3.10
-        - Configure per-module overrides if needed for gradual migration
-      </mypy_configuration>
-
-      <type_stubs>
-        - Create TypedDict definitions for common data structures:
-          - OCR response structures
-          - Metadata dictionaries
-          - TOC entries
-          - Chunk objects
-          - Weaviate objects
-          - Pipeline results
-        - Add NewType for semantic type safety (DocumentName, ChunkId, etc.)
-        - Create Protocol types for callback functions
-      </type_stubs>
-
-      <specific_improvements>
-        - pdf_pipeline.py: Type all 10 pipeline steps, callbacks, result dictionaries
-        - flask_app.py: Type all route handlers, request/response types
-        - schema.py: Type Weaviate configuration objects
-        - llm_*.py: Type LLM request/response structures
-        - mistral_client.py: Type API client methods and responses
-        - weaviate_ingest.py: Type ingestion functions and batch operations
-      </specific_improvements>
-    </type_annotations>
-
-    <documentation>
-      <google_style_docstrings>
-        - Add comprehensive Google-style docstrings to ALL:
-          - Module-level docstrings explaining purpose and usage
-          - Class docstrings with Attributes section
-          - Function/method docstrings with Args, Returns, Raises sections
-          - Complex algorithm explanations with Examples section
-        - Include code examples for public APIs
-        - Document all exceptions that can be raised
-        - Add Notes section for important implementation details
-        - Add See Also section for related functions
-      </google_style_docstrings>
-
-      <module_documentation>
-        <utils_modules>
-          - pdf_pipeline.py: Document the 10-step pipeline, each step's purpose
-          - mistral_client.py: Document OCR API usage, cost calculation
-          - llm_metadata.py: Document metadata extraction logic
-          - llm_toc.py: Document TOC extraction strategies
-          - llm_classifier.py: Document section classification types
-          - llm_chunker.py: Document semantic vs basic chunking
-          - llm_cleaner.py: Document cleaning rules and validation
-          - llm_validator.py: Document validation criteria
-          - weaviate_ingest.py: Document ingestion process, nested objects
-          - hierarchy_parser.py: Document hierarchy building algorithm
-        </utils_modules>
-
-        <flask_app>
-          - Document all routes with request/response examples
-          - Document SSE (Server-Sent Events) implementation
-          - Document Weaviate query patterns
-          - Document upload processing workflow
-          - Document background job management
-        </flask_app>
-
-        <schema>
-          - Document Weaviate schema design decisions
-          - Document each collection's purpose and relationships
-          - Document nested object structure
-          - Document vectorization strategy
-        </schema>
-      </module_documentation>
-
-      <inline_comments>
-        - Add inline comments for complex logic only (don't over-comment)
-        - Explain WHY not WHAT (code should be self-documenting)
-        - Document performance considerations
-        - Document cost implications (OCR, LLM API calls)
-        - Document error handling strategies
-      </inline_comments>
-    </documentation>
-
-    <validation>
-      <type_checking>
-        - All modules must pass mypy --strict
-        - No # type: ignore comments without justification
-        - CI/CD should run mypy checks
-        - Type coverage should be 100%
-      </type_checking>
-
-      <documentation_quality>
-        - All public functions must have docstrings
-        - All docstrings must follow Google style
-        - Examples should be executable and tested
-        - Documentation should be clear and concise
-      </documentation_quality>
-    </validation>
-  </core_features>
-
-  <implementation_priority>
-    <critical_modules>
-      Priority 1 (Most used, most complex):
-      1. utils/pdf_pipeline.py - Main orchestration
-      2. flask_app.py - Web application entry point
-      3. utils/weaviate_ingest.py - Database operations
-      4. schema.py - Schema definition
-
-      Priority 2 (Core LLM modules):
-      5. utils/llm_metadata.py
-      6. utils/llm_toc.py
-      7. utils/llm_classifier.py
-      8. utils/llm_chunker.py
-      9. utils/llm_cleaner.py
-      10. utils/llm_validator.py
-
-      Priority 3 (OCR and parsing):
-      11. utils/mistral_client.py
-      12. utils/ocr_processor.py
-      13. utils/markdown_builder.py
-      14. utils/hierarchy_parser.py
-      15. utils/image_extractor.py
-
-      Priority 4 (Supporting modules):
-      16. utils/toc_extractor.py
-      17. utils/toc_extractor_markdown.py
-      18. utils/toc_extractor_visual.py
-      19. utils/llm_structurer.py (legacy)
-    </critical_modules>
-  </implementation_priority>
-
-  <implementation_steps>
-    <feature_1>
-      <title>Setup Type Checking Infrastructure</title>
-      <description>
-        Configure mypy with strict settings and create foundational type definitions
-      </description>
-      <tasks>
-        - Create mypy.ini configuration file with strict settings
-        - Add mypy to requirements.txt or dev dependencies
-        - Create utils/types.py module for common TypedDict definitions
-        - Define core types: OCRResponse, Metadata, TOCEntry, ChunkData, PipelineResult
-        - Add NewType definitions for semantic types: DocumentName, ChunkId, SectionPath
-        - Create Protocol types for callbacks (ProgressCallback, etc.)
-        - Document type definitions in utils/types.py module docstring
-        - Test mypy configuration on a single module to verify settings
-      </tasks>
-      <acceptance_criteria>
-        - mypy.ini exists with strict configuration
-        - utils/types.py contains all foundational types with docstrings
-        - mypy runs without errors on utils/types.py
-        - Type definitions are comprehensive and reusable
-      </acceptance_criteria>
-    </feature_1>
-
-    <feature_2>
-      <title>Add Types to PDF Pipeline Orchestration</title>
-      <description>
-        Add complete type annotations to pdf_pipeline.py (879 lines, most complex module)
-      </description>
-      <tasks>
-        - Add type annotations to all function signatures in pdf_pipeline.py
-        - Type the 10-step pipeline: OCR, Markdown, Metadata, TOC, Classify, Chunk, Clean, Validate, Weaviate
-        - Type progress_callback parameter with Protocol or Callable
-        - Add TypedDict for pipeline options dictionary
-        - Add TypedDict for pipeline result dictionary structure
-        - Type all helper functions (extract_document_metadata_legacy, etc.)
-        - Add proper return types for process_pdf_v2, process_pdf, process_pdf_bytes
-        - Fix any mypy errors that arise
-        - Verify mypy --strict passes on pdf_pipeline.py
-      </tasks>
-      <acceptance_criteria>
-        - All functions in pdf_pipeline.py have complete type annotations
-        - progress_callback is properly typed with Protocol
-        - All Dict[str, Any] replaced with TypedDict where appropriate
-        - mypy --strict pdf_pipeline.py passes with zero errors
-        - No # type: ignore comments (or justified if absolutely necessary)
-      </acceptance_criteria>
-    </feature_2>
-
-    <feature_3>
-      <title>Add Types to Flask Application</title>
-      <description>
-        Add complete type annotations to flask_app.py and type all routes
-      </description>
-      <tasks>
-        - Add type annotations to all Flask route handlers
-        - Type request.args, request.form, request.files usage
-        - Type jsonify() return values
-        - Type get_weaviate_client context manager
-        - Type get_collection_stats, get_all_chunks, search_chunks functions
-        - Add TypedDict for Weaviate query results
-        - Type background job processing functions (run_processing_job)
-        - Type SSE generator function (upload_progress)
-        - Add type hints for template rendering
-        - Verify mypy --strict passes on flask_app.py
-      </tasks>
-      <acceptance_criteria>
-        - All Flask routes have complete type annotations
-        - Request/response types are clear and documented
-        - Weaviate query functions are properly typed
-        - SSE generator is correctly typed
-        - mypy --strict flask_app.py passes with zero errors
-      </acceptance_criteria>
-    </feature_3>
-
-    <feature_4>
-      <title>Add Types to Core LLM Modules</title>
-      <description>
-        Add complete type annotations to all LLM processing modules (metadata, TOC, classifier, chunker, cleaner, validator)
-      </description>
-      <tasks>
-        - llm_metadata.py: Type extract_metadata function, return structure
-        - llm_toc.py: Type extract_toc function, TOC hierarchy structure
-        - llm_classifier.py: Type classify_sections, section types (Literal), validation functions
-        - llm_chunker.py: Type chunk_section_with_llm, chunk objects
-        - llm_cleaner.py: Type clean_chunk, is_chunk_valid functions
-        - llm_validator.py: Type validate_document, validation result structure
-        - Add TypedDict for LLM request/response structures
-        - Type provider selection ("ollama" | "mistral" as Literal)
-        - Type model names with Literal or constants
-        - Verify mypy --strict passes on all llm_*.py modules
-      </tasks>
-      <acceptance_criteria>
-        - All LLM modules have complete type annotations
-        - Section types use Literal for type safety
-        - Provider and model parameters are strongly typed
-        - LLM request/response structures use TypedDict
-        - mypy --strict passes on all llm_*.py modules with zero errors
-      </acceptance_criteria>
-    </feature_4>
-
-    <feature_5>
-      <title>Add Types to Weaviate and Database Modules</title>
-      <description>
-        Add complete type annotations to schema.py and weaviate_ingest.py
-      </description>
-      <tasks>
-        - schema.py: Type Weaviate configuration objects
-        - schema.py: Type collection property definitions
-        - weaviate_ingest.py: Type ingest_document function signature
-        - weaviate_ingest.py: Type delete_document_chunks function
-        - weaviate_ingest.py: Add TypedDict for Weaviate object structure
-        - Type batch insertion operations
-        - Type nested object references (work, document)
-        - Add proper error types for Weaviate exceptions
-        - Verify mypy --strict passes on both modules
-      </tasks>
-      <acceptance_criteria>
-        - schema.py has complete type annotations for Weaviate config
-        - weaviate_ingest.py functions are fully typed
-        - Nested object structures use TypedDict
-        - Weaviate client operations are properly typed
-        - mypy --strict passes on both modules with zero errors
-      </acceptance_criteria>
-    </feature_5>
-
-    <feature_6>
-      <title>Add Types to OCR and Parsing Modules</title>
-      <description>
-        Add complete type annotations to mistral_client.py, ocr_processor.py, markdown_builder.py, hierarchy_parser.py
-      </description>
-      <tasks>
-        - mistral_client.py: Type create_client, run_ocr, estimate_ocr_cost
-        - mistral_client.py: Add TypedDict for Mistral API response structures
-        - ocr_processor.py: Type serialize_ocr_response, OCR object structures
-        - markdown_builder.py: Type build_markdown, image_writer parameter
-        - hierarchy_parser.py: Type build_hierarchy, flatten_hierarchy functions
-        - hierarchy_parser.py: Add TypedDict for hierarchy node structure
-        - image_extractor.py: Type create_image_writer, image handling
-        - Verify mypy --strict passes on all modules
-      </tasks>
-      <acceptance_criteria>
-        - All OCR/parsing modules have complete type annotations
-        - Mistral API structures use TypedDict
-        - Hierarchy nodes are properly typed
-        - Image handling functions are typed
-        - mypy --strict passes on all modules with zero errors
-      </acceptance_criteria>
-    </feature_6>
-
-    <feature_7>
-      <title>Add Google-Style Docstrings to Core Modules</title>
-      <description>
-        Add comprehensive Google-style docstrings to pdf_pipeline.py, flask_app.py, and weaviate modules
-      </description>
-      <tasks>
-        - pdf_pipeline.py: Add module docstring explaining the V2 pipeline
-        - pdf_pipeline.py: Add docstrings to process_pdf_v2 with Args, Returns, Raises sections
-        - pdf_pipeline.py: Document each of the 10 pipeline steps in comments
-        - pdf_pipeline.py: Add Examples section showing typical usage
-        - flask_app.py: Add module docstring explaining Flask application
-        - flask_app.py: Document all routes with request/response examples
-        - flask_app.py: Document Weaviate connection management
-        - schema.py: Add module docstring explaining schema design
-        - schema.py: Document each collection's purpose and relationships
-        - weaviate_ingest.py: Document ingestion process with examples
-        - All docstrings must follow Google style format exactly
-      </tasks>
-      <acceptance_criteria>
-        - All core modules have comprehensive module-level docstrings
-        - All public functions have Google-style docstrings
-        - Args, Returns, Raises sections are complete and accurate
-        - Examples are provided for complex functions
-        - Docstrings explain WHY, not just WHAT
-      </acceptance_criteria>
-    </feature_7>
-
-    <feature_8>
-      <title>Add Google-Style Docstrings to LLM Modules</title>
-      <description>
-        Add comprehensive Google-style docstrings to all LLM processing modules
-      </description>
-      <tasks>
-        - llm_metadata.py: Document metadata extraction logic with examples
-        - llm_toc.py: Document TOC extraction strategies and fallbacks
-        - llm_classifier.py: Document section types and classification criteria
-        - llm_chunker.py: Document semantic vs basic chunking approaches
-        - llm_cleaner.py: Document cleaning rules and validation logic
-        - llm_validator.py: Document validation criteria and corrections
-        - Add Examples sections showing input/output for each function
-        - Document LLM provider differences (Ollama vs Mistral)
-        - Document cost implications in Notes sections
-        - All docstrings must follow Google style format exactly
-      </tasks>
-      <acceptance_criteria>
-        - All LLM modules have comprehensive docstrings
-        - Each function has Args, Returns, Raises sections
-        - Examples show realistic input/output
-        - Provider differences are documented
-        - Cost implications are noted where relevant
-      </acceptance_criteria>
-    </feature_8>
-
-    <feature_9>
-      <title>Add Google-Style Docstrings to OCR and Parsing Modules</title>
-      <description>
-        Add comprehensive Google-style docstrings to OCR, markdown, hierarchy, and extraction modules
-      </description>
-      <tasks>
-        - mistral_client.py: Document OCR API usage, cost calculation
-        - ocr_processor.py: Document OCR response processing
-        - markdown_builder.py: Document markdown generation strategy
-        - hierarchy_parser.py: Document hierarchy building algorithm
-        - image_extractor.py: Document image extraction process
-        - toc_extractor*.py: Document various TOC extraction methods
-        - Add Examples sections for complex algorithms
-        - Document edge cases and error handling
-        - All docstrings must follow Google style format exactly
-      </tasks>
-      <acceptance_criteria>
-        - All OCR/parsing modules have comprehensive docstrings
-        - Complex algorithms are well explained
-        - Edge cases are documented
-        - Error handling is documented
-        - Examples demonstrate typical usage
-      </acceptance_criteria>
-    </feature_9>
-
-    <feature_10>
-      <title>Final Validation and CI Integration</title>
-      <description>
-        Verify all type annotations and docstrings, integrate mypy into CI/CD
-      </description>
-      <tasks>
-        - Run mypy --strict on entire codebase, verify 100% pass rate
-        - Verify all public functions have docstrings
-        - Check docstring formatting with pydocstyle or similar tool
-        - Create GitHub Actions workflow to run mypy on every commit
-        - Update README.md with type checking instructions
-        - Update CLAUDE.md with documentation standards
-        - Create CONTRIBUTING.md with type annotation and docstring guidelines
-        - Generate API documentation with Sphinx or pdoc
-        - Fix any remaining mypy errors or missing docstrings
-      </tasks>
-      <acceptance_criteria>
-        - mypy --strict passes on entire codebase with zero errors
-        - All public functions have Google-style docstrings
-        - CI/CD runs mypy checks automatically
-        - Documentation is generated and accessible
-        - Contributing guidelines document type/docstring requirements
-      </acceptance_criteria>
-    </feature_10>
-  </implementation_steps>
-
-  <success_criteria>
-    <type_safety>
-      - 100% type coverage across all modules
-      - mypy --strict passes with zero errors
-      - No # type: ignore comments without justification
-      - All Dict[str, Any] replaced with TypedDict where appropriate
-      - Proper use of generics, protocols, and type variables
-      - NewType used for semantic type safety
-    </type_safety>
-
-    <documentation_quality>
-      - All modules have comprehensive module-level docstrings
-      - All public functions/classes have Google-style docstrings
-      - All docstrings include Args, Returns, Raises sections
-      - Complex functions include Examples sections
-      - Cost implications documented in Notes sections
-      - Error handling clearly documented
-      - Provider differences (Ollama vs Mistral) documented
-    </documentation_quality>
-
-    <code_quality>
-      - Code is self-documenting with clear variable names
-      - Inline comments explain WHY, not WHAT
-      - Complex algorithms are well explained
-      - Performance considerations documented
-      - Security considerations documented
-    </code_quality>
-
-    <developer_experience>
-      - IDE autocomplete works perfectly with type hints
-      - Type errors caught at development time, not runtime
-      - Documentation is easily accessible in IDE
-      - API examples are executable and tested
-      - Contributing guidelines are clear and comprehensive
-    </developer_experience>
-
-    <maintainability>
-      - Refactoring is safer with type checking
-      - Function signatures are self-documenting
-      - API contracts are explicit and enforced
-      - Breaking changes are caught by type checker
-      - New developers can understand code quickly
-    </maintainability>
-  </success_criteria>
-
-  <constraints>
-    <compatibility>
-      - Must maintain backward compatibility with existing code
-      - Cannot break existing Flask routes or API contracts
-      - Weaviate schema must remain unchanged
-      - Existing tests must continue to pass
-    </compatibility>
-
-    <gradual_migration>
-      - Can use per-module mypy configuration for gradual migration
-      - Can temporarily disable strict checks on legacy modules
-      - Priority modules must be completed first
-      - Low-priority modules can be deferred
-    </gradual_migration>
-
-    <standards>
-      - All type annotations must use Python 3.10+ syntax
-      - Docstrings must follow Google style exactly (not NumPy or reStructuredText)
-      - Use typing module (List, Dict, Optional) until Python 3.9 support dropped
-      - Use from __future__ import annotations if needed for forward references
-    </standards>
-  </constraints>
-
-  <testing_strategy>
-    <type_checking>
-      - Run mypy --strict on each module after adding types
-      - Use mypy daemon (dmypy) for faster incremental checking
-      - Add mypy to pre-commit hooks
-      - CI/CD must run mypy and fail on type errors
-    </type_checking>
-
-    <documentation_validation>
-      - Use pydocstyle to validate Google-style format
-      - Use sphinx-build to generate docs and catch errors
-      - Manual review of docstring examples
-      - Verify examples are executable and correct
-    </documentation_validation>
-
-    <integration_testing>
-      - Verify existing tests still pass after type additions
-      - Add new tests for complex typed structures
-      - Test mypy configuration on sample code
-      - Verify IDE autocomplete works correctly
-    </integration_testing>
-  </testing_strategy>
-
-  <documentation_examples>
-    <module_docstring>
-      ```python
-      """
-      PDF Pipeline V2 - Intelligent document processing with LLM enhancement.
-
-      This module orchestrates a 10-step pipeline for processing PDF documents:
-      1. OCR via Mistral API
-      2. Markdown construction with images
-      3. Metadata extraction via LLM
-      4. Table of contents (TOC) extraction
-      5. Section classification
-      6. Semantic chunking
-      7. Chunk cleaning and validation
-      8. Enrichment with concepts
-      9. Validation and corrections
-      10. Ingestion into Weaviate vector database
-
-      The pipeline supports multiple LLM providers (Ollama local, Mistral API) and
-      various processing modes (skip OCR, semantic chunking, OCR annotations).
-
-      Typical usage:
-          >>> from pathlib import Path
-          >>> from utils.pdf_pipeline import process_pdf
-          >>>
-          >>> result = process_pdf(
-          ...     Path("document.pdf"),
-          ...     use_llm=True,
-          ...     llm_provider="ollama",
-          ...     ingest_to_weaviate=True,
-          ... )
-          >>> print(f"Processed {result['pages']} pages, {result['chunks_count']} chunks")
-
-      See Also:
-          mistral_client: OCR API client
-          llm_metadata: Metadata extraction
-          weaviate_ingest: Database ingestion
-      """
-      ```
-    </module_docstring>
-
-    <function_docstring>
-      ```python
-      def process_pdf_v2(
-          pdf_path: Path,
-          output_dir: Path = Path("output"),
-          *,
-          use_llm: bool = True,
-          llm_provider: Literal["ollama", "mistral"] = "ollama",
-          llm_model: Optional[str] = None,
-          skip_ocr: bool = False,
-          ingest_to_weaviate: bool = True,
-          progress_callback: Optional[ProgressCallback] = None,
-      ) -> PipelineResult:
-          """
-          Process a PDF through the complete V2 pipeline with LLM enhancement.
-
-          This function orchestrates all 10 steps of the intelligent document processing
-          pipeline, from OCR to Weaviate ingestion. It supports both local (Ollama) and
-          cloud (Mistral API) LLM providers, with optional caching via skip_ocr.
-
-          Args:
-              pdf_path: Absolute path to the PDF file to process.
-              output_dir: Base directory for output files. Defaults to "./output".
-              use_llm: Enable LLM-based processing (metadata, TOC, chunking).
-                  If False, uses basic heuristic processing.
-              llm_provider: LLM provider to use. "ollama" for local (free but slow),
-                  "mistral" for API (fast but paid).
-              llm_model: Specific model name. If None, auto-detects based on provider
-                  (qwen2.5:7b for ollama, mistral-small-latest for mistral).
-              skip_ocr: If True, reuses existing markdown file to avoid OCR cost.
-                  Requires output_dir/<doc_name>/<doc_name>.md to exist.
-              ingest_to_weaviate: If True, ingests chunks into Weaviate after processing.
-              progress_callback: Optional callback for real-time progress updates.
-                  Called with (step_id, status, detail) for each pipeline step.
-
-          Returns:
-              Dictionary containing processing results with the following keys:
-                  - success (bool): True if processing completed without errors
-                  - document_name (str): Name of the processed document
-                  - pages (int): Number of pages in the PDF
-                  - chunks_count (int): Number of chunks generated
-                  - cost_ocr (float): OCR cost in euros (0 if skip_ocr=True)
-                  - cost_llm (float): LLM API cost in euros (0 if provider=ollama)
-                  - cost_total (float): Total cost (ocr + llm)
-                  - metadata (dict): Extracted metadata (title, author, etc.)
-                  - toc (list): Hierarchical table of contents
-                  - files (dict): Paths to generated files (markdown, chunks, etc.)
-
-          Raises:
-              FileNotFoundError: If pdf_path does not exist.
-              ValueError: If skip_ocr=True but markdown file not found.
-              RuntimeError: If Weaviate connection fails during ingestion.
-
-          Examples:
-              Basic usage with Ollama (free):
-              >>> result = process_pdf_v2(
-              ...     Path("platon_menon.pdf"),
-              ...     llm_provider="ollama"
-              ... )
-              >>> print(f"Cost: {result['cost_total']:.4f}€")
-              Cost: 0.0270€  # OCR only
-
-              With Mistral API (faster):
-              >>> result = process_pdf_v2(
-              ...     Path("platon_menon.pdf"),
-              ...     llm_provider="mistral",
-              ...     llm_model="mistral-small-latest"
-              ... )
-
-              Skip OCR to avoid cost:
-              >>> result = process_pdf_v2(
-              ...     Path("platon_menon.pdf"),
-              ...     skip_ocr=True,  # Reuses existing markdown
-              ...     ingest_to_weaviate=False
-              ... )
-
-          Notes:
-              - OCR cost: ~0.003€/page (standard), ~0.009€/page (with annotations)
-              - LLM cost: Free with Ollama, variable with Mistral API
-              - Processing time: ~30s/page with Ollama, ~5s/page with Mistral
-              - Weaviate must be running (docker-compose up -d) before ingestion
-          """
-      ```
-    </function_docstring>
-  </documentation_examples>
-</project_specification>
diff --git a/prompts/coding_prompt_library.md b/prompts/coding_prompt_library.md
deleted file mode 100644
index 0f628a3..0000000
--- a/prompts/coding_prompt_library.md
+++ /dev/null
@@ -1,290 +0,0 @@
-## YOUR ROLE - CODING AGENT (Library RAG - Type Safety & Documentation)
-
-You are working on adding strict type annotations and Google-style docstrings to a Python library project.
-This is a FRESH context window - you have no memory of previous sessions.
-
-You have access to Linear for project management via MCP tools. Linear is your single source of truth.
-
-### STEP 1: GET YOUR BEARINGS (MANDATORY)
-
-Start by orienting yourself:
-
-```bash
-# 1. See your working directory
-pwd
-
-# 2. List files to understand project structure
-ls -la
-
-# 3. Read the project specification
-cat app_spec.txt
-
-# 4. Read the Linear project state
-cat .linear_project.json
-
-# 5. Check recent git history
-git log --oneline -20
-```
-
-### STEP 2: CHECK LINEAR STATUS
-
-Query Linear to understand current project state using the project_id from `.linear_project.json`.
-
-1. **Get all issues and count progress:**
-   ```
-   mcp__linear__list_issues with project_id
-   ```
-   Count:
-   - Issues "Done" = completed
-   - Issues "Todo" = remaining
-   - Issues "In Progress" = currently being worked on
-
-2. **Find META issue** (if exists) for session context
-
-3. **Check for in-progress work** - complete it first if found
-
-### STEP 3: SELECT NEXT ISSUE
-
-Get Todo issues sorted by priority:
-```
-mcp__linear__list_issues with project_id, status="Todo", limit=5
-```
-
-Select ONE highest-priority issue to work on.
-
-### STEP 4: CLAIM THE ISSUE
-
-Use `mcp__linear__update_issue` to set status to "In Progress"
-
-### STEP 5: IMPLEMENT THE ISSUE
-
-Based on issue category:
-
-**For Type Annotation Issues (e.g., "Types - Add type annotations to X.py"):**
-
-1. Read the target Python file
-2. Identify all functions, methods, and variables
-3. Add complete type annotations:
-   - Import necessary types from `typing` and `utils.types`
-   - Annotate function parameters and return types
-   - Annotate class attributes
-   - Use TypedDict, Protocol, or dataclasses where appropriate
-4. Save the file
-5. Run mypy to verify (MANDATORY):
-   ```bash
-   cd generations/library_rag
-   mypy --config-file=mypy.ini <file_path>
-   ```
-6. Fix any mypy errors
-7. Commit the changes
-
-**For Documentation Issues (e.g., "Docs - Add docstrings to X.py"):**
-
-1. Read the target Python file
-2. Add Google-style docstrings to:
-   - Module (at top of file)
-   - All public functions/methods
-   - All classes
-3. Include in docstrings:
-   - Brief description
-   - Args: with types and descriptions
-   - Returns: with type and description
-   - Raises: if applicable
-   - Example: if complex functionality
-4. Save the file
-5. Optionally run pydocstyle to verify (if installed)
-6. Commit the changes
-
-**For Setup/Infrastructure Issues:**
-
-Follow the specific instructions in the issue description.
-
-### STEP 6: VERIFICATION
-
-**Type Annotation Issues:**
-- Run mypy on the modified file(s)
-- Ensure zero type errors
-- If errors exist, fix them before proceeding
-
-**Documentation Issues:**
-- Review docstrings for completeness
-- Ensure Args/Returns sections match function signatures
-- Check that examples are accurate
-
-**Functional Changes (rare):**
-- If the issue changes behavior, test manually
-- Start Flask server if needed: `python flask_app.py`
-- Test the affected functionality
-
-### STEP 7: GIT COMMIT
-
-Make a descriptive commit:
-```bash
-git add <files>
-git commit -m "<Issue ID>: <Short description>
-
-- <List of changes>
-- Verified with mypy (for type issues)
-- Linear issue: <issue identifier>
-"
-```
-
-### STEP 8: UPDATE LINEAR ISSUE
-
-1. **Add implementation comment:**
-   ```markdown
-   ## Implementation Complete
-
-   ### Changes Made
-   - [List of files modified]
-   - [Key changes]
-
-   ### Verification
-   - mypy passes with zero errors (for type issues)
-   - All test steps from issue description verified
-
-   ### Git Commit
-   [commit hash and message]
-   ```
-
-2. **Update status to "Done"** using `mcp__linear__update_issue`
-
-### STEP 9: DECIDE NEXT ACTION
-
-After completing an issue, ask yourself:
-
-1. Have I been working for a while? (Use judgment based on complexity of work done)
-2. Is the code in a stable state?
-3. Would this be a good handoff point?
-
-**If YES to all three:**
-- Proceed to STEP 10 (Session Summary)
-- End cleanly
-
-**If NO:**
-- Continue to another issue (go back to STEP 3)
-- But commit first!
-
-**Pacing Guidelines:**
-- Early phase (< 20% done): Can complete multiple simple issues
-- Mid/late phase (> 20% done): 1-2 issues per session for quality
-
-### STEP 10: SESSION SUMMARY (When Ending)
-
-If META issue exists, add a comment:
-
-```markdown
-## Session Complete
-
-### Completed This Session
-- [Issue ID]: [Title] - [Brief summary]
-
-### Current Progress
-- X issues Done
-- Y issues In Progress
-- Z issues Todo
-
-### Notes for Next Session
-- [Important context]
-- [Recommendations]
-- [Any concerns]
-```
-
-Ensure:
-- All code committed
-- No uncommitted changes
-- App in working state
-
----
-
-## LINEAR WORKFLOW RULES
-
-**Status Transitions:**
-- Todo → In Progress (when starting)
-- In Progress → Done (when verified)
-
-**NEVER:**
-- Delete or modify issue descriptions
-- Mark Done without verification
-- Leave issues In Progress when switching
-
----
-
-## TYPE ANNOTATION GUIDELINES
-
-**Imports needed:**
-```python
-from typing import Optional, Dict, List, Any, Tuple, Callable
-from pathlib import Path
-from utils.types import <ProjectSpecificTypes>
-```
-
-**Common patterns:**
-```python
-# Functions
-def process_data(input: str, options: Optional[Dict[str, Any]] = None) -> List[str]:
-    """Process input data."""
-    ...
-
-# Methods with self
-def save(self, path: Path) -> None:
-    """Save to file."""
-    ...
-
-# Async functions
-async def fetch_data(url: str) -> Dict[str, Any]:
-    """Fetch from API."""
-    ...
-```
-
-**Use project types from `utils/types.py`:**
-- Metadata, OCRResponse, TOCEntry, ChunkData, PipelineResult, etc.
-
----
-
-## DOCSTRING TEMPLATE (Google Style)
-
-```python
-def function_name(param1: str, param2: int = 0) -> List[str]:
-    """
-    Brief one-line description.
-
-    More detailed description if needed. Explain what the function does,
-    any important behavior, side effects, etc.
-
-    Args:
-        param1: Description of param1.
-        param2: Description of param2. Defaults to 0.
-
-    Returns:
-        Description of return value.
-
-    Raises:
-        ValueError: When param1 is empty.
-        IOError: When file cannot be read.
-
-    Example:
-        >>> result = function_name("test", 5)
-        >>> print(result)
-        ['test', 'test', 'test', 'test', 'test']
-    """
-```
-
----
-
-## IMPORTANT REMINDERS
-
-**Your Goal:** Add strict type annotations and comprehensive documentation to all Python modules
-
-**This Session's Goal:** Complete 1-2 issues with quality work and clean handoff
-
-**Quality Bar:**
-- mypy --strict passes with zero errors
-- All public functions have complete Google-style docstrings
-- Code is clean and well-documented
-
-**Context is finite.** End sessions early with good handoff notes. The next agent will continue.
-
----
-
-Begin by running STEP 1 (Get Your Bearings).