diff --git a/REMOTE_WEAVIATE_ARCHITECTURE.md b/REMOTE_WEAVIATE_ARCHITECTURE.md
deleted file mode 100644
index cc4200c..0000000
--- a/REMOTE_WEAVIATE_ARCHITECTURE.md
+++ /dev/null
@@ -1,431 +0,0 @@
-# Architecture pour Weaviate distant (Synology/VPS)
-
-## Votre cas d'usage
-
-**Situation** : Application LLM (local ou cloud) → Weaviate (Synology ou VPS distant)
-
-**Besoins** :
-- ✅ Fiabilité maximale
-- ✅ Sécurité (données privées)
-- ✅ Performance acceptable
-- ✅ Maintenance simple
-
----
-
-## 🏆 Option recommandée : API REST + Tunnel sécurisé
-
-### Architecture globale
-
-```
-┌──────────────────────────────────────────────────────────────┐
-│ Application LLM │
-│ (Claude API, OpenAI, Ollama local, etc.) │
-└────────────────────┬─────────────────────────────────────────┘
- │
- ▼
-┌──────────────────────────────────────────────────────────────┐
-│ API REST Custom (Flask/FastAPI) │
-│ - Authentification JWT/API Key │
-│ - Rate limiting │
-│ - Logging │
-│ - HTTPS (Let's Encrypt) │
-└────────────────────┬─────────────────────────────────────────┘
- │
- ▼ (réseau privé ou VPN)
-┌──────────────────────────────────────────────────────────────┐
-│ Synology NAS / VPS │
-│ ┌────────────────────────────────────────────────────┐ │
-│ │ Docker Compose │ │
-│ │ ┌──────────────────┐ ┌─────────────────────┐ │ │
-│ │ │ Weaviate :8080 │ │ text2vec-transformers│ │ │
-│ │ └──────────────────┘ └─────────────────────┘ │ │
-│ └────────────────────────────────────────────────────┘ │
-└──────────────────────────────────────────────────────────────┘
-```
-
-### Pourquoi cette option ?
-
-✅ **Fiabilité maximale** (5/5)
-- HTTP/REST = protocole standard, éprouvé
-- Retry automatique facile
-- Gestion d'erreur claire
-
-✅ **Sécurité** (5/5)
-- HTTPS obligatoire
-- Authentification par API key
-- IP whitelisting possible
-- Logs d'audit
-
-✅ **Performance** (4/5)
-- Latence réseau inévitable
-- Compression gzip possible
-- Cache Redis optionnel
-
-✅ **Maintenance** (5/5)
-- Code simple (Flask/FastAPI)
-- Monitoring facile
-- Déploiement standard
-
----
-
-## Comparaison des 4 options
-
-### Option 1 : API REST Custom (⭐ RECOMMANDÉ)
-
-**Architecture** : App → API REST → Weaviate
-
-**Code exemple** :
-
-```python
-# api_server.py (déployé sur VPS/Synology)
-from fastapi import FastAPI, HTTPException, Security
-from fastapi.security import APIKeyHeader
-import weaviate
-
-app = FastAPI()
-api_key_header = APIKeyHeader(name="X-API-Key")
-
-# Connect to Weaviate (local on same machine)
-client = weaviate.connect_to_local()
-
-def verify_api_key(api_key: str = Security(api_key_header)):
- if api_key != os.getenv("API_KEY"):
- raise HTTPException(status_code=403, detail="Invalid API key")
- return api_key
-
-@app.post("/search")
-async def search_chunks(
- query: str,
- limit: int = 10,
- api_key: str = Security(verify_api_key)
-):
- collection = client.collections.get("Chunk")
- result = collection.query.near_text(
- query=query,
- limit=limit
- )
- return {"results": [obj.properties for obj in result.objects]}
-
-@app.post("/insert_pdf")
-async def insert_pdf(
- pdf_path: str,
- api_key: str = Security(verify_api_key)
-):
- # Appeler le pipeline library_rag
- from utils.pdf_pipeline import process_pdf
- result = process_pdf(Path(pdf_path))
- return result
-```
-
-**Déploiement** :
-
-```bash
-# Sur VPS/Synology
-docker-compose up -d weaviate text2vec
-uvicorn api_server:app --host 0.0.0.0 --port 8000 --ssl-keyfile key.pem --ssl-certfile cert.pem
-```
-
-**Avantages** :
-- ✅ Contrôle total sur l'API
-- ✅ Facile à sécuriser (HTTPS + API key)
-- ✅ Peut wrapper tout le pipeline library_rag
-- ✅ Monitoring et logging faciles
-
-**Inconvénients** :
-- ⚠️ Code custom à maintenir
-- ⚠️ Besoin d'un serveur web (nginx/uvicorn)
-
----
-
-### Option 2 : Accès direct Weaviate via VPN
-
-**Architecture** : App → VPN → Weaviate:8080
-
-**Configuration** :
-
-```bash
-# Sur Synology : activer VPN Server (OpenVPN/WireGuard)
-# Sur client : se connecter au VPN
-# Accès direct à http://192.168.x.x:8080 (IP privée Synology)
-```
-
-**Code client** :
-
-```python
-# Dans votre app LLM
-import weaviate
-
-# Via VPN, IP privée Synology
-client = weaviate.connect_to_custom(
- http_host="192.168.1.100",
- http_port=8080,
- http_secure=False, # En VPN, pas besoin HTTPS
- grpc_host="192.168.1.100",
- grpc_port=50051,
-)
-
-# Utilisation directe
-collection = client.collections.get("Chunk")
-result = collection.query.near_text(query="justice")
-```
-
-**Avantages** :
-- ✅ Très simple (pas de code custom)
-- ✅ Sécurité via VPN
-- ✅ Utilise Weaviate client Python directement
-
-**Inconvénients** :
-- ⚠️ VPN doit être actif en permanence
-- ⚠️ Latence VPN
-- ⚠️ Pas de couche d'abstraction (app doit connaître Weaviate)
-
----
-
-### Option 3 : MCP Server HTTP sur VPS
-
-**Architecture** : App → MCP HTTP → Weaviate
-
-**Problème** : FastMCP SSE ne fonctionne pas bien en production (comme on l'a vu)
-
-**Solution** : Wrapper custom MCP over HTTP
-
-```python
-# mcp_http_wrapper.py (sur VPS)
-from fastapi import FastAPI
-from mcp_tools import parse_pdf_handler, search_chunks_handler
-from pydantic import BaseModel
-
-app = FastAPI()
-
-class SearchRequest(BaseModel):
- query: str
- limit: int = 10
-
-@app.post("/mcp/search_chunks")
-async def mcp_search(req: SearchRequest):
- # Appeler directement le handler MCP
- input_data = SearchChunksInput(query=req.query, limit=req.limit)
- result = await search_chunks_handler(input_data)
- return result.model_dump()
-```
-
-**Avantages** :
-- ✅ Réutilise le code MCP existant
-- ✅ HTTP standard
-
-**Inconvénients** :
-- ⚠️ MCP stdio ne peut pas être utilisé
-- ⚠️ Besoin d'un wrapper HTTP custom de toute façon
-- ⚠️ Équivalent à l'option 1 en plus complexe
-
-**Verdict** : Option 1 (API REST pure) est meilleure
-
----
-
-### Option 4 : Tunnel SSH + Port forwarding
-
-**Architecture** : App → SSH tunnel → localhost:8080 (Weaviate distant)
-
-**Configuration** :
-
-```bash
-# Sur votre machine locale
-ssh -L 8080:localhost:8080 user@synology-ip
-
-# Weaviate distant est maintenant accessible sur localhost:8080
-```
-
-**Code** :
-
-```python
-# Dans votre app (pense que Weaviate est local)
-client = weaviate.connect_to_local() # Va sur localhost:8080 = tunnel SSH
-```
-
-**Avantages** :
-- ✅ Sécurité SSH
-- ✅ Simple à configurer
-- ✅ Pas de code custom
-
-**Inconvénients** :
-- ⚠️ Tunnel doit rester ouvert
-- ⚠️ Pas adapté pour une app cloud
-- ⚠️ Latence SSH
-
----
-
-## 🎯 Recommandations selon votre cas
-
-### Cas 1 : Application locale (votre PC) → Weaviate Synology/VPS
-
-**Recommandation** : **VPN + Accès direct Weaviate** (Option 2)
-
-**Pourquoi** :
-- Simple à configurer sur Synology (VPN Server intégré)
-- Pas de code custom
-- Sécurité via VPN
-- Performance acceptable en réseau local/VPN
-
-**Setup** :
-
-1. Synology : Activer VPN Server (OpenVPN)
-2. Client : Se connecter au VPN
-3. Python : `weaviate.connect_to_custom(http_host="192.168.x.x", ...)`
-
----
-
-### Cas 2 : Application cloud (serveur distant) → Weaviate Synology/VPS
-
-**Recommandation** : **API REST Custom** (Option 1)
-
-**Pourquoi** :
-- Pas de VPN nécessaire
-- HTTPS public avec Let's Encrypt
-- Contrôle d'accès par API key
-- Rate limiting
-- Monitoring
-
-**Setup** :
-
-1. VPS/Synology : Docker Compose (Weaviate + API REST)
-2. Domaine : api.monrag.com → VPS IP
-3. Let's Encrypt : HTTPS automatique
-4. App cloud : Appelle `https://api.monrag.com/search?api_key=xxx`
-
----
-
-### Cas 3 : Développement local temporaire → Weaviate distant
-
-**Recommandation** : **Tunnel SSH** (Option 4)
-
-**Pourquoi** :
-- Setup en 1 ligne
-- Aucune config permanente
-- Parfait pour le dev/debug
-
-**Setup** :
-
-```bash
-ssh -L 8080:localhost:8080 user@vps
-# Weaviate distant accessible sur localhost:8080
-```
-
----
-
-## 🔧 Déploiement recommandé pour VPS
-
-### Stack complète
-
-```yaml
-# docker-compose.yml (sur VPS)
-version: '3.8'
-
-services:
- # Weaviate + embeddings
- weaviate:
- image: cr.weaviate.io/semitechnologies/weaviate:1.34.4
- ports:
- - "127.0.0.1:8080:8080" # Uniquement localhost (sécurité)
- environment:
- AUTHENTICATION_APIKEY_ENABLED: "true"
- AUTHENTICATION_APIKEY_ALLOWED_KEYS: "my-secret-key"
- # ... autres configs
- volumes:
- - weaviate_data:/var/lib/weaviate
-
- text2vec-transformers:
- image: cr.weaviate.io/semitechnologies/transformers-inference:baai-bge-m3-onnx-latest
- # ... config
-
- # API REST custom
- api:
- build: ./api
- ports:
- - "8000:8000"
- environment:
- WEAVIATE_URL: http://weaviate:8080
- API_KEY: ${API_KEY}
- MISTRAL_API_KEY: ${MISTRAL_API_KEY}
- depends_on:
- - weaviate
- restart: always
-
- # NGINX reverse proxy + HTTPS
- nginx:
- image: nginx:alpine
- ports:
- - "80:80"
- - "443:443"
- volumes:
- - ./nginx.conf:/etc/nginx/nginx.conf
- - /etc/letsencrypt:/etc/letsencrypt
- depends_on:
- - api
-
-volumes:
- weaviate_data:
-```
-
-### NGINX config
-
-```nginx
-# nginx.conf
-server {
- listen 443 ssl;
- server_name api.monrag.com;
-
- ssl_certificate /etc/letsencrypt/live/api.monrag.com/fullchain.pem;
- ssl_certificate_key /etc/letsencrypt/live/api.monrag.com/privkey.pem;
-
- location / {
- proxy_pass http://api:8000;
- proxy_set_header Host $host;
- proxy_set_header X-Real-IP $remote_addr;
-
- # Rate limiting
- limit_req zone=api_limit burst=10 nodelay;
- }
-}
-```
-
----
-
-## 📊 Comparaison finale
-
-| Critère | VPN + Direct | API REST | Tunnel SSH | MCP HTTP |
-|---------|--------------|----------|------------|----------|
-| **Fiabilité** | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ |
-| **Sécurité** | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
-| **Simplicité** | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐ |
-| **Performance** | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ |
-| **Maintenance** | ⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐ |
-| **Production** | ✅ Oui | ✅ Oui | ❌ Non | ⚠️ Possible |
-
----
-
-## 💡 Ma recommandation finale
-
-### Pour Synology (usage personnel/équipe)
-**VPN + Accès direct Weaviate** (Option 2)
-- Synology a un excellent VPN Server intégré
-- Sécurité maximale
-- Simple à maintenir
-
-### Pour VPS (usage production/public)
-**API REST Custom** (Option 1)
-- Contrôle total
-- HTTPS public
-- Scalable
-- Monitoring complet
-
----
-
-## 🚀 Prochaine étape recommandée
-
-Voulez-vous que je crée :
-
-1. **Le code de l'API REST** (Flask/FastAPI) avec auth + rate limiting ?
-2. **Le docker-compose VPS complet** avec nginx + Let's Encrypt ?
-3. **Le guide d'installation Synology VPN** + config client ?
-
-Dites-moi votre cas d'usage exact et je vous prépare la solution complète ! 🎯
diff --git a/navette.txt b/navette.txt
deleted file mode 100644
index af78880..0000000
--- a/navette.txt
+++ /dev/null
@@ -1,2510 +0,0 @@
-================================================================================
-NAVETTE - COMMUNICATION CLAUDE <-> DAVID
-================================================================================
-Date: 19 decembre 2025
-Derniere mise a jour: NOUVEAU SPEC CREE
-
-================================================================================
-NOUVEAU SPEC CREE !
-================================================================================
-
-J'ai reecrit COMPLETEMENT le spec selon ta demande.
-
-NOUVEAU FICHIER: prompts/app_spec_ikario_rag_UI.txt
-
-================================================================================
-DIFFERENCES AVEC L'ANCIEN SPEC
-================================================================================
-
-ANCIEN SPEC (app_spec_ikario_rag_improvements.txt):
-❌ Modifiait le code Python ikario_rag (mcp_ikario_memory.py, server.py)
-❌ Developpait dans SynologyDrive
-❌ Ajoutait des outils MCP au serveur Python
-❌ A cause le probleme (agent a modifie ton code)
-
-NOUVEAU SPEC (app_spec_ikario_rag_UI.txt):
-✓ Developpe UNIQUEMENT dans generations/ikario_body/
-✓ UTILISE les 7 outils MCP existants (via client)
-✓ NE TOUCHE PAS au code ikario_rag
-✓ Ajoute interface utilisateur pour exploiter la memoire
-
-================================================================================
-15 NOUVELLES FEATURES (FRONTEND + BACKEND)
-================================================================================
-
-BACKEND (server/):
-1. Routes API Memory (POST /api/memory/thoughts, GET, etc.)
-2. Memory Service Layer (wrapper MCP client)
-3. Error Handling & Logging (robuste)
-4. Memory Stats Endpoint (statistiques)
-
-FRONTEND (src/):
-5. useMemory Hook (React hook centralise)
-6. Memory Panel Component (sidebar memoire)
-7. Add Thought Modal (ajouter pensees)
-8. Memory Settings Panel (preferences)
-9. Save to Memory Button (depuis chat)
-10. Memory Context Panel (contexte pendant chat)
-11. Memory Search Interface (recherche avancee)
-12. Concepts Graph Visualization (graphe interactif)
-
-DOCUMENTATION & TESTS:
-13. Memory API Guide (doc complete)
-14. Integration Tests (tests backend)
-15. Memory Tour (onboarding users)
-
-================================================================================
-OUTILS MCP EXISTANTS UTILISES
-================================================================================
-
-Le serveur ikario_rag expose deja 7 outils MCP:
-1. add_thought - Ajouter une pensee
-2. add_conversation - Ajouter une conversation
-3. search_thoughts - Rechercher pensees
-4. search_conversations - Rechercher conversations
-5. search_memories - Recherche globale
-6. trace_concept_evolution - Tracer evolution concept
-7. check_consistency - Check coherence
-
-On utilise ces outils VIA le client MCP deja present dans ikario_body:
-- server/services/mcpClient.js
-
-================================================================================
-ARCHITECTURE
-================================================================================
-
-User Interface (React)
- ↓
-Backend API (Express routes)
- ↓
-Memory Service (wrapper)
- ↓
-MCP Client (mcpClient.js)
- ↓
-MCP Protocol (stdio)
- ↓
-Ikario RAG Server (Python, SynologyDrive)
- ↓
-ChromaDB (embeddings)
-
-PAS DE MODIFICATION dans ikario_rag (SynologyDrive) !
-
-================================================================================
-PROCHAINES ACTIONS
-================================================================================
-
-1. SUPPRIMER L'ANCIEN SPEC?
- - Fichier: prompts/app_spec_ikario_rag_improvements.txt
- - Options:
- a) SUPPRIMER (recommande, cause confusion)
- b) RENOMMER en .OLD (backup)
- c) GARDER (mais risque relancer par erreur)
-
-2. SUPPRIMER LES 15 ISSUES LINEAR EXISTANTES?
- - Issues TEAMPHI-305 a 319 (anciennes features)
- - Ces issues parlent de modifier ikario_rag (on ne veut plus)
- - Options:
- a) SUPPRIMER toutes (clean slate)
- b) GARDER comme doc (mais marquer Canceled)
-
-3. CREER 15 NOUVELLES ISSUES?
- - Pour les 15 features du nouveau spec (UI)
- - Issues qui developpent dans ikario_body
- - Options:
- a) OUI, creer maintenant avec initializer bis
- b) OUI, mais manuellement dans Linear
- c) NON, juste developper sans Linear
-
-================================================================================
-MES RECOMMANDATIONS
-================================================================================
-
-1. ANCIEN SPEC: SUPPRIMER
- - Fichier app_spec_ikario_rag_improvements.txt
- - Eviter confusion future
- - Le nouveau spec est complet
-
-2. ANCIENNES ISSUES (305-319): SUPPRIMER TOUTES
- - Elles parlent de modifier ikario_rag
- - On ne veut plus faire ca
- - Clean slate
-
-3. NOUVELLES ISSUES: CREER MAINTENANT
- - 15 nouvelles issues pour features UI
- - Lancer initializer bis avec nouveau spec
- - Developper uniquement dans ikario_body
- - Avec restrictions sandbox pour SynologyDrive
-
-================================================================================
-COMMANDES POUR NETTOYER
-================================================================================
-
-Si tu es d'accord avec mes recommandations:
-
-1. Supprimer ancien spec:
- rm C:/GitHub/Linear_coding/prompts/app_spec_ikario_rag_improvements.txt
-
-2. Supprimer 15 anciennes issues:
- (je peux le faire via Linear API)
-
-3. Creer 15 nouvelles issues:
- python autonomous_agent_demo.py --project-dir ikario_body --new-spec app_spec_ikario_rag_UI.txt
-
-4. Ajouter restrictions sandbox (avant de lancer agent):
- (je dois modifier autonomous_agent_demo.py pour bloquer SynologyDrive)
-
-================================================================================
-QUESTIONS POUR TOI
-================================================================================
-
-Reponds avec 3 choix:
-
-1. Ancien spec (app_spec_ikario_rag_improvements.txt):
- a) SUPPRIMER
- b) RENOMMER .OLD
- c) GARDER
-
-2. Anciennes issues Linear (TEAMPHI-305 a 319):
- a) SUPPRIMER toutes
- b) GARDER comme doc (Canceled)
- c) GARDER telles quelles
-
-3. Nouvelles issues pour nouveau spec:
- a) CREER maintenant (agent initializer bis)
- b) CREER manuellement dans Linear
- c) PAS D'ISSUES (developper sans Linear)
-
-Exemple de reponse:
-1. a
-2. a
-3. a
-
-================================================================================
-VERIFICATION NOUVEAU SPEC
-================================================================================
-
-Le nouveau spec est dans: prompts/app_spec_ikario_rag_UI.txt
-
-Tu peux le lire pour verifier que c'est bien ce que tu veux.
-
-Points importants:
-- 15 features frontend/backend
-- ZERO modification ikario_rag
-- Developpe dans ikario_body uniquement
-- Utilise outils MCP existants
-- 5 phases implementation (7-10 jours total)
-
-Si tu veux des modifications au spec, dis-le maintenant AVANT de creer les issues.
-
-================================================================================
-SYNTHESE DES BESOINS FONCTIONNELS
-================================================================================
-Date: 19 decembre 2025
-Derniere mise a jour: SYNTHESE AJOUTEE
-
-Tu as demande de clarifier les deux fonctionnalites principales.
-
-Voici ma comprehension et ma synthese:
-
-================================================================================
-BESOIN 1: PENSEES (THOUGHTS)
-================================================================================
-
-COMPORTEMENT SOUHAITE:
-- Le LLM peut ECRIRE des pensees quand il le souhaite
-- Le LLM peut LIRE des pensees existantes
-- Le LLM peut RECHERCHER des pensees pertinentes
-
-OUTILS MCP UTILISES (deja exposes par ikario_rag):
-1. add_thought - Pour ECRIRE une nouvelle pensee
-2. search_thoughts - Pour RECHERCHER des pensees
-
-COMMENT CA MARCHE:
-- Pendant une conversation, le LLM decide de sauvegarder une reflexion
-- Exemple: "Je viens de comprendre que l'utilisateur prefere React a Vue"
-- Le LLM appelle add_thought via le MCP client
-- La pensee est stockee dans ChromaDB avec embeddings semantiques
-- Plus tard, le LLM peut rechercher: "preferences frontend de l'utilisateur"
-- search_thoughts retourne les pensees pertinentes
-
-MODE D'INVOCATION:
-- MANUEL (LLM decide): Le LLM utilise l'outil quand il juge necessaire
-- MANUEL (User decide): Bouton "Save to Memory" dans l'UI chat
-- SEMI-AUTO: Suggestion automatique apres conversations importantes
-
-================================================================================
-BESOIN 2: CONVERSATIONS (AUTO-SAVE)
-================================================================================
-
-COMPORTEMENT SOUHAITE:
-- Apres CHAQUE reponse du LLM, la conversation est sauvegardee
-- Sauvegarde AUTOMATIQUE (pas besoin d'action manuelle)
-- Meme conversation = tous les messages sont lies (conversation_id)
-
-OUTILS MCP UTILISES (deja exposes par ikario_rag):
-1. add_conversation - Pour SAUVEGARDER la conversation
-
-COMMENT CA MARCHE:
-- User: "Comment faire un fetch API en React?"
-- LLM: [Reponse detaillee sur fetch API]
-- AUTOMATIQUEMENT apres la reponse du LLM:
- * Backend detecte fin de reponse LLM
- * Backend appelle add_conversation avec:
- - user_message: "Comment faire un fetch API en React?"
- - assistant_message: [la reponse du LLM]
- - conversation_id: ID unique pour cette session chat
- * ChromaDB stocke avec embeddings semantiques
-- Prochaine fois, recherche "React fetch API" retournera cette conversation
-
-ARCHITECTURE TECHNIQUE:
-- Hook backend: onMessageComplete()
-- Declenche: Apres chaque reponse LLM streamed completement
-- Appelle: mcpClient.callTool('add_conversation', {...})
-- Parametres:
- {
- user_message: string,
- assistant_message: string,
- conversation_id: string (UUID session),
- timestamp: ISO date,
- metadata: {
- model: "claude-sonnet-4.5",
- tokens: number,
- ...
- }
- }
-
-================================================================================
-MAPPING COMPLET DES 7 OUTILS MCP
-================================================================================
-
-POUR PENSEES (THOUGHTS):
-1. add_thought -------> Ecrire une nouvelle pensee
-2. search_thoughts ---> Rechercher des pensees
-3. trace_concept_evolution -> Tracer evolution d'un concept dans les pensees
-4. check_consistency -> Verifier coherence entre pensees
-
-POUR CONVERSATIONS:
-1. add_conversation -----> Sauvegarder une conversation (AUTO)
-2. search_conversations -> Rechercher dans l'historique
-3. search_memories -------> Recherche globale (thoughts + conversations)
-
-AVANCEES (optionnel):
-1. trace_concept_evolution -> Voir comment un concept evolue dans le temps
-2. check_consistency --------> Detecter contradictions
-
-================================================================================
-ARCHITECTURE D'IMPLEMENTATION
-================================================================================
-
-BACKEND (Express API):
-------------------
-1. POST /api/chat/message
- - Recoit message user
- - Envoie a Claude API
- - Stream la reponse
- - APRES streaming complete:
- * Appelle add_conversation automatiquement
- * Retourne success au frontend
-
-2. POST /api/memory/thoughts (manuel)
- - User clique "Save to Memory"
- - Backend appelle add_thought
- - Retourne confirmation
-
-3. GET /api/memory/search?q=...
- - User cherche dans sidebar
- - Backend appelle search_memories
- - Retourne resultats (thoughts + conversations)
-
-FRONTEND (React):
---------------
-1. Chat Interface:
- - Bouton "Save to Memory" sur chaque message
- - Auto-save indicator (petit icon quand conversation sauvegardee)
-
-2. Memory Sidebar:
- - Barre de recherche
- - Liste de resultats (thoughts + conversations)
- - Filtre: "Thoughts only" / "Conversations only" / "All"
-
-3. Memory Context Panel:
- - Pendant qu'on tape, affiche pensees/conversations pertinentes
- - Auto-recherche basee sur le contexte du message
-
-================================================================================
-EXEMPLE CONCRET D'UTILISATION
-================================================================================
-
-SCENARIO 1: CONVERSATION AUTO-SAUVEGARDEE
------------------------------------------
-User: "Comment implementer un dark mode en React?"
-LLM: [Reponse detaillee avec code examples]
-BACKEND (auto): Appelle add_conversation avec les deux messages
-ChromaDB: Stocke avec embeddings
-
-2 semaines plus tard:
-User: "dark mode"
-Search: Retourne la conversation precedente
-LLM: Peut relire et continuer la discussion
-
-SCENARIO 2: PENSEE MANUELLE
----------------------------
-User: "Je prefere utiliser TailwindCSS plutot que styled-components"
-LLM: "D'accord, je note votre preference"
-LLM (interne): Appelle add_thought("User prefers TailwindCSS over styled-components")
-ChromaDB: Stocke la preference
-
-Plus tard:
-User: "Aide-moi a styler ce composant"
-LLM (interne): Recherche "styling preferences"
-Result: Trouve la pensee sur TailwindCSS
-LLM: "Je vais utiliser TailwindCSS pour le styling, comme vous preferez"
-
-SCENARIO 3: BOUTON SAVE TO MEMORY
----------------------------------
-User: "Voici nos conventions de nommage: components en PascalCase, utils en camelCase"
-LLM: [Repond avec confirmation]
-User: [Clique "Save to Memory"]
-Frontend: POST /api/memory/thoughts
-Backend: Appelle add_thought avec le message user
-ChromaDB: Stocke les conventions
-
-Plus tard:
-LLM cree un nouveau composant et respecte automatiquement les conventions
-(car il peut rechercher "naming conventions" avant de generer du code)
-
-================================================================================
-DIFFERENCES CLES ENTRE THOUGHTS ET CONVERSATIONS
-================================================================================
-
-THOUGHTS:
-- Contenu: Reflexions, preferences, conventions, apprentissages
-- Taille: Generalement courts (1-3 phrases)
-- Declenchement: Manuel (LLM decide ou User clique bouton)
-- Granularite: Atomique (1 pensee = 1 concept)
-- Exemple: "User prefers functional components over class components"
-
-CONVERSATIONS:
-- Contenu: Echanges complets user-assistant
-- Taille: Variable (peut etre long)
-- Declenchement: AUTOMATIQUE apres chaque reponse LLM
-- Granularite: Dialogue (1 conversation = 1 echange Q&A)
-- Exemple: Tout l'echange sur "Comment faire un fetch API en React?"
-
-LES DEUX ENSEMBLE:
-- Complementaires: Thoughts = knowledge, Conversations = context
-- Recherchables: search_memories cherche dans les deux
-- Evolution: trace_concept_evolution fonctionne sur les deux
-
-================================================================================
-QUESTIONS DE CLARIFICATION
-================================================================================
-
-Avant de continuer, j'ai besoin de confirmer quelques details:
-
-1. AUTO-SAVE CONVERSATIONS:
- - Faut-il sauvegarder TOUTES les conversations?
- - Ou seulement certaines (ex: > 100 tokens, contient du code, etc.)?
- - Mon avis: TOUTES, mais avec option user "Disable auto-save" dans settings
-
-2. CONVERSATION_ID:
- - Un conversation_id = une session chat complete (plusieurs messages)?
- - Ou un conversation_id = un echange unique (1 user msg + 1 assistant msg)?
- - Mon avis: Session complete (comme tu as dit "meme conversation")
-
-3. DECLENCHEMENT AUTO-SAVE:
- - Immediate (apres chaque reponse)?
- - Ou batched (toutes les 5 minutes)?
- - Mon avis: Immediate mais asynchrone (ne bloque pas le chat)
-
-4. PRIVACY:
- - Les conversations auto-sauvegardees sont "private" par defaut?
- - Ou "shared" (visible par d'autres users)?
- - Mon avis: Private par defaut dans un contexte single-user
-
-================================================================================
-RECOMMANDATION FINALE
-================================================================================
-
-Je recommande cette implementation:
-
-PHASE 1 (Core):
-- Auto-save conversations (add_conversation apres chaque reponse)
-- Bouton manuel "Save to Memory" (add_thought)
-- Search interface basique (search_memories)
-
-PHASE 2 (Enhanced):
-- Memory sidebar avec resultats enrichis
-- Filtres thoughts vs conversations
-- Memory context panel (suggestions pendant typing)
-
-PHASE 3 (Advanced):
-- Concepts graph visualization (trace_concept_evolution)
-- Consistency checker (check_consistency)
-- Memory settings (disable auto-save, privacy, etc.)
-
-TOTAL: 15 features comme dans le spec app_spec_ikario_rag_UI.txt
-
-================================================================================
-MODIFICATIONS VALIDEES
-================================================================================
-Date: 19 decembre 2025 - 23h30
-
-MODIFICATION 1: PENSEES = LLM SEULEMENT
----------------------------------------
-SUPPRIME:
-- Bouton "Save to Memory" pour l'utilisateur
-- Suggestions automatiques
-
-CONSERVE:
-- Seulement le LLM decide quand ecrire/lire ses pensees
-- Les pensees sont un outil INTERNE du LLM
-
-================================================================================
-ANALYSE CODE: add_conversation
-================================================================================
-
-J'ai lu mcp_ikario_memory.py (ligne 100-189).
-
-REPONSE: NON, add_conversation NE PEUT PAS faire de mise a jour incrementale
-
-PROBLEME IDENTIFIE:
-------------------
-Ligne 160-164:
-```python
-self.conversations.add(
- documents=[full_conversation_text],
- metadatas=[main_metadata],
- ids=[conversation_id] # <-- PROBLEME ICI
-)
-```
-
-ChromaDB.add() avec un ID existant:
-- Option 1: Erreur "ID already exists"
-- Option 2: Ecrase completement l'ancien document
-
-DONC:
-- Appeler add_conversation 2 fois avec meme conversation_id = ECRASEMENT
-- Pas de mecanisme "append" pour ajouter des messages
-- C'est un REMPLACEMENT complet, pas une mise a jour incrementale
-
-COMPORTEMENT ACTUEL:
--------------------
-Premier appel:
-add_conversation(conversation_id="session_123", messages=[msg1, msg2])
--> Cree conversation avec 2 messages
-
-Deuxieme appel:
-add_conversation(conversation_id="session_123", messages=[msg1, msg2, msg3, msg4])
--> ECRASE la conversation precedente
--> Remplace completement par 4 messages
-
-CONSEQUENCE POUR TON BESOIN:
-----------------------------
-Tu veux sauvegarder apres CHAQUE reponse du LLM dans la MEME conversation.
-
-Exemple:
-User: "Bonjour"
-LLM: "Salut!"
--> Sauvegarde conversation_id="conv_20251219" avec 2 messages
-
-User: "Comment vas-tu?"
-LLM: "Bien merci!"
--> Doit ajouter 2 nouveaux messages a "conv_20251219"
--> MAIS add_conversation va ECRASER les 2 premiers messages!
-
-================================================================================
-SOLUTION: TU DOIS AJOUTER UN NOUVEL OUTIL
-================================================================================
-
-OPTION A (recommande): append_to_conversation
-----------------------------------------------
-Nouvel outil qui ajoute des messages sans ecraser:
-
-```python
-async def append_to_conversation(
- self,
- conversation_id: str,
- new_messages: List[Dict[str, str]]
-) -> str:
- """
- Ajoute de nouveaux messages a une conversation existante
- """
- # 1. Recuperer la conversation existante
- existing = self.conversations.get(ids=[conversation_id])
-
- # 2. Extraire les anciens messages (ou les stocker autrement)
-
- # 3. Merger old_messages + new_messages
-
- # 4. Re-creer le document principal avec tous les messages
-
- # 5. Ajouter les nouveaux messages individuels
-```
-
-OPTION B: update_conversation (remplacement complet)
----------------------------------------------------
-Similaire a add_conversation mais avec upsert:
-
-```python
-async def update_conversation(
- self,
- conversation_id: str,
- all_messages: List[Dict[str, str]],
- ...
-) -> str:
- """
- Remplace completement une conversation existante
- """
- # Delete old documents
- self.conversations.delete(ids=[conversation_id])
-
- # Add new version
- # (meme code que add_conversation)
-```
-
-OPTION C: Modifier add_conversation
------------------------------------
-Ajouter logique de detection:
-
-```python
-async def add_conversation(...):
- # Verifier si conversation_id existe deja
- try:
- existing = self.conversations.get(ids=[conversation_id])
- if existing:
- # Faire un append
- except:
- # Creer nouvelle conversation
-```
-
-================================================================================
-MA RECOMMANDATION
-================================================================================
-
-UTILISE OPTION A: append_to_conversation
-
-POURQUOI:
-- Semantique claire: "append" = ajouter sans ecraser
-- Separation des responsabilites: add = creation, append = ajout
-- Plus facile a debugger
-- Pas de "magic" (Option C serait trop implicite)
-
-ARCHITECTURE BACKEND ikario_body:
--------------------------------
-POST /api/chat/message
--> User envoie message
--> LLM repond
--> Apres reponse complete:
- - Si c'est le premier message de la session:
- * Appelle add_conversation(conversation_id, [user_msg, assistant_msg])
- - Si conversation existe deja:
- * Appelle append_to_conversation(conversation_id, [user_msg, assistant_msg])
-
-ALTERNATIVE SIMPLE (sans append):
----------------------------------
-Si tu ne veux pas modifier ikario_rag:
-- Backend garde TOUS les messages de la session en memoire
-- Appelle add_conversation SEULEMENT a la fin de la session (quand user ferme le chat)
-- Parametres: conversation_id + TOUS les messages accumules
-
-MAIS:
-- Risque de perte si crash avant la fin
-- Pas de recherche en temps reel pendant la conversation
-- Moins robuste
-
-================================================================================
-DECISION REQUISE
-================================================================================
-
-Tu dois choisir:
-
-1. AJOUTER append_to_conversation dans ikario_rag
- - Je modifie mcp_ikario_memory.py (dans SynologyDrive)
- - J'ajoute le nouvel outil au serveur MCP
- - Puis je mets a jour le spec UI
-
-2. UTILISER ALTERNATIVE SIMPLE (save a la fin de session)
- - Pas de modification ikario_rag
- - Backend accumule messages en memoire
- - Sauvegarde complete a la fin
-
-3. MODIFIER add_conversation (Option C)
- - Ajouter logique auto-detect + append
- - Moins explicite mais plus simple cote client
-
-Quelle option preferes-tu?
-
-================================================================================
-QUESTION CRITIQUE: ECRASEMENT ET EMBEDDINGS
-================================================================================
-Date: 19 decembre 2025 - 23h35
-
-Tu demandes: "Est-ce que l'écrasement supprime aussi les anciens embeddings?"
-
-REPONSE COURTE: NON, c'est encore PIRE que je pensais!
-
-ANALYSE DETAILLEE:
------------------
-
-Rappel de l'architecture add_conversation:
-
-1. DOCUMENT PRINCIPAL (ligne 160-164):
- ID = conversation_id (ex: "conv_20251219_1430")
- Contenu = conversation complete (tous les messages concatenes)
-
-2. MESSAGES INDIVIDUELS (ligne 166-187):
- IDs = conversation_id + "_msg_001", "_msg_002", etc.
- Contenu = chaque message avec son propre embedding
-
-SCENARIO PROBLEMATIQUE:
-----------------------
-
-Premier appel:
-add_conversation(conversation_id="conv_123", messages=[msg1, msg2])
-
-ChromaDB contient:
-- conv_123 (document principal, embedding de "msg1 + msg2")
-- conv_123_msg_001 (msg1, embedding individuel)
-- conv_123_msg_002 (msg2, embedding individuel)
-
-Deuxieme appel:
-add_conversation(conversation_id="conv_123", messages=[msg1, msg2, msg3, msg4])
-
-QUE SE PASSE-T-IL?
-
-1. Document principal conv_123:
- - ECRASE (nouveau embedding pour "msg1 + msg2 + msg3 + msg4")
- - Ancien embedding perdu
-
-2. Messages individuels:
- - conv_123_msg_001 deja existe -> ECRASE (nouveau embedding pour msg1)
- - conv_123_msg_002 deja existe -> ECRASE (nouveau embedding pour msg2)
- - conv_123_msg_003 nouveau -> CREE
- - conv_123_msg_004 nouveau -> CREE
-
-RESULTAT:
---------
-- Anciens embeddings ECRASES (pas supprimés, mais remplaces)
-- PAS de pollution si les messages sont identiques
-- MAIS si les messages changent = embeddings incorrects
-
-PIRE SCENARIO:
--------------
-Si le backend accumule mal les messages:
-
-Premier appel: [msg1, msg2]
-Deuxieme appel: [msg3, msg4] <-- OUBLIE msg1 et msg2!
-
-ChromaDB contient:
-- conv_123 (embedding de "msg3 + msg4") <-- FAUX!
-- conv_123_msg_001 (embedding de msg3) <-- FAUX ID!
-- conv_123_msg_002 (embedding de msg4) <-- FAUX ID!
-
-Les anciens msg_001 et msg_002 (msg1 et msg2) sont PERDUS.
-
-CONCLUSION:
-----------
-L'ecrasement:
-- REMPLACE les embeddings (pas de suppression propre)
-- NECESSITE que le backend envoie TOUS les messages a chaque fois
-- RISQUE de perte de donnees si le backend se trompe
-
-C'est pour ca que append_to_conversation est NECESSAIRE!
-
-================================================================================
-POURQUOI append_to_conversation EST INDISPENSABLE
-================================================================================
-
-Avec append_to_conversation:
-
-Premier appel:
-add_conversation(conversation_id="conv_123", messages=[msg1, msg2])
-
-ChromaDB:
-- conv_123 (2 messages)
-- conv_123_msg_001, conv_123_msg_002
-
-Deuxieme appel:
-append_to_conversation(conversation_id="conv_123", new_messages=[msg3, msg4])
-
-Logic interne:
-1. GET existing conversation "conv_123"
-2. Extract metadata: message_count = 2
-3. Calculate next sequence = 3
-4. Update document principal:
- - DELETE conv_123
- - ADD conv_123 (nouveau embedding "msg1 + msg2 + msg3 + msg4")
-5. Add new individual messages:
- - conv_123_msg_003 (msg3)
- - conv_123_msg_004 (msg4)
-
-RESULTAT:
-- Anciens embeddings individuels CONSERVES (msg_001, msg_002)
-- Nouveau embedding principal CORRECT (4 messages)
-- Pas de perte de donnees
-- Sequence correcte
-
-================================================================================
-IMPLEMENTATION append_to_conversation (SKETCH)
-================================================================================
-
-```python
-async def append_to_conversation(
- self,
- conversation_id: str,
- new_messages: List[Dict[str, str]],
- update_context: Optional[Dict[str, Any]] = None
-) -> str:
- """
- Ajoute de nouveaux messages a une conversation existante
-
- Args:
- conversation_id: ID de la conversation existante
- new_messages: Nouveaux messages a ajouter
- update_context: Metadonnees a mettre a jour (optionnel)
-
- Returns:
- Message de confirmation
- """
- # 1. VERIFIER QUE LA CONVERSATION EXISTE
- try:
- existing = self.conversations.get(ids=[conversation_id])
- except Exception as e:
- raise ValueError(f"Conversation {conversation_id} not found")
-
- if not existing['documents'] or len(existing['documents']) == 0:
- raise ValueError(f"Conversation {conversation_id} not found")
-
- # 2. EXTRAIRE LES METADONNEES EXISTANTES
- existing_metadata = existing['metadatas'][0] if existing['metadatas'] else {}
- current_message_count = int(existing_metadata.get('message_count', 0))
-
- # 3. CALCULER LA NOUVELLE SEQUENCE
- next_sequence = current_message_count + 1
-
- # 4. CONSTRUIRE LE NOUVEAU TEXTE COMPLET
- # Recuperer l'ancien texte
- old_full_text = existing['documents'][0]
-
- # Ajouter les nouveaux messages
- new_text_parts = []
- for msg in new_messages:
- author = msg.get('author', 'unknown')
- content = msg.get('content', '')
- new_text_parts.append(f"{author}: {content}")
-
- new_text = "\n".join(new_text_parts)
- updated_full_text = old_full_text + "\n" + new_text
-
- # 5. METTRE A JOUR LES METADONNEES
- updated_metadata = existing_metadata.copy()
- updated_metadata['message_count'] = str(current_message_count + len(new_messages))
-
- # Merger update_context si fourni
- if update_context:
- for key, value in update_context.items():
- if isinstance(value, list):
- updated_metadata[key] = ", ".join(str(v) for v in value)
- elif isinstance(value, dict):
- updated_metadata[key] = json.dumps(value)
- else:
- updated_metadata[key] = str(value)
-
- # 6. SUPPRIMER L'ANCIEN DOCUMENT PRINCIPAL
- self.conversations.delete(ids=[conversation_id])
-
- # 7. AJOUTER LE NOUVEAU DOCUMENT PRINCIPAL
- self.conversations.add(
- documents=[updated_full_text],
- metadatas=[updated_metadata],
- ids=[conversation_id]
- )
-
- # 8. AJOUTER LES NOUVEAUX MESSAGES INDIVIDUELS
- for i, msg in enumerate(new_messages):
- msg_id = f"{conversation_id}_msg_{str(next_sequence + i).zfill(3)}"
- msg_content = msg.get('content', '')
- msg_author = msg.get('author', 'unknown')
- msg_timestamp = msg.get('timestamp', '')
-
- msg_metadata = {
- "conversation_id": conversation_id,
- "message_type": "individual_message",
- "author": msg_author,
- "timestamp": msg_timestamp,
- "sequence": str(next_sequence + i)
- }
-
- self.conversations.add(
- documents=[msg_content],
- metadatas=[msg_metadata],
- ids=[msg_id]
- )
-
- return f"Conversation {conversation_id} updated: added {len(new_messages)} messages (total: {updated_metadata['message_count']})"
-```
-
-AVANTAGES:
-- Conserve les anciens embeddings individuels
-- Met a jour correctement le document principal
-- Gere la sequence automatiquement
-- Pas de risque de perte de donnees
-
-================================================================================
-DECISION FINALE REQUISE
-================================================================================
-
-Maintenant que tu comprends les risques de l'ecrasement:
-
-OPTION 1: J'ajoute append_to_conversation dans ikario_rag
- - Implementation robuste (comme ci-dessus)
- - Pas de risque de perte d'embeddings
- - Architecture propre
-
-OPTION 2: Alternative simple (accumulation backend)
- - Backend garde tous les messages en memoire
- - Appelle add_conversation une seule fois a la fin
- - RISQUE: perte si crash
- - RISQUE: pas de recherche en temps reel
-
-OPTION 3: Modifier add_conversation pour auto-append
- - Ajouter la logique ci-dessus dans add_conversation
- - Plus transparent pour le client
- - Mais semantique moins claire
-
-Je recommande FORTEMENT Option 1.
-
-Quelle option choisis-tu?
-
-================================================================================
-OPTION 1 CHOISIE: append_to_conversation
-================================================================================
-Date: 19 decembre 2025 - 23h40
-
-Tu choisis Option 1 avec une question cruciale:
-"Si la conversation n'existe pas?"
-
-EXCELLENTE QUESTION! Il y a 2 approches:
-
-================================================================================
-APPROCHE A: append_to_conversation AVEC AUTO-CREATE (recommandé)
-================================================================================
-
-append_to_conversation détecte si la conversation existe:
-- Si existe: fait un append
-- Si n'existe pas: crée la conversation (comme add_conversation)
-
-AVANTAGES:
-- Backend simplifié (1 seul appel, toujours le même)
-- Pas besoin de tracker si c'est le premier message
-- Robuste
-
-CODE:
-```python
-async def append_to_conversation(
- self,
- conversation_id: str,
- new_messages: List[Dict[str, str]],
- participants: Optional[List[str]] = None,
- context: Optional[Dict[str, Any]] = None
-) -> str:
- """
- Ajoute des messages à une conversation (ou la crée si n'existe pas)
-
- Args:
- conversation_id: ID de la conversation
- new_messages: Messages à ajouter
- participants: Liste participants (requis si création)
- context: Métadonnées (requis si création)
- """
- # 1. VÉRIFIER SI LA CONVERSATION EXISTE
- try:
- existing = self.conversations.get(ids=[conversation_id])
- conversation_exists = (
- existing and
- existing['documents'] and
- len(existing['documents']) > 0
- )
- except:
- conversation_exists = False
-
- # 2. SI N'EXISTE PAS: CRÉER
- if not conversation_exists:
- if not participants or not context:
- raise ValueError(
- "participants and context required when creating new conversation"
- )
- return await self.add_conversation(
- participants=participants,
- messages=new_messages,
- context=context,
- conversation_id=conversation_id
- )
-
- # 3. SI EXISTE: APPEND
- # [Code d'append comme avant...]
- existing_metadata = existing['metadatas'][0]
- current_message_count = int(existing_metadata.get('message_count', 0))
- next_sequence = current_message_count + 1
-
- old_full_text = existing['documents'][0]
-
- new_text_parts = []
- for msg in new_messages:
- author = msg.get('author', 'unknown')
- content = msg.get('content', '')
- new_text_parts.append(f"{author}: {content}")
-
- new_text = "\n".join(new_text_parts)
- updated_full_text = old_full_text + "\n" + new_text
-
- updated_metadata = existing_metadata.copy()
- updated_metadata['message_count'] = str(current_message_count + len(new_messages))
-
- if context:
- for key, value in context.items():
- if isinstance(value, list):
- updated_metadata[key] = ", ".join(str(v) for v in value)
- elif isinstance(value, dict):
- updated_metadata[key] = json.dumps(value)
- else:
- updated_metadata[key] = str(value)
-
- self.conversations.delete(ids=[conversation_id])
-
- self.conversations.add(
- documents=[updated_full_text],
- metadatas=[updated_metadata],
- ids=[conversation_id]
- )
-
- for i, msg in enumerate(new_messages):
- msg_id = f"{conversation_id}_msg_{str(next_sequence + i).zfill(3)}"
- msg_content = msg.get('content', '')
- msg_author = msg.get('author', 'unknown')
- msg_timestamp = msg.get('timestamp', '')
-
- msg_metadata = {
- "conversation_id": conversation_id,
- "message_type": "individual_message",
- "author": msg_author,
- "timestamp": msg_timestamp,
- "sequence": str(next_sequence + i)
- }
-
- self.conversations.add(
- documents=[msg_content],
- metadatas=[msg_metadata],
- ids=[msg_id]
- )
-
- return f"Conversation {conversation_id} updated: added {len(new_messages)} messages (total: {updated_metadata['message_count']})"
-```
-
-UTILISATION BACKEND (ikario_body):
-```javascript
-// POST /api/chat/message
-app.post('/api/chat/message', async (req, res) => {
- const { message, conversationId } = req.body;
-
- // Generate conversation_id if first message
- const convId = conversationId || `conv_${Date.now()}`;
-
- // Get LLM response
- const llmResponse = await callClaudeAPI(message);
-
- // ALWAYS use append_to_conversation (handles creation automatically)
- await mcpClient.callTool('append_to_conversation', {
- conversation_id: convId,
- new_messages: [
- { author: 'user', content: message, timestamp: new Date().toISOString() },
- { author: 'assistant', content: llmResponse, timestamp: new Date().toISOString() }
- ],
- participants: ['user', 'assistant'], // Requis pour première fois
- context: {
- category: 'chat',
- date: new Date().toISOString()
- }
- });
-
- res.json({ response: llmResponse, conversationId: convId });
-});
-```
-
-SIMPLICITÉ BACKEND:
-- Toujours le même appel (append_to_conversation)
-- Pas de logique if/else
-- MCP server gère la complexité
-
-================================================================================
-APPROCHE B: GARDER add_conversation ET append_to_conversation SÉPARÉS
-================================================================================
-
-append_to_conversation REJETTE si conversation n'existe pas:
-- Backend doit tracker si c'est le premier message
-- Appelle add_conversation pour création
-- Appelle append_to_conversation pour ajouts
-
-CODE append_to_conversation (strict):
-```python
-async def append_to_conversation(
- self,
- conversation_id: str,
- new_messages: List[Dict[str, str]]
-) -> str:
- """
- Ajoute des messages à une conversation EXISTANTE
- Lève une erreur si la conversation n'existe pas
- """
- # Vérifier existence
- try:
- existing = self.conversations.get(ids=[conversation_id])
- if not existing['documents'] or len(existing['documents']) == 0:
- raise ValueError(f"Conversation {conversation_id} does not exist. Use add_conversation first.")
- except Exception as e:
- raise ValueError(f"Conversation {conversation_id} not found: {e}")
-
- # [Reste du code d'append...]
-```
-
-UTILISATION BACKEND (plus complexe):
-```javascript
-// POST /api/chat/message
-app.post('/api/chat/message', async (req, res) => {
- const { message, conversationId, isFirstMessage } = req.body;
-
- // Generate ID if new
- const convId = conversationId || `conv_${Date.now()}`;
-
- const llmResponse = await callClaudeAPI(message);
-
- const messages = [
- { author: 'user', content: message, timestamp: new Date().toISOString() },
- { author: 'assistant', content: llmResponse, timestamp: new Date().toISOString() }
- ];
-
- // DIFFÉRENT selon si première fois ou non
- if (isFirstMessage || !conversationId) {
- // Première fois: créer
- await mcpClient.callTool('add_conversation', {
- conversation_id: convId,
- participants: ['user', 'assistant'],
- messages: messages,
- context: { category: 'chat', date: new Date().toISOString() }
- });
- } else {
- // Fois suivantes: append
- await mcpClient.callTool('append_to_conversation', {
- conversation_id: convId,
- new_messages: messages
- });
- }
-
- res.json({ response: llmResponse, conversationId: convId });
-});
-```
-
-DÉSAVANTAGES:
-- Backend plus complexe (if/else)
-- Doit tracker isFirstMessage
-- Risque d'erreur si mauvaise détection
-
-================================================================================
-MA RECOMMANDATION FINALE
-================================================================================
-
-APPROCHE A: append_to_conversation AVEC AUTO-CREATE
-
-POURQUOI:
-1. Backend simplifié (1 seul appel)
-2. Robuste (pas de risque d'oublier add_conversation)
-3. Sémantique acceptable (append = "ajouter à", que ça existe ou non)
-4. Moins de surface d'erreur
-
-IMPLEMENTATION:
-- J'ajoute append_to_conversation dans mcp_ikario_memory.py
-- Avec détection + auto-create si n'existe pas
-- J'expose l'outil dans server.py
-- Je mets à jour le spec UI pour utiliser cet outil
-
-ALTERNATIVE:
-Si tu préfères la sémantique stricte (Approche B), je peux faire ça aussi.
-
-================================================================================
-PROCHAINE ÉTAPE
-================================================================================
-
-Dis-moi:
-1. APPROCHE A (auto-create) ou APPROCHE B (strict)?
-2. Une fois choisi, je vais:
- - Modifier mcp_ikario_memory.py
- - Modifier server.py pour exposer l'outil
- - Tester l'implémentation
- - Mettre à jour le spec UI
- - Supprimer ancien spec
- - Supprimer 15 anciennes issues
- - Créer 15 nouvelles issues
- - Lancer agent initializer bis
-
-Quelle approche préfères-tu?
-
-================================================================================
-QUESTION: EST-CE QUE LA REFLEXION LLM EST ENREGISTREE?
-================================================================================
-Date: 19 decembre 2025 - 23h50
-
-Tu demandes si sont enregistrés:
-1. Message utilisateur
-2. Réflexion LLM (thinking)
-3. Message LLM (réponse)
-
-RÉPONSE ACTUELLE: NON, la réflexion LLM n'est PAS enregistrée
-
-ANALYSE DU CODE ACTUEL:
-----------------------
-
-Structure des messages (ligne 113):
-```python
-messages: List[Dict[str, str]]
-# [{"author": "david", "content": "...", "timestamp": "14:30:05"}, ...]
-```
-
-Champs actuels:
-- author: "david" ou "ikario"
-- content: Le contenu du message
-- timestamp: Horodatage
-
-Il n'y a PAS de champ "thinking" ou "reflection".
-
-CE QUI EST ENREGISTRÉ ACTUELLEMENT:
------------------------------------
-
-Message user:
-{
- "author": "user",
- "content": "Comment faire un fetch API?",
- "timestamp": "14:30:00"
-}
-
-Message LLM:
-{
- "author": "assistant",
- "content": "Voici comment faire un fetch API: ...", <-- SEULEMENT la réponse finale
- "timestamp": "14:30:05"
-}
-
-La réflexion interne (Extended Thinking) n'est PAS capturée.
-
-================================================================================
-QUESTION: VEUX-TU ENREGISTRER LA RÉFLEXION LLM?
-================================================================================
-
-Avec Extended Thinking, Claude génère:
-1. Thinking (réflexion interne, raisonnement)
-2. Response (réponse visible à l'utilisateur)
-
-OPTION 1: ENREGISTRER SEULEMENT LA RÉPONSE (actuel)
----------------------------------------------------
-Message LLM dans ChromaDB:
-{
- "author": "assistant",
- "content": "Voici comment faire un fetch API: ..."
-}
-
-AVANTAGES:
-- Plus simple
-- Moins de données stockées
-- Embeddings basés sur le contenu utile
-
-INCONVÉNIENTS:
-- Perte du raisonnement interne
-- Impossible de retrouver "comment le LLM a pensé"
-
-OPTION 2: ENREGISTRER THINKING + RÉPONSE (recommandé)
------------------------------------------------------
-Message LLM dans ChromaDB:
-{
- "author": "assistant",
- "content": "Voici comment faire un fetch API: ...",
- "thinking": "L'utilisateur demande... je dois expliquer... [réflexion complète]"
-}
-
-OU (séparé):
-Message thinking:
-{
- "author": "assistant",
- "message_type": "thinking",
- "content": "[réflexion interne]"
-}
-
-Message response:
-{
- "author": "assistant",
- "message_type": "response",
- "content": "Voici comment faire..."
-}
-
-AVANTAGES:
-- Capture le raisonnement complet
-- Recherche sémantique sur la réflexion
-- Comprendre l'évolution de la pensée
-- Traçabilité totale
-
-INCONVÉNIENTS:
-- Plus de données stockées
-- Structure plus complexe
-
-OPTION 3: THINKING SÉPARÉ (dans thoughts, pas conversations)
-------------------------------------------------------------
-Conversation:
-- Message user
-- Message LLM (réponse seulement)
-
-Thoughts (collection séparée):
-- Thinking du LLM stocké comme une "pensée"
-
-AVANTAGES:
-- Séparation claire: conversations = dialogue, thoughts = réflexions
-- Cohérent avec l'architecture actuelle (2 collections)
-
-INCONVÉNIENTS:
-- Perte du lien direct avec la conversation
-- Plus complexe à récupérer
-
-================================================================================
-MA RECOMMANDATION
-================================================================================
-
-OPTION 2 (ENREGISTRER THINKING + RÉPONSE dans le même message)
-
-Structure proposée:
-```python
-messages: List[Dict[str, Any]] # Changement: Any au lieu de str
-
-# Message user (inchangé)
-{
- "author": "user",
- "content": "Comment faire un fetch API?",
- "timestamp": "14:30:00"
-}
-
-# Message LLM (nouveau format)
-{
- "author": "assistant",
- "content": "Voici comment faire un fetch API: ...",
- "thinking": "[Réflexion interne du LLM...]", # NOUVEAU
- "timestamp": "14:30:05"
-}
-```
-
-IMPLÉMENTATION:
-- Modifier add_conversation pour accepter champ "thinking" optionnel
-- Stocker thinking dans les métadonnées du message individuel
-- Document principal: inclure ou non le thinking? (à décider)
-
-POUR LE DOCUMENT PRINCIPAL:
-OPTION A: Inclure thinking
- "user: Comment faire...\nassistant (thinking): [réflexion]\nassistant: Voici comment..."
-
-OPTION B: Exclure thinking (seulement dialogue visible)
- "user: Comment faire...\nassistant: Voici comment..."
-
-Je recommande OPTION A (inclure thinking dans document principal).
-
-POURQUOI:
-- Recherche sémantique plus riche
-- Retrouver "cette fois où le LLM a raisonné sur X"
-- Traçabilité complète
-
-================================================================================
-DÉCISION REQUISE
-================================================================================
-
-Avant de commencer à développer append_to_conversation, tu dois décider:
-
-1. ENREGISTRER LA RÉFLEXION LLM?
- a) OUI - Ajouter champ "thinking" dans les messages
- b) NON - Garder seulement "content" (réponse finale)
-
-2. SI OUI, FORMAT?
- a) Thinking dans le même message (recommandé)
- b) Thinking comme message séparé
- c) Thinking dans collection thoughts (séparé)
-
-3. SI OUI, DOCUMENT PRINCIPAL?
- a) Inclure thinking dans l'embedding
- b) Exclure thinking (seulement dialogue)
-
-Mes recommandations:
-1. a) OUI
-2. a) Même message
-3. a) Inclure thinking
-
-Qu'en penses-tu?
-
-================================================================================
-DECISION CONFIRMEE: OPTION 2 (THINKING DANS LE MESSAGE)
-================================================================================
-Date: 19 decembre 2025 - 23h55
-
-Tu confirmes:
-- OUI pour enregistrer le thinking
-- Option 2: Thinking dans le même message (fait partie de la conversation)
-- PAS une pensée séparée dans thoughts
-
-CORRECT! Le thinking est le raisonnement du LLM PENDANT la conversation.
-
-================================================================================
-PLAN DETAILLE: INTEGRATION THINKING DANS CONVERSATIONS
-================================================================================
-
-PHASE 1: ANALYSE DES MODIFICATIONS NECESSAIRES
-----------------------------------------------
-
-Fichiers à modifier:
-1. mcp_ikario_memory.py (C:/Users/david/SynologyDrive/ikario/ikario_rag/)
- - Modifier add_conversation
- - Ajouter append_to_conversation
-
-2. server.py (C:/Users/david/SynologyDrive/ikario/ikario_rag/)
- - Exposer append_to_conversation comme outil MCP
-
-3. prompts/app_spec_ikario_rag_UI.txt (C:/GitHub/Linear_coding/)
- - Mettre à jour pour utiliser append_to_conversation
- - Documenter le champ thinking
-
-PHASE 2: STRUCTURE DES DONNEES
-------------------------------
-
-NOUVEAU FORMAT MESSAGE:
-
-Message utilisateur (inchangé):
-{
- "author": "user",
- "content": "Comment faire un fetch API?",
- "timestamp": "2025-12-19T14:30:00"
-}
-
-Message LLM (NOUVEAU avec thinking):
-{
- "author": "assistant",
- "content": "Voici comment faire un fetch API...",
- "thinking": "L'utilisateur demande une explication sur fetch API. Je dois expliquer...", # OPTIONNEL
- "timestamp": "2025-12-19T14:30:05"
-}
-
-STOCKAGE DANS CHROMADB:
-
-1. DOCUMENT PRINCIPAL (conversation_id):
- Documents: Texte complet avec thinking inclus
- Format:
- ```
- user: Comment faire un fetch API?
- assistant (thinking): L'utilisateur demande une explication...
- assistant: Voici comment faire un fetch API...
- ```
-
-2. MESSAGES INDIVIDUELS (conversation_id_msg_001, etc.):
- Documents: Contenu du message
- Métadonnées:
- - author: "user" ou "assistant"
- - timestamp: "..."
- - sequence: "1", "2", etc.
- - thinking: "[texte du thinking]" (si présent, optionnel)
- - message_type: "individual_message"
-
-DECISION: INCLURE THINKING DANS DOCUMENT PRINCIPAL
-
-POURQUOI:
-- Recherche sémantique plus riche
-- "Trouve la conversation où le LLM a raisonné sur les performances React"
-- Traçabilité complète du raisonnement
-
-PHASE 3: MODIFICATIONS DANS add_conversation
---------------------------------------------
-
-Changements nécessaires:
-
-1. SIGNATURE (ligne 100-106):
- AVANT:
- ```python
- async def add_conversation(
- self,
- participants: List[str],
- messages: List[Dict[str, str]], # <-- str
- context: Dict[str, Any],
- conversation_id: Optional[str] = None
- ) -> str:
- ```
-
- APRES:
- ```python
- async def add_conversation(
- self,
- participants: List[str],
- messages: List[Dict[str, Any]], # <-- Any pour supporter thinking
- context: Dict[str, Any],
- conversation_id: Optional[str] = None
- ) -> str:
- ```
-
-2. DOCUMENT PRINCIPAL (ligne 131-138):
- AVANT:
- ```python
- full_text_parts = []
- for msg in messages:
- author = msg.get('author', 'unknown')
- content = msg.get('content', '')
- full_text_parts.append(f"{author}: {content}")
- ```
-
- APRES:
- ```python
- full_text_parts = []
- for msg in messages:
- author = msg.get('author', 'unknown')
- content = msg.get('content', '')
- thinking = msg.get('thinking', None)
-
- # Si thinking présent, l'inclure dans le document principal
- if thinking:
- full_text_parts.append(f"{author} (thinking): {thinking}")
-
- full_text_parts.append(f"{author}: {content}")
- ```
-
-3. MESSAGES INDIVIDUELS (ligne 166-187):
- AVANT:
- ```python
- for i, msg in enumerate(messages):
- msg_id = f"{conversation_id}_msg_{str(i+1).zfill(3)}"
- msg_content = msg.get('content', '')
- msg_author = msg.get('author', 'unknown')
- msg_timestamp = msg.get('timestamp', '')
-
- msg_metadata = {
- "conversation_id": conversation_id,
- "message_type": "individual_message",
- "author": msg_author,
- "timestamp": msg_timestamp,
- "sequence": str(i+1)
- }
- ```
-
- APRES:
- ```python
- for i, msg in enumerate(messages):
- msg_id = f"{conversation_id}_msg_{str(i+1).zfill(3)}"
- msg_content = msg.get('content', '')
- msg_author = msg.get('author', 'unknown')
- msg_timestamp = msg.get('timestamp', '')
- msg_thinking = msg.get('thinking', None) # NOUVEAU
-
- msg_metadata = {
- "conversation_id": conversation_id,
- "message_type": "individual_message",
- "author": msg_author,
- "timestamp": msg_timestamp,
- "sequence": str(i+1)
- }
-
- # Ajouter thinking aux métadonnées si présent
- if msg_thinking:
- msg_metadata["thinking"] = msg_thinking # NOUVEAU
- ```
-
-PHASE 4: IMPLEMENTATION append_to_conversation
-----------------------------------------------
-
-Nouvelle fonction complète:
-
-```python
-async def append_to_conversation(
- self,
- conversation_id: str,
- new_messages: List[Dict[str, Any]],
- participants: Optional[List[str]] = None,
- context: Optional[Dict[str, Any]] = None
-) -> str:
- """
- Ajoute des messages à une conversation (ou la crée si n'existe pas)
-
- Support du champ 'thinking' optionnel dans les messages.
-
- Args:
- conversation_id: ID de la conversation
- new_messages: Messages à ajouter
- Format: [
- {"author": "user", "content": "...", "timestamp": "..."},
- {"author": "assistant", "content": "...", "thinking": "...", "timestamp": "..."}
- ]
- participants: Liste participants (requis si création)
- context: Métadonnées (requis si création)
-
- Returns:
- Message de confirmation
- """
- # 1. VERIFIER SI LA CONVERSATION EXISTE
- try:
- existing = self.conversations.get(ids=[conversation_id])
- conversation_exists = (
- existing and
- existing['documents'] and
- len(existing['documents']) > 0
- )
- except:
- conversation_exists = False
-
- # 2. SI N'EXISTE PAS: CREER (déléguer à add_conversation)
- if not conversation_exists:
- if not participants or not context:
- raise ValueError(
- "participants and context required when creating new conversation"
- )
- return await self.add_conversation(
- participants=participants,
- messages=new_messages,
- context=context,
- conversation_id=conversation_id
- )
-
- # 3. SI EXISTE: APPEND
-
- # 3a. Extraire métadonnées existantes
- existing_metadata = existing['metadatas'][0]
- current_message_count = int(existing_metadata.get('message_count', 0))
- next_sequence = current_message_count + 1
-
- # 3b. Récupérer ancien texte complet
- old_full_text = existing['documents'][0]
-
- # 3c. Construire nouveau texte avec thinking si présent
- new_text_parts = []
- for msg in new_messages:
- author = msg.get('author', 'unknown')
- content = msg.get('content', '')
- thinking = msg.get('thinking', None)
-
- # Inclure thinking dans le document principal si présent
- if thinking:
- new_text_parts.append(f"{author} (thinking): {thinking}")
-
- new_text_parts.append(f"{author}: {content}")
-
- new_text = "\n".join(new_text_parts)
- updated_full_text = old_full_text + "\n" + new_text
-
- # 3d. Mettre à jour métadonnées
- updated_metadata = existing_metadata.copy()
- updated_metadata['message_count'] = str(current_message_count + len(new_messages))
-
- # Merger context si fourni
- if context:
- for key, value in context.items():
- if isinstance(value, list):
- updated_metadata[key] = ", ".join(str(v) for v in value)
- elif isinstance(value, dict):
- updated_metadata[key] = json.dumps(value)
- else:
- updated_metadata[key] = str(value)
-
- # 3e. Supprimer ancien document principal
- self.conversations.delete(ids=[conversation_id])
-
- # 3f. Ajouter nouveau document principal
- self.conversations.add(
- documents=[updated_full_text],
- metadatas=[updated_metadata],
- ids=[conversation_id]
- )
-
- # 3g. Ajouter nouveaux messages individuels
- for i, msg in enumerate(new_messages):
- msg_id = f"{conversation_id}_msg_{str(next_sequence + i).zfill(3)}"
- msg_content = msg.get('content', '')
- msg_author = msg.get('author', 'unknown')
- msg_timestamp = msg.get('timestamp', '')
- msg_thinking = msg.get('thinking', None)
-
- msg_metadata = {
- "conversation_id": conversation_id,
- "message_type": "individual_message",
- "author": msg_author,
- "timestamp": msg_timestamp,
- "sequence": str(next_sequence + i)
- }
-
- # Ajouter thinking aux métadonnées si présent
- if msg_thinking:
- msg_metadata["thinking"] = msg_thinking
-
- # Générer embedding pour ce message (content seulement, pas thinking)
- self.conversations.add(
- documents=[msg_content],
- metadatas=[msg_metadata],
- ids=[msg_id]
- )
-
- return f"Conversation {conversation_id} updated: added {len(new_messages)} messages (total: {updated_metadata['message_count']})"
-```
-
-PHASE 5: EXPOSITION DANS server.py
-----------------------------------
-
-Ajouter l'outil MCP pour append_to_conversation:
-
-```python
-@server.call_tool()
-async def call_tool(name: str, arguments: dict) -> list[types.TextContent]:
- """Handle tool calls"""
-
- # ... (outils existants: add_thought, add_conversation, etc.)
-
- # NOUVEAU: append_to_conversation
- elif name == "append_to_conversation":
- result = await memory.append_to_conversation(
- conversation_id=arguments["conversation_id"],
- new_messages=arguments["new_messages"],
- participants=arguments.get("participants"),
- context=arguments.get("context")
- )
- return [types.TextContent(type="text", text=result)]
-```
-
-Et ajouter la définition de l'outil:
-
-```python
-@server.list_tools()
-async def list_tools() -> list[types.Tool]:
- """List available tools"""
- return [
- # ... (outils existants)
-
- types.Tool(
- name="append_to_conversation",
- description=(
- "Ajoute des messages à une conversation existante (ou la crée si nécessaire). "
- "Support du champ 'thinking' optionnel pour capturer le raisonnement du LLM. "
- "Si la conversation n'existe pas, elle sera créée automatiquement."
- ),
- inputSchema={
- "type": "object",
- "properties": {
- "conversation_id": {
- "type": "string",
- "description": "ID de la conversation"
- },
- "new_messages": {
- "type": "array",
- "description": "Nouveaux messages à ajouter",
- "items": {
- "type": "object",
- "properties": {
- "author": {"type": "string"},
- "content": {"type": "string"},
- "thinking": {"type": "string", "description": "Réflexion interne du LLM (optionnel)"},
- "timestamp": {"type": "string"}
- },
- "required": ["author", "content", "timestamp"]
- }
- },
- "participants": {
- "type": "array",
- "items": {"type": "string"},
- "description": "Liste des participants (requis si création)"
- },
- "context": {
- "type": "object",
- "description": "Métadonnées de la conversation (requis si création)"
- }
- },
- "required": ["conversation_id", "new_messages"]
- }
- )
- ]
-```
-
-PHASE 6: TESTS A EFFECTUER
---------------------------
-
-Test 1: Création nouvelle conversation SANS thinking
-```python
-await append_to_conversation(
- conversation_id="conv_test_1",
- new_messages=[
- {"author": "user", "content": "Bonjour", "timestamp": "14:30:00"},
- {"author": "assistant", "content": "Salut!", "timestamp": "14:30:05"}
- ],
- participants=["user", "assistant"],
- context={"category": "test"}
-)
-```
-
-Test 2: Création nouvelle conversation AVEC thinking
-```python
-await append_to_conversation(
- conversation_id="conv_test_2",
- new_messages=[
- {"author": "user", "content": "Comment faire un fetch?", "timestamp": "14:30:00"},
- {
- "author": "assistant",
- "content": "Voici comment...",
- "thinking": "L'utilisateur demande une explication sur fetch API...",
- "timestamp": "14:30:05"
- }
- ],
- participants=["user", "assistant"],
- context={"category": "test"}
-)
-```
-
-Test 3: Append à conversation existante SANS thinking
-```python
-await append_to_conversation(
- conversation_id="conv_test_1",
- new_messages=[
- {"author": "user", "content": "Merci!", "timestamp": "14:31:00"},
- {"author": "assistant", "content": "De rien!", "timestamp": "14:31:02"}
- ]
-)
-```
-
-Test 4: Append à conversation existante AVEC thinking
-```python
-await append_to_conversation(
- conversation_id="conv_test_2",
- new_messages=[
- {"author": "user", "content": "Et avec async/await?", "timestamp": "14:31:00"},
- {
- "author": "assistant",
- "content": "Avec async/await...",
- "thinking": "Il veut comprendre async/await avec fetch...",
- "timestamp": "14:31:05"
- }
- ]
-)
-```
-
-Test 5: Vérifier embeddings et métadonnées
-```python
-# Récupérer la conversation
-result = await search_conversations("fetch API", n_results=1)
-
-# Vérifier:
-# - Document principal contient thinking
-# - Messages individuels ont métadonnée "thinking"
-# - Embeddings corrects
-```
-
-PHASE 7: MISE A JOUR SPEC UI
-----------------------------
-
-Dans prompts/app_spec_ikario_rag_UI.txt:
-
-1. Remplacer add_conversation par append_to_conversation dans les exemples
-
-2. Documenter le champ thinking:
-```
-OUTIL MCP: append_to_conversation
-- Paramètres:
- * conversation_id: ID de la session
- * new_messages: Array de messages
- - author: "user" ou "assistant"
- - content: Contenu du message
- - thinking: Réflexion LLM (OPTIONNEL)
- - timestamp: ISO date
- * participants: ["user", "assistant"] (requis si nouvelle conversation)
- * context: {category, date, ...} (requis si nouvelle conversation)
-```
-
-3. Exemple d'utilisation backend:
-```javascript
-// POST /api/chat/message
-const llmResponse = await callClaudeAPI(userMessage, { extended_thinking: true });
-
-await mcpClient.callTool('append_to_conversation', {
- conversation_id: conversationId,
- new_messages: [
- { author: 'user', content: userMessage, timestamp: new Date().toISOString() },
- {
- author: 'assistant',
- content: llmResponse.content,
- thinking: llmResponse.thinking, // Inclure le thinking si Extended Thinking activé
- timestamp: new Date().toISOString()
- }
- ],
- participants: ['user', 'assistant'],
- context: { category: 'chat', date: new Date().toISOString() }
-});
-```
-
-================================================================================
-RESUME DU PLAN
-================================================================================
-
-ORDRE D'EXECUTION:
-
-1. [EN COURS] Créer ce plan détaillé ✓
-2. Faire commit de sauvegarde dans ikario_rag
-3. Modifier add_conversation (support thinking)
-4. Implémenter append_to_conversation (avec thinking)
-5. Modifier server.py (exposer append_to_conversation)
-6. Tester les 5 scénarios
-7. Mettre à jour spec UI
-8. Commit final
-9. Supprimer ancien spec + anciennes issues
-10. Créer 15 nouvelles issues
-11. Lancer agent initializer bis
-
-FICHIERS MODIFIES:
-- C:/Users/david/SynologyDrive/ikario/ikario_rag/mcp_ikario_memory.py
-- C:/Users/david/SynologyDrive/ikario/ikario_rag/server.py
-- C:/GitHub/Linear_coding/prompts/app_spec_ikario_rag_UI.txt
-
-NOUVEAUX OUTILS MCP:
-- append_to_conversation (8ème outil)
-
-NOUVEAU FORMAT:
-- Messages avec champ "thinking" optionnel
-- Document principal inclut thinking
-- Métadonnées individuelles incluent thinking
-
-================================================================================
-PROCHAINE ETAPE
-================================================================================
-
-Est-ce que ce plan te convient?
-
-Si OUI:
-1. Je fais le commit de sauvegarde
-2. Je commence les modifications
-
-Si NON:
-- Dis-moi ce qu'il faut changer dans le plan
-
-================================================================================
-IMPLEMENTATION TERMINEE !
-================================================================================
-Date: 20 decembre 2025 - 00h15
-
-TOUT EST FAIT ET TESTE AVEC SUCCES!
-
-================================================================================
-RESUME DES MODIFICATIONS
-================================================================================
-
-FICHIERS MODIFIES:
-1. mcp_ikario_memory.py (C:/Users/david/SynologyDrive/ikario/ikario_rag/)
- - Ligne 103: Signature add_conversation changee (Dict[str, Any])
- - Lignes 131-143: Document principal inclut thinking
- - Lignes 172-200: Messages individuels stockent thinking dans metadata
- - Lignes 202-329: Nouvelle fonction append_to_conversation (129 lignes)
-
-2. server.py (C:/Users/david/SynologyDrive/ikario/ikario_rag/)
- - Lignes 173-272: Tool append_to_conversation ajoute (definition MCP)
- - Ligne 195: Tool add_conversation mis a jour (thinking dans schema)
- - Lignes 427-438: Handler append_to_conversation ajoute
-
-3. test_append_conversation.py (NOUVEAU - tests)
- - 6 tests automatises
- - Tous passent avec succes
-
-================================================================================
-COMMITS CREES
-================================================================================
-
-Commit 1 (backup): 55d905b
-"Backup before adding append_to_conversation with thinking support"
-
-Commit 2 (implementation): cba84fe
-"Add append_to_conversation with thinking support (8th MCP tool)"
-
-================================================================================
-TESTS REUSSIS (6/6)
-================================================================================
-
-Test 1: Creation conversation SANS thinking
-[OK] Conversation ajoutee: test_conv_1 (2 messages)
-
-Test 2: Creation conversation AVEC thinking
-[OK] Conversation ajoutee: test_conv_2 (2 messages)
-
-Test 3: Append a conversation SANS thinking
-[OK] Conversation test_conv_1 updated: added 2 messages (total: 4)
-
-Test 4: Append a conversation AVEC thinking
-[OK] Conversation test_conv_2 updated: added 2 messages (total: 4)
-
-Test 5: Recherche semantique avec thinking
-[OK] Found 1 conversation
- Relevance: 0.481
- Thinking visible dans le document principal!
-
-Test 6: Verification metadata
-[OK] Thinking metadata is present!
- Stocke dans les messages individuels
-
-================================================================================
-NOUVEAU FORMAT MESSAGE
-================================================================================
-
-Message utilisateur (inchange):
-{
- "author": "user",
- "content": "Comment faire un fetch API?",
- "timestamp": "2025-12-20T00:10:00"
-}
-
-Message LLM (NOUVEAU avec thinking optionnel):
-{
- "author": "assistant",
- "content": "Voici comment faire...",
- "thinking": "L'utilisateur demande une explication...", # OPTIONNEL
- "timestamp": "2025-12-20T00:10:05"
-}
-
-================================================================================
-NOUVEL OUTIL MCP: append_to_conversation (8eme)
-================================================================================
-
-DESCRIPTION:
-Ajoute des messages a une conversation existante (ou la cree si necessaire).
-Support du champ 'thinking' optionnel pour capturer le raisonnement du LLM.
-Si la conversation n'existe pas, elle sera creee automatiquement.
-
-PARAMETRES:
-- conversation_id: string (requis)
-- new_messages: array (requis)
- * author: string
- * content: string
- * thinking: string (OPTIONNEL)
- * timestamp: string
-- participants: array (requis si creation)
-- context: object (requis si creation)
-
-EXEMPLE D'UTILISATION:
-await mcpClient.callTool('append_to_conversation', {
- conversation_id: 'conv_20251220_0010',
- new_messages: [
- { author: 'user', content: 'Bonjour', timestamp: '...' },
- {
- author: 'assistant',
- content: 'Salut!',
- thinking: 'L\'utilisateur me salue...',
- timestamp: '...'
- }
- ],
- participants: ['user', 'assistant'],
- context: { category: 'chat', date: '2025-12-20' }
-});
-
-================================================================================
-AVANTAGES
-================================================================================
-
-1. THINKING CAPTURE:
- - Raisonnement LLM preserve dans la memoire
- - Recherche semantique enrichie
- - Tracabilite complete des reflexions
-
-2. AUTO-CREATE:
- - Backend simplifie (1 seul appel)
- - Pas besoin de tracker si premiere fois
- - Robuste
-
-3. BACKWARD COMPATIBLE:
- - Thinking optionnel
- - Code existant continue de fonctionner
- - Pas de breaking changes
-
-4. SEMANTIC SEARCH:
- - Thinking inclus dans l'embedding principal
- - "Trouve la conversation ou le LLM a raisonne sur X"
- - Meilleure pertinence des resultats
-
-================================================================================
-PROCHAINES ETAPES
-================================================================================
-
-1. [TERMINE] Mettre a jour spec UI (app_spec_ikario_rag_UI.txt) ✓
-2. [SUIVANT] Supprimer ancien spec (app_spec_ikario_rag_improvements.txt)
-3. Supprimer 15 anciennes issues Linear (TEAMPHI-305 a 319)
-4. Creer 15 nouvelles issues avec nouveau spec
-5. Lancer agent initializer bis
-
-================================================================================
-SPEC UI MIS A JOUR !
-================================================================================
-Date: 20 decembre 2025 - 00h30
-
-Fichier: prompts/app_spec_ikario_rag_UI.txt
-
-MODIFICATIONS EFFECTUEES:
-
-1. Ligne 9-13: Overview mis a jour
- - "8 outils MCP" (au lieu de 7)
- - Ajout de append_to_conversation dans la liste
- - Mention du support thinking optionnel
-
-2. Ligne 44: Technology stack mis a jour
- - "8 outils MCP disponibles (avec append_to_conversation + support thinking)"
-
-3. Lignes 103-124: Routes API mises a jour
- - Ajout route: POST /api/memory/conversations/append
- - Documentation append_to_conversation (auto-create, thinking)
- - Format message avec thinking documente
-
-4. Lignes 156-185: Memory Service Layer mis a jour
- - Ajout fonction appendToConversation() avec exemple complet
- - Documentation auto-create et thinking optionnel
-
-5. Lignes 440-462: Chat Integration mis a jour
- - Utilisation de append_to_conversation pour chat streaming
- - Exemple POST avec thinking optionnel
- - Support Extended Thinking documente
-
-6. Lignes 777-790: Tests mis a jour
- - Ajout test append_to_conversation
- - Test thinking optionnel
- - Test auto-creation
-
-7. Lignes 982-988: Success criteria mis a jour
- - "8 endpoints" (au lieu de 7)
- - Ajout validation append_to_conversation
- - Validation thinking support
-
-8. Lignes 1012-1014: Constraints mis a jour
- - "8 outils MCP existants"
- - Note: append_to_conversation deja implemente (commit cba84fe)
-
-RESUME DES CHANGEMENTS:
-- 8 sections modifiees
-- Documentation complete du nouvel outil
-- Exemples concrets d'utilisation avec thinking
-- Distinction claire: add_conversation (complete) vs append_to_conversation (incremental)
-- Guidelines pour integration chat avec thinking support
-
-LE SPEC EST PRET pour creation des issues!
-
-COMMIT GIT CREE:
-Commit: 3a17744
-Message: "Update UI spec for append_to_conversation and thinking support"
-
-Fichiers commites:
-- prompts/app_spec_ikario_rag_UI.txt (spec mis a jour)
-- navette.txt (ce fichier)
-
-================================================================================
-ETAT ACTUEL - RECAPITULATIF COMPLET
-================================================================================
-
-TRAVAIL TERMINE:
-✓ Plan detaille cree (7 phases)
-✓ Commit backup (55d905b)
-✓ Modifications mcp_ikario_memory.py (support thinking + append_to_conversation)
-✓ Modifications server.py (8eme outil MCP expose)
-✓ Tests automatises (6/6 reussis)
-✓ Commit implementation (cba84fe)
-✓ Spec UI mis a jour (8 sections modifiees)
-✓ Commit spec UI (3a17744)
-
-COMMITS CREES (3 au total):
-1. 55d905b - Backup before adding append_to_conversation
-2. cba84fe - Add append_to_conversation with thinking support (ikario_rag)
-3. 3a17744 - Update UI spec (Linear_coding)
-
-OUTILS MCP DISPONIBLES (8):
-1. add_thought
-2. add_conversation (avec thinking optionnel)
-3. append_to_conversation (NOUVEAU - incremental + auto-create + thinking)
-4. search_thoughts
-5. search_conversations
-6. search_memories
-7. trace_concept_evolution
-8. check_consistency
-
-NOUVEAU FORMAT MESSAGE:
-{
- "author": "assistant",
- "content": "Reponse visible",
- "thinking": "Raisonnement interne LLM", // OPTIONNEL
- "timestamp": "ISO date"
-}
-
-================================================================================
-PROCHAINES ACTIONS RECOMMANDEES
-================================================================================
-
-1. SUPPRIMER ancien spec (app_spec_ikario_rag_improvements.txt)
- - Cause confusion (parle de modifier ikario_rag)
- - Nouveau spec est complet
-
-2. SUPPRIMER 15 anciennes issues Linear (TEAMPHI-305 a 319)
- - Ces issues parlent de modifier ikario_rag (on ne veut plus)
- - Clean slate pour nouvelles issues
-
-3. CREER 15 nouvelles issues avec nouveau spec
- - Utiliser: python autonomous_agent_demo.py --project-dir ikario_body --new-spec app_spec_ikario_rag_UI.txt
- - Mode: initializer bis
- - Issues pour developper dans ikario_body uniquement
-
-4. LANCER agent coding
- - Apres creation des issues
- - Mode: coding agent
- - Developper les 15 features UI
-
-VEUX-TU QUE JE CONTINUE?
-Options:
-a) OUI - Supprimer ancien spec + anciennes issues + creer nouvelles issues
-b) ATTENDRE - Tu veux verifier quelque chose avant
-c) MODIFIER - Tu veux changer le plan
-
-================================================================================
-CLARIFICATIONS IMPORTANTES - TES QUESTIONS
-================================================================================
-Date: 20 decembre 2025 - 00h45
-
-QUESTION 1: Difference entre search_thoughts et search_memories?
-----------------------------------------------------------------
-
-J'ai verifie le code mcp_ikario_memory.py:
-
-search_thoughts (lignes 191-224):
-- Recherche SEULEMENT dans la collection "thoughts"
-- Filtre optionnel: filter_thought_type
-- Retourne: pensees internes d'Ikario
-
-search_conversations (lignes 226-282):
-- Recherche SEULEMENT dans la collection "conversations"
-- Filtre optionnel: filter_category, search_level
-- Retourne: conversations David-Ikario
-
-search_memories (lignes 37-51):
-- PROBLEME IDENTIFIE!
-- Code actuel: recherche SEULEMENT dans self.conversations (ligne 43)
-- Ce n'est PAS une vraie recherche globale!
-- C'est essentiellement la meme chose que search_conversations
-
-CONCLUSION:
-search_memories DEVRAIT faire une recherche globale (thoughts + conversations)
-mais actuellement il cherche SEULEMENT dans conversations.
-
-C'est probablement un bug ou une implementation incomplete.
-
-QUESTION 2: Je melange les deux projets?
------------------------------------------
-
-OUI, tu as raison! J'ai melange:
-
-PROJET 1: ikario_rag (C:/Users/david/SynologyDrive/ikario/ikario_rag/)
-- Backend MCP Python
-- 8 outils MCP exposes
-- ChromaDB avec embeddings
-- CE QU'ON A FAIT:
- * Ajoute append_to_conversation (mcp_ikario_memory.py)
- * Ajoute support thinking (mcp_ikario_memory.py)
- * Expose 8eme outil (server.py)
- * Tests (test_append_conversation.py)
- * Commits: 55d905b, cba84fe
-
-PROJET 2: ikario_body (C:/GitHub/Linear_coding/generations/ikario_body/)
-- Frontend React + Backend Express
-- Interface utilisateur pour UTILISER les outils MCP d'ikario_rag
-- CE QU'ON A FAIT:
- * Cree spec UI (prompts/app_spec_ikario_rag_UI.txt)
- * Commit: 3a17744
- * MAIS: rien d'implemente dans ikario_body encore!
-
-Le spec UI que j'ai cree est pour PLUS TARD, quand on developpera
-l'interface dans ikario_body qui UTILISERA ikario_rag.
-
-Tu as raison: ON DOIT D'ABORD FINIR ikario_rag!
-
-================================================================================
-CE QU'IL RESTE A FAIRE DANS ikario_rag
-================================================================================
-
-1. CORRIGER search_memories (bug identifie)
- - Doit chercher dans thoughts + conversations
- - Pas seulement conversations
-
-2. TESTER le serveur MCP complet
- - Lancer server.py
- - Tester avec un client MCP reel
- - Verifier tous les 8 outils fonctionnent
-
-3. TESTER append_to_conversation via MCP
- - Via server.py (pas seulement test_append_conversation.py)
- - Avec thinking optionnel
- - Auto-create
-
-4. VERIFIER backward compatibility
- - Code existant continue de fonctionner
- - Pas de breaking changes
-
-ENSUITE SEULEMENT on passera a ikario_body.
-
-================================================================================
-DECISION REQUISE
-================================================================================
-
-Veux-tu que je:
-
-A) CORRIGER search_memories d'abord (recherche globale thoughts + conversations)
-B) TESTER le serveur MCP tel quel (avec search_memories bugue)
-C) AUTRE chose?
-
-Je recommande A: corriger search_memories pour avoir une vraie recherche globale.
-
-================================================================================
-CORRECTION search_memories TERMINEE !
-================================================================================
-Date: 20 decembre 2025 - 00h55
-
-PROBLEME IDENTIFIE:
-search_memories cherchait SEULEMENT dans conversations (ligne 43)
-Ce n'etait PAS une vraie recherche globale.
-
-CORRECTION APPLIQUEE (mcp_ikario_memory.py lignes 37-87):
-1. Recherche dans thoughts
-2. Recherche dans conversations
-3. Merge des resultats
-4. Tri par relevance_score (descendant)
-5. Ajout champ 'source' (thought vs conversation)
-6. Retourne top n_results combines
-
-TESTS CREES (test_search_memories.py):
-Test 1: Recherche "Python" - trouve thoughts ET conversations ✓
-Test 2: Verification tri par relevance_score ✓
-Test 3: Recherche "JavaScript" - trouve conversation ✓
-Test 4: filter_category (affecte seulement conversations) ✓
-
-RESULTAT DES TESTS:
- Found 5 results:
- - Thoughts: 2
- - Conversations: 3
-
- [1] Source: thought | Relevance: 0.513
- [2] Source: thought | Relevance: 0.502
- [3] Source: conversation | Relevance: 0.479
- [4] Source: conversation | Relevance: 0.411
- [5] Source: conversation | Relevance: 0.165
-
- ✓ search_memories combines thoughts + conversations: OK
- ✓ Results sorted by relevance: OK
- ✓ Source field added: OK
-
-COMMIT CREE:
-Commit: 05a4613
-Message: "Fix search_memories to search in both thoughts and conversations"
-
-Fichiers modifies:
-- mcp_ikario_memory.py (fonction search_memories reecrite)
-- test_search_memories.py (nouveau fichier de test)
-
-================================================================================
-ETAT ACTUEL ikario_rag - RESUME COMPLET
-================================================================================
-
-COMMITS CREES (4 au total dans ikario_rag):
-1. 55d905b - Backup before adding append_to_conversation
-2. cba84fe - Add append_to_conversation with thinking support
-3. 05a4613 - Fix search_memories (global search)
-4. (Note: commit 3a17744 est dans Linear_coding, pas ikario_rag)
-
-OUTILS MCP DISPONIBLES (8):
-1. add_thought - Ajouter pensee ✓
-2. add_conversation - Ajouter conversation complete (avec thinking optionnel) ✓
-3. append_to_conversation - Ajouter messages incrementalement (auto-create + thinking) ✓
-4. search_thoughts - Rechercher dans thoughts ✓
-5. search_conversations - Rechercher dans conversations ✓
-6. search_memories - Recherche GLOBALE (thoughts + conversations) ✓ CORRIGE!
-7. trace_concept_evolution - Tracer evolution concept ✓
-8. check_consistency - Verifier coherence ✓
-
-TESTS REALISES:
-✓ test_append_conversation.py (6/6 tests) - append + thinking
-✓ test_search_memories.py (4/4 tests) - recherche globale
-
-RESTE A FAIRE dans ikario_rag:
-1. Tester serveur MCP complet (server.py)
-2. Tester append_to_conversation via MCP protocol (pas seulement Python)
-3. Verifier backward compatibility
-
-================================================================================
-PROCHAINE ETAPE
-================================================================================
-
-Veux-tu:
-
-A) TESTER le serveur MCP complet (lancer server.py et tester avec un client MCP)
-B) CREER un test MCP pour append_to_conversation
-C) AUTRE chose?
-
-Je recommande A: tester le serveur MCP complet pour s'assurer que tout fonctionne via le protocole MCP.
-
-================================================================================
-
-================================================================================
-PROBLEME CRITIQUE: EMBEDDING TRONQUE POUR CONVERSATIONS LONGUES
-================================================================================
-Date: 20/12/2025 - 15h30
-
-PROBLEME IDENTIFIE:
--------------------
-
-1. TRONCATURE MASSIVE:
- - Modele actuel: all-MiniLM-L6-v2
- - Limite: 256 tokens (~1000 caracteres)
- - Conversation Fondatrice #1: 23,574 mots (~106,000 chars)
-
- RESULTAT:
- - Stockage ChromaDB: ✅ 106,000 chars complets
- - Embedding cree sur: ❌ 1,280 chars seulement (1.2%!)
- - Recherche semantique: ❌ 98.8% de la conversation INVISIBLE
-
- Si vous cherchez quelque chose discute apres les 256 premiers mots,
- search_memories ne le trouvera JAMAIS.
-
-2. QUALITE INSUFFISANTE POUR PHILOSOPHIE:
- - all-MiniLM-L6-v2: 22M parametres (TRES petit)
- - Optimise pour: Vitesse, pas comprehension semantique profonde
- - Langue: Anglais principalement
- - Performance sur concepts abstraits francais: MAUVAISE
-
-IMPACT REEL:
-------------
-
-Test avec differentes tailles:
-- 250 chars (50 mots): 100% conserve ✅
-- 1,000 chars (200 mots): 100% conserve ✅
-- 2,500 chars (500 mots): 51.2% conserve ⚠️
-- 10,000 chars (2,000 mots): 12.8% conserve ❌
-- 106,000 chars (23,574 mots): 1.2% conserve ❌❌❌
-
-Conversations philosophiques longues = CATASTROPHIQUE
-
-SOLUTION PROPOSEE:
-==================
-
-BENCHMARK DE 3 MODELES:
-
-1. all-MiniLM-L6-v2 (ACTUEL):
- - Parametres: 22M
- - Dimension: 384
- - Max tokens: 256
- - Langue: Anglais
- - Qualite: Basique
- - Pour Conv Fondatrice #1: 1.2% indexe
- - VERDICT: ❌ Inadequat
-
-2. intfloat/multilingual-e5-large:
- - Parametres: 560M (25x plus puissant)
- - Dimension: 1024 (2.7x plus riche)
- - Max tokens: 512 (2x plus long)
- - Langue: Excellent francais + multilingue
- - Qualite: State-of-the-art semantique
- - Pour Conv Fondatrice #1: ~2.4% indexe
- - VERDICT: ⚠️ Mieux mais encore insuffisant
-
-3. BAAI/bge-m3 (RECOMMANDE PAR DAVID):
- - Parametres: 568M
- - Dimension: 1024
- - Max tokens: 8192 (32x plus long!)
- - Langue: Multilingue excellent (francais inclus)
- - Qualite: State-of-the-art retrieval
- - Features: Dense + Sparse + Multi-vector retrieval (hybrid)
- - Pour Conv Fondatrice #1: ~38-40% indexe
- - VERDICT: ✅✅✅ EXCELLENT CHOIX!
-
-AVANTAGES BAAI/bge-m3:
-----------------------
-✅ Max tokens 8192 vs 256 actuel (32x amelioration!)
-✅ Hybrid retrieval (dense+sparse) pour meilleure precision
-✅ Specialement concu pour retrieval multilingue
-✅ Excellent sur benchmarks MTEB (top 3 mondial)
-✅ Supporte francais nativement
-✅ Comprehension semantique profonde pour concepts abstraits
-✅ Pour conversation 23,574 mots: conserve ~9,000 mots vs 256 actuellement
-
-PLAN D'ACTION PROPOSE:
-======================
-
-OPTION A - UPGRADE MODELE SEUL (RAPIDE):
------------------------------------------
-1. Remplacer all-MiniLM-L6-v2 par BAAI/bge-m3 dans ikario_rag
-2. Re-indexer toutes conversations existantes
-3. Tester performance recherche
-
-Fichier a modifier:
-- C:/Users/david/SynologyDrive/ikario/ikario_rag/mcp_ikario_memory.py
- Ligne 31: self.embedder = SentenceTransformer('all-MiniLM-L6-v2')
- → Remplacer par: self.embedder = SentenceTransformer('BAAI/bge-m3')
-
-Avantages:
-✅ Simple (1 ligne a changer)
-✅ Amelioration immediate et massive
-✅ Pas besoin chunking
-
-Inconvenients:
-⚠️ Download modele ~2.3GB (une fois)
-⚠️ 2-3x plus lent (acceptable pour batch)
-⚠️ +4GB RAM necessaire
-⚠️ Re-indexation toutes conversations existantes
-
-OPTION B - CHUNKING + UPGRADE MODELE (OPTIMAL):
-------------------------------------------------
-1. Implementer chunking intelligent pour conversations >8192 tokens
-2. Utiliser BAAI/bge-m3 pour embeddings
-3. Metadata: conversation_id + chunk_position pour reconstruction
-
-Avantages:
-✅ Couverture 100% meme conversations >40,000 mots
-✅ Meilleure qualite semantique
-✅ Flexible pour futures evolutions
-
-Inconvenients:
-⚠️ Plus complexe a implementer
-⚠️ Plus de documents dans ChromaDB
-⚠️ Logique de recherche plus sophistiquee
-
-RECOMMANDATION FINALE:
-======================
-
-PHASE 1 (MAINTENANT): Option A - Upgrade vers BAAI/bge-m3
-- Gain immediat: 1.2% → 38-40% couverture
-- Simple: 1 ligne de code
-- Suffisant pour 95% de vos conversations
-
-PHASE 2 (SI BESOIN): Ajouter chunking pour conversations exceptionnelles >40,000 mots
-- Seulement si vous avez regulierement des conversations >40,000 mots
-- Sinon, pas necessaire
-
-JUSTIFICATION POUR VOTRE CAS D'USAGE:
--------------------------------------
-Philosophie, concepts abstraits, idees complexes en francais:
-
-- all-MiniLM-L6-v2: Similarite textuelle basique, anglais
- → Score MTEB: ~58/100
- → Francais philosophie: ~40/100 (estime)
-
-- BAAI/bge-m3: Comprehension semantique profonde, multilingue
- → Score MTEB: ~72/100 (+24%)
- → Francais philosophie: ~70/100 (estime, +75% gain!)
-
-Pour conversations philosophiques: gain qualite estime >50%
-
-COUT DE MIGRATION:
-------------------
-- Temps: ~30 min (download modele + re-index)
-- Calcul: 2-3x plus lent (1 conversation = 2s vs 0.7s actuellement)
-- Memoire: +4GB RAM (total ~5GB vs ~1GB actuel)
-- Stockage: +2.3GB pour modele
-- Code: Minimal (1 ligne a changer + re-index script)
-
-PROCHAINE ETAPE:
-================
-Decider et implementer upgrade vers BAAI/bge-m3 dans ikario_rag
-
-================================================================================
diff --git a/prompts/app_spec_library_rag_types_docs.txt b/prompts/app_spec_library_rag_types_docs.txt
deleted file mode 100644
index 0fe4fa6..0000000
--- a/prompts/app_spec_library_rag_types_docs.txt
+++ /dev/null
@@ -1,679 +0,0 @@
-
- Library RAG - Type Safety & Documentation Enhancement
-
-
- Enhance the Library RAG application (philosophical texts indexing and semantic search) by adding
- strict type annotations and comprehensive Google-style docstrings to all Python modules. This will
- improve code maintainability, enable static type checking with mypy, and provide clear documentation
- for all functions, classes, and modules.
-
- The application is a RAG pipeline that processes PDF documents through OCR, LLM-based extraction,
- semantic chunking, and ingestion into Weaviate vector database. It includes a Flask web interface
- for document upload, processing, and semantic search.
-
-
-
-
- Python 3.10+
- Flask 3.0
- Weaviate 1.34.4 with text2vec-transformers
- Mistral OCR API
- Ollama (local) or Mistral API
- mypy with strict configuration
-
-
- Docker Compose (Weaviate + transformers)
- weaviate-client, flask, mistralai, python-dotenv
-
-
-
-
-
- - flask_app.py: Main Flask application (640 lines)
- - schema.py: Weaviate schema definition (383 lines)
- - utils/: 16+ modules for PDF processing pipeline
- - pdf_pipeline.py: Main orchestration (879 lines)
- - mistral_client.py: OCR API client
- - ocr_processor.py: OCR processing
- - markdown_builder.py: Markdown generation
- - llm_metadata.py: Metadata extraction via LLM
- - llm_toc.py: Table of contents extraction
- - llm_classifier.py: Section classification
- - llm_chunker.py: Semantic chunking
- - llm_cleaner.py: Chunk cleaning
- - llm_validator.py: Document validation
- - weaviate_ingest.py: Database ingestion
- - hierarchy_parser.py: Document hierarchy parsing
- - image_extractor.py: Image extraction from PDFs
- - toc_extractor*.py: Various TOC extraction methods
- - templates/: Jinja2 templates for Flask UI
- - tests/utils2/: Minimal test coverage (3 test files)
-
-
-
- - Inconsistent type annotations across modules (some have partial types, many have none)
- - Missing or incomplete docstrings (no Google-style format)
- - No mypy configuration for strict type checking
- - Type hints missing on function parameters and return values
- - Dict[str, Any] used extensively without proper typing
- - No type stubs for complex nested structures
-
-
-
-
-
-
- - Add complete type annotations to ALL functions and methods
- - Use proper generic types (List, Dict, Optional, Union) from typing module
- - Add TypedDict for complex dictionary structures
- - Add Protocol types for duck-typed interfaces
- - Use Literal types for string constants
- - Add ParamSpec and TypeVar where appropriate
- - Type all class attributes and instance variables
- - Add type annotations to lambda functions where possible
-
-
-
- - Create mypy.ini with strict configuration
- - Enable: check_untyped_defs, disallow_untyped_defs, disallow_incomplete_defs
- - Enable: disallow_untyped_calls, disallow_untyped_decorators
- - Enable: warn_return_any, warn_redundant_casts
- - Enable: strict_equality, strict_optional
- - Set python_version to 3.10
- - Configure per-module overrides if needed for gradual migration
-
-
-
- - Create TypedDict definitions for common data structures:
- - OCR response structures
- - Metadata dictionaries
- - TOC entries
- - Chunk objects
- - Weaviate objects
- - Pipeline results
- - Add NewType for semantic type safety (DocumentName, ChunkId, etc.)
- - Create Protocol types for callback functions
-
-
-
- - pdf_pipeline.py: Type all 10 pipeline steps, callbacks, result dictionaries
- - flask_app.py: Type all route handlers, request/response types
- - schema.py: Type Weaviate configuration objects
- - llm_*.py: Type LLM request/response structures
- - mistral_client.py: Type API client methods and responses
- - weaviate_ingest.py: Type ingestion functions and batch operations
-
-
-
-
-
- - Add comprehensive Google-style docstrings to ALL:
- - Module-level docstrings explaining purpose and usage
- - Class docstrings with Attributes section
- - Function/method docstrings with Args, Returns, Raises sections
- - Complex algorithm explanations with Examples section
- - Include code examples for public APIs
- - Document all exceptions that can be raised
- - Add Notes section for important implementation details
- - Add See Also section for related functions
-
-
-
-
- - pdf_pipeline.py: Document the 10-step pipeline, each step's purpose
- - mistral_client.py: Document OCR API usage, cost calculation
- - llm_metadata.py: Document metadata extraction logic
- - llm_toc.py: Document TOC extraction strategies
- - llm_classifier.py: Document section classification types
- - llm_chunker.py: Document semantic vs basic chunking
- - llm_cleaner.py: Document cleaning rules and validation
- - llm_validator.py: Document validation criteria
- - weaviate_ingest.py: Document ingestion process, nested objects
- - hierarchy_parser.py: Document hierarchy building algorithm
-
-
-
- - Document all routes with request/response examples
- - Document SSE (Server-Sent Events) implementation
- - Document Weaviate query patterns
- - Document upload processing workflow
- - Document background job management
-
-
-
- - Document Weaviate schema design decisions
- - Document each collection's purpose and relationships
- - Document nested object structure
- - Document vectorization strategy
-
-
-
-
- - Add inline comments for complex logic only (don't over-comment)
- - Explain WHY not WHAT (code should be self-documenting)
- - Document performance considerations
- - Document cost implications (OCR, LLM API calls)
- - Document error handling strategies
-
-
-
-
-
- - All modules must pass mypy --strict
- - No # type: ignore comments without justification
- - CI/CD should run mypy checks
- - Type coverage should be 100%
-
-
-
- - All public functions must have docstrings
- - All docstrings must follow Google style
- - Examples should be executable and tested
- - Documentation should be clear and concise
-
-
-
-
-
-
- Priority 1 (Most used, most complex):
- 1. utils/pdf_pipeline.py - Main orchestration
- 2. flask_app.py - Web application entry point
- 3. utils/weaviate_ingest.py - Database operations
- 4. schema.py - Schema definition
-
- Priority 2 (Core LLM modules):
- 5. utils/llm_metadata.py
- 6. utils/llm_toc.py
- 7. utils/llm_classifier.py
- 8. utils/llm_chunker.py
- 9. utils/llm_cleaner.py
- 10. utils/llm_validator.py
-
- Priority 3 (OCR and parsing):
- 11. utils/mistral_client.py
- 12. utils/ocr_processor.py
- 13. utils/markdown_builder.py
- 14. utils/hierarchy_parser.py
- 15. utils/image_extractor.py
-
- Priority 4 (Supporting modules):
- 16. utils/toc_extractor.py
- 17. utils/toc_extractor_markdown.py
- 18. utils/toc_extractor_visual.py
- 19. utils/llm_structurer.py (legacy)
-
-
-
-
-
- Setup Type Checking Infrastructure
-
- Configure mypy with strict settings and create foundational type definitions
-
-
- - Create mypy.ini configuration file with strict settings
- - Add mypy to requirements.txt or dev dependencies
- - Create utils/types.py module for common TypedDict definitions
- - Define core types: OCRResponse, Metadata, TOCEntry, ChunkData, PipelineResult
- - Add NewType definitions for semantic types: DocumentName, ChunkId, SectionPath
- - Create Protocol types for callbacks (ProgressCallback, etc.)
- - Document type definitions in utils/types.py module docstring
- - Test mypy configuration on a single module to verify settings
-
-
- - mypy.ini exists with strict configuration
- - utils/types.py contains all foundational types with docstrings
- - mypy runs without errors on utils/types.py
- - Type definitions are comprehensive and reusable
-
-
-
-
- Add Types to PDF Pipeline Orchestration
-
- Add complete type annotations to pdf_pipeline.py (879 lines, most complex module)
-
-
- - Add type annotations to all function signatures in pdf_pipeline.py
- - Type the 10-step pipeline: OCR, Markdown, Metadata, TOC, Classify, Chunk, Clean, Validate, Weaviate
- - Type progress_callback parameter with Protocol or Callable
- - Add TypedDict for pipeline options dictionary
- - Add TypedDict for pipeline result dictionary structure
- - Type all helper functions (extract_document_metadata_legacy, etc.)
- - Add proper return types for process_pdf_v2, process_pdf, process_pdf_bytes
- - Fix any mypy errors that arise
- - Verify mypy --strict passes on pdf_pipeline.py
-
-
- - All functions in pdf_pipeline.py have complete type annotations
- - progress_callback is properly typed with Protocol
- - All Dict[str, Any] replaced with TypedDict where appropriate
- - mypy --strict pdf_pipeline.py passes with zero errors
- - No # type: ignore comments (or justified if absolutely necessary)
-
-
-
-
- Add Types to Flask Application
-
- Add complete type annotations to flask_app.py and type all routes
-
-
- - Add type annotations to all Flask route handlers
- - Type request.args, request.form, request.files usage
- - Type jsonify() return values
- - Type get_weaviate_client context manager
- - Type get_collection_stats, get_all_chunks, search_chunks functions
- - Add TypedDict for Weaviate query results
- - Type background job processing functions (run_processing_job)
- - Type SSE generator function (upload_progress)
- - Add type hints for template rendering
- - Verify mypy --strict passes on flask_app.py
-
-
- - All Flask routes have complete type annotations
- - Request/response types are clear and documented
- - Weaviate query functions are properly typed
- - SSE generator is correctly typed
- - mypy --strict flask_app.py passes with zero errors
-
-
-
-
- Add Types to Core LLM Modules
-
- Add complete type annotations to all LLM processing modules (metadata, TOC, classifier, chunker, cleaner, validator)
-
-
- - llm_metadata.py: Type extract_metadata function, return structure
- - llm_toc.py: Type extract_toc function, TOC hierarchy structure
- - llm_classifier.py: Type classify_sections, section types (Literal), validation functions
- - llm_chunker.py: Type chunk_section_with_llm, chunk objects
- - llm_cleaner.py: Type clean_chunk, is_chunk_valid functions
- - llm_validator.py: Type validate_document, validation result structure
- - Add TypedDict for LLM request/response structures
- - Type provider selection ("ollama" | "mistral" as Literal)
- - Type model names with Literal or constants
- - Verify mypy --strict passes on all llm_*.py modules
-
-
- - All LLM modules have complete type annotations
- - Section types use Literal for type safety
- - Provider and model parameters are strongly typed
- - LLM request/response structures use TypedDict
- - mypy --strict passes on all llm_*.py modules with zero errors
-
-
-
-
- Add Types to Weaviate and Database Modules
-
- Add complete type annotations to schema.py and weaviate_ingest.py
-
-
- - schema.py: Type Weaviate configuration objects
- - schema.py: Type collection property definitions
- - weaviate_ingest.py: Type ingest_document function signature
- - weaviate_ingest.py: Type delete_document_chunks function
- - weaviate_ingest.py: Add TypedDict for Weaviate object structure
- - Type batch insertion operations
- - Type nested object references (work, document)
- - Add proper error types for Weaviate exceptions
- - Verify mypy --strict passes on both modules
-
-
- - schema.py has complete type annotations for Weaviate config
- - weaviate_ingest.py functions are fully typed
- - Nested object structures use TypedDict
- - Weaviate client operations are properly typed
- - mypy --strict passes on both modules with zero errors
-
-
-
-
- Add Types to OCR and Parsing Modules
-
- Add complete type annotations to mistral_client.py, ocr_processor.py, markdown_builder.py, hierarchy_parser.py
-
-
- - mistral_client.py: Type create_client, run_ocr, estimate_ocr_cost
- - mistral_client.py: Add TypedDict for Mistral API response structures
- - ocr_processor.py: Type serialize_ocr_response, OCR object structures
- - markdown_builder.py: Type build_markdown, image_writer parameter
- - hierarchy_parser.py: Type build_hierarchy, flatten_hierarchy functions
- - hierarchy_parser.py: Add TypedDict for hierarchy node structure
- - image_extractor.py: Type create_image_writer, image handling
- - Verify mypy --strict passes on all modules
-
-
- - All OCR/parsing modules have complete type annotations
- - Mistral API structures use TypedDict
- - Hierarchy nodes are properly typed
- - Image handling functions are typed
- - mypy --strict passes on all modules with zero errors
-
-
-
-
- Add Google-Style Docstrings to Core Modules
-
- Add comprehensive Google-style docstrings to pdf_pipeline.py, flask_app.py, and weaviate modules
-
-
- - pdf_pipeline.py: Add module docstring explaining the V2 pipeline
- - pdf_pipeline.py: Add docstrings to process_pdf_v2 with Args, Returns, Raises sections
- - pdf_pipeline.py: Document each of the 10 pipeline steps in comments
- - pdf_pipeline.py: Add Examples section showing typical usage
- - flask_app.py: Add module docstring explaining Flask application
- - flask_app.py: Document all routes with request/response examples
- - flask_app.py: Document Weaviate connection management
- - schema.py: Add module docstring explaining schema design
- - schema.py: Document each collection's purpose and relationships
- - weaviate_ingest.py: Document ingestion process with examples
- - All docstrings must follow Google style format exactly
-
-
- - All core modules have comprehensive module-level docstrings
- - All public functions have Google-style docstrings
- - Args, Returns, Raises sections are complete and accurate
- - Examples are provided for complex functions
- - Docstrings explain WHY, not just WHAT
-
-
-
-
- Add Google-Style Docstrings to LLM Modules
-
- Add comprehensive Google-style docstrings to all LLM processing modules
-
-
- - llm_metadata.py: Document metadata extraction logic with examples
- - llm_toc.py: Document TOC extraction strategies and fallbacks
- - llm_classifier.py: Document section types and classification criteria
- - llm_chunker.py: Document semantic vs basic chunking approaches
- - llm_cleaner.py: Document cleaning rules and validation logic
- - llm_validator.py: Document validation criteria and corrections
- - Add Examples sections showing input/output for each function
- - Document LLM provider differences (Ollama vs Mistral)
- - Document cost implications in Notes sections
- - All docstrings must follow Google style format exactly
-
-
- - All LLM modules have comprehensive docstrings
- - Each function has Args, Returns, Raises sections
- - Examples show realistic input/output
- - Provider differences are documented
- - Cost implications are noted where relevant
-
-
-
-
- Add Google-Style Docstrings to OCR and Parsing Modules
-
- Add comprehensive Google-style docstrings to OCR, markdown, hierarchy, and extraction modules
-
-
- - mistral_client.py: Document OCR API usage, cost calculation
- - ocr_processor.py: Document OCR response processing
- - markdown_builder.py: Document markdown generation strategy
- - hierarchy_parser.py: Document hierarchy building algorithm
- - image_extractor.py: Document image extraction process
- - toc_extractor*.py: Document various TOC extraction methods
- - Add Examples sections for complex algorithms
- - Document edge cases and error handling
- - All docstrings must follow Google style format exactly
-
-
- - All OCR/parsing modules have comprehensive docstrings
- - Complex algorithms are well explained
- - Edge cases are documented
- - Error handling is documented
- - Examples demonstrate typical usage
-
-
-
-
- Final Validation and CI Integration
-
- Verify all type annotations and docstrings, integrate mypy into CI/CD
-
-
- - Run mypy --strict on entire codebase, verify 100% pass rate
- - Verify all public functions have docstrings
- - Check docstring formatting with pydocstyle or similar tool
- - Create GitHub Actions workflow to run mypy on every commit
- - Update README.md with type checking instructions
- - Update CLAUDE.md with documentation standards
- - Create CONTRIBUTING.md with type annotation and docstring guidelines
- - Generate API documentation with Sphinx or pdoc
- - Fix any remaining mypy errors or missing docstrings
-
-
- - mypy --strict passes on entire codebase with zero errors
- - All public functions have Google-style docstrings
- - CI/CD runs mypy checks automatically
- - Documentation is generated and accessible
- - Contributing guidelines document type/docstring requirements
-
-
-
-
-
-
- - 100% type coverage across all modules
- - mypy --strict passes with zero errors
- - No # type: ignore comments without justification
- - All Dict[str, Any] replaced with TypedDict where appropriate
- - Proper use of generics, protocols, and type variables
- - NewType used for semantic type safety
-
-
-
- - All modules have comprehensive module-level docstrings
- - All public functions/classes have Google-style docstrings
- - All docstrings include Args, Returns, Raises sections
- - Complex functions include Examples sections
- - Cost implications documented in Notes sections
- - Error handling clearly documented
- - Provider differences (Ollama vs Mistral) documented
-
-
-
- - Code is self-documenting with clear variable names
- - Inline comments explain WHY, not WHAT
- - Complex algorithms are well explained
- - Performance considerations documented
- - Security considerations documented
-
-
-
- - IDE autocomplete works perfectly with type hints
- - Type errors caught at development time, not runtime
- - Documentation is easily accessible in IDE
- - API examples are executable and tested
- - Contributing guidelines are clear and comprehensive
-
-
-
- - Refactoring is safer with type checking
- - Function signatures are self-documenting
- - API contracts are explicit and enforced
- - Breaking changes are caught by type checker
- - New developers can understand code quickly
-
-
-
-
-
- - Must maintain backward compatibility with existing code
- - Cannot break existing Flask routes or API contracts
- - Weaviate schema must remain unchanged
- - Existing tests must continue to pass
-
-
-
- - Can use per-module mypy configuration for gradual migration
- - Can temporarily disable strict checks on legacy modules
- - Priority modules must be completed first
- - Low-priority modules can be deferred
-
-
-
- - All type annotations must use Python 3.10+ syntax
- - Docstrings must follow Google style exactly (not NumPy or reStructuredText)
- - Use typing module (List, Dict, Optional) until Python 3.9 support dropped
- - Use from __future__ import annotations if needed for forward references
-
-
-
-
-
- - Run mypy --strict on each module after adding types
- - Use mypy daemon (dmypy) for faster incremental checking
- - Add mypy to pre-commit hooks
- - CI/CD must run mypy and fail on type errors
-
-
-
- - Use pydocstyle to validate Google-style format
- - Use sphinx-build to generate docs and catch errors
- - Manual review of docstring examples
- - Verify examples are executable and correct
-
-
-
- - Verify existing tests still pass after type additions
- - Add new tests for complex typed structures
- - Test mypy configuration on sample code
- - Verify IDE autocomplete works correctly
-
-
-
-
-
- ```python
- """
- PDF Pipeline V2 - Intelligent document processing with LLM enhancement.
-
- This module orchestrates a 10-step pipeline for processing PDF documents:
- 1. OCR via Mistral API
- 2. Markdown construction with images
- 3. Metadata extraction via LLM
- 4. Table of contents (TOC) extraction
- 5. Section classification
- 6. Semantic chunking
- 7. Chunk cleaning and validation
- 8. Enrichment with concepts
- 9. Validation and corrections
- 10. Ingestion into Weaviate vector database
-
- The pipeline supports multiple LLM providers (Ollama local, Mistral API) and
- various processing modes (skip OCR, semantic chunking, OCR annotations).
-
- Typical usage:
- >>> from pathlib import Path
- >>> from utils.pdf_pipeline import process_pdf
- >>>
- >>> result = process_pdf(
- ... Path("document.pdf"),
- ... use_llm=True,
- ... llm_provider="ollama",
- ... ingest_to_weaviate=True,
- ... )
- >>> print(f"Processed {result['pages']} pages, {result['chunks_count']} chunks")
-
- See Also:
- mistral_client: OCR API client
- llm_metadata: Metadata extraction
- weaviate_ingest: Database ingestion
- """
- ```
-
-
-
- ```python
- def process_pdf_v2(
- pdf_path: Path,
- output_dir: Path = Path("output"),
- *,
- use_llm: bool = True,
- llm_provider: Literal["ollama", "mistral"] = "ollama",
- llm_model: Optional[str] = None,
- skip_ocr: bool = False,
- ingest_to_weaviate: bool = True,
- progress_callback: Optional[ProgressCallback] = None,
- ) -> PipelineResult:
- """
- Process a PDF through the complete V2 pipeline with LLM enhancement.
-
- This function orchestrates all 10 steps of the intelligent document processing
- pipeline, from OCR to Weaviate ingestion. It supports both local (Ollama) and
- cloud (Mistral API) LLM providers, with optional caching via skip_ocr.
-
- Args:
- pdf_path: Absolute path to the PDF file to process.
- output_dir: Base directory for output files. Defaults to "./output".
- use_llm: Enable LLM-based processing (metadata, TOC, chunking).
- If False, uses basic heuristic processing.
- llm_provider: LLM provider to use. "ollama" for local (free but slow),
- "mistral" for API (fast but paid).
- llm_model: Specific model name. If None, auto-detects based on provider
- (qwen2.5:7b for ollama, mistral-small-latest for mistral).
- skip_ocr: If True, reuses existing markdown file to avoid OCR cost.
- Requires output_dir//.md to exist.
- ingest_to_weaviate: If True, ingests chunks into Weaviate after processing.
- progress_callback: Optional callback for real-time progress updates.
- Called with (step_id, status, detail) for each pipeline step.
-
- Returns:
- Dictionary containing processing results with the following keys:
- - success (bool): True if processing completed without errors
- - document_name (str): Name of the processed document
- - pages (int): Number of pages in the PDF
- - chunks_count (int): Number of chunks generated
- - cost_ocr (float): OCR cost in euros (0 if skip_ocr=True)
- - cost_llm (float): LLM API cost in euros (0 if provider=ollama)
- - cost_total (float): Total cost (ocr + llm)
- - metadata (dict): Extracted metadata (title, author, etc.)
- - toc (list): Hierarchical table of contents
- - files (dict): Paths to generated files (markdown, chunks, etc.)
-
- Raises:
- FileNotFoundError: If pdf_path does not exist.
- ValueError: If skip_ocr=True but markdown file not found.
- RuntimeError: If Weaviate connection fails during ingestion.
-
- Examples:
- Basic usage with Ollama (free):
- >>> result = process_pdf_v2(
- ... Path("platon_menon.pdf"),
- ... llm_provider="ollama"
- ... )
- >>> print(f"Cost: {result['cost_total']:.4f}€")
- Cost: 0.0270€ # OCR only
-
- With Mistral API (faster):
- >>> result = process_pdf_v2(
- ... Path("platon_menon.pdf"),
- ... llm_provider="mistral",
- ... llm_model="mistral-small-latest"
- ... )
-
- Skip OCR to avoid cost:
- >>> result = process_pdf_v2(
- ... Path("platon_menon.pdf"),
- ... skip_ocr=True, # Reuses existing markdown
- ... ingest_to_weaviate=False
- ... )
-
- Notes:
- - OCR cost: ~0.003€/page (standard), ~0.009€/page (with annotations)
- - LLM cost: Free with Ollama, variable with Mistral API
- - Processing time: ~30s/page with Ollama, ~5s/page with Mistral
- - Weaviate must be running (docker-compose up -d) before ingestion
- """
- ```
-
-
-
diff --git a/prompts/app_spec_markdown_support.txt b/prompts/app_spec_markdown_support.txt
deleted file mode 100644
index 5cae3aa..0000000
--- a/prompts/app_spec_markdown_support.txt
+++ /dev/null
@@ -1,490 +0,0 @@
-
- Library RAG - Native Markdown Support
-
-
- Add native support for Markdown (.md) files to the Library RAG application. Currently, the system only accepts PDF files
- and uses Mistral OCR for text extraction. This feature will allow users to upload pre-existing Markdown files directly,
- skipping the expensive OCR step while still benefiting from LLM-based metadata extraction, TOC generation, semantic
- chunking, and Weaviate vectorization.
-
- This enhancement reduces costs, improves processing speed for already-digitized texts, and makes the system more flexible
- for users who have philosophical texts in Markdown format.
-
-
-
-
- Flask 3.0
- utils/pdf_pipeline.py (to be extended)
- Werkzeug secure_filename
- Ollama (local) or Mistral API
- Weaviate with BAAI/bge-m3
-
-
- mypy strict mode
- Google-style docstrings required
-
-
-
-
-
- Update Flask File Validation
-
- Modify the Flask application to accept both PDF and Markdown files. Update the ALLOWED_EXTENSIONS
- configuration and file validation logic to support .md files while maintaining backward compatibility
- with existing PDF workflows.
-
- 1
- backend
-
- - flask_app.py (line 99: ALLOWED_EXTENSIONS, line 427: allowed_file function)
-
-
- - Change ALLOWED_EXTENSIONS from {"pdf"} to {"pdf", "md"}
- - Update allowed_file() function to accept both extensions
- - Update upload.html template to accept .md files in file input
- - Update error messages to reflect both formats
-
-
- 1. Start Flask app
- 2. Navigate to /upload
- 3. Attempt to upload a .md file
- 4. Verify file is accepted (no "Format non supporté" error)
- 5. Verify PDF upload still works
-
-
-
-
- Add Markdown Detection in Pipeline
-
- Enhance pdf_pipeline.py to detect when a Markdown file is being processed instead of a PDF.
- Add logic to automatically skip OCR processing for .md files and copy the Markdown content
- directly to the output directory.
-
- 1
- backend
-
- - utils/pdf_pipeline.py (process_pdf_v2 function, around line 250-450)
-
-
- - Add file extension detection: `file_ext = pdf_path.suffix.lower()`
- - If file_ext == ".md":
- - Skip OCR step entirely (no Mistral API call)
- - Read Markdown content directly: `md_content = pdf_path.read_text(encoding='utf-8')`
- - Copy to output: `md_path.write_text(md_content, encoding='utf-8')`
- - Set nb_pages = md_content.count('\n# ') or 1 (estimate from H1 headers)
- - Set cost_ocr = 0.0
- - Emit progress: "markdown_load" instead of "ocr"
- - If file_ext == ".pdf":
- - Continue with existing OCR workflow
- - Both paths converge at LLM processing (metadata, TOC, chunking)
-
-
- 1. Create test Markdown file with philosophical content
- 2. Call process_pdf(Path("test.md"), use_llm=True)
- 3. Verify OCR is skipped (cost_ocr = 0.0)
- 4. Verify output/test/test.md is created
- 5. Verify no _ocr.json file is created
- 6. Verify LLM processing runs normally
-
-
-
-
- Markdown-Specific Progress Callback
-
- Update the progress callback system to emit appropriate events for Markdown file processing.
- Instead of "OCR Mistral en cours...", display "Chargement Markdown..." to provide accurate
- user feedback during Server-Sent Events streaming.
-
- 2
- backend
-
- - utils/pdf_pipeline.py (emit_progress calls)
- - flask_app.py (process_file_background function)
-
-
- - Add conditional progress messages based on file type
- - For .md files: emit_progress("markdown_load", "running", "Chargement du fichier Markdown...")
- - For .pdf files: emit_progress("ocr", "running", "OCR Mistral en cours...")
- - Update frontend to handle "markdown_load" event type
- - Ensure step numbering adjusts (9 steps for MD vs 10 for PDF)
-
-
- 1. Upload Markdown file via Flask interface
- 2. Monitor SSE progress stream at /upload/progress/<job_id>
- 3. Verify first step shows "Chargement du fichier Markdown..."
- 4. Verify no OCR-related messages appear
- 5. Verify subsequent steps (metadata, TOC, etc.) work normally
-
-
-
-
- Update process_pdf_bytes for Markdown
-
- Extend process_pdf_bytes() function to handle Markdown content uploaded via Flask.
- This function currently creates a temporary PDF file, but for Markdown uploads,
- it should create a temporary .md file instead.
-
- 1
- backend
-
- - utils/pdf_pipeline.py (process_pdf_bytes function, line 1255)
-
-
- - Detect file type from filename parameter
- - If filename ends with .md:
- - Create temp file with suffix=".md"
- - Write file_bytes as UTF-8 text
- - If filename ends with .pdf:
- - Existing behavior (suffix=".pdf", binary write)
- - Pass temp file path to process_pdf() which now handles both types
-
-
- 1. Create Flask test client
- 2. POST multipart form with .md file to /upload
- 3. Verify process_pdf_bytes creates .md temp file
- 4. Verify temp file contains correct Markdown content
- 5. Verify cleanup deletes temp file after processing
-
-
-
-
- Add Markdown File Validation
-
- Implement validation for uploaded Markdown files to ensure they contain valid UTF-8 text
- and basic Markdown structure. Reject files that are too large, contain binary data,
- or have no meaningful content.
-
- 2
- backend
-
- - utils/markdown_validator.py
-
-
- - Create validate_markdown_file(file_path: Path) -> dict[str, Any] function
- - Checks:
- - File size < 10 MB
- - Valid UTF-8 encoding
- - Contains at least one header (#, ##, etc.)
- - Not empty (at least 100 characters)
- - No null bytes or excessive binary content
- - Return dict with success, error, and warnings keys
- - Call from process_pdf_v2 before processing
- - Type annotations and Google-style docstrings required
-
-
- 1. Test with valid Markdown file → passes validation
- 2. Test with empty file → fails with "File too short"
- 3. Test with binary file (.exe renamed to .md) → fails with "Invalid UTF-8"
- 4. Test with very large file (>10MB) → fails with "File too large"
- 5. Test with plain text no headers → warning but continues
-
-
-
-
- Update Documentation
-
- Update README.md and .claude/CLAUDE.md to document the new Markdown support feature.
- Include usage examples, cost comparison (PDF vs MD), and troubleshooting tips.
-
- 3
- documentation
-
- - README.md (add section under "Pipeline de Traitement")
- - .claude/CLAUDE.md (update development guidelines)
- - templates/upload.html (add help text)
-
-
- - README.md:
- - Add "Support Markdown Natif" section
- - Document accepted formats: PDF, MD
- - Show cost comparison table (PDF: ~0.003€/page, MD: 0€)
- - Add example: process_pdf(Path("document.md"))
- - CLAUDE.md:
- - Update "Pipeline de Traitement" section
- - Note conditional OCR step
- - Document markdown_validator.py module
- - upload.html:
- - Update file input accept attribute: accept=".pdf,.md"
- - Add help text: "Formats acceptés : PDF, Markdown (.md)"
-
-
- 1. Read README.md markdown support section
- 2. Verify examples are clear and accurate
- 3. Check CLAUDE.md developer notes
- 4. Open /upload in browser
- 5. Verify help text displays correctly
-
-
-
-
- Add Unit Tests for Markdown Processing
-
- Create comprehensive unit tests for Markdown file handling to ensure reliability
- and prevent regressions. Cover file validation, pipeline processing, and edge cases.
-
- 2
- testing
-
- - tests/utils/test_markdown_validator.py
- - tests/utils/test_pdf_pipeline_markdown.py
- - tests/fixtures/sample.md
-
-
- - test_markdown_validator.py:
- - Test valid Markdown acceptance
- - Test invalid encoding rejection
- - Test file size limits
- - Test empty file rejection
- - Test binary data detection
- - test_pdf_pipeline_markdown.py:
- - Test Markdown file processing end-to-end
- - Test OCR skip for .md files
- - Test cost_ocr = 0.0
- - Test LLM processing (metadata, TOC, chunking)
- - Mock Weaviate ingestion
- - Verify output files created correctly
- - fixtures/sample.md:
- - Create realistic philosophical text in Markdown
- - Include headers, paragraphs, formatting
- - ~1000 words for realistic testing
-
-
- 1. Run: pytest tests/utils/test_markdown_validator.py -v
- 2. Verify all validation tests pass
- 3. Run: pytest tests/utils/test_pdf_pipeline_markdown.py -v
- 4. Verify end-to-end Markdown processing works
- 5. Check test coverage: pytest --cov=utils --cov-report=html
-
-
-
-
- Type Safety and Documentation
-
- Ensure all new code follows strict type safety requirements and includes comprehensive
- Google-style docstrings. Run mypy checks and update type definitions as needed.
-
- 2
- type_safety
-
- - utils/types.py (add Markdown-specific types if needed)
- - All modified modules (type annotations)
-
-
- - Add type annotations to all new functions
- - Update existing functions that handle both PDF and MD
- - Consider adding:
- - FileFormat = Literal["pdf", "md"]
- - MarkdownValidationResult = TypedDict(...)
- - Run mypy --strict on all modified files
- - Add Google-style docstrings with:
- - Args section documenting all parameters
- - Returns section with structure details
- - Raises section for exceptions
- - Examples section for complex functions
-
-
- 1. Run: mypy utils/pdf_pipeline.py --strict
- 2. Run: mypy utils/markdown_validator.py --strict
- 3. Verify no type errors
- 4. Run: pydocstyle utils/markdown_validator.py --convention=google
- 5. Verify all docstrings follow Google style
-
-
-
-
- Handle Markdown-Specific Edge Cases
-
- Address edge cases specific to Markdown processing: front matter (YAML/TOML),
- embedded code blocks, special characters, and non-standard Markdown extensions.
-
- 3
- backend
-
- - utils/markdown_validator.py
- - utils/llm_metadata.py (handle front matter)
-
-
- - Front matter handling:
- - Detect YAML/TOML front matter (--- or +++)
- - Extract metadata if present (title, author, date)
- - Pass to LLM or use directly if valid
- - Strip front matter before content processing
- - Code block handling:
- - Don't treat code blocks as actual content
- - Preserve them for chunking but don't analyze
- - Special characters:
- - Handle Unicode properly (Greek, Latin, French accents)
- - Preserve LaTeX equations in $ or $$
- - GitHub Flavored Markdown:
- - Support tables, task lists, strikethrough
- - Convert to standard format if needed
-
-
- 1. Upload Markdown with YAML front matter
- 2. Verify metadata extracted correctly
- 3. Upload Markdown with code blocks
- 4. Verify code not treated as philosophical content
- 5. Upload Markdown with Greek/Latin text
- 6. Verify Unicode handled correctly
-
-
-
-
- Update UI/UX for Markdown Upload
-
- Enhance the upload interface to clearly communicate Markdown support and provide
- visual feedback about the file type being processed. Show format-specific information
- (e.g., "No OCR cost for Markdown files").
-
- 3
- frontend
-
- - templates/upload.html
- - templates/upload_progress.html
-
-
- - upload.html:
- - Add file type indicator icon (📄 PDF vs 📝 MD)
- - Show format-specific help text on hover
- - Display estimated cost: "PDF: ~0.003€/page, Markdown: 0€"
- - Add example Markdown file download link
- - upload_progress.html:
- - Show different icon for Markdown processing
- - Adjust progress bar (9 steps vs 10 steps)
- - Display "No OCR cost" badge for Markdown
- - Update step descriptions based on file type
-
-
- 1. Open /upload page
- 2. Verify help text mentions both PDF and MD
- 3. Select a .md file
- 4. Verify file type indicator shows 📝
- 5. Submit upload
- 6. Verify progress shows "Chargement Markdown..."
- 7. Verify "No OCR cost" badge displays
-
-
-
-
-
-
- Setup and Configuration
-
- - Update ALLOWED_EXTENSIONS in flask_app.py
- - Modify allowed_file() validation function
- - Update upload.html file input accept attribute
- - Add Markdown MIME type handling
-
-
-
-
- Core Pipeline Extension
-
- - Add file extension detection in process_pdf_v2()
- - Implement Markdown file reading logic
- - Skip OCR for .md files
- - Add conditional progress callbacks
- - Update process_pdf_bytes() for Markdown
-
-
-
-
- Validation and Error Handling
-
- - Create markdown_validator.py module
- - Implement UTF-8 encoding validation
- - Add file size limits
- - Handle front matter extraction
- - Add comprehensive error messages
-
-
-
-
- Testing Infrastructure
-
- - Create test fixtures (sample.md)
- - Write validation tests
- - Write pipeline integration tests
- - Add edge case tests
- - Verify mypy strict compliance
-
-
-
-
- Documentation and Polish
-
- - Update README.md with Markdown support
- - Update .claude/CLAUDE.md developer docs
- - Add Google-style docstrings
- - Update UI templates with new messaging
- - Create usage examples
-
-
-
-
-
-
- - Markdown files upload successfully via Flask
- - OCR is skipped for .md files (cost_ocr = 0.0)
- - LLM processing works identically for PDF and MD
- - Chunks are created and vectorized correctly
- - Both file types can be searched in Weaviate
- - Existing PDF workflow remains unchanged
-
-
-
- - All code passes mypy --strict
- - All functions have type annotations
- - Google-style docstrings on all modules
- - No Any types without justification
- - TypedDict definitions for new data structures
-
-
-
- - Unit tests cover Markdown validation
- - Integration tests verify end-to-end processing
- - Edge cases handled (front matter, Unicode, large files)
- - Test coverage >80% for new code
- - All tests pass in CI/CD pipeline
-
-
-
- - Upload interface clearly shows both formats supported
- - Progress feedback accurate for both PDF and MD
- - Cost savings clearly communicated ("0€ for Markdown")
- - Error messages helpful and specific
- - Documentation clear with examples
-
-
-
- - Markdown processing faster than PDF (no OCR)
- - No regression in PDF processing speed
- - Memory usage reasonable for large MD files
- - Validation completes in <100ms
- - Overall pipeline <30s for typical Markdown document
-
-
-
-
-
- - PDF processing: OCR ~0.003€/page + LLM variable
- - Markdown processing: 0€ OCR + LLM variable
- - Estimated savings: 50-70% for documents with Markdown source
-
-
-
- - Maintains backward compatibility with existing PDFs
- - No breaking changes to API or database schema
- - Existing chunks and documents unaffected
- - Can process both formats in same session
-
-
-
- - Support for .txt plain text files
- - Support for .docx Word documents (via pandoc)
- - Support for .epub ebooks
- - Batch upload of multiple Markdown files
- - Markdown to PDF export for archival
-
-
-
diff --git a/prompts/app_spec_tavily_mcp.txt b/prompts/app_spec_tavily_mcp.txt
deleted file mode 100644
index 349f9a6..0000000
--- a/prompts/app_spec_tavily_mcp.txt
+++ /dev/null
@@ -1,498 +0,0 @@
-
- ikario - Tavily MCP Integration for Internet Access
-
-
- This specification adds Tavily search capabilities via MCP (Model Context Protocol) to give Ikario
- internet access for real-time web searches. Tavily provides high-quality search results optimized
- for AI agents, making it ideal for research, fact-checking, and accessing current information.
-
- This integration adds a new MCP server connection to the existing architecture (alongside the
- ikario-memory MCP server) and exposes Tavily search tools to Ikario during conversations.
-
- All changes are additive and backward-compatible. Existing functionality remains unchanged.
-
-
-
-
- Tavily MCP Server Connection:
- - Uses @modelcontextprotocol/sdk Client to connect to Tavily MCP server
- - Connection can be stdio-based (local MCP server) or HTTP-based (remote)
- - Tavily MCP server provides search tools that are exposed to Claude via Tool Use API
- - Backend routes handle tool execution and return results to Claude
-
-
-
- - Real-time internet access for Ikario
- - High-quality search results optimized for LLMs
- - Fact-checking and verification capabilities
- - Access to current events and news
- - Research assistance with cited sources
- - Seamless integration with existing memory tools
-
-
-
-
-
- Tavily MCP Server
- Model Context Protocol (MCP)
- stdio or HTTP transport
- @modelcontextprotocol/sdk
- Tavily API key (from https://tavily.com)
-
-
- Node.js with Express (existing)
- MCP Client for Tavily server connection
- Existing toolExecutor service extended with Tavily tools
-
-
- GET/POST /api/tavily/* for Tavily-specific operations
- Existing /api/claude/chat routes support Tavily tools automatically
-
-
-
-
-
- - Tavily API key obtained from https://tavily.com (free tier available)
- - API key stored in environment variable TAVILY_API_KEY or configuration file
- - MCP SDK already installed (@modelcontextprotocol/sdk exists for ikario-memory)
- - Tavily MCP server installed (npm package or Python package)
-
-
- - Add Tavily MCP server config to server/.claude_settings.json or similar
- - Configure connection parameters (stdio vs HTTP)
- - Set API key securely
-
-
-
-
-
- Tavily MCP Client Setup
-
- Create MCP client connection to Tavily search server. This is similar to the existing
- ikario-memory MCP client but connects to Tavily instead.
-
- Implementation:
- - Create server/services/tavilyMcpClient.js
- - Initialize MCP client with Tavily server connection
- - Handle connection lifecycle (connect, disconnect, reconnect)
- - Implement health checks and connection status
- - Export client instance and helper functions
-
- Configuration:
- - Read Tavily API key from environment or config file
- - Configure transport (stdio or HTTP)
- - Set connection timeout and retry logic
- - Log connection status for debugging
-
- Error Handling:
- - Graceful degradation if Tavily is unavailable
- - Connection retry with exponential backoff
- - Clear error messages for configuration issues
-
- 1
- backend
-
- 1. Verify MCP client can connect to Tavily server on startup
- 2. Test connection health check endpoint returns correct status
- 3. Verify graceful handling when Tavily API key is missing
- 4. Test reconnection logic when connection drops
- 5. Verify connection status is logged correctly
- 6. Test that server starts even if Tavily is unavailable
-
-
-
-
- Tavily Tool Configuration
-
- Configure Tavily search tools to be available to Claude during conversations.
- This integrates with the existing tool system (like memory tools).
-
- Implementation:
- - Create server/config/tavilyTools.js
- - Define tool schemas for Tavily search capabilities
- - Integrate with existing toolExecutor service
- - Add Tavily tools to system prompt alongside memory tools
-
- Tavily Tools to Expose:
- - tavily_search: General web search with AI-optimized results
- - Parameters: query (string), max_results (number), search_depth (basic/advanced)
- - Returns: Array of search results with title, url, content, score
-
- - tavily_search_news: News-specific search for current events
- - Parameters: query (string), max_results (number), days (number)
- - Returns: Recent news articles with metadata
-
- Tool Schema:
- - Follow Claude Tool Use API format
- - Clear descriptions for each tool
- - Well-defined input schemas with validation
- - Proper error handling in tool execution
-
- 1
- backend
-
- 1. Verify Tavily tools are listed in available tools
- 2. Test tool schema validation with valid inputs
- 3. Test tool schema validation rejects invalid inputs
- 4. Verify tools appear in Claude's system prompt
- 5. Test that tool descriptions are clear and accurate
- 6. Verify tools can be called without errors
-
-
-
-
- Tavily Tool Executor Integration
-
- Integrate Tavily tools into the existing toolExecutor service so Claude can
- use them during conversations.
-
- Implementation:
- - Extend server/services/toolExecutor.js to handle Tavily tools
- - Add tool detection for tavily_search and tavily_search_news
- - Implement tool execution logic using Tavily MCP client
- - Format Tavily results for Claude consumption
- - Handle errors and timeouts gracefully
-
- Tool Execution Flow:
- 1. Claude requests tool use (e.g., tavily_search)
- 2. toolExecutor detects Tavily tool request
- 3. Call Tavily MCP client with tool parameters
- 4. Receive and format search results
- 5. Return formatted results to Claude
- 6. Claude incorporates results into response
-
- Result Formatting:
- - Convert Tavily results to Claude-friendly format
- - Include source URLs for citation
- - Add relevance scores
- - Truncate content if too long
- - Handle empty results gracefully
-
- 1
- backend
-
- 1. Test tavily_search tool execution with valid query
- 2. Verify results are properly formatted
- 3. Test tavily_search_news tool execution
- 4. Verify error handling when Tavily API fails
- 5. Test timeout handling for slow searches
- 6. Verify results include proper citations and URLs
- 7. Test with empty search results
- 8. Test with very long search queries
-
-
-
-
- System Prompt Enhancement for Internet Access
-
- Update the system prompt to inform Ikario about internet access capabilities.
- This should be added alongside existing memory tools instructions.
-
- Implementation:
- - Update MEMORY_SYSTEM_PROMPT in server/routes/messages.js and claude.js
- - Add Tavily tools documentation
- - Provide usage guidelines for when to search the internet
- - Include examples of good search queries
-
- Prompt Addition:
- "## Internet Access via Tavily
-
- Tu as accès à internet en temps réel via deux outils de recherche :
-
- 1. tavily_search : Recherche web générale optimisée pour l'IA
- - Utilise pour : rechercher des informations actuelles, vérifier des faits,
- trouver des sources fiables
- - Paramètres : query (ta question), max_results (nombre de résultats, défaut: 5),
- search_depth ('basic' ou 'advanced')
- - Retourne : Résultats avec titre, URL, contenu et score de pertinence
-
- 2. tavily_search_news : Recherche d'actualités récentes
- - Utilise pour : événements actuels, nouvelles, actualités
- - Paramètres : query, max_results, days (nombre de jours en arrière, défaut: 7)
-
- Quand utiliser la recherche internet :
- - Quand l'utilisateur demande des informations récentes ou actuelles
- - Pour vérifier des faits ou données que tu n'es pas sûr de connaître
- - Quand ta base de connaissances est trop ancienne (après janvier 2025)
- - Pour trouver des sources et citations spécifiques
- - Pour des requêtes nécessitant des données en temps réel
-
- N'utilise PAS la recherche pour :
- - Des questions sur ta propre identité ou capacités
- - Des concepts généraux que tu connais déjà bien
- - Des questions purement créatives ou d'opinion
-
- Utilise ces outils de façon autonome selon les besoins de la conversation.
- Cite toujours tes sources quand tu utilises des informations de Tavily."
-
- 2
- backend
-
- 1. Verify system prompt includes Tavily instructions
- 2. Test that Claude understands when to use Tavily search
- 3. Verify Claude cites sources from Tavily results
- 4. Test that Claude uses appropriate search queries
- 5. Verify Claude chooses between tavily_search and tavily_search_news correctly
- 6. Test that Claude doesn't over-use search for simple questions
-
-
-
-
- Tavily Status API Endpoint
-
- Create API endpoint to check Tavily MCP connection status and search capabilities.
- Similar to /api/memory/status endpoint.
-
- Implementation:
- - Create GET /api/tavily/status endpoint
- - Return connection status, available tools, and configuration
- - Create GET /api/tavily/health endpoint for health checks
- - Add Tavily status to existing /api/memory/stats (rename to /api/tools/stats)
-
- Response Format:
- {
- "success": true,
- "data": {
- "connected": true,
- "message": "Tavily MCP server is connected",
- "tools": ["tavily_search", "tavily_search_news"],
- "apiKeyConfigured": true,
- "transport": "stdio"
- }
- }
-
- 2
- backend
-
- 1. Test GET /api/tavily/status returns correct status
- 2. Verify status shows "connected" when Tavily is available
- 3. Verify status shows "disconnected" when Tavily is unavailable
- 4. Test health endpoint returns proper status code
- 5. Verify tools list is accurate
- 6. Test with missing API key shows proper error
-
-
-
-
- Frontend UI Indicator for Internet Access
-
- Add visual indicator in the UI to show when Ikario has internet access via Tavily.
- This can be displayed alongside the existing memory status indicator.
-
- Implementation:
- - Add Tavily status indicator in header or sidebar
- - Show online/offline status for Tavily connection
- - Optional: Show when Tavily is being used during a conversation
- - Optional: Add tooltip explaining internet access capabilities
-
- Visual Design:
- - Globe or wifi icon to represent internet access
- - Green when connected, gray when disconnected
- - Subtle animation when search is in progress
- - Tooltip: "Internet access via Tavily" or similar
-
- Integration:
- - Use existing useMemory hook pattern or create useTavily hook
- - Poll /api/tavily/status periodically (every 60s)
- - Update status in real-time during searches
-
- 3
- frontend
-
- 1. Verify internet access indicator appears in UI
- 2. Test status updates when Tavily connects/disconnects
- 3. Verify tooltip shows correct information
- 4. Test that indicator shows activity during searches
- 5. Verify status polling doesn't impact performance
- 6. Test with Tavily disabled shows offline status
-
-
-
-
- Manual Search UI (Optional Enhancement)
-
- Optional: Add manual search interface to allow users to trigger Tavily searches directly,
- similar to the memory search panel.
-
- Implementation:
- - Add "Internet Search" panel in sidebar (alongside Memory panel)
- - Search input for manual Tavily queries
- - Display search results with title, snippet, URL
- - Click to insert results into conversation
- - Filter by search type (general vs news)
-
- This is OPTIONAL and lower priority. The primary use case is autonomous search by Claude.
-
- 4
- frontend
-
- 1. Verify search panel appears in sidebar
- 2. Test manual search returns results
- 3. Verify results display properly with links
- 4. Test inserting results into conversation
- 5. Test news search filter works correctly
- 6. Verify search history is saved (optional)
-
-
-
-
- Configuration and Settings
-
- Add Tavily configuration options to settings and environment.
-
- Implementation:
- - Add TAVILY_API_KEY to environment variables
- - Add Tavily settings to .claude_settings.json or similar config file
- - Create server/config/tavilyConfig.js for configuration management
- - Document configuration options in README
-
- Configuration Options:
- - API key
- - Max results per search (default: 5)
- - Search depth (basic/advanced)
- - Timeout duration
- - Enable/disable Tavily globally
- - Rate limiting settings
-
- Security:
- - API key should NOT be exposed to frontend
- - Use environment variable or secure config file
- - Validate API key on startup
- - Log warnings if API key is missing
-
- 2
- backend
-
- 1. Verify API key is read from environment variable
- 2. Test fallback to config file if env var not set
- 3. Verify API key validation on startup
- 4. Test configuration options are applied correctly
- 5. Verify API key is never exposed in API responses
- 6. Test enabling/disabling Tavily via config
-
-
-
-
- Error Handling and Rate Limiting
-
- Implement robust error handling and rate limiting for Tavily API calls.
-
- Implementation:
- - Detect and handle Tavily API errors (rate limits, invalid API key, etc.)
- - Implement client-side rate limiting to avoid hitting Tavily limits
- - Cache search results for duplicate queries (optional)
- - Provide clear error messages to Claude when searches fail
-
- Error Types:
- - 401: Invalid API key
- - 429: Rate limit exceeded
- - 500: Tavily server error
- - Timeout: Search took too long
- - Network: Connection failed
-
- Rate Limiting:
- - Track searches per minute/hour
- - Queue requests if limit reached
- - Return cached results for duplicate queries within 5 minutes
- - Log rate limit warnings
-
- 2
- backend
-
- 1. Test error handling for invalid API key
- 2. Verify rate limit detection and handling
- 3. Test timeout handling for slow searches
- 4. Verify error messages are clear to Claude
- 5. Test rate limiting prevents API abuse
- 6. Verify caching works for duplicate queries
-
-
-
-
- Documentation and README Updates
-
- Update project documentation to explain Tavily integration.
-
- Implementation:
- - Update main README.md with Tavily setup instructions
- - Add TAVILY_SETUP.md with detailed configuration guide
- - Document API endpoints in README
- - Add examples of using Tavily with Ikario
- - Document troubleshooting steps
-
- Documentation Sections:
- - Prerequisites (Tavily API key)
- - Installation steps
- - Configuration options
- - Testing Tavily connection
- - Example conversations using internet search
- - Troubleshooting common issues
- - API reference for Tavily endpoints
-
- 3
- documentation
-
- 1. Verify README has Tavily setup section
- 2. Test that setup instructions are clear and complete
- 3. Verify all configuration options are documented
- 4. Test examples work as described
- 5. Verify troubleshooting section covers common issues
-
-
-
-
-
-
- Recommended implementation order:
- 1. Feature 1 (MCP Client Setup) - Foundation
- 2. Feature 2 (Tool Configuration) - Core functionality
- 3. Feature 3 (Tool Executor Integration) - Core functionality
- 4. Feature 8 (Configuration) - Required for testing
- 5. Feature 4 (System Prompt) - Makes tools accessible to Claude
- 6. Feature 9 (Error Handling) - Production readiness
- 7. Feature 5 (Status API) - Monitoring
- 8. Feature 10 (Documentation) - User onboarding
- 9. Feature 6 (UI Indicator) - Nice to have
- 10. Feature 7 (Manual Search UI) - Optional enhancement
-
-
-
- After implementing features 1-5, you should be able to:
- - Ask Ikario: "Quelle est l'actualité aujourd'hui ?"
- - Ask Ikario: "Recherche des informations sur [topic actuel]"
- - Ask Ikario: "Vérifie cette information : [claim]"
-
- Ikario should autonomously use Tavily search and cite sources.
-
-
-
- - This specification is fully compatible with existing ikario-memory MCP integration
- - Ikario will have both memory tools AND internet search tools
- - Tools can be used together in the same conversation
- - No conflicts expected between tool systems
-
-
-
-
-
- - DO NOT expose Tavily API key to frontend or in API responses
- - DO NOT modify existing MCP memory integration
- - DO NOT break existing conversation functionality
- - Tavily should gracefully degrade if unavailable (don't crash the app)
- - Implement proper rate limiting to avoid API abuse
- - Validate all user inputs before passing to Tavily
- - Sanitize search results before displaying (XSS prevention)
- - Log all Tavily API calls for monitoring and debugging
-
-
-
-
- - Ikario can successfully perform internet searches when asked
- - Search results are relevant and well-formatted
- - Sources are properly cited
- - Tavily integration doesn't slow down conversations
- - Error handling is robust and user-friendly
- - Configuration is straightforward
- - Documentation is clear and complete
-
-
diff --git a/prompts/app_spec_types_docs.backup.txt b/prompts/app_spec_types_docs.backup.txt
deleted file mode 100644
index 0fe4fa6..0000000
--- a/prompts/app_spec_types_docs.backup.txt
+++ /dev/null
@@ -1,679 +0,0 @@
-
- Library RAG - Type Safety & Documentation Enhancement
-
-
- Enhance the Library RAG application (philosophical texts indexing and semantic search) by adding
- strict type annotations and comprehensive Google-style docstrings to all Python modules. This will
- improve code maintainability, enable static type checking with mypy, and provide clear documentation
- for all functions, classes, and modules.
-
- The application is a RAG pipeline that processes PDF documents through OCR, LLM-based extraction,
- semantic chunking, and ingestion into Weaviate vector database. It includes a Flask web interface
- for document upload, processing, and semantic search.
-
-
-
-
- Python 3.10+
- Flask 3.0
- Weaviate 1.34.4 with text2vec-transformers
- Mistral OCR API
- Ollama (local) or Mistral API
- mypy with strict configuration
-
-
- Docker Compose (Weaviate + transformers)
- weaviate-client, flask, mistralai, python-dotenv
-
-
-
-
-
- - flask_app.py: Main Flask application (640 lines)
- - schema.py: Weaviate schema definition (383 lines)
- - utils/: 16+ modules for PDF processing pipeline
- - pdf_pipeline.py: Main orchestration (879 lines)
- - mistral_client.py: OCR API client
- - ocr_processor.py: OCR processing
- - markdown_builder.py: Markdown generation
- - llm_metadata.py: Metadata extraction via LLM
- - llm_toc.py: Table of contents extraction
- - llm_classifier.py: Section classification
- - llm_chunker.py: Semantic chunking
- - llm_cleaner.py: Chunk cleaning
- - llm_validator.py: Document validation
- - weaviate_ingest.py: Database ingestion
- - hierarchy_parser.py: Document hierarchy parsing
- - image_extractor.py: Image extraction from PDFs
- - toc_extractor*.py: Various TOC extraction methods
- - templates/: Jinja2 templates for Flask UI
- - tests/utils2/: Minimal test coverage (3 test files)
-
-
-
- - Inconsistent type annotations across modules (some have partial types, many have none)
- - Missing or incomplete docstrings (no Google-style format)
- - No mypy configuration for strict type checking
- - Type hints missing on function parameters and return values
- - Dict[str, Any] used extensively without proper typing
- - No type stubs for complex nested structures
-
-
-
-
-
-
- - Add complete type annotations to ALL functions and methods
- - Use proper generic types (List, Dict, Optional, Union) from typing module
- - Add TypedDict for complex dictionary structures
- - Add Protocol types for duck-typed interfaces
- - Use Literal types for string constants
- - Add ParamSpec and TypeVar where appropriate
- - Type all class attributes and instance variables
- - Add type annotations to lambda functions where possible
-
-
-
- - Create mypy.ini with strict configuration
- - Enable: check_untyped_defs, disallow_untyped_defs, disallow_incomplete_defs
- - Enable: disallow_untyped_calls, disallow_untyped_decorators
- - Enable: warn_return_any, warn_redundant_casts
- - Enable: strict_equality, strict_optional
- - Set python_version to 3.10
- - Configure per-module overrides if needed for gradual migration
-
-
-
- - Create TypedDict definitions for common data structures:
- - OCR response structures
- - Metadata dictionaries
- - TOC entries
- - Chunk objects
- - Weaviate objects
- - Pipeline results
- - Add NewType for semantic type safety (DocumentName, ChunkId, etc.)
- - Create Protocol types for callback functions
-
-
-
- - pdf_pipeline.py: Type all 10 pipeline steps, callbacks, result dictionaries
- - flask_app.py: Type all route handlers, request/response types
- - schema.py: Type Weaviate configuration objects
- - llm_*.py: Type LLM request/response structures
- - mistral_client.py: Type API client methods and responses
- - weaviate_ingest.py: Type ingestion functions and batch operations
-
-
-
-
-
- - Add comprehensive Google-style docstrings to ALL:
- - Module-level docstrings explaining purpose and usage
- - Class docstrings with Attributes section
- - Function/method docstrings with Args, Returns, Raises sections
- - Complex algorithm explanations with Examples section
- - Include code examples for public APIs
- - Document all exceptions that can be raised
- - Add Notes section for important implementation details
- - Add See Also section for related functions
-
-
-
-
- - pdf_pipeline.py: Document the 10-step pipeline, each step's purpose
- - mistral_client.py: Document OCR API usage, cost calculation
- - llm_metadata.py: Document metadata extraction logic
- - llm_toc.py: Document TOC extraction strategies
- - llm_classifier.py: Document section classification types
- - llm_chunker.py: Document semantic vs basic chunking
- - llm_cleaner.py: Document cleaning rules and validation
- - llm_validator.py: Document validation criteria
- - weaviate_ingest.py: Document ingestion process, nested objects
- - hierarchy_parser.py: Document hierarchy building algorithm
-
-
-
- - Document all routes with request/response examples
- - Document SSE (Server-Sent Events) implementation
- - Document Weaviate query patterns
- - Document upload processing workflow
- - Document background job management
-
-
-
- - Document Weaviate schema design decisions
- - Document each collection's purpose and relationships
- - Document nested object structure
- - Document vectorization strategy
-
-
-
-
- - Add inline comments for complex logic only (don't over-comment)
- - Explain WHY not WHAT (code should be self-documenting)
- - Document performance considerations
- - Document cost implications (OCR, LLM API calls)
- - Document error handling strategies
-
-
-
-
-
- - All modules must pass mypy --strict
- - No # type: ignore comments without justification
- - CI/CD should run mypy checks
- - Type coverage should be 100%
-
-
-
- - All public functions must have docstrings
- - All docstrings must follow Google style
- - Examples should be executable and tested
- - Documentation should be clear and concise
-
-
-
-
-
-
- Priority 1 (Most used, most complex):
- 1. utils/pdf_pipeline.py - Main orchestration
- 2. flask_app.py - Web application entry point
- 3. utils/weaviate_ingest.py - Database operations
- 4. schema.py - Schema definition
-
- Priority 2 (Core LLM modules):
- 5. utils/llm_metadata.py
- 6. utils/llm_toc.py
- 7. utils/llm_classifier.py
- 8. utils/llm_chunker.py
- 9. utils/llm_cleaner.py
- 10. utils/llm_validator.py
-
- Priority 3 (OCR and parsing):
- 11. utils/mistral_client.py
- 12. utils/ocr_processor.py
- 13. utils/markdown_builder.py
- 14. utils/hierarchy_parser.py
- 15. utils/image_extractor.py
-
- Priority 4 (Supporting modules):
- 16. utils/toc_extractor.py
- 17. utils/toc_extractor_markdown.py
- 18. utils/toc_extractor_visual.py
- 19. utils/llm_structurer.py (legacy)
-
-
-
-
-
- Setup Type Checking Infrastructure
-
- Configure mypy with strict settings and create foundational type definitions
-
-
- - Create mypy.ini configuration file with strict settings
- - Add mypy to requirements.txt or dev dependencies
- - Create utils/types.py module for common TypedDict definitions
- - Define core types: OCRResponse, Metadata, TOCEntry, ChunkData, PipelineResult
- - Add NewType definitions for semantic types: DocumentName, ChunkId, SectionPath
- - Create Protocol types for callbacks (ProgressCallback, etc.)
- - Document type definitions in utils/types.py module docstring
- - Test mypy configuration on a single module to verify settings
-
-
- - mypy.ini exists with strict configuration
- - utils/types.py contains all foundational types with docstrings
- - mypy runs without errors on utils/types.py
- - Type definitions are comprehensive and reusable
-
-
-
-
- Add Types to PDF Pipeline Orchestration
-
- Add complete type annotations to pdf_pipeline.py (879 lines, most complex module)
-
-
- - Add type annotations to all function signatures in pdf_pipeline.py
- - Type the 10-step pipeline: OCR, Markdown, Metadata, TOC, Classify, Chunk, Clean, Validate, Weaviate
- - Type progress_callback parameter with Protocol or Callable
- - Add TypedDict for pipeline options dictionary
- - Add TypedDict for pipeline result dictionary structure
- - Type all helper functions (extract_document_metadata_legacy, etc.)
- - Add proper return types for process_pdf_v2, process_pdf, process_pdf_bytes
- - Fix any mypy errors that arise
- - Verify mypy --strict passes on pdf_pipeline.py
-
-
- - All functions in pdf_pipeline.py have complete type annotations
- - progress_callback is properly typed with Protocol
- - All Dict[str, Any] replaced with TypedDict where appropriate
- - mypy --strict pdf_pipeline.py passes with zero errors
- - No # type: ignore comments (or justified if absolutely necessary)
-
-
-
-
- Add Types to Flask Application
-
- Add complete type annotations to flask_app.py and type all routes
-
-
- - Add type annotations to all Flask route handlers
- - Type request.args, request.form, request.files usage
- - Type jsonify() return values
- - Type get_weaviate_client context manager
- - Type get_collection_stats, get_all_chunks, search_chunks functions
- - Add TypedDict for Weaviate query results
- - Type background job processing functions (run_processing_job)
- - Type SSE generator function (upload_progress)
- - Add type hints for template rendering
- - Verify mypy --strict passes on flask_app.py
-
-
- - All Flask routes have complete type annotations
- - Request/response types are clear and documented
- - Weaviate query functions are properly typed
- - SSE generator is correctly typed
- - mypy --strict flask_app.py passes with zero errors
-
-
-
-
- Add Types to Core LLM Modules
-
- Add complete type annotations to all LLM processing modules (metadata, TOC, classifier, chunker, cleaner, validator)
-
-
- - llm_metadata.py: Type extract_metadata function, return structure
- - llm_toc.py: Type extract_toc function, TOC hierarchy structure
- - llm_classifier.py: Type classify_sections, section types (Literal), validation functions
- - llm_chunker.py: Type chunk_section_with_llm, chunk objects
- - llm_cleaner.py: Type clean_chunk, is_chunk_valid functions
- - llm_validator.py: Type validate_document, validation result structure
- - Add TypedDict for LLM request/response structures
- - Type provider selection ("ollama" | "mistral" as Literal)
- - Type model names with Literal or constants
- - Verify mypy --strict passes on all llm_*.py modules
-
-
- - All LLM modules have complete type annotations
- - Section types use Literal for type safety
- - Provider and model parameters are strongly typed
- - LLM request/response structures use TypedDict
- - mypy --strict passes on all llm_*.py modules with zero errors
-
-
-
-
- Add Types to Weaviate and Database Modules
-
- Add complete type annotations to schema.py and weaviate_ingest.py
-
-
- - schema.py: Type Weaviate configuration objects
- - schema.py: Type collection property definitions
- - weaviate_ingest.py: Type ingest_document function signature
- - weaviate_ingest.py: Type delete_document_chunks function
- - weaviate_ingest.py: Add TypedDict for Weaviate object structure
- - Type batch insertion operations
- - Type nested object references (work, document)
- - Add proper error types for Weaviate exceptions
- - Verify mypy --strict passes on both modules
-
-
- - schema.py has complete type annotations for Weaviate config
- - weaviate_ingest.py functions are fully typed
- - Nested object structures use TypedDict
- - Weaviate client operations are properly typed
- - mypy --strict passes on both modules with zero errors
-
-
-
-
- Add Types to OCR and Parsing Modules
-
- Add complete type annotations to mistral_client.py, ocr_processor.py, markdown_builder.py, hierarchy_parser.py
-
-
- - mistral_client.py: Type create_client, run_ocr, estimate_ocr_cost
- - mistral_client.py: Add TypedDict for Mistral API response structures
- - ocr_processor.py: Type serialize_ocr_response, OCR object structures
- - markdown_builder.py: Type build_markdown, image_writer parameter
- - hierarchy_parser.py: Type build_hierarchy, flatten_hierarchy functions
- - hierarchy_parser.py: Add TypedDict for hierarchy node structure
- - image_extractor.py: Type create_image_writer, image handling
- - Verify mypy --strict passes on all modules
-
-
- - All OCR/parsing modules have complete type annotations
- - Mistral API structures use TypedDict
- - Hierarchy nodes are properly typed
- - Image handling functions are typed
- - mypy --strict passes on all modules with zero errors
-
-
-
-
- Add Google-Style Docstrings to Core Modules
-
- Add comprehensive Google-style docstrings to pdf_pipeline.py, flask_app.py, and weaviate modules
-
-
- - pdf_pipeline.py: Add module docstring explaining the V2 pipeline
- - pdf_pipeline.py: Add docstrings to process_pdf_v2 with Args, Returns, Raises sections
- - pdf_pipeline.py: Document each of the 10 pipeline steps in comments
- - pdf_pipeline.py: Add Examples section showing typical usage
- - flask_app.py: Add module docstring explaining Flask application
- - flask_app.py: Document all routes with request/response examples
- - flask_app.py: Document Weaviate connection management
- - schema.py: Add module docstring explaining schema design
- - schema.py: Document each collection's purpose and relationships
- - weaviate_ingest.py: Document ingestion process with examples
- - All docstrings must follow Google style format exactly
-
-
- - All core modules have comprehensive module-level docstrings
- - All public functions have Google-style docstrings
- - Args, Returns, Raises sections are complete and accurate
- - Examples are provided for complex functions
- - Docstrings explain WHY, not just WHAT
-
-
-
-
- Add Google-Style Docstrings to LLM Modules
-
- Add comprehensive Google-style docstrings to all LLM processing modules
-
-
- - llm_metadata.py: Document metadata extraction logic with examples
- - llm_toc.py: Document TOC extraction strategies and fallbacks
- - llm_classifier.py: Document section types and classification criteria
- - llm_chunker.py: Document semantic vs basic chunking approaches
- - llm_cleaner.py: Document cleaning rules and validation logic
- - llm_validator.py: Document validation criteria and corrections
- - Add Examples sections showing input/output for each function
- - Document LLM provider differences (Ollama vs Mistral)
- - Document cost implications in Notes sections
- - All docstrings must follow Google style format exactly
-
-
- - All LLM modules have comprehensive docstrings
- - Each function has Args, Returns, Raises sections
- - Examples show realistic input/output
- - Provider differences are documented
- - Cost implications are noted where relevant
-
-
-
-
- Add Google-Style Docstrings to OCR and Parsing Modules
-
- Add comprehensive Google-style docstrings to OCR, markdown, hierarchy, and extraction modules
-
-
- - mistral_client.py: Document OCR API usage, cost calculation
- - ocr_processor.py: Document OCR response processing
- - markdown_builder.py: Document markdown generation strategy
- - hierarchy_parser.py: Document hierarchy building algorithm
- - image_extractor.py: Document image extraction process
- - toc_extractor*.py: Document various TOC extraction methods
- - Add Examples sections for complex algorithms
- - Document edge cases and error handling
- - All docstrings must follow Google style format exactly
-
-
- - All OCR/parsing modules have comprehensive docstrings
- - Complex algorithms are well explained
- - Edge cases are documented
- - Error handling is documented
- - Examples demonstrate typical usage
-
-
-
-
- Final Validation and CI Integration
-
- Verify all type annotations and docstrings, integrate mypy into CI/CD
-
-
- - Run mypy --strict on entire codebase, verify 100% pass rate
- - Verify all public functions have docstrings
- - Check docstring formatting with pydocstyle or similar tool
- - Create GitHub Actions workflow to run mypy on every commit
- - Update README.md with type checking instructions
- - Update CLAUDE.md with documentation standards
- - Create CONTRIBUTING.md with type annotation and docstring guidelines
- - Generate API documentation with Sphinx or pdoc
- - Fix any remaining mypy errors or missing docstrings
-
-
- - mypy --strict passes on entire codebase with zero errors
- - All public functions have Google-style docstrings
- - CI/CD runs mypy checks automatically
- - Documentation is generated and accessible
- - Contributing guidelines document type/docstring requirements
-
-
-
-
-
-
- - 100% type coverage across all modules
- - mypy --strict passes with zero errors
- - No # type: ignore comments without justification
- - All Dict[str, Any] replaced with TypedDict where appropriate
- - Proper use of generics, protocols, and type variables
- - NewType used for semantic type safety
-
-
-
- - All modules have comprehensive module-level docstrings
- - All public functions/classes have Google-style docstrings
- - All docstrings include Args, Returns, Raises sections
- - Complex functions include Examples sections
- - Cost implications documented in Notes sections
- - Error handling clearly documented
- - Provider differences (Ollama vs Mistral) documented
-
-
-
- - Code is self-documenting with clear variable names
- - Inline comments explain WHY, not WHAT
- - Complex algorithms are well explained
- - Performance considerations documented
- - Security considerations documented
-
-
-
- - IDE autocomplete works perfectly with type hints
- - Type errors caught at development time, not runtime
- - Documentation is easily accessible in IDE
- - API examples are executable and tested
- - Contributing guidelines are clear and comprehensive
-
-
-
- - Refactoring is safer with type checking
- - Function signatures are self-documenting
- - API contracts are explicit and enforced
- - Breaking changes are caught by type checker
- - New developers can understand code quickly
-
-
-
-
-
- - Must maintain backward compatibility with existing code
- - Cannot break existing Flask routes or API contracts
- - Weaviate schema must remain unchanged
- - Existing tests must continue to pass
-
-
-
- - Can use per-module mypy configuration for gradual migration
- - Can temporarily disable strict checks on legacy modules
- - Priority modules must be completed first
- - Low-priority modules can be deferred
-
-
-
- - All type annotations must use Python 3.10+ syntax
- - Docstrings must follow Google style exactly (not NumPy or reStructuredText)
- - Use typing module (List, Dict, Optional) until Python 3.9 support dropped
- - Use from __future__ import annotations if needed for forward references
-
-
-
-
-
- - Run mypy --strict on each module after adding types
- - Use mypy daemon (dmypy) for faster incremental checking
- - Add mypy to pre-commit hooks
- - CI/CD must run mypy and fail on type errors
-
-
-
- - Use pydocstyle to validate Google-style format
- - Use sphinx-build to generate docs and catch errors
- - Manual review of docstring examples
- - Verify examples are executable and correct
-
-
-
- - Verify existing tests still pass after type additions
- - Add new tests for complex typed structures
- - Test mypy configuration on sample code
- - Verify IDE autocomplete works correctly
-
-
-
-
-
- ```python
- """
- PDF Pipeline V2 - Intelligent document processing with LLM enhancement.
-
- This module orchestrates a 10-step pipeline for processing PDF documents:
- 1. OCR via Mistral API
- 2. Markdown construction with images
- 3. Metadata extraction via LLM
- 4. Table of contents (TOC) extraction
- 5. Section classification
- 6. Semantic chunking
- 7. Chunk cleaning and validation
- 8. Enrichment with concepts
- 9. Validation and corrections
- 10. Ingestion into Weaviate vector database
-
- The pipeline supports multiple LLM providers (Ollama local, Mistral API) and
- various processing modes (skip OCR, semantic chunking, OCR annotations).
-
- Typical usage:
- >>> from pathlib import Path
- >>> from utils.pdf_pipeline import process_pdf
- >>>
- >>> result = process_pdf(
- ... Path("document.pdf"),
- ... use_llm=True,
- ... llm_provider="ollama",
- ... ingest_to_weaviate=True,
- ... )
- >>> print(f"Processed {result['pages']} pages, {result['chunks_count']} chunks")
-
- See Also:
- mistral_client: OCR API client
- llm_metadata: Metadata extraction
- weaviate_ingest: Database ingestion
- """
- ```
-
-
-
- ```python
- def process_pdf_v2(
- pdf_path: Path,
- output_dir: Path = Path("output"),
- *,
- use_llm: bool = True,
- llm_provider: Literal["ollama", "mistral"] = "ollama",
- llm_model: Optional[str] = None,
- skip_ocr: bool = False,
- ingest_to_weaviate: bool = True,
- progress_callback: Optional[ProgressCallback] = None,
- ) -> PipelineResult:
- """
- Process a PDF through the complete V2 pipeline with LLM enhancement.
-
- This function orchestrates all 10 steps of the intelligent document processing
- pipeline, from OCR to Weaviate ingestion. It supports both local (Ollama) and
- cloud (Mistral API) LLM providers, with optional caching via skip_ocr.
-
- Args:
- pdf_path: Absolute path to the PDF file to process.
- output_dir: Base directory for output files. Defaults to "./output".
- use_llm: Enable LLM-based processing (metadata, TOC, chunking).
- If False, uses basic heuristic processing.
- llm_provider: LLM provider to use. "ollama" for local (free but slow),
- "mistral" for API (fast but paid).
- llm_model: Specific model name. If None, auto-detects based on provider
- (qwen2.5:7b for ollama, mistral-small-latest for mistral).
- skip_ocr: If True, reuses existing markdown file to avoid OCR cost.
- Requires output_dir//.md to exist.
- ingest_to_weaviate: If True, ingests chunks into Weaviate after processing.
- progress_callback: Optional callback for real-time progress updates.
- Called with (step_id, status, detail) for each pipeline step.
-
- Returns:
- Dictionary containing processing results with the following keys:
- - success (bool): True if processing completed without errors
- - document_name (str): Name of the processed document
- - pages (int): Number of pages in the PDF
- - chunks_count (int): Number of chunks generated
- - cost_ocr (float): OCR cost in euros (0 if skip_ocr=True)
- - cost_llm (float): LLM API cost in euros (0 if provider=ollama)
- - cost_total (float): Total cost (ocr + llm)
- - metadata (dict): Extracted metadata (title, author, etc.)
- - toc (list): Hierarchical table of contents
- - files (dict): Paths to generated files (markdown, chunks, etc.)
-
- Raises:
- FileNotFoundError: If pdf_path does not exist.
- ValueError: If skip_ocr=True but markdown file not found.
- RuntimeError: If Weaviate connection fails during ingestion.
-
- Examples:
- Basic usage with Ollama (free):
- >>> result = process_pdf_v2(
- ... Path("platon_menon.pdf"),
- ... llm_provider="ollama"
- ... )
- >>> print(f"Cost: {result['cost_total']:.4f}€")
- Cost: 0.0270€ # OCR only
-
- With Mistral API (faster):
- >>> result = process_pdf_v2(
- ... Path("platon_menon.pdf"),
- ... llm_provider="mistral",
- ... llm_model="mistral-small-latest"
- ... )
-
- Skip OCR to avoid cost:
- >>> result = process_pdf_v2(
- ... Path("platon_menon.pdf"),
- ... skip_ocr=True, # Reuses existing markdown
- ... ingest_to_weaviate=False
- ... )
-
- Notes:
- - OCR cost: ~0.003€/page (standard), ~0.009€/page (with annotations)
- - LLM cost: Free with Ollama, variable with Mistral API
- - Processing time: ~30s/page with Ollama, ~5s/page with Mistral
- - Weaviate must be running (docker-compose up -d) before ingestion
- """
- ```
-
-
-
diff --git a/prompts/coding_prompt_library.md b/prompts/coding_prompt_library.md
deleted file mode 100644
index 0f628a3..0000000
--- a/prompts/coding_prompt_library.md
+++ /dev/null
@@ -1,290 +0,0 @@
-## YOUR ROLE - CODING AGENT (Library RAG - Type Safety & Documentation)
-
-You are working on adding strict type annotations and Google-style docstrings to a Python library project.
-This is a FRESH context window - you have no memory of previous sessions.
-
-You have access to Linear for project management via MCP tools. Linear is your single source of truth.
-
-### STEP 1: GET YOUR BEARINGS (MANDATORY)
-
-Start by orienting yourself:
-
-```bash
-# 1. See your working directory
-pwd
-
-# 2. List files to understand project structure
-ls -la
-
-# 3. Read the project specification
-cat app_spec.txt
-
-# 4. Read the Linear project state
-cat .linear_project.json
-
-# 5. Check recent git history
-git log --oneline -20
-```
-
-### STEP 2: CHECK LINEAR STATUS
-
-Query Linear to understand current project state using the project_id from `.linear_project.json`.
-
-1. **Get all issues and count progress:**
- ```
- mcp__linear__list_issues with project_id
- ```
- Count:
- - Issues "Done" = completed
- - Issues "Todo" = remaining
- - Issues "In Progress" = currently being worked on
-
-2. **Find META issue** (if exists) for session context
-
-3. **Check for in-progress work** - complete it first if found
-
-### STEP 3: SELECT NEXT ISSUE
-
-Get Todo issues sorted by priority:
-```
-mcp__linear__list_issues with project_id, status="Todo", limit=5
-```
-
-Select ONE highest-priority issue to work on.
-
-### STEP 4: CLAIM THE ISSUE
-
-Use `mcp__linear__update_issue` to set status to "In Progress"
-
-### STEP 5: IMPLEMENT THE ISSUE
-
-Based on issue category:
-
-**For Type Annotation Issues (e.g., "Types - Add type annotations to X.py"):**
-
-1. Read the target Python file
-2. Identify all functions, methods, and variables
-3. Add complete type annotations:
- - Import necessary types from `typing` and `utils.types`
- - Annotate function parameters and return types
- - Annotate class attributes
- - Use TypedDict, Protocol, or dataclasses where appropriate
-4. Save the file
-5. Run mypy to verify (MANDATORY):
- ```bash
- cd generations/library_rag
- mypy --config-file=mypy.ini
- ```
-6. Fix any mypy errors
-7. Commit the changes
-
-**For Documentation Issues (e.g., "Docs - Add docstrings to X.py"):**
-
-1. Read the target Python file
-2. Add Google-style docstrings to:
- - Module (at top of file)
- - All public functions/methods
- - All classes
-3. Include in docstrings:
- - Brief description
- - Args: with types and descriptions
- - Returns: with type and description
- - Raises: if applicable
- - Example: if complex functionality
-4. Save the file
-5. Optionally run pydocstyle to verify (if installed)
-6. Commit the changes
-
-**For Setup/Infrastructure Issues:**
-
-Follow the specific instructions in the issue description.
-
-### STEP 6: VERIFICATION
-
-**Type Annotation Issues:**
-- Run mypy on the modified file(s)
-- Ensure zero type errors
-- If errors exist, fix them before proceeding
-
-**Documentation Issues:**
-- Review docstrings for completeness
-- Ensure Args/Returns sections match function signatures
-- Check that examples are accurate
-
-**Functional Changes (rare):**
-- If the issue changes behavior, test manually
-- Start Flask server if needed: `python flask_app.py`
-- Test the affected functionality
-
-### STEP 7: GIT COMMIT
-
-Make a descriptive commit:
-```bash
-git add
-git commit -m ":
-
--
-- Verified with mypy (for type issues)
-- Linear issue:
-"
-```
-
-### STEP 8: UPDATE LINEAR ISSUE
-
-1. **Add implementation comment:**
- ```markdown
- ## Implementation Complete
-
- ### Changes Made
- - [List of files modified]
- - [Key changes]
-
- ### Verification
- - mypy passes with zero errors (for type issues)
- - All test steps from issue description verified
-
- ### Git Commit
- [commit hash and message]
- ```
-
-2. **Update status to "Done"** using `mcp__linear__update_issue`
-
-### STEP 9: DECIDE NEXT ACTION
-
-After completing an issue, ask yourself:
-
-1. Have I been working for a while? (Use judgment based on complexity of work done)
-2. Is the code in a stable state?
-3. Would this be a good handoff point?
-
-**If YES to all three:**
-- Proceed to STEP 10 (Session Summary)
-- End cleanly
-
-**If NO:**
-- Continue to another issue (go back to STEP 3)
-- But commit first!
-
-**Pacing Guidelines:**
-- Early phase (< 20% done): Can complete multiple simple issues
-- Mid/late phase (> 20% done): 1-2 issues per session for quality
-
-### STEP 10: SESSION SUMMARY (When Ending)
-
-If META issue exists, add a comment:
-
-```markdown
-## Session Complete
-
-### Completed This Session
-- [Issue ID]: [Title] - [Brief summary]
-
-### Current Progress
-- X issues Done
-- Y issues In Progress
-- Z issues Todo
-
-### Notes for Next Session
-- [Important context]
-- [Recommendations]
-- [Any concerns]
-```
-
-Ensure:
-- All code committed
-- No uncommitted changes
-- App in working state
-
----
-
-## LINEAR WORKFLOW RULES
-
-**Status Transitions:**
-- Todo → In Progress (when starting)
-- In Progress → Done (when verified)
-
-**NEVER:**
-- Delete or modify issue descriptions
-- Mark Done without verification
-- Leave issues In Progress when switching
-
----
-
-## TYPE ANNOTATION GUIDELINES
-
-**Imports needed:**
-```python
-from typing import Optional, Dict, List, Any, Tuple, Callable
-from pathlib import Path
-from utils.types import
-```
-
-**Common patterns:**
-```python
-# Functions
-def process_data(input: str, options: Optional[Dict[str, Any]] = None) -> List[str]:
- """Process input data."""
- ...
-
-# Methods with self
-def save(self, path: Path) -> None:
- """Save to file."""
- ...
-
-# Async functions
-async def fetch_data(url: str) -> Dict[str, Any]:
- """Fetch from API."""
- ...
-```
-
-**Use project types from `utils/types.py`:**
-- Metadata, OCRResponse, TOCEntry, ChunkData, PipelineResult, etc.
-
----
-
-## DOCSTRING TEMPLATE (Google Style)
-
-```python
-def function_name(param1: str, param2: int = 0) -> List[str]:
- """
- Brief one-line description.
-
- More detailed description if needed. Explain what the function does,
- any important behavior, side effects, etc.
-
- Args:
- param1: Description of param1.
- param2: Description of param2. Defaults to 0.
-
- Returns:
- Description of return value.
-
- Raises:
- ValueError: When param1 is empty.
- IOError: When file cannot be read.
-
- Example:
- >>> result = function_name("test", 5)
- >>> print(result)
- ['test', 'test', 'test', 'test', 'test']
- """
-```
-
----
-
-## IMPORTANT REMINDERS
-
-**Your Goal:** Add strict type annotations and comprehensive documentation to all Python modules
-
-**This Session's Goal:** Complete 1-2 issues with quality work and clean handoff
-
-**Quality Bar:**
-- mypy --strict passes with zero errors
-- All public functions have complete Google-style docstrings
-- Code is clean and well-documented
-
-**Context is finite.** End sessions early with good handoff notes. The next agent will continue.
-
----
-
-Begin by running STEP 1 (Get Your Bearings).