From 2e33637daecd10f686f10abdf9495c855a11da7e Mon Sep 17 00:00:00 2001 From: David Blanc Brioir Date: Thu, 25 Dec 2025 12:53:14 +0100 Subject: [PATCH] Update framework configuration and clean up obsolete specs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Configuration updates: - Added .env.example template for environment variables - Updated README.md with better setup instructions (.env usage) - Enhanced .claude/settings.local.json with additional Bash permissions - Added .claude/CLAUDE.md framework documentation Spec cleanup: - Removed obsolete spec files (language_selection, mistral_extensible, template, theme_customization) - Consolidated app_spec.txt (Claude Clone example) - Added app_spec_model.txt as reference template - Added app_spec_library_rag_types_docs.txt - Added coding_prompt_library.md Framework improvements: - Updated agent.py, autonomous_agent_demo.py, client.py with minor fixes - Enhanced dockerize_my_project.py - Updated prompts (initializer, initializer_bis) with better guidance - Added docker-compose.my_project.yml example This commit consolidates improvements made during development sessions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 --- .claude/CLAUDE.md | 234 ++++ .claude/settings.local.json | 36 +- .env.example | 11 + GUIDE_NEW_APP.md | 3 + README.md | 42 +- agent.py | 39 +- autonomous_agent_demo.py | 22 +- client.py | 14 +- docker-compose.my_project.yml | 29 + linear_config.py | 8 +- package-lock.json | 6 + prompts.py | 5 + prompts/app_spec.txt | 1171 ++++++++--------- ...app_spec_language_selection.completion.txt | 179 --- prompts/app_spec_language_selection.txt | 525 -------- prompts/app_spec_library_rag_types_docs.txt | 679 ++++++++++ prompts/app_spec_mistral_extensible.txt | 448 ------- prompts/app_spec_model.txt | 681 ++++++++++ prompts/app_spec_template.txt | 134 -- prompts/app_spec_theme_customization.txt | 403 ------ prompts/app_spec_types_docs.backup.txt | 679 ++++++++++ prompts/coding_prompt_library.md | 290 ++++ prompts/initializer_bis_prompt.md | 7 + prompts/initializer_prompt.md | 13 +- prompts/spec_embed_BAAI.txt | 576 ++++++++ requirements.txt | 1 + security.py | 5 + 27 files changed, 3862 insertions(+), 2378 deletions(-) create mode 100644 .claude/CLAUDE.md create mode 100644 .env.example create mode 100644 docker-compose.my_project.yml create mode 100644 package-lock.json delete mode 100644 prompts/app_spec_language_selection.completion.txt delete mode 100644 prompts/app_spec_language_selection.txt create mode 100644 prompts/app_spec_library_rag_types_docs.txt delete mode 100644 prompts/app_spec_mistral_extensible.txt create mode 100644 prompts/app_spec_model.txt delete mode 100644 prompts/app_spec_template.txt delete mode 100644 prompts/app_spec_theme_customization.txt create mode 100644 prompts/app_spec_types_docs.backup.txt create mode 100644 prompts/coding_prompt_library.md create mode 100644 prompts/spec_embed_BAAI.txt diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md new file mode 100644 index 0000000..ce017a2 --- /dev/null +++ b/.claude/CLAUDE.md @@ -0,0 +1,234 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## Overview + +This is an autonomous coding agent framework that uses Claude Agent SDK with Linear integration for project management. The framework enables long-running autonomous development sessions where agents create complete applications from XML specifications. + +**Key Architecture**: Two-agent pattern (Initializer + Coding Agent) with Linear as the single source of truth for project state and progress tracking. + +## Common Commands + +### Running the Agent + +```bash +# Fresh project initialization +python autonomous_agent_demo.py --project-dir ./my_project + +# Continue existing project +python autonomous_agent_demo.py --project-dir ./my_project + +# Add new features to existing project (Initializer Bis) +python autonomous_agent_demo.py --project-dir ./my_project --new-spec app_spec_theme_customization.txt + +# Limit iterations for testing +python autonomous_agent_demo.py --project-dir ./my_project --max-iterations 3 +``` + +### Testing + +```bash +# Run security hook tests +python test_security.py + +# Test mypy type checking (for library projects) +mypy path/to/module.py +``` + +### Environment Setup + +```bash +# Generate Claude Code OAuth token +claude setup-token + +# Install dependencies +pip install -r requirements.txt +``` + +## High-Level Architecture + +### Core Agent Flow + +1. **First Run (Initializer Agent)**: + - Reads `prompts/app_spec.txt` specification + - Creates Linear project and ~50 issues (one per `` tag) + - Creates META issue for session tracking + - Initializes project structure with `init.sh` + - Writes `.linear_project.json` marker file + +2. **Subsequent Runs (Coding Agent)**: + - Queries Linear for highest-priority Todo issue + - Updates issue status to "In Progress" + - Implements feature using SDK tools + - Tests implementation (Puppeteer for web apps, pytest/mypy for libraries) + - Adds comment to Linear issue with implementation notes + - Marks issue as "Done" + - Updates META issue with session summary + +3. **Initializer Bis (Add Features)**: + - Triggered by `--new-spec` flag on existing projects + - Reads new spec file from `prompts/` + - Creates additional Linear issues for new features + - Updates existing project without re-initializing + +### Key Design Patterns + +**Session Handoff via Linear**: Agents don't use local state files for coordination. All session context, implementation notes, and progress tracking happens through Linear issues and comments. This provides: +- Real-time visibility in Linear workspace +- Persistent history across sessions +- Easy debugging via issue comments + +**Defense-in-Depth Security** (see `security.py` and `client.py`): +1. OS-level sandbox for bash command isolation +2. Filesystem restrictions (operations limited to project directory) +3. Bash command allowlist with pre-tool-use hooks +4. Explicit MCP tool permissions + +**Project Type Detection** (`agent.py:is_library_project`): +- Detects library/type-safety projects vs full-stack web apps +- Uses different coding prompts (`coding_prompt_library.md` vs `coding_prompt.md`) +- Keywords: "type safety", "docstrings", "mypy", "library rag" + +### Module Responsibilities + +- **`autonomous_agent_demo.py`**: Entry point, argument parsing, environment validation +- **`agent.py`**: Core agent loop, session orchestration, project type detection +- **`client.py`**: Claude SDK client configuration, MCP server setup (Linear + Puppeteer) +- **`security.py`**: Bash command validation with allowlist, pre-tool-use hooks +- **`prompts.py`**: Prompt loading utilities, spec file copying +- **`progress.py`**: Progress tracking via `.linear_project.json` marker +- **`linear_config.py`**: Linear API configuration constants + +### MCP Servers + +**Linear** (HTTP transport at `mcp.linear.app/mcp`): +- Project/team management +- Issue CRUD operations +- Comments and status updates +- Requires `LINEAR_API_KEY` in `.env` + +**Puppeteer** (stdio transport): +- Browser automation for UI testing +- Navigate, screenshot, click, fill, evaluate +- Used by web app projects, not library projects + +## Application Specification Format + +Specifications use XML format in `prompts/app_spec.txt`: + +```xml + + Your App Name + Detailed description... + + + ... + ... + + + + + Feature title + Detailed description + 1-4 (1=urgent, 4=low) + frontend|backend|auth|etc + + 1. Step one + 2. Step two + + + + + +``` + +**Important**: Each `` tag becomes a separate Linear issue. The initializer creates exactly one issue per feature tag. + +## Environment Configuration + +All configuration via `.env` file (copy from `.env.example`): + +```bash +CLAUDE_CODE_OAUTH_TOKEN='your-oauth-token' # From: claude setup-token +LINEAR_API_KEY='lin_api_xxxxx' # From: linear.app/settings/api +LINEAR_TEAM_ID='team-id' # Optional, agent prompts if missing +``` + +## Security Model + +### Allowed Commands (`security.py:ALLOWED_COMMANDS`) + +File operations: `ls`, `cat`, `head`, `tail`, `wc`, `grep`, `cp`, `mkdir`, `chmod` +Development: `npm`, `node`, `python`, `python3`, `mypy`, `pytest` +Version control: `git` +Process management: `ps`, `lsof`, `sleep`, `pkill` +Scripts: `init.sh` + +### Additional Validation + +- **`pkill`**: Only allowed for dev processes (node, npm, vite, next) +- **`chmod`**: Only `+x` mode permitted (making scripts executable) +- **`init.sh`**: Must be `./init.sh` or end with `/init.sh` + +### Adding New Commands + +Edit `security.py:ALLOWED_COMMANDS` and optionally add validation logic to `bash_security_hook`. + +## Generated Project Structure + +After initialization, projects contain: + +``` +my_project/ +├── .linear_project.json # Linear state marker (project_id, total_issues, meta_issue_id) +├── .claude_settings.json # Security settings (auto-generated) +├── app_spec.txt # Original specification (copied from prompts/) +├── init.sh # Environment setup script (executable) +└── [generated code] # Application files created by agent +``` + +## Creating New Applications + +1. Create `prompts/app_spec.txt` with your XML specification +2. Use existing spec files as templates (see `prompts/app_spec.txt` for Claude Clone example) +3. Run: `python autonomous_agent_demo.py --project-dir ./new_app` +4. Monitor progress in Linear workspace + +See `GUIDE_NEW_APP.md` for detailed guide (French). + +## Prompt Templates + +Located in `prompts/`: + +- **`initializer_prompt.md`**: First session prompt (creates Linear project/issues) +- **`initializer_bis_prompt.md`**: Add features prompt (extends existing project) +- **`coding_prompt.md`**: Standard coding session (web apps with Puppeteer testing) +- **`coding_prompt_library.md`**: Library coding session (focuses on types/docs, uses pytest/mypy) + +The framework automatically selects the appropriate prompt based on session type and project detection. + +## Important Implementation Notes + +### Linear Integration + +- All work tracked as Linear issues, not local files +- Session handoff via Linear comments on META issue +- Status workflow: Todo → In Progress → Done +- Early termination: Agent stops when detecting "feature-complete" in responses + +### Auto-Continue Behavior + +Agent auto-continues with 3-second delay between sessions (`agent.py:AUTO_CONTINUE_DELAY_SECONDS`). Stops when: +- `--max-iterations` limit reached +- Response contains "feature-complete" or "all issues completed" +- Fatal error occurs + +### Project Directory Handling + +Relative paths automatically placed under `generations/` directory unless absolute path provided. + +### Model Selection + +Default: `claude-opus-4-5-20251101` (Opus 4.5 for best coding performance) +Override with: `--model claude-sonnet-4-5-20250929` diff --git a/.claude/settings.local.json b/.claude/settings.local.json index c85f664..047e8a8 100644 --- a/.claude/settings.local.json +++ b/.claude/settings.local.json @@ -3,7 +3,41 @@ "allow": [ "Bash(test:*)", "Bash(cat:*)", - "Bash(netstat:*)" + "Bash(netstat:*)", + "Bash(docker-compose:*)", + "Bash(ls:*)", + "Bash(rm:*)", + "Bash(python autonomous_agent_demo.py:*)", + "Bash(dir C:GitHublinear_coding_philosophia_raggenerationslibrary_rag*.py)", + "Bash(git add:*)", + "Bash(git commit -m \"$\\(cat <<''EOF''\nFix import error: rename delete_document_passages to delete_document_chunks\n\nThe function was renamed in weaviate_ingest.py but the import in __init__.py\nwas not updated, causing ImportError when using the library.\n\nChanges:\n- Updated import statement in utils/__init__.py\n- Updated __all__ export list to use correct function name\nEOF\n\\)\")", + "Bash(dir \"C:\\\\GitHub\\\\linear_coding_philosophia_rag\\\\generations\\\\library_rag\\\\.env\")", + "Bash(git commit:*)", + "Bash(tasklist:*)", + "Bash(findstr:*)", + "Bash(wmic process:*)", + "Bash(powershell -Command \"Get-Process python | Select-Object Id,Path,StartTime | Format-Table -AutoSize\")", + "Bash(powershell -Command \"Get-WmiObject Win32_Process -Filter \"\"name = ''python.exe''\"\" | Select-Object ProcessId, CommandLine | Format-List\")", + "Bash(timeout:*)", + "Bash(powershell -Command:*)", + "Bash(python:*)", + "Bash(dir \"C:\\\\GitHub\\\\linear_coding_library_rag\\\\generations\\\\library_rag\")", + "Bash(docker ps:*)", + "Bash(curl:*)", + "Bash(dir:*)", + "Bash(grep:*)", + "Bash(git push:*)", + "Bash(mypy:*)", + "WebSearch", + "Bash(nvidia-smi:*)", + "WebFetch(domain:cr.weaviate.io)", + "Bash(git restore:*)", + "Bash(git log:*)", + "Bash(done)", + "Bash(git remote set-url:*)", + "Bash(docker compose:*)", + "Bash(pytest:*)", + "Bash(git pull:*)" ] } } diff --git a/.env.example b/.env.example new file mode 100644 index 0000000..f9fa073 --- /dev/null +++ b/.env.example @@ -0,0 +1,11 @@ +# Claude Code OAuth Token +# Run 'claude setup-token' to generate this token +CLAUDE_CODE_OAUTH_TOKEN=your-oauth-token-here + +# Linear API Key +# Get your API key from: https://linear.app/YOUR-TEAM/settings/api +LINEAR_API_KEY=lin_api_xxxxxxxxxxxxx + +# Linear Team ID (optional) +# If not set, the agent will list teams and ask you to choose +LINEAR_TEAM_ID= diff --git a/GUIDE_NEW_APP.md b/GUIDE_NEW_APP.md index eda9662..db5b672 100644 --- a/GUIDE_NEW_APP.md +++ b/GUIDE_NEW_APP.md @@ -332,3 +332,6 @@ Pour créer une nouvelle application : Le framework s'occupe du reste ! 🚀 + + + diff --git a/README.md b/README.md index 4e92c64..0d3d258 100644 --- a/README.md +++ b/README.md @@ -26,23 +26,35 @@ pip install -r requirements.txt ### 2. Set Up Authentication -You need two authentication tokens: +Create a `.env` file in the root directory by copying the example: -**Claude Code OAuth Token:** +```bash +cp .env.example .env +``` + +Then configure your credentials in the `.env` file: + +**1. Claude Code OAuth Token:** ```bash # Generate the token using Claude Code CLI claude setup-token -# Set the environment variable -export CLAUDE_CODE_OAUTH_TOKEN='your-oauth-token-here' +# Add to .env file: +CLAUDE_CODE_OAUTH_TOKEN='your-oauth-token-here' ``` -**Linear API Key:** +**2. Linear API Key:** ```bash # Get your API key from: https://linear.app/YOUR-TEAM/settings/api -export LINEAR_API_KEY='lin_api_xxxxxxxxxxxxx' +# Add to .env file: +LINEAR_API_KEY='lin_api_xxxxxxxxxxxxx' + +# Optional: Linear Team ID (if not set, agent will list teams) +LINEAR_TEAM_ID='your-team-id' ``` +**Important:** The `.env` file is already in `.gitignore` - never commit it! + ### 3. Verify Installation ```bash @@ -142,12 +154,15 @@ Instead of local text files, agents communicate through: - **META Issue**: Session summaries and handoff notes - **Issue Status**: Todo / In Progress / Done workflow -## Environment Variables +## Configuration (.env file) + +All configuration is done via a `.env` file in the root directory. | Variable | Description | Required | |----------|-------------|----------| | `CLAUDE_CODE_OAUTH_TOKEN` | Claude Code OAuth token (from `claude setup-token`) | Yes | | `LINEAR_API_KEY` | Linear API key for MCP access | Yes | +| `LINEAR_TEAM_ID` | Linear Team ID (if not set, agent will list teams and ask) | No | ## Command Line Options @@ -268,11 +283,14 @@ Edit `security.py` to add or remove commands from `ALLOWED_COMMANDS`. ## Troubleshooting -**"CLAUDE_CODE_OAUTH_TOKEN not set"** -Run `claude setup-token` to generate a token, then export it. +**"CLAUDE_CODE_OAUTH_TOKEN not found in .env file"** +1. Run `claude setup-token` to generate a token +2. Copy `.env.example` to `.env` +3. Add your token to the `.env` file -**"LINEAR_API_KEY not set"** -Get your API key from `https://linear.app/YOUR-TEAM/settings/api` +**"LINEAR_API_KEY not found in .env file"** +1. Get your API key from `https://linear.app/YOUR-TEAM/settings/api` +2. Add it to your `.env` file **"Appears to hang on first run"** Normal behavior. The initializer is creating a Linear project and 50 issues with detailed descriptions. Watch for `[Tool: mcp__linear__create_issue]` output. @@ -281,7 +299,7 @@ Normal behavior. The initializer is creating a Linear project and 50 issues with The agent tried to run a disallowed command. Add it to `ALLOWED_COMMANDS` in `security.py` if needed. **"MCP server connection failed"** -Verify your `LINEAR_API_KEY` is valid and has appropriate permissions. The Linear MCP server uses HTTP transport at `https://mcp.linear.app/mcp`. +Verify your `LINEAR_API_KEY` in the `.env` file is valid and has appropriate permissions. The Linear MCP server uses HTTP transport at `https://mcp.linear.app/mcp`. ## Viewing Progress diff --git a/agent.py b/agent.py index 2e06223..8ca7b00 100644 --- a/agent.py +++ b/agent.py @@ -17,6 +17,7 @@ from prompts import ( get_initializer_prompt, get_initializer_bis_prompt, get_coding_prompt, + get_coding_prompt_library, copy_spec_to_project, copy_new_spec_to_project, ) @@ -26,6 +27,34 @@ from prompts import ( AUTO_CONTINUE_DELAY_SECONDS = 3 +def is_library_project(project_dir: Path) -> bool: + """ + Detect if this is a library/type-safety project vs a full-stack web app. + + Checks app_spec.txt for keywords related to type safety, documentation, or library projects. + """ + app_spec_path = project_dir / "app_spec.txt" + if not app_spec_path.exists(): + return False + + try: + spec_content = app_spec_path.read_text(encoding='utf-8').lower() + + # Keywords that indicate a library/type-safety project + library_keywords = [ + "type safety", + "type annotations", + "docstrings", + "documentation enhancement", + "mypy", + "library rag", + ] + + return any(keyword in spec_content for keyword in library_keywords) + except Exception: + return False + + async def run_agent_session( client: ClaudeSDKClient, message: str, @@ -162,8 +191,8 @@ async def run_autonomous_agent( print("Fresh start - will use initializer agent") print() print("=" * 70) - print(" NOTE: First session takes 10-20+ minutes!") - print(" The agent is creating 50 Linear issues and setting up the project.") + print(" NOTE: First session may take several minutes!") + print(" The agent is creating Linear issues (one per feature in spec).") print(" This may appear to hang - it's working. Watch for [Tool: ...] output.") print("=" * 70) print() @@ -213,7 +242,11 @@ async def run_autonomous_agent( prompt = get_initializer_bis_prompt() use_initializer_bis = False # Only use initializer bis once else: - prompt = get_coding_prompt() + # Detect project type and use appropriate coding prompt + if is_library_project(project_dir): + prompt = get_coding_prompt_library() + else: + prompt = get_coding_prompt() # Run session with async context manager async with client: diff --git a/autonomous_agent_demo.py b/autonomous_agent_demo.py index 7d9b487..e296867 100644 --- a/autonomous_agent_demo.py +++ b/autonomous_agent_demo.py @@ -17,8 +17,12 @@ import asyncio import os from pathlib import Path +from dotenv import load_dotenv from agent import run_autonomous_agent +# Load environment variables from .env file +load_dotenv() + # Configuration # Using Claude Opus 4.5 as default for best coding and agentic performance @@ -48,9 +52,10 @@ Examples: # Add new specifications to existing project python autonomous_agent_demo.py --project-dir ./claude_clone --new-spec app_spec_new1.txt -Environment Variables: +Configuration (.env file): CLAUDE_CODE_OAUTH_TOKEN Claude Code OAuth token (required) LINEAR_API_KEY Linear API key (required) + LINEAR_TEAM_ID Linear Team ID (optional) """, ) @@ -91,18 +96,17 @@ def main() -> None: # Check for Claude Code OAuth token if not os.environ.get("CLAUDE_CODE_OAUTH_TOKEN"): - print("Error: CLAUDE_CODE_OAUTH_TOKEN environment variable not set") - print("\nRun 'claude setup-token' after installing the Claude Code CLI.") - print("\nThen set it:") - print(" export CLAUDE_CODE_OAUTH_TOKEN='your-token-here'") + print("Error: CLAUDE_CODE_OAUTH_TOKEN not found in .env file") + print("\n1. Run 'claude setup-token' after installing the Claude Code CLI") + print("2. Copy .env.example to .env") + print("3. Add your token to .env: CLAUDE_CODE_OAUTH_TOKEN='your-token-here'") return # Check for Linear API key if not os.environ.get("LINEAR_API_KEY"): - print("Error: LINEAR_API_KEY environment variable not set") - print("\nGet your API key from: https://linear.app/YOUR-TEAM/settings/api") - print("\nThen set it:") - print(" export LINEAR_API_KEY='lin_api_xxxxxxxxxxxxx'") + print("Error: LINEAR_API_KEY not found in .env file") + print("\n1. Get your API key from: https://linear.app/YOUR-TEAM/settings/api") + print("2. Add it to .env: LINEAR_API_KEY='lin_api_xxxxxxxxxxxxx'") return # Automatically place projects in generations/ directory unless already specified diff --git a/client.py b/client.py index 0961566..cd6867a 100644 --- a/client.py +++ b/client.py @@ -9,11 +9,15 @@ import json import os from pathlib import Path +from dotenv import load_dotenv from claude_code_sdk import ClaudeCodeOptions, ClaudeSDKClient from claude_code_sdk.types import HookMatcher from security import bash_security_hook +# Load environment variables from .env file +load_dotenv() + # Puppeteer MCP tools for browser automation PUPPETEER_TOOLS = [ @@ -85,15 +89,17 @@ def create_client(project_dir: Path, model: str) -> ClaudeSDKClient: api_key = os.environ.get("CLAUDE_CODE_OAUTH_TOKEN") if not api_key: raise ValueError( - "CLAUDE_CODE_OAUTH_TOKEN environment variable not set.\n" - "Run 'claude setup-token after installing the Claude Code CLI." + "CLAUDE_CODE_OAUTH_TOKEN not set in .env file.\n" + "Run 'claude setup-token' after installing the Claude Code CLI,\n" + "then add the token to your .env file." ) linear_api_key = os.environ.get("LINEAR_API_KEY") if not linear_api_key: raise ValueError( - "LINEAR_API_KEY environment variable not set.\n" - "Get your API key from: https://linear.app/YOUR-TEAM/settings/api" + "LINEAR_API_KEY not set in .env file.\n" + "Get your API key from: https://linear.app/YOUR-TEAM/settings/api\n" + "then add it to your .env file." ) # Create comprehensive security settings diff --git a/docker-compose.my_project.yml b/docker-compose.my_project.yml new file mode 100644 index 0000000..592ad00 --- /dev/null +++ b/docker-compose.my_project.yml @@ -0,0 +1,29 @@ +services: + my_project_frontend: + image: node:20 + working_dir: /app + volumes: + - ./generations/my_project:/app + # Eviter de réutiliser les node_modules Windows dans le conteneur Linux + - /app/node_modules + command: ["sh", "-c", "npm install && npm run dev -- --host 0.0.0.0 --port 3000"] + ports: + - "4300:3000" + environment: + - NODE_ENV=development + + my_project_server: + image: node:20 + working_dir: /app/server + volumes: + - ./generations/my_project:/app + # Eviter de réutiliser les node_modules Windows dans le conteneur Linux + - /app/server/node_modules + command: ["sh", "-c", "npm install && npm start"] + ports: + - "4301:3001" + environment: + - NODE_ENV=development + depends_on: + - my_project_frontend + diff --git a/linear_config.py b/linear_config.py index 434d1d2..38d4429 100644 --- a/linear_config.py +++ b/linear_config.py @@ -7,10 +7,14 @@ These values are used in prompts and for project state management. """ import os +from dotenv import load_dotenv -# Environment variables (must be set before running) +# Load environment variables from .env file +load_dotenv() + +# Environment variables (loaded from .env file) LINEAR_API_KEY = os.environ.get("LINEAR_API_KEY") -LINEAR_TEAM_ID = os.environ.get("LINEAR_TEAM_ID") +LINEAR_TEAM_ID = os.environ.get("LINEAR_TEAM_ID") # Default number of issues to create (can be overridden via command line) DEFAULT_ISSUE_COUNT = 50 diff --git a/package-lock.json b/package-lock.json new file mode 100644 index 0000000..97cec99 --- /dev/null +++ b/package-lock.json @@ -0,0 +1,6 @@ +{ + "name": "linear_coding_philosophia_rag", + "lockfileVersion": 3, + "requires": true, + "packages": {} +} diff --git a/prompts.py b/prompts.py index f4c8a8b..cf08e91 100644 --- a/prompts.py +++ b/prompts.py @@ -28,6 +28,11 @@ def get_coding_prompt() -> str: return load_prompt("coding_prompt") +def get_coding_prompt_library() -> str: + """Load the library-specific coding agent prompt (for type safety & documentation projects).""" + return load_prompt("coding_prompt_library") + + def copy_spec_to_project(project_dir: Path) -> None: """Copy the app spec file into the project directory for the agent to read.""" spec_source = PROMPTS_DIR / "app_spec.txt" diff --git a/prompts/app_spec.txt b/prompts/app_spec.txt index 1e35f6d..147ea49 100644 --- a/prompts/app_spec.txt +++ b/prompts/app_spec.txt @@ -1,681 +1,542 @@ - Claude.ai Clone - AI Chat Interface + Library RAG MCP Server - PDF Ingestion & Semantic Retrieval - Build a fully functional clone of claude.ai, Anthropic's conversational AI interface. The application should - provide a clean, modern chat interface for interacting with Claude via the API, including features like - conversation management, artifact rendering, project organization, multiple model selection, and advanced - settings. The UI should closely match claude.ai's design using Tailwind CSS with a focus on excellent - user experience and responsive design. + Serveur MCP (Model Context Protocol) exposant les capacités de Library RAG comme outils pour LLMs. + + **Architecture simplifiée :** + - 1 outil d'ingestion : parse_pdf (configuration optimale pré-définie) + - 7 outils de retrieval : recherche sémantique et gestion de documents + + **Configuration optimale (paramètres fixes) :** + - LLM : Mistral Medium (mistral-medium-latest) + - OCR : Mistral avec annotations (meilleure qualité TOC, 3x coût) + - Chunking : Sémantique intelligent (LLM-based) + - Ingestion : Automatique dans Weaviate + + Le LLM client n'a qu'à fournir le chemin du PDF, tous les paramètres sont optimisés par défaut. - - You can use an API key located at /tmp/api-key for testing. You will not be allowed to read this file, but you can reference it in code. - - - React with Vite - Tailwind CSS (via CDN) - React hooks and context - React Router for navigation - React Markdown for message rendering - Syntax highlighting for code blocks - Only launch on port {frontend_port} - - Node.js with Express - SQLite with better-sqlite3 - Claude API for chat completions - Server-Sent Events for streaming responses + Python 3.10+ + mcp Python SDK (official Anthropic implementation) + Weaviate 1.34.4 + text2vec-transformers + Mistral OCR API with annotations + Mistral API (mistral-medium-latest) + mypy strict - - RESTful endpoints - SSE for real-time message streaming - Integration with Claude API using Anthropic SDK - + + 1.0 + stdio + tools + - - - - Repository includes .env with VITE_ANTHROPIC_API_KEY configured - - Frontend dependencies pre-installed via pnpm - - Backend code goes in /server directory - - Install backend dependencies as needed - - - - - - - Clean, centered chat layout with message bubbles - - Streaming message responses with typing indicator - - Markdown rendering with proper formatting - - Code blocks with syntax highlighting and copy button - - LaTeX/math equation rendering - - Image upload and display in messages - - Multi-turn conversations with context - - Message editing and regeneration - - Stop generation button during streaming - - Input field with auto-resize textarea - - Character count and token estimation - - Keyboard shortcuts (Enter to send, Shift+Enter for newline) - - - - - Artifact detection and rendering in side panel - - Code artifact viewer with syntax highlighting - - HTML/SVG preview with live rendering - - React component preview - - Mermaid diagram rendering - - Text document artifacts - - Artifact editing and re-prompting - - Full-screen artifact view - - Download artifact content - - Artifact versioning and history - - - - - Create new conversations - - Conversation list in sidebar - - Rename conversations - - Delete conversations - - Search conversations by title/content - - Pin important conversations - - Archive conversations - - Conversation folders/organization - - Duplicate conversation - - Export conversation (JSON, Markdown, PDF) - - Conversation timestamps (created, last updated) - - Unread message indicators - - - - - Create projects to group related conversations - - Project knowledge base (upload documents) - - Project-specific custom instructions - - Share projects with team (mock feature) - - Project settings and configuration - - Move conversations between projects - - Project templates - - Project analytics (usage stats) - - - - - Model selector dropdown with the following models: - - Claude Sonnet 4.5 (claude-sonnet-4-5-20250929) - default - - Claude Haiku 4.5 (claude-haiku-4-5-20251001) - - Claude Opus 4.1 (claude-opus-4-1-20250805) - - Model capabilities display - - Context window indicator - - Model-specific pricing info (display only) - - Switch models mid-conversation - - Model comparison view - - - - - Global custom instructions - - Project-specific custom instructions - - Conversation-specific system prompts - - Custom instruction templates - - Preview how instructions affect responses - - - - - Theme selection (Light, Dark, Auto) - - Font size adjustment - - Message density (compact, comfortable, spacious) - - Code theme selection - - Language preferences - - Accessibility options - - Keyboard shortcuts reference - - Data export options - - Privacy settings - - API key management - - - - - Temperature control slider - - Max tokens adjustment - - Top-p (nucleus sampling) control - - System prompt override - - Thinking/reasoning mode toggle - - Multi-modal input (text + images) - - Voice input (optional, mock UI) - - Response suggestions - - Related prompts - - Conversation branching - - - - - Share conversation via link (read-only) - - Export conversation formats - - Conversation templates - - Prompt library - - Share artifacts - - Team workspaces (mock UI) - - - - - Search across all conversations - - Filter by project, date, model - - Prompt library with categories - - Example conversations - - Quick actions menu - - Command palette (Cmd/Ctrl+K) - - - - - Token usage display per message - - Conversation cost estimation - - Daily/monthly usage dashboard - - Usage limits and warnings - - API quota tracking - - - - - Welcome screen for new users - - Feature tour highlights - - Example prompts to get started - - Quick tips and best practices - - Keyboard shortcuts tutorial - - - - - Full keyboard navigation - - Screen reader support - - ARIA labels and roles - - High contrast mode - - Focus management - - Reduced motion support - - - - - Mobile-first responsive layout - - Touch-optimized interface - - Collapsible sidebar on mobile - - Swipe gestures for navigation - - Adaptive artifact display - - Progressive Web App (PWA) support - - - - - - - - id, email, name, avatar_url - - created_at, last_login - - preferences (JSON: theme, font_size, etc.) - - custom_instructions - - - - - id, user_id, name, description, color - - custom_instructions, knowledge_base_path - - created_at, updated_at - - is_archived, is_pinned - - - - - id, user_id, project_id, title - - model, created_at, updated_at, last_message_at - - is_archived, is_pinned, is_deleted - - settings (JSON: temperature, max_tokens, etc.) - - token_count, message_count - - - - - id, conversation_id, role (user/assistant/system) - - content, created_at, edited_at - - tokens, finish_reason - - images (JSON array of image data) - - parent_message_id (for branching) - - - - - id, message_id, conversation_id - - type (code/html/svg/react/mermaid/text) - - title, identifier, language - - content, version - - created_at, updated_at - - - - - id, conversation_id, share_token - - created_at, expires_at, view_count - - is_public - - - - - id, user_id, title, description - - prompt_template, category, tags (JSON) - - is_public, usage_count - - created_at, updated_at - - - - - id, user_id, project_id, name, parent_folder_id - - created_at, position - - - - - id, folder_id, conversation_id - - - - - id, user_id, conversation_id, message_id - - model, input_tokens, output_tokens - - cost_estimate, created_at - - - - - id, user_id, key_name, api_key_hash - - created_at, last_used_at - - is_active - - - - - - - - POST /api/auth/login - - POST /api/auth/logout - - GET /api/auth/me - - PUT /api/auth/profile - - - - - GET /api/conversations - - POST /api/conversations - - GET /api/conversations/:id - - PUT /api/conversations/:id - - DELETE /api/conversations/:id - - POST /api/conversations/:id/duplicate - - POST /api/conversations/:id/export - - PUT /api/conversations/:id/archive - - PUT /api/conversations/:id/pin - - POST /api/conversations/:id/branch - - - - - GET /api/conversations/:id/messages - - POST /api/conversations/:id/messages - - PUT /api/messages/:id - - DELETE /api/messages/:id - - POST /api/messages/:id/regenerate - - GET /api/messages/stream (SSE endpoint) - - - - - GET /api/conversations/:id/artifacts - - GET /api/artifacts/:id - - PUT /api/artifacts/:id - - DELETE /api/artifacts/:id - - POST /api/artifacts/:id/fork - - GET /api/artifacts/:id/versions - - - - - GET /api/projects - - POST /api/projects - - GET /api/projects/:id - - PUT /api/projects/:id - - DELETE /api/projects/:id - - POST /api/projects/:id/knowledge - - GET /api/projects/:id/conversations - - PUT /api/projects/:id/settings - - - - - POST /api/conversations/:id/share - - GET /api/share/:token - - DELETE /api/share/:token - - PUT /api/share/:token/settings - - - - - GET /api/prompts/library - - POST /api/prompts/library - - GET /api/prompts/:id - - PUT /api/prompts/:id - - DELETE /api/prompts/:id - - GET /api/prompts/categories - - GET /api/prompts/examples - - - - - GET /api/search/conversations?q=query - - GET /api/search/messages?q=query - - GET /api/search/artifacts?q=query - - GET /api/search/prompts?q=query - - - - - GET /api/folders - - POST /api/folders - - PUT /api/folders/:id - - DELETE /api/folders/:id - - POST /api/folders/:id/items - - DELETE /api/folders/:id/items/:conversationId - - - - - GET /api/usage/daily - - GET /api/usage/monthly - - GET /api/usage/by-model - - GET /api/usage/conversations/:id - - - - - GET /api/settings - - PUT /api/settings - - GET /api/settings/custom-instructions - - PUT /api/settings/custom-instructions - - - - - POST /api/claude/chat (proxy to Claude API) - - POST /api/claude/chat/stream (streaming proxy) - - GET /api/claude/models - - POST /api/claude/images/upload - - - - - - - Three-column layout: sidebar (conversations), main (chat), panel (artifacts) - - Collapsible sidebar with resize handle - - Responsive breakpoints: mobile (single column), tablet (two column), desktop (three column) - - Persistent header with project/model selector - - Bottom input area with send button and options - - - - - New chat button (prominent) - - Project selector dropdown - - Search conversations input - - Conversations list (grouped by date: Today, Yesterday, Previous 7 days, etc.) - - Folder tree view (collapsible) - - Settings gear icon at bottom - - User profile at bottom - - - - - Conversation title (editable inline) - - Model selector badge - - Message history (scrollable) - - Welcome screen for new conversations - - Suggested prompts (empty state) - - Input area with formatting toolbar - - Attachment button for images - - Send button with loading state - - Stop generation button - - - - - Artifact header with title and type badge - - Code editor or preview pane - - Tabs for multiple artifacts - - Full-screen toggle - - Download button - - Edit/Re-prompt button - - Version selector - - Close panel button - - - - - Settings modal (tabbed interface) - - Share conversation modal - - Export options modal - - Project settings modal - - Prompt library modal - - Command palette overlay - - Keyboard shortcuts reference - - - - - - - Primary: Orange/amber accent (#CC785C claude-style) - - Background: White (light mode), Dark gray (#1A1A1A dark mode) - - Surface: Light gray (#F5F5F5 light), Darker gray (#2A2A2A dark) - - Text: Near black (#1A1A1A light), Off-white (#E5E5E5 dark) - - Borders: Light gray (#E5E5E5 light), Dark gray (#404040 dark) - - Code blocks: Monaco editor theme - - - - - Sans-serif system font stack (Inter, SF Pro, Roboto, system-ui) - - Headings: font-semibold - - Body: font-normal, leading-relaxed - - Code: Monospace (JetBrains Mono, Consolas, Monaco) - - Message text: text-base (16px), comfortable line-height - - - - - - User messages: Right-aligned, subtle background - - Assistant messages: Left-aligned, no background - - Markdown formatting with proper spacing - - Inline code with bg-gray-100 background - - Code blocks with syntax highlighting - - Copy button on code blocks - - - - - Primary: Orange/amber background, white text, rounded - - Secondary: Border style with hover fill - - Icon buttons: Square with hover background - - Disabled state: Reduced opacity, no pointer events - - - - - Rounded borders with focus ring - - Textarea auto-resize - - Placeholder text in gray - - Error states in red - - Character counter - - - - - Subtle border or shadow - - Rounded corners (8px) - - Padding: p-4 to p-6 - - Hover state: slight shadow increase - - - - - - Smooth transitions (150-300ms) - - Fade in for new messages - - Slide in for sidebar - - Typing indicator animation - - Loading spinner for generation - - Skeleton loaders for content - - - - - - 1. User types message in input field - 2. Optional: Attach images via button - 3. Click send or press Enter - 4. Message appears in chat immediately - 5. Typing indicator shows while waiting - 6. Response streams in word by word - 7. Code blocks render with syntax highlighting - 8. Artifacts detected and rendered in side panel - 9. Message complete, enable regenerate option - - - - 1. Assistant generates artifact in response - 2. Artifact panel slides in from right - 3. Content renders (code with highlighting or live preview) - 4. User can edit artifact inline - 5. "Re-prompt" button to iterate with Claude - 6. Download or copy artifact content - 7. Full-screen mode for detailed work - 8. Close panel to return to chat focus - - - - 1. Click "New Chat" to start fresh conversation - 2. Conversations auto-save with first message - 3. Auto-generate title from first exchange - 4. Click title to rename inline - 5. Drag conversations into folders - 6. Right-click for context menu (pin, archive, delete, export) - 7. Search filters conversations in real-time - 8. Click conversation to switch context - - - - - Setup Project Foundation and Database - - - Initialize Express server with SQLite database - - Set up Claude API client with streaming support - - Create database schema with migrations - - Implement authentication endpoints - - Set up basic CORS and middleware - - Create health check endpoint - - + + MCP Server Foundation + + Set up the basic MCP server structure with stdio transport. - - Build Core Chat Interface - - - Create main layout with sidebar and chat area - - Implement message display with markdown rendering - - Add streaming message support with SSE - - Build input area with auto-resize textarea - - Add code block syntax highlighting - - Implement stop generation functionality - - Add typing indicators and loading states - - + Tasks: + - Install mcp Python SDK: pip install mcp + - Create mcp_server.py with Server initialization + - Configure stdio transport for LLM communication + - Implement server lifecycle handlers (startup, shutdown) + - Set up basic logging with Python logging module + - Test server startup and graceful shutdown + + 1 + infrastructure + + 1. Run: python mcp_server.py + 2. Verify server starts without errors + 3. Check that stdio transport is initialized + 4. Send SIGTERM and verify graceful shutdown + 5. Check logs are created and formatted correctly + + - - Conversation Management - - - Create conversation list in sidebar - - Implement new conversation creation - - Add conversation switching - - Build conversation rename functionality - - Implement delete with confirmation - - Add conversation search - - Create conversation grouping by date - - + + Configuration Management + + Implement configuration loading and validation. - - Artifacts System - - - Build artifact detection from Claude responses - - Create artifact rendering panel - - Implement code artifact viewer - - Add HTML/SVG live preview - - Build artifact editing interface - - Add artifact versioning - - Implement full-screen artifact view - - + Tasks: + - Verify mcp_config.py exists (already created) + - Test MCPConfig.from_env() loads .env correctly + - Validate MISTRAL_API_KEY is required + - Test default values for optional settings + - Create .env.example file with all variables + - Document all environment variables + + 1 + infrastructure + + 1. Create .env file with MISTRAL_API_KEY + 2. Run: python -c "from mcp_config import MCPConfig; MCPConfig.from_env()" + 3. Verify config loads successfully + 4. Remove MISTRAL_API_KEY from .env + 5. Verify ValueError is raised + 6. Test all default values are applied + + - - Projects and Organization - - - Create projects CRUD endpoints - - Build project selector UI - - Implement project-specific custom instructions - - Add folder system for conversations - - Create drag-and-drop organization - - Build project settings panel - - + + Pydantic Schemas for All Tools + + Define all tool input/output schemas using Pydantic models. - - Advanced Features - - - Add model selection dropdown - - Implement temperature and parameter controls - - Build image upload functionality - - Create message editing and regeneration - - Add conversation branching - - Implement export functionality - - + Tasks: + - Create mcp_tools/ directory + - Create mcp_tools/__init__.py + - Create mcp_tools/schemas.py + - Define ParsePdfInput model with validation + - Define ParsePdfOutput model + - Define SearchChunksInput/Output models + - Define schemas for all 7 retrieval tools + - Add Field() validation (min/max, regex, enum) + - Add docstrings for JSON schema generation + - Verify mypy --strict passes + + 1 + infrastructure + + 1. Run: mypy mcp_tools/schemas.py --strict + 2. Verify no type errors + 3. Test ParsePdfInput validation with invalid inputs + 4. Test ParsePdfInput validation with valid inputs + 5. Generate JSON schema from Pydantic models + 6. Verify all fields have descriptions + + - - Settings and Customization - - - Build settings modal with tabs - - Implement theme switching (light/dark) - - Add custom instructions management - - Create keyboard shortcuts - - Build prompt library - - Add usage tracking dashboard - - + + Parsing Tool - parse_pdf Implementation + + Implement the parse_pdf tool with optimal parameters pre-configured. - - Sharing and Collaboration - - - Implement conversation sharing with tokens - - Create public share view - - Add export to multiple formats - - Build prompt templates - - Create example conversations - - + Tasks: + - Create mcp_tools/parsing_tools.py + - Implement parse_pdf tool handler + - Fixed parameters: + - llm_provider="mistral" + - llm_model="mistral-medium-latest" + - use_semantic_chunking=True + - use_ocr_annotations=True + - ingest_to_weaviate=True + - Wrapper around pdf_pipeline.process_pdf_bytes() + - Handle file downloads for URL inputs + - Return comprehensive results (metadata, costs, file paths) + - Add error handling and logging + - Register tool with MCP server + + 1 + functional + + 1. Mock pdf_pipeline.process_pdf_bytes() + 2. Call parse_pdf with local PDF path + 3. Verify fixed parameters are used + 4. Call parse_pdf with URL + 5. Verify file download works + 6. Check output contains all required fields + 7. Verify costs are tracked and returned + + - - Polish and Optimization - - - Optimize for mobile responsiveness - - Add command palette (Cmd+K) - - Implement comprehensive keyboard navigation - - Add onboarding flow - - Create accessibility improvements - - Performance optimization and caching - - + + Retrieval Tool - search_chunks + + Implement semantic search on text chunks. + + Tasks: + - Create mcp_tools/retrieval_tools.py + - Implement search_chunks tool handler + - Weaviate near_text query on Chunk collection + - Support filters: author, work, language, min_similarity + - Handle nested object properties (work.author, work.title) + - Return results with similarity scores + - Add error handling for Weaviate connection + - Register tool with MCP server + + 2 + functional + + 1. Mock Weaviate client + 2. Call search_chunks with query="justice" + 3. Verify near_text query is called + 4. Test author_filter applies correct nested filter + 5. Test work_filter applies correct nested filter + 6. Test min_similarity threshold works + 7. Verify results include all metadata fields + + + + + Retrieval Tool - search_summaries + + Implement search in chapter/section summaries. + + Tasks: + - Add search_summaries handler to retrieval_tools.py + - Query Summary collection with near_text + - Support level filters (min_level, max_level) + - Return summaries with hierarchical metadata + - Add error handling + - Register tool with MCP server + + 2 + functional + + 1. Mock Weaviate Summary collection + 2. Call search_summaries with query + 3. Verify near_text query on Summary + 4. Test min_level filter + 5. Test max_level filter + 6. Verify results include sectionPath and concepts + + + + + Retrieval Tool - get_document + + Retrieve complete document metadata and chunks by sourceId. + + Tasks: + - Add get_document handler to retrieval_tools.py + - Query Document collection by sourceId + - Optionally fetch related chunks + - Support chunk_limit parameter + - Return complete document data with TOC and hierarchy + - Add error handling for missing documents + - Register tool with MCP server + + 2 + functional + + 1. Mock Weaviate Document collection + 2. Call get_document with valid sourceId + 3. Verify Document query by sourceId + 4. Test include_chunks=True fetches chunks + 5. Test include_chunks=False skips chunks + 6. Test chunk_limit parameter + 7. Verify error on missing document + + + + + Retrieval Tool - list_documents + + List all documents with filtering and pagination. + + Tasks: + - Add list_documents handler to retrieval_tools.py + - Query Document collection with filters + - Support author, work, language filters + - Implement pagination (limit, offset) + - Return document summaries with counts + - Add error handling + - Register tool with MCP server + + 2 + functional + + 1. Mock Weaviate Document collection + 2. Call list_documents without filters + 3. Verify all documents returned + 4. Test author_filter + 5. Test work_filter + 6. Test pagination (limit=10, offset=5) + 7. Verify total count is accurate + + + + + Retrieval Tool - get_chunks_by_document + + Retrieve all chunks for a document in sequential order. + + Tasks: + - Add get_chunks_by_document handler to retrieval_tools.py + - Query Chunk collection filtered by document.sourceId + - Order results by orderIndex + - Support pagination (limit, offset) + - Support section_filter for specific sections + - Return ordered chunks with document metadata + - Add error handling + - Register tool with MCP server + + 2 + functional + + 1. Mock Weaviate Chunk collection + 2. Call get_chunks_by_document with document_id + 3. Verify filter by document.sourceId + 4. Verify ordering by orderIndex + 5. Test pagination + 6. Test section_filter parameter + 7. Verify document metadata is included + + + + + Retrieval Tool - filter_by_author + + Get all works and documents by a specific author. + + Tasks: + - Add filter_by_author handler to retrieval_tools.py + - Query Work collection by author + - Fetch related Documents for each work + - Optionally aggregate chunk counts + - Return hierarchical structure (works → documents) + - Add error handling + - Register tool with MCP server + + 2 + functional + + 1. Mock Weaviate Work and Document collections + 2. Call filter_by_author with author="Platon" + 3. Verify Work query by author + 4. Verify Documents are fetched for each work + 5. Test include_chunks=True aggregates counts + 6. Test include_chunks=False skips counts + 7. Verify hierarchical structure in output + + + + + Retrieval Tool - delete_document + + Delete a document and all its chunks from Weaviate. + + Tasks: + - Add delete_document handler to retrieval_tools.py + - Use weaviate_ingest.delete_document_chunks() + - Require explicit confirmation flag + - Return deletion statistics (chunks, summaries, document) + - Add safety checks + - Add error handling + - Register tool with MCP server + + 3 + functional + + 1. Mock weaviate_ingest.delete_document_chunks() + 2. Call delete_document without confirm=True + 3. Verify error is raised + 4. Call delete_document with confirm=True + 5. Verify delete_document_chunks is called + 6. Verify deletion statistics are returned + 7. Test error handling for missing document + + + + + Error Handling & Structured Logging + + Implement comprehensive error handling and logging across all tools. + + Tasks: + - Define custom exception classes: + - WeaviateConnectionError + - PDFProcessingError + - ValidationError + - MCPToolError + - Add try-except in all tool handlers + - Convert Python exceptions to MCP error responses + - Implement structured JSON logging + - Log all tool invocations (name, inputs, duration, costs) + - Log Weaviate queries and results + - Configure log level from environment + - Never expose sensitive data in logs + + 2 + infrastructure + + 1. Test WeaviateConnectionError is raised on connection failure + 2. Verify error is converted to MCP error format + 3. Test PDFProcessingError on invalid PDF + 4. Verify all errors are logged with context + 5. Check logs contain tool name, inputs, duration + 6. Verify sensitive data (API keys) not in logs + 7. Test LOG_LEVEL environment variable works + + + + + MCP Server Documentation + + Create comprehensive documentation for MCP server usage. + + Tasks: + - Create MCP_README.md with: + - Overview of capabilities + - Installation instructions + - Environment variable configuration + - Tool descriptions with examples + - Claude Desktop integration guide + - Troubleshooting section + - Add docstrings to all tool handlers + - Create .env.example file + - Document error codes and messages + - Add usage examples for each tool + + 3 + documentation + + 1. Review MCP_README.md for completeness + 2. Verify installation instructions work + 3. Check all tools documented with examples + 4. Test Claude Desktop integration steps + 5. Verify .env.example has all variables + 6. Check troubleshooting covers common issues + + + + + Unit Tests - Parsing Tool + + Implement unit tests for parse_pdf tool. + + Tasks: + - Create tests/mcp/ directory + - Create tests/mcp/conftest.py with fixtures + - Create tests/mcp/test_parsing_tools.py + - Test parse_pdf with valid PDF path + - Test parse_pdf with URL (mock download) + - Test parse_pdf error handling + - Mock pdf_pipeline.process_pdf_bytes() + - Verify fixed parameters are used + - Test cost tracking in output + - Use pytest-mock for mocking + - Target >80% coverage + + 3 + testing + + 1. Run: pytest tests/mcp/test_parsing_tools.py -v + 2. Verify all tests pass + 3. Run: pytest tests/mcp/test_parsing_tools.py --cov + 4. Verify coverage >80% + 5. Check mocks are used (no real API calls) + 6. Verify error cases are tested + + + + + Unit Tests - Retrieval Tools + + Implement unit tests for all 7 retrieval tools. + + Tasks: + - Create tests/mcp/test_retrieval_tools.py + - Test search_chunks with filters + - Test search_summaries with level filters + - Test get_document with/without chunks + - Test list_documents pagination + - Test get_chunks_by_document ordering + - Test filter_by_author aggregation + - Test delete_document confirmation + - Mock all Weaviate queries + - Test error handling for all tools + - Target >80% coverage + + 3 + testing + + 1. Run: pytest tests/mcp/test_retrieval_tools.py -v + 2. Verify all tests pass + 3. Run: pytest tests/mcp/test_retrieval_tools.py --cov + 4. Verify coverage >80% + 5. Check all 7 tools have tests + 6. Verify error cases are tested + 7. Check no real Weaviate connections + + + + + + tests/mcp/ + ├── conftest.py (fixtures pour mocks) + ├── test_parsing_tools.py + ├── test_retrieval_tools.py + ├── test_schemas.py + └── test_config.py + + + + - pytest-mock pour mocking + - Mock Mistral API responses + - Mock Weaviate client + - Fixtures pour test data + - Pas d'appels API réels en unit tests + + + + + + **PAS DE TESTS PUPPETEER** + + Serveur MCP sans interface web. + Focus sur : + - Fonctionnalité des outils (unit tests) + - Validation schemas (Pydantic) + - Error handling + + + + + + + Configuration dans ~/.config/claude/claude_desktop_config.json : + + { + "mcpServers": { + "library-rag": { + "command": "python", + "args": ["C:/GitHub/linear_coding_library_rag/generations/library_rag/mcp_server.py"], + "env": { + "MISTRAL_API_KEY": "your-key-here" + } + } + } + } + + + - - - Streaming chat responses work smoothly - - Artifact detection and rendering accurate - - Conversation management intuitive and reliable - - Project organization clear and useful - - Image upload and display working - - All CRUD operations functional - + + - 8 outils fonctionnels (1 parsing + 7 retrieval) + - parse_pdf avec paramètres optimaux fixes + - Recherche sémantique Weaviate fonctionnelle + - Error handling complet + - - - Interface matches claude.ai design language - - Responsive on all device sizes - - Smooth animations and transitions - - Fast response times and minimal lag - - Intuitive navigation and workflows - - Clear feedback for all actions - + + - mypy --strict passes + - Coverage >80% + - Docstrings complets + - - - Clean, maintainable code structure - - Proper error handling throughout - - Secure API key management - - Optimized database queries - - Efficient streaming implementation - - Comprehensive testing coverage - - - - - Consistent with claude.ai visual design - - Beautiful typography and spacing - - Smooth animations and micro-interactions - - Excellent contrast and accessibility - - Professional, polished appearance - - Dark mode fully implemented - + + - parse_pdf : <2min par document + - search_chunks : <1s + + + + + Document 300 pages avec configuration optimale : + - OCR : ~2.70€ + - LLM : ~0.15€ + - Total : ~2.85€ + + Retourné dans parse_pdf output pour tracking. + + diff --git a/prompts/app_spec_language_selection.completion.txt b/prompts/app_spec_language_selection.completion.txt deleted file mode 100644 index 656ccb9..0000000 --- a/prompts/app_spec_language_selection.completion.txt +++ /dev/null @@ -1,179 +0,0 @@ - - Language selection & i18n completion (FR default, EN/FR only) - - - This specification complements the existing "app_spec_language_selection.txt" file. - It does NOT replace the original spec. Instead, it adds additional requirements and - corrective steps to fully complete the language selection and i18n implementation. - - Main goals: - - Support exactly two UI languages: English ("en") and French ("fr"). - - Make French ("fr") the default language when no preference exists. - - Ensure that all user-facing text is translated via the i18n system (no hardcoded strings). - - Align the language selector UI with the actual supported languages. - - - - - The original file "app_spec_language_selection.txt" defines the initial language selection - feature and i18n architecture (context, translation files, etc.). - - This completion spec: - * keeps that architecture, - * tightens some requirements (FR as default), - * and adds missing work items (removal of hardcoded English strings, cleanup of extra languages). - - The original spec remains valid; this completion spec should be applied on top of it. - - - - - Officially supported UI languages: - * English ("en") - * French ("fr") - - Default language: - * French ("fr") MUST be the default language when there is no stored preference. - - No other languages (es, de, ja, etc.) are considered part of this completion scope. - They may be added later in a separate spec with full translation coverage. - - The existing i18n architecture (LanguageContext, useLanguage hook, en.json, fr.json) - must be reused, not replaced. - - - - - LanguageContext and useLanguage already exist and manage language + translations. - - en.json and fr.json exist with a significant subset of strings translated. - - Some components already call t('...') correctly (e.g. welcome screen, many settings labels). - - However: - * Many UI strings are still hardcoded in English in "src/App.jsx". - * The language selector UI mentions more languages than are actually implemented. - * The default language behavior is not explicitly enforced as French. - - - - - French is used as the default language for new/anonymous users. - - Only English and French appear in the language selector. - - All user-facing UI strings in "src/App.jsx" and its inline components use t('key'). - - Every key used by the UI is defined in both en.json and fr.json. - - No leftover English UI text appears when French is selected. - - - - - - In the language context code: - * Ensure there is a constant DEFAULT_LANGUAGE set to "fr". - Example: - const DEFAULT_LANGUAGE = 'fr'; - - Initial language resolution MUST follow this order: - 1. If a valid language ("en" or "fr") is found in localStorage, use it. - 2. Otherwise, fall back to DEFAULT_LANGUAGE = "fr". - - This guarantees that first-time users and users without a stored preference see the UI in French. - - - - - SUPPORTED_LANGUAGES must contain exactly: - * { code: 'en', name: 'English', nativeName: 'English' } - * { code: 'fr', name: 'French', nativeName: 'Français' } - - The Settings > Language dropdown must iterate only over SUPPORTED_LANGUAGES. - - Any explicit references to "es", "de", "ja" as selectable languages must be removed - or commented out as "future languages" (but not shown to users). - - - - - Perform a systematic audit of "src/App.jsx" to identify every user-visible English string - that is still hardcoded. Typical areas include: - * ThemePreview sample messages (e.g. “Hello! Can you help me with something?”). - * About section in Settings > General (product name, description, “Built with …” text). - * Default model description and option labels. - * Project modals: “Cancel”, “Save Changes”, etc. - * Any toasts, confirmation messages, help texts, or labels still in English. - - For each identified string: - * Define a stable translation key (e.g. "themePreview.sampleUser1", - "settings.defaultModelDescription", "projectModal.cancel", "projectModal.saveChanges"). - * Add this key to both en.json and fr.json. - - - - - Replace each hardcoded string with a call to the translation function, for example: - BEFORE: -

Hello! Can you help me with something?

- AFTER: -

{t('themePreview.sampleUser1')}

- - Ensure that: - * The component (or function) imports useLanguage. - * const { t } = useLanguage() is declared in the correct scope. - - Apply this systematically across: - * Settings / General and Appearance sections. - * Theme preview component. - * Project-related modals. - * Any remaining banners, tooltips, or messages defined inside App.jsx. -
- - - - Update translations/en.json: - * Add all new keys with natural English text. - - Update translations/fr.json: - * Add the same keys with accurate French translations. - - Goal: - * For every key used in code, both en.json and fr.json must contain a value. - - - - - Keep existing fallback behavior in LanguageContext: - * If a key is missing in the current language, fall back to English. - * If the key is also missing in English, return the key and log a warning. - - However, after this completion spec is implemented: - * No fallback warnings should appear in normal operation, because all keys are defined. - - - - - In the Settings > General tab: - * The language section heading must be translated via t('settings.language'). - * Any helper text/description for the language selector must also use t('...'). - * The select's value is bound to the language from useLanguage. - * The onChange handler calls setLanguage(newLanguageCode). - - Expected behavior: - * Switching to French instantly updates the UI and saves "fr" in localStorage. - * Switching to English instantly updates the UI and saves "en" in localStorage. - -
- - - - 1. Clear the language preference from localStorage. - 2. Load the application: - - Confirm that the UI is initially in French (FR as default). - 3. Open the Settings modal and navigate to the General tab. - - Verify that the language selector shows only "Français" and "English". - 4. Switch to English: - - Verify that Sidebar, Settings, Welcome screen, Chat area, and modals are all in English. - 5. Refresh the page: - - Confirm that the UI stays in English (preference persisted). - 6. Switch back to French and repeat quick checks to confirm all UI text is in French. - - - - - Check in both languages: - * Main/empty state (welcome screen). - * Chat area (input placeholder, send/stop/regenerate buttons). - * Sidebar (navigation sections, search placeholder, pinned/archived labels). - * Settings (all tabs). - * Project creation and edit modals. - * Delete/confirmation dialogs and any share/export flows. - - Confirm: - * In French, there is no remaining English UI text. - * In English, there is no accidental French UI text. - - - - - Verify: - * Chat behavior is unchanged except for translated labels/text. - * Project operations (create/update/delete) still work. - * No new console errors appear when switching languages or reloading. - - - - - - "app_spec_language_selection.txt" remains the original base spec. - - This completion spec ("app_spec_language_selection.completion.txt") is fully implemented. - - French is used as default language when no preference exists. - - Only English and French are presented in the language selector. - - All user-facing strings in App.jsx go through t('key') and exist in both en.json and fr.json. - - No stray English text is visible when the French language is selected. - -
\ No newline at end of file diff --git a/prompts/app_spec_language_selection.txt b/prompts/app_spec_language_selection.txt deleted file mode 100644 index 7c5f689..0000000 --- a/prompts/app_spec_language_selection.txt +++ /dev/null @@ -1,525 +0,0 @@ - - Claude.ai Clone - Language Selection Bug Fix - - - This specification fixes a bug in the language selection functionality. The feature was - originally planned in the initial app_spec.txt (line 127: "Language preferences") and a UI - component already exists in the settings panel (App.jsx lines 1412-1419), but the functionality - is incomplete and non-functional. - - Currently, there is a language selector dropdown in the settings with options for English, - Español, Français, Deutsch, and 日本語, but it lacks: - - State management for the selected language - - Event handlers (onChange) to handle language changes - - A translation system (i18n) - - Translation files (en.json, fr.json, etc.) - - Language context/provider - - Persistence of language preference - - This bug fix will complete the implementation by adding the missing functionality so that when - a language is selected, the entire interface updates immediately to display all text in the - chosen language. The language preference should persist across sessions. - - Focus will be on English (default) and French as the primary languages, with the existing - UI supporting additional languages for future expansion. - - - - - Location: src/App.jsx, lines 1412-1419 - Component: Language selector dropdown in settings panel (General/Preferences section) - Current options: English (en), Español (es), Français (fr), Deutsch (de), 日本語 (ja) - Status: UI exists but is non-functional (no onChange handler, no state, no translations) - - - - Original spec: prompts/app_spec.txt, line 127 - Mentioned as: "Language preferences" in settings_preferences section - Status: Feature was planned but not fully implemented - - - - - - - DO NOT remove or modify the existing language selector UI (lines 1412-1419 in App.jsx) - - DO NOT break existing functionality when language is changed - - English must remain the default language - - Language changes should apply immediately without page refresh - - All translations must be complete (no missing translations) - - Maintain backward compatibility with existing code - - Language preference should be stored and persist across sessions - - Keep the existing dropdown structure and styling - - Connect the existing select element to the new translation system - - - - - - Fix Language Selection Functionality - - Complete the implementation of the existing language selector in the settings menu. - The UI already exists (App.jsx lines 1412-1419) but needs to be made functional. - - The fix should: - - Connect the existing select element to state management - - Add onChange handler to the existing select element - - Display current selected language (load from localStorage on mount) - - Apply language changes immediately to the entire interface - - Save language preference to localStorage - - Persist language choice across sessions - - The existing selector is already in the correct location (General/Preferences section - of settings panel) and has the correct styling, so only the functionality needs to be added. - - 1 - bug_fix - completion_of_existing_feature - - - Keep the existing select element in App.jsx (lines 1412-1419) - - Add useState hook to manage selected language state - - Add value prop to select element (bound to state) - - Add onChange handler to select element - - Load language preference from localStorage on component mount - - Save language preference to localStorage on change - - Create translation files/dictionaries for each language - - Implement language context/provider to manage current language - - Create translation utility function to retrieve translated strings - - Update all hardcoded text to use translation function - - Apply language changes reactively throughout the application - - - 1. Open settings menu - 2. Navigate to "General" or "Preferences" section - 3. Locate the existing "Language" selector (should already be visible) - 4. Verify the select element now has a value bound to state (not empty) - 5. Verify default language is "English" (en) on first load - 6. Select "Français" (fr) from the existing language dropdown - 7. Verify onChange handler fires and updates state - 8. Verify entire interface updates immediately to French - 9. Check that all UI elements are translated (buttons, labels, menus) - 10. Navigate to different pages and verify translations persist - 11. Refresh the page and verify language preference is maintained (loaded from localStorage) - 12. Switch back to "English" and verify interface returns to English - 13. Test with new conversations and verify messages/UI are in selected language - 14. Verify the existing select element styling and structure remain unchanged - - - - - Translation System Infrastructure - - Implement a translation system that: - - Stores translations for English and French - - Provides a translation function/utility to retrieve translated strings - - Supports dynamic language switching - - Handles missing translations gracefully (fallback to English) - - Organizes translations by feature/component - - Translation keys should be organized logically: - - Common UI elements (buttons, labels, placeholders) - - Settings panel - - Chat interface - - Navigation menus - - Error messages - - Success messages - - Tooltips and help text - - 1 - infrastructure - new_implementation - - - Create translation files (JSON or JS objects): - * translations/en.json (English) - * translations/fr.json (French) - - Create translation context/provider (React Context) - - Create useTranslation hook for components - - Create translation utility function (t() or translate()) - - Organize translations by namespace/feature - - Implement fallback mechanism for missing translations - - Ensure type safety for translation keys (TypeScript if applicable) - - - 1. Verify translation files exist for both languages - 2. Test translation function with valid keys - 3. Test translation function with invalid keys (should fallback) - 4. Verify all translation keys have values in both languages - 5. Test language switching updates all components - 6. Verify no console errors when switching languages - - - - - Complete UI Translation Coverage - - Translate all user-facing text in the application to support both English and French: - - Navigation & Menus: - - Sidebar navigation items - - Menu labels - - Breadcrumbs - - Chat Interface: - - Input placeholder text - - Send button - - Message status indicators - - Empty state messages - - Loading states - - Settings: - - All setting section titles - - Setting option labels - - Setting descriptions - - Save/Cancel buttons - - Buttons & Actions: - - Primary action buttons - - Secondary buttons - - Delete/Remove actions - - Edit actions - - Save actions - - Messages & Notifications: - - Success messages - - Error messages - - Warning messages - - Info messages - - Forms: - - Form labels - - Input placeholders - - Validation messages - - Help text - - Modals & Dialogs: - - Modal titles - - Modal content - - Confirmation dialogs - - Cancel/Confirm buttons - - 1 - ui - translation_implementation - - - Audit all hardcoded text in the application - - Replace all hardcoded strings with translation function calls - - Create translation keys for each text element - - Add French translations for all keys - - Test each screen/page to ensure complete translation coverage - - Verify no English text remains when French is selected - - - 1. Set language to French - 2. Navigate through all pages/screens - 3. Verify every text element is translated - 4. Check all buttons, labels, placeholders - 5. Test all modals and dialogs - 6. Verify form validation messages - 7. Check error and success notifications - 8. Verify no English text appears when French is selected - 9. Repeat test with English to ensure nothing broke - - - - - Language Preference Persistence - - Ensure that the selected language preference is saved and persists across: - - Page refreshes - - Browser sessions - - Tab closures - - Application restarts - - The language preference should be: - - Stored in localStorage (client-side) or backend user preferences - - Loaded on application startup - - Applied immediately when the app loads - - Synchronized if user is logged in (multi-device support) - - 1 - persistence - bug_fix - - - Save language selection to localStorage on change - - Load language preference on app initialization - - Apply saved language before rendering UI - - Optionally sync with backend user preferences if available - - Handle case where no preference is saved (default to English) - - - 1. Select French language - 2. Refresh the page - 3. Verify interface is still in French - 4. Close browser tab and reopen - 5. Verify language preference persists - 6. Clear localStorage and verify defaults to English - 7. Select language again and verify it saves - - - - - - - Location: src/App.jsx, lines 1412-1419 - Current code: - ```jsx -
-

Language

- -
- ``` - - Required changes: - - Add value={language} to select element - - Add onChange={(e) => setLanguage(e.target.value)} to select element - - Add useState for language state - - Load from localStorage on mount - - Save to localStorage on change -
- - - frontend/ - src/ - App.jsx # UPDATE: Add language state and connect existing select - components/ - LanguageSelector.jsx # Optional: Extract to component if needed (NEW) - contexts/ - LanguageContext.jsx # Language context provider (NEW) - hooks/ - useLanguage.js # Hook to access language and translations (NEW) - utils/ - translations.js # Translation utility functions (NEW) - translations/ - en.json # English translations (NEW) - fr.json # French translations (NEW) - - - - Translation files should be organized by feature/namespace: - { - "common": { - "save": "Save", - "cancel": "Cancel", - "delete": "Delete", - "edit": "Edit", - ... - }, - "settings": { - "title": "Settings", - "language": "Language", - "theme": "Theme", - ... - }, - "chat": { - "placeholder": "Message Claude...", - "send": "Send", - ... - }, - ... - } - - - - Store language preference in: - - localStorage key: "app_language" (value: "en" or "fr") - - Or backend user preferences if available: - { - language: "en" | "fr" - } - - Default value: "en" (English) - - - - Example implementation: - - useTranslation() hook returns { t, language, setLanguage } - - t(key) function retrieves translation for current language - - t("common.save") returns "Save" (en) or "Enregistrer" (fr) - - Supports nested keys: t("settings.general.title") - - Falls back to English if translation missing - - - - - Keep all existing functionality intact - - Default to English if no language preference set - - Gracefully handle missing translations (fallback to English) - - Ensure language changes don't cause re-renders that break functionality - - Test thoroughly to ensure no English text remains when French is selected - - Maintain code readability with clear translation key naming - -
- - - - Language selector component in settings (ALREADY EXISTS - needs functionality) - Settings > General/Preferences section (App.jsx lines 1412-1419) - - - UI exists with dropdown/select element - - Has 5 language options: English (en), Español (es), Français (fr), Deutsch (de), 日本語 (ja) - - Styling is already correct - - Missing: value binding, onChange handler, state management - - - - Add value prop bound to language state - - Add onChange handler to update language state - - Connect to translation system - - Add persistence (localStorage) - - - - Keep existing dropdown/select element (no UI changes needed) - - Shows current selection (via value prop) - - Updates interface immediately on change (via onChange) - - - - - - - All text visible to users must be translated: - - Navigation menu items - - Page titles and headers - - Button labels - - Form labels and placeholders - - Input field labels - - Error messages - - Success messages - - Tooltips - - Help text - - Modal titles and content - - Dialog confirmations - - Empty states - - Loading states - - Settings labels and descriptions - - Chat interface elements - - - - English -> French: - - "Settings" -> "Paramètres" - - "Save" -> "Enregistrer" - - "Cancel" -> "Annuler" - - "Delete" -> "Supprimer" - - "Language" -> "Langue" - - "Theme" -> "Thème" - - "Send" -> "Envoyer" - - "New Conversation" -> "Nouvelle conversation" - - "Message Claude..." -> "Message à Claude..." - - - - - - If storing language preference in backend: - - GET /api/user/preferences - Get user preferences (includes language) - - PUT /api/user/preferences - Update user preferences (includes language) - - GET /api/user/preferences/language - Get language preference only - - PUT /api/user/preferences/language - Update language preference only - - - - If using localStorage only, no API endpoints needed. - Backend storage is optional but recommended for multi-device sync. - - - - - - Language selector must be keyboard navigable - - Language changes must be announced to screen readers - - Translation quality must be accurate (no machine translation errors) - - Text direction should be handled correctly (LTR for both languages) - - Font rendering should support both languages properly - - - - - - Verify existing functionality works in both languages - - Verify language change doesn't break any features - - Test that default language (English) still works as before - - Verify all existing features are accessible in both languages - - - - - Test language selector in settings - - Test immediate language change on selection - - Test language persistence across page refresh - - Test language persistence across browser sessions - - Test all UI elements are translated - - Test translation fallback for missing keys - - Test switching between languages multiple times - - Verify no English text appears when French is selected - - Verify all pages/screens are translated - - - - - Verify all translation keys have values in both languages - - Test translation accuracy (no machine translation errors) - - Verify consistent terminology across the application - - Test special characters and accents in French - - Verify text doesn't overflow UI elements in French (may be longer) - - - - - Test with different browsers (Chrome, Firefox, Safari, Edge) - - Test with different screen sizes (responsive design) - - Test language switching during active conversations - - Test language switching with modals open - - Verify language preference syncs across tabs (if applicable) - - - - - - The language selection feature was planned in the original specification (app_spec.txt line 127) - and a UI component was created (App.jsx lines 1412-1419), but the implementation is incomplete. - The select dropdown exists but has no functionality - it lacks state management, event handlers, - and a translation system. - - - - This is a bug fix that completes the existing feature by: - 1. Connecting the existing UI to state management - 2. Adding the missing translation system - 3. Implementing language persistence - 4. Translating all UI text to support English and French - - - - DO NOT remove or significantly modify the existing language selector UI. Only add the - missing functionality to make it work. - - - - - - - Users can select language from the existing settings dropdown (English or French) - - Language changes apply immediately to entire interface - - Language preference persists across sessions - - All UI elements are translated when language is changed - - English remains the default language - - No functionality is broken by language changes - - The existing select element in App.jsx (lines 1412-1419) is now functional - - - - - Language selector is easy to find in settings - - Language change is instant and smooth - - All text is properly translated (no English text in French mode) - - Translations are accurate and natural - - Interface layout works well with both languages - - - - - Translation system is well-organized and maintainable - - Translation keys are logically structured - - Language preference is stored reliably - - No performance degradation with language switching - - Code is clean and follows existing patterns - - Easy to add more languages in the future - - -
\ No newline at end of file diff --git a/prompts/app_spec_library_rag_types_docs.txt b/prompts/app_spec_library_rag_types_docs.txt new file mode 100644 index 0000000..0fe4fa6 --- /dev/null +++ b/prompts/app_spec_library_rag_types_docs.txt @@ -0,0 +1,679 @@ + + Library RAG - Type Safety & Documentation Enhancement + + + Enhance the Library RAG application (philosophical texts indexing and semantic search) by adding + strict type annotations and comprehensive Google-style docstrings to all Python modules. This will + improve code maintainability, enable static type checking with mypy, and provide clear documentation + for all functions, classes, and modules. + + The application is a RAG pipeline that processes PDF documents through OCR, LLM-based extraction, + semantic chunking, and ingestion into Weaviate vector database. It includes a Flask web interface + for document upload, processing, and semantic search. + + + + + Python 3.10+ + Flask 3.0 + Weaviate 1.34.4 with text2vec-transformers + Mistral OCR API + Ollama (local) or Mistral API + mypy with strict configuration + + + Docker Compose (Weaviate + transformers) + weaviate-client, flask, mistralai, python-dotenv + + + + + + - flask_app.py: Main Flask application (640 lines) + - schema.py: Weaviate schema definition (383 lines) + - utils/: 16+ modules for PDF processing pipeline + - pdf_pipeline.py: Main orchestration (879 lines) + - mistral_client.py: OCR API client + - ocr_processor.py: OCR processing + - markdown_builder.py: Markdown generation + - llm_metadata.py: Metadata extraction via LLM + - llm_toc.py: Table of contents extraction + - llm_classifier.py: Section classification + - llm_chunker.py: Semantic chunking + - llm_cleaner.py: Chunk cleaning + - llm_validator.py: Document validation + - weaviate_ingest.py: Database ingestion + - hierarchy_parser.py: Document hierarchy parsing + - image_extractor.py: Image extraction from PDFs + - toc_extractor*.py: Various TOC extraction methods + - templates/: Jinja2 templates for Flask UI + - tests/utils2/: Minimal test coverage (3 test files) + + + + - Inconsistent type annotations across modules (some have partial types, many have none) + - Missing or incomplete docstrings (no Google-style format) + - No mypy configuration for strict type checking + - Type hints missing on function parameters and return values + - Dict[str, Any] used extensively without proper typing + - No type stubs for complex nested structures + + + + + + + - Add complete type annotations to ALL functions and methods + - Use proper generic types (List, Dict, Optional, Union) from typing module + - Add TypedDict for complex dictionary structures + - Add Protocol types for duck-typed interfaces + - Use Literal types for string constants + - Add ParamSpec and TypeVar where appropriate + - Type all class attributes and instance variables + - Add type annotations to lambda functions where possible + + + + - Create mypy.ini with strict configuration + - Enable: check_untyped_defs, disallow_untyped_defs, disallow_incomplete_defs + - Enable: disallow_untyped_calls, disallow_untyped_decorators + - Enable: warn_return_any, warn_redundant_casts + - Enable: strict_equality, strict_optional + - Set python_version to 3.10 + - Configure per-module overrides if needed for gradual migration + + + + - Create TypedDict definitions for common data structures: + - OCR response structures + - Metadata dictionaries + - TOC entries + - Chunk objects + - Weaviate objects + - Pipeline results + - Add NewType for semantic type safety (DocumentName, ChunkId, etc.) + - Create Protocol types for callback functions + + + + - pdf_pipeline.py: Type all 10 pipeline steps, callbacks, result dictionaries + - flask_app.py: Type all route handlers, request/response types + - schema.py: Type Weaviate configuration objects + - llm_*.py: Type LLM request/response structures + - mistral_client.py: Type API client methods and responses + - weaviate_ingest.py: Type ingestion functions and batch operations + + + + + + - Add comprehensive Google-style docstrings to ALL: + - Module-level docstrings explaining purpose and usage + - Class docstrings with Attributes section + - Function/method docstrings with Args, Returns, Raises sections + - Complex algorithm explanations with Examples section + - Include code examples for public APIs + - Document all exceptions that can be raised + - Add Notes section for important implementation details + - Add See Also section for related functions + + + + + - pdf_pipeline.py: Document the 10-step pipeline, each step's purpose + - mistral_client.py: Document OCR API usage, cost calculation + - llm_metadata.py: Document metadata extraction logic + - llm_toc.py: Document TOC extraction strategies + - llm_classifier.py: Document section classification types + - llm_chunker.py: Document semantic vs basic chunking + - llm_cleaner.py: Document cleaning rules and validation + - llm_validator.py: Document validation criteria + - weaviate_ingest.py: Document ingestion process, nested objects + - hierarchy_parser.py: Document hierarchy building algorithm + + + + - Document all routes with request/response examples + - Document SSE (Server-Sent Events) implementation + - Document Weaviate query patterns + - Document upload processing workflow + - Document background job management + + + + - Document Weaviate schema design decisions + - Document each collection's purpose and relationships + - Document nested object structure + - Document vectorization strategy + + + + + - Add inline comments for complex logic only (don't over-comment) + - Explain WHY not WHAT (code should be self-documenting) + - Document performance considerations + - Document cost implications (OCR, LLM API calls) + - Document error handling strategies + + + + + + - All modules must pass mypy --strict + - No # type: ignore comments without justification + - CI/CD should run mypy checks + - Type coverage should be 100% + + + + - All public functions must have docstrings + - All docstrings must follow Google style + - Examples should be executable and tested + - Documentation should be clear and concise + + + + + + + Priority 1 (Most used, most complex): + 1. utils/pdf_pipeline.py - Main orchestration + 2. flask_app.py - Web application entry point + 3. utils/weaviate_ingest.py - Database operations + 4. schema.py - Schema definition + + Priority 2 (Core LLM modules): + 5. utils/llm_metadata.py + 6. utils/llm_toc.py + 7. utils/llm_classifier.py + 8. utils/llm_chunker.py + 9. utils/llm_cleaner.py + 10. utils/llm_validator.py + + Priority 3 (OCR and parsing): + 11. utils/mistral_client.py + 12. utils/ocr_processor.py + 13. utils/markdown_builder.py + 14. utils/hierarchy_parser.py + 15. utils/image_extractor.py + + Priority 4 (Supporting modules): + 16. utils/toc_extractor.py + 17. utils/toc_extractor_markdown.py + 18. utils/toc_extractor_visual.py + 19. utils/llm_structurer.py (legacy) + + + + + + Setup Type Checking Infrastructure + + Configure mypy with strict settings and create foundational type definitions + + + - Create mypy.ini configuration file with strict settings + - Add mypy to requirements.txt or dev dependencies + - Create utils/types.py module for common TypedDict definitions + - Define core types: OCRResponse, Metadata, TOCEntry, ChunkData, PipelineResult + - Add NewType definitions for semantic types: DocumentName, ChunkId, SectionPath + - Create Protocol types for callbacks (ProgressCallback, etc.) + - Document type definitions in utils/types.py module docstring + - Test mypy configuration on a single module to verify settings + + + - mypy.ini exists with strict configuration + - utils/types.py contains all foundational types with docstrings + - mypy runs without errors on utils/types.py + - Type definitions are comprehensive and reusable + + + + + Add Types to PDF Pipeline Orchestration + + Add complete type annotations to pdf_pipeline.py (879 lines, most complex module) + + + - Add type annotations to all function signatures in pdf_pipeline.py + - Type the 10-step pipeline: OCR, Markdown, Metadata, TOC, Classify, Chunk, Clean, Validate, Weaviate + - Type progress_callback parameter with Protocol or Callable + - Add TypedDict for pipeline options dictionary + - Add TypedDict for pipeline result dictionary structure + - Type all helper functions (extract_document_metadata_legacy, etc.) + - Add proper return types for process_pdf_v2, process_pdf, process_pdf_bytes + - Fix any mypy errors that arise + - Verify mypy --strict passes on pdf_pipeline.py + + + - All functions in pdf_pipeline.py have complete type annotations + - progress_callback is properly typed with Protocol + - All Dict[str, Any] replaced with TypedDict where appropriate + - mypy --strict pdf_pipeline.py passes with zero errors + - No # type: ignore comments (or justified if absolutely necessary) + + + + + Add Types to Flask Application + + Add complete type annotations to flask_app.py and type all routes + + + - Add type annotations to all Flask route handlers + - Type request.args, request.form, request.files usage + - Type jsonify() return values + - Type get_weaviate_client context manager + - Type get_collection_stats, get_all_chunks, search_chunks functions + - Add TypedDict for Weaviate query results + - Type background job processing functions (run_processing_job) + - Type SSE generator function (upload_progress) + - Add type hints for template rendering + - Verify mypy --strict passes on flask_app.py + + + - All Flask routes have complete type annotations + - Request/response types are clear and documented + - Weaviate query functions are properly typed + - SSE generator is correctly typed + - mypy --strict flask_app.py passes with zero errors + + + + + Add Types to Core LLM Modules + + Add complete type annotations to all LLM processing modules (metadata, TOC, classifier, chunker, cleaner, validator) + + + - llm_metadata.py: Type extract_metadata function, return structure + - llm_toc.py: Type extract_toc function, TOC hierarchy structure + - llm_classifier.py: Type classify_sections, section types (Literal), validation functions + - llm_chunker.py: Type chunk_section_with_llm, chunk objects + - llm_cleaner.py: Type clean_chunk, is_chunk_valid functions + - llm_validator.py: Type validate_document, validation result structure + - Add TypedDict for LLM request/response structures + - Type provider selection ("ollama" | "mistral" as Literal) + - Type model names with Literal or constants + - Verify mypy --strict passes on all llm_*.py modules + + + - All LLM modules have complete type annotations + - Section types use Literal for type safety + - Provider and model parameters are strongly typed + - LLM request/response structures use TypedDict + - mypy --strict passes on all llm_*.py modules with zero errors + + + + + Add Types to Weaviate and Database Modules + + Add complete type annotations to schema.py and weaviate_ingest.py + + + - schema.py: Type Weaviate configuration objects + - schema.py: Type collection property definitions + - weaviate_ingest.py: Type ingest_document function signature + - weaviate_ingest.py: Type delete_document_chunks function + - weaviate_ingest.py: Add TypedDict for Weaviate object structure + - Type batch insertion operations + - Type nested object references (work, document) + - Add proper error types for Weaviate exceptions + - Verify mypy --strict passes on both modules + + + - schema.py has complete type annotations for Weaviate config + - weaviate_ingest.py functions are fully typed + - Nested object structures use TypedDict + - Weaviate client operations are properly typed + - mypy --strict passes on both modules with zero errors + + + + + Add Types to OCR and Parsing Modules + + Add complete type annotations to mistral_client.py, ocr_processor.py, markdown_builder.py, hierarchy_parser.py + + + - mistral_client.py: Type create_client, run_ocr, estimate_ocr_cost + - mistral_client.py: Add TypedDict for Mistral API response structures + - ocr_processor.py: Type serialize_ocr_response, OCR object structures + - markdown_builder.py: Type build_markdown, image_writer parameter + - hierarchy_parser.py: Type build_hierarchy, flatten_hierarchy functions + - hierarchy_parser.py: Add TypedDict for hierarchy node structure + - image_extractor.py: Type create_image_writer, image handling + - Verify mypy --strict passes on all modules + + + - All OCR/parsing modules have complete type annotations + - Mistral API structures use TypedDict + - Hierarchy nodes are properly typed + - Image handling functions are typed + - mypy --strict passes on all modules with zero errors + + + + + Add Google-Style Docstrings to Core Modules + + Add comprehensive Google-style docstrings to pdf_pipeline.py, flask_app.py, and weaviate modules + + + - pdf_pipeline.py: Add module docstring explaining the V2 pipeline + - pdf_pipeline.py: Add docstrings to process_pdf_v2 with Args, Returns, Raises sections + - pdf_pipeline.py: Document each of the 10 pipeline steps in comments + - pdf_pipeline.py: Add Examples section showing typical usage + - flask_app.py: Add module docstring explaining Flask application + - flask_app.py: Document all routes with request/response examples + - flask_app.py: Document Weaviate connection management + - schema.py: Add module docstring explaining schema design + - schema.py: Document each collection's purpose and relationships + - weaviate_ingest.py: Document ingestion process with examples + - All docstrings must follow Google style format exactly + + + - All core modules have comprehensive module-level docstrings + - All public functions have Google-style docstrings + - Args, Returns, Raises sections are complete and accurate + - Examples are provided for complex functions + - Docstrings explain WHY, not just WHAT + + + + + Add Google-Style Docstrings to LLM Modules + + Add comprehensive Google-style docstrings to all LLM processing modules + + + - llm_metadata.py: Document metadata extraction logic with examples + - llm_toc.py: Document TOC extraction strategies and fallbacks + - llm_classifier.py: Document section types and classification criteria + - llm_chunker.py: Document semantic vs basic chunking approaches + - llm_cleaner.py: Document cleaning rules and validation logic + - llm_validator.py: Document validation criteria and corrections + - Add Examples sections showing input/output for each function + - Document LLM provider differences (Ollama vs Mistral) + - Document cost implications in Notes sections + - All docstrings must follow Google style format exactly + + + - All LLM modules have comprehensive docstrings + - Each function has Args, Returns, Raises sections + - Examples show realistic input/output + - Provider differences are documented + - Cost implications are noted where relevant + + + + + Add Google-Style Docstrings to OCR and Parsing Modules + + Add comprehensive Google-style docstrings to OCR, markdown, hierarchy, and extraction modules + + + - mistral_client.py: Document OCR API usage, cost calculation + - ocr_processor.py: Document OCR response processing + - markdown_builder.py: Document markdown generation strategy + - hierarchy_parser.py: Document hierarchy building algorithm + - image_extractor.py: Document image extraction process + - toc_extractor*.py: Document various TOC extraction methods + - Add Examples sections for complex algorithms + - Document edge cases and error handling + - All docstrings must follow Google style format exactly + + + - All OCR/parsing modules have comprehensive docstrings + - Complex algorithms are well explained + - Edge cases are documented + - Error handling is documented + - Examples demonstrate typical usage + + + + + Final Validation and CI Integration + + Verify all type annotations and docstrings, integrate mypy into CI/CD + + + - Run mypy --strict on entire codebase, verify 100% pass rate + - Verify all public functions have docstrings + - Check docstring formatting with pydocstyle or similar tool + - Create GitHub Actions workflow to run mypy on every commit + - Update README.md with type checking instructions + - Update CLAUDE.md with documentation standards + - Create CONTRIBUTING.md with type annotation and docstring guidelines + - Generate API documentation with Sphinx or pdoc + - Fix any remaining mypy errors or missing docstrings + + + - mypy --strict passes on entire codebase with zero errors + - All public functions have Google-style docstrings + - CI/CD runs mypy checks automatically + - Documentation is generated and accessible + - Contributing guidelines document type/docstring requirements + + + + + + + - 100% type coverage across all modules + - mypy --strict passes with zero errors + - No # type: ignore comments without justification + - All Dict[str, Any] replaced with TypedDict where appropriate + - Proper use of generics, protocols, and type variables + - NewType used for semantic type safety + + + + - All modules have comprehensive module-level docstrings + - All public functions/classes have Google-style docstrings + - All docstrings include Args, Returns, Raises sections + - Complex functions include Examples sections + - Cost implications documented in Notes sections + - Error handling clearly documented + - Provider differences (Ollama vs Mistral) documented + + + + - Code is self-documenting with clear variable names + - Inline comments explain WHY, not WHAT + - Complex algorithms are well explained + - Performance considerations documented + - Security considerations documented + + + + - IDE autocomplete works perfectly with type hints + - Type errors caught at development time, not runtime + - Documentation is easily accessible in IDE + - API examples are executable and tested + - Contributing guidelines are clear and comprehensive + + + + - Refactoring is safer with type checking + - Function signatures are self-documenting + - API contracts are explicit and enforced + - Breaking changes are caught by type checker + - New developers can understand code quickly + + + + + + - Must maintain backward compatibility with existing code + - Cannot break existing Flask routes or API contracts + - Weaviate schema must remain unchanged + - Existing tests must continue to pass + + + + - Can use per-module mypy configuration for gradual migration + - Can temporarily disable strict checks on legacy modules + - Priority modules must be completed first + - Low-priority modules can be deferred + + + + - All type annotations must use Python 3.10+ syntax + - Docstrings must follow Google style exactly (not NumPy or reStructuredText) + - Use typing module (List, Dict, Optional) until Python 3.9 support dropped + - Use from __future__ import annotations if needed for forward references + + + + + + - Run mypy --strict on each module after adding types + - Use mypy daemon (dmypy) for faster incremental checking + - Add mypy to pre-commit hooks + - CI/CD must run mypy and fail on type errors + + + + - Use pydocstyle to validate Google-style format + - Use sphinx-build to generate docs and catch errors + - Manual review of docstring examples + - Verify examples are executable and correct + + + + - Verify existing tests still pass after type additions + - Add new tests for complex typed structures + - Test mypy configuration on sample code + - Verify IDE autocomplete works correctly + + + + + + ```python + """ + PDF Pipeline V2 - Intelligent document processing with LLM enhancement. + + This module orchestrates a 10-step pipeline for processing PDF documents: + 1. OCR via Mistral API + 2. Markdown construction with images + 3. Metadata extraction via LLM + 4. Table of contents (TOC) extraction + 5. Section classification + 6. Semantic chunking + 7. Chunk cleaning and validation + 8. Enrichment with concepts + 9. Validation and corrections + 10. Ingestion into Weaviate vector database + + The pipeline supports multiple LLM providers (Ollama local, Mistral API) and + various processing modes (skip OCR, semantic chunking, OCR annotations). + + Typical usage: + >>> from pathlib import Path + >>> from utils.pdf_pipeline import process_pdf + >>> + >>> result = process_pdf( + ... Path("document.pdf"), + ... use_llm=True, + ... llm_provider="ollama", + ... ingest_to_weaviate=True, + ... ) + >>> print(f"Processed {result['pages']} pages, {result['chunks_count']} chunks") + + See Also: + mistral_client: OCR API client + llm_metadata: Metadata extraction + weaviate_ingest: Database ingestion + """ + ``` + + + + ```python + def process_pdf_v2( + pdf_path: Path, + output_dir: Path = Path("output"), + *, + use_llm: bool = True, + llm_provider: Literal["ollama", "mistral"] = "ollama", + llm_model: Optional[str] = None, + skip_ocr: bool = False, + ingest_to_weaviate: bool = True, + progress_callback: Optional[ProgressCallback] = None, + ) -> PipelineResult: + """ + Process a PDF through the complete V2 pipeline with LLM enhancement. + + This function orchestrates all 10 steps of the intelligent document processing + pipeline, from OCR to Weaviate ingestion. It supports both local (Ollama) and + cloud (Mistral API) LLM providers, with optional caching via skip_ocr. + + Args: + pdf_path: Absolute path to the PDF file to process. + output_dir: Base directory for output files. Defaults to "./output". + use_llm: Enable LLM-based processing (metadata, TOC, chunking). + If False, uses basic heuristic processing. + llm_provider: LLM provider to use. "ollama" for local (free but slow), + "mistral" for API (fast but paid). + llm_model: Specific model name. If None, auto-detects based on provider + (qwen2.5:7b for ollama, mistral-small-latest for mistral). + skip_ocr: If True, reuses existing markdown file to avoid OCR cost. + Requires output_dir//.md to exist. + ingest_to_weaviate: If True, ingests chunks into Weaviate after processing. + progress_callback: Optional callback for real-time progress updates. + Called with (step_id, status, detail) for each pipeline step. + + Returns: + Dictionary containing processing results with the following keys: + - success (bool): True if processing completed without errors + - document_name (str): Name of the processed document + - pages (int): Number of pages in the PDF + - chunks_count (int): Number of chunks generated + - cost_ocr (float): OCR cost in euros (0 if skip_ocr=True) + - cost_llm (float): LLM API cost in euros (0 if provider=ollama) + - cost_total (float): Total cost (ocr + llm) + - metadata (dict): Extracted metadata (title, author, etc.) + - toc (list): Hierarchical table of contents + - files (dict): Paths to generated files (markdown, chunks, etc.) + + Raises: + FileNotFoundError: If pdf_path does not exist. + ValueError: If skip_ocr=True but markdown file not found. + RuntimeError: If Weaviate connection fails during ingestion. + + Examples: + Basic usage with Ollama (free): + >>> result = process_pdf_v2( + ... Path("platon_menon.pdf"), + ... llm_provider="ollama" + ... ) + >>> print(f"Cost: {result['cost_total']:.4f}€") + Cost: 0.0270€ # OCR only + + With Mistral API (faster): + >>> result = process_pdf_v2( + ... Path("platon_menon.pdf"), + ... llm_provider="mistral", + ... llm_model="mistral-small-latest" + ... ) + + Skip OCR to avoid cost: + >>> result = process_pdf_v2( + ... Path("platon_menon.pdf"), + ... skip_ocr=True, # Reuses existing markdown + ... ingest_to_weaviate=False + ... ) + + Notes: + - OCR cost: ~0.003€/page (standard), ~0.009€/page (with annotations) + - LLM cost: Free with Ollama, variable with Mistral API + - Processing time: ~30s/page with Ollama, ~5s/page with Mistral + - Weaviate must be running (docker-compose up -d) before ingestion + """ + ``` + + + diff --git a/prompts/app_spec_mistral_extensible.txt b/prompts/app_spec_mistral_extensible.txt deleted file mode 100644 index 0abcc86..0000000 --- a/prompts/app_spec_mistral_extensible.txt +++ /dev/null @@ -1,448 +0,0 @@ - - Claude.ai Clone - Multi-Provider Support (Mistral + Extensible) - - - This specification adds Mistral AI model support AND creates an extensible provider architecture - that makes it easy to add additional AI providers (OpenAI, Gemini, etc.) in the future. - This uses the "Open/Closed Principle" - open for extension, closed for modification. - - All changes are additive and backward-compatible. Existing Claude functionality remains unchanged. - - - - - - DO NOT modify existing Claude API integration code directly - - DO NOT change existing model selection logic for Claude models - - DO NOT modify existing database schema without safe migrations - - DO NOT break existing conversations or messages - - All new code must be in separate files/modules when possible - - Test thoroughly before marking issues as complete - - Maintain backward compatibility at all times - - Refactor Claude code to use BaseProvider WITHOUT changing functionality - - - - - - Create an abstract provider interface that all AI providers implement: - - BaseProvider (abstract class/interface) - defines common interface - - ClaudeProvider (existing code refactored to extend BaseProvider) - - MistralProvider (new, extends BaseProvider) - - OpenAIProvider (future, extends BaseProvider - easy to add) - - GeminiProvider (future, extends BaseProvider - easy to add) - - - - - Easy to add new providers without modifying existing code - - Consistent interface across all providers - - Isolated error handling per provider - - Unified model selection UI - - Shared functionality (streaming, error handling, logging) - - Future-proof architecture - - - - - - Extensible Provider Architecture (Foundation) - - Create a provider abstraction layer that allows easy addition of multiple AI providers. - This is the foundation that makes adding OpenAI, Gemini, etc. trivial in the future. - - BaseProvider abstract class should define: - - sendMessage(messages, options) -> Promise<response> - - streamMessage(messages, options) -> AsyncGenerator<chunk> - - getModels() -> Promise<array> of available models - - validateApiKey(key) -> Promise<boolean> - - getCapabilities() -> object with provider capabilities - - getName() -> string (provider name: 'claude', 'mistral', 'openai', etc.) - - getDefaultModel() -> string (default model ID for this provider) - - ProviderRegistry should: - - Register all available providers - - Provide list of all providers - - Check which providers are configured (have API keys) - - Enable/disable providers - - ProviderFactory should: - - Create provider instances based on model ID or provider name - - Handle provider selection logic - - Route requests to correct provider - - 1 - functional - - - Create server/providers/BaseProvider.js (abstract base class) - - Refactor existing Claude code to server/providers/ClaudeProvider.js (extends BaseProvider) - - Create server/providers/ProviderRegistry.js (manages all providers) - - Create server/providers/ProviderFactory.js (creates provider instances) - - Update existing routes to use ProviderFactory instead of direct Claude calls - - Keep all provider code in server/providers/ directory - - - 1. Verify Claude still works after refactoring to use BaseProvider - 2. Test that ProviderFactory creates ClaudeProvider correctly - 3. Test that ProviderRegistry lists Claude provider - 4. Verify error handling works correctly - 5. Test that adding a mock provider is straightforward - 6. Verify no regression in existing Claude functionality - - - - - Mistral Provider Implementation - - Implement MistralProvider extending BaseProvider. This should: - - Implement all BaseProvider abstract methods - - Handle Mistral-specific API calls (https://api.mistral.ai/v1/chat/completions) - - Support Mistral streaming (Server-Sent Events) - - Handle Mistral-specific error codes and messages - - Provide Mistral model list: - * mistral-large-latest (default) - * mistral-medium-latest - * mistral-small-latest - * mistral-7b-instruct - - Manage Mistral API authentication - - Return responses in unified format (same as Claude) - - 2 - functional - - - Create server/providers/MistralProvider.js - - Extend BaseProvider class - - Implement Mistral API integration using fetch or axios - - Register in ProviderRegistry - - Use same response format as ClaudeProvider for consistency - - - 1. Test MistralProvider.sendMessage() works with valid API key - 2. Test MistralProvider.streamMessage() works - 3. Test MistralProvider.getModels() returns correct models - 4. Test error handling for invalid API key - 5. Test error handling for API rate limits - 6. Verify it integrates with ProviderFactory - 7. Verify responses match expected format - - - - - Unified Model Selector (All Providers) - - Update model selector to dynamically load models from all registered providers. - The selector should: - - Query all providers for available models via GET /api/models - - Group models by provider (Claude, Mistral, etc.) - - Display provider badges/icons next to model names - - Show which provider each model belongs to - - Filter models by provider (optional toggle) - - Show provider-specific capabilities (streaming, images, etc.) - - Only show models from providers with configured API keys - - Handle providers gracefully (show "Configure API key" if not set) - - 2 - functional - - - Create API endpoint: GET /api/models (returns all models from all providers) - - Update frontend ModelSelector component to handle multiple providers - - Add provider grouping/filtering in UI - - Show provider badges/icons next to model names - - Group models by provider with collapsible sections - - Show provider status (configured/not configured) - - - 1. Verify model selector shows Claude models (existing functionality) - 2. Verify model selector shows Mistral models (if key configured) - 3. Test grouping by provider works - 4. Test filtering by provider works - 5. Verify provider badges display correctly - 6. Test that providers without API keys show "Configure" message - 7. Verify selecting a model works for both providers - - - - - Multi-Provider API Key Management - - Create unified API key management that supports multiple providers. Users should be able to: - - Manage API keys for each provider separately (Claude, Mistral, OpenAI, etc.) - - See which providers are available - - See which providers are configured (have API keys) - - Test each provider's API key independently - - Enable/disable providers (hide models if key not configured) - - See provider status indicators (configured/not configured/error) - - Update or remove API keys for any provider - - See usage statistics per provider - - 2 - functional - - - Create server/routes/providers.js with unified provider management - - Update settings UI to show provider cards (one per provider) - - Each provider card has: - * Provider name and logo/icon - * API key input field (masked) - * "Test Connection" button - * Status indicator (green/yellow/red) - * Enable/disable toggle - - Store keys in api_keys table with key_name = 'claude_api_key', 'mistral_api_key', etc. - - Use same encryption method for all providers - - - 1. Configure Claude API key (verify existing functionality still works) - 2. Configure Mistral API key - 3. Verify both keys are stored separately - 4. Test each provider's "Test Connection" button - 5. Remove one key and verify only that provider's models are hidden - 6. Verify provider status indicators update correctly - 7. Test that disabling a provider hides its models - - - - - Database Support for Multiple Providers (Future-Proof) - - Update database schema to support multiple providers in a future-proof way. - This should: - - Add provider field to conversations table (TEXT, default: 'claude') - - Add provider field to messages/usage_tracking (TEXT, default: 'claude') - - Use TEXT field (not ENUM) to allow easy addition of new providers without schema changes - - Migration should be safe, idempotent, and backward compatible - - All existing records default to 'claude' provider - - Add indexes for performance on provider queries - - 1 - functional - - - Create migration: server/migrations/add_provider_support.sql - - Use TEXT field (not ENUM) for provider name (allows 'claude', 'mistral', 'openai', etc.) - - Default all existing records to 'claude' - - Add indexes on provider columns for performance - - Make migration idempotent (can run multiple times safely) - - Create rollback script if needed - - - 1. Backup existing database - 2. Run migration script - 3. Verify all existing conversations have provider='claude' - 4. Verify all existing messages have provider='claude' (via usage_tracking) - 5. Create new conversation with Mistral provider - 6. Verify provider='mistral' is saved correctly - 7. Query conversations by provider (test index performance) - 8. Verify existing Claude conversations still work - 9. Test rollback script if needed - - - - - Unified Chat Endpoint (Works with Any Provider) - - Update chat endpoints to use ProviderFactory, making them work with any provider. - The endpoint should: - - Accept provider or model ID in request - - Use ProviderFactory to get correct provider - - Route request to appropriate provider - - Return unified response format - - Handle provider-specific errors gracefully - - Support streaming for all providers that support it - - 1 - functional - - - Update POST /api/chat to use ProviderFactory - - Update POST /api/chat/stream to use ProviderFactory - - Extract provider from model ID or accept provider parameter - - Route to correct provider instance - - Return unified response format - - - 1. Test POST /api/chat with Claude model (verify no regression) - 2. Test POST /api/chat with Mistral model - 3. Test POST /api/chat/stream with Claude (verify streaming still works) - 4. Test POST /api/chat/stream with Mistral - 5. Test error handling for invalid provider - 6. Test error handling for missing API key - - - - - - - How to Add OpenAI in the Future - - To add OpenAI support later, simply follow these steps (NO changes to existing code needed): - - 1. Create server/providers/OpenAIProvider.js extending BaseProvider - 2. Implement OpenAI API calls (https://api.openai.com/v1/chat/completions) - 3. Register in ProviderRegistry: ProviderRegistry.register('openai', OpenAIProvider) - 4. That's it! OpenAI models will automatically appear in model selector. - - Example OpenAIProvider structure: - - Extends BaseProvider - - Implements sendMessage() using OpenAI API - - Implements streamMessage() for streaming support - - Returns models: gpt-4, gpt-3.5-turbo, etc. - - Handles OpenAI-specific authentication and errors - - - - - - Same pattern works for any AI provider: - - Google Gemini (GeminiProvider) - - Cohere (CohereProvider) - - Any other AI API that follows similar patterns - Just create a new Provider class extending BaseProvider and register it. - - - - - - - server/ - providers/ - BaseProvider.js # Abstract base class (NEW) - ClaudeProvider.js # Refactored Claude (extends BaseProvider) - MistralProvider.js # New Mistral (extends BaseProvider) - ProviderRegistry.js # Manages all providers (NEW) - ProviderFactory.js # Creates provider instances (NEW) - routes/ - providers.js # Unified provider management (NEW) - chat.js # Updated to use ProviderFactory - migrations/ - add_provider_support.sql # Database migration (NEW) - - - - - Refactor Claude code to use BaseProvider WITHOUT changing functionality - - All providers are isolated - errors in one don't affect others - - Database changes are backward compatible (TEXT field, not ENUM) - - Existing conversations default to 'claude' provider - - Test Claude thoroughly after refactoring - - Use feature flags if needed to enable/disable providers - - Log all provider operations separately for debugging - - - - - Each provider handles its own errors - - Provider errors should NOT affect other providers - - Show user-friendly error messages - - Log errors with provider context - - Don't throw unhandled exceptions - - - - - - - Add provider support (TEXT field for extensibility) - - -- Add provider column to conversations (TEXT allows any provider name) - -- Default to 'claude' for backward compatibility - ALTER TABLE conversations - ADD COLUMN provider TEXT DEFAULT 'claude'; - - -- Add provider column to usage_tracking - ALTER TABLE usage_tracking - ADD COLUMN provider TEXT DEFAULT 'claude'; - - -- Add indexes for performance - CREATE INDEX IF NOT EXISTS idx_conversations_provider - ON conversations(provider); - - CREATE INDEX IF NOT EXISTS idx_usage_tracking_provider - ON usage_tracking(provider); - - - -- Rollback script (use with caution - may cause data issues) - DROP INDEX IF EXISTS idx_conversations_provider; - DROP INDEX IF EXISTS idx_usage_tracking_provider; - -- Note: SQLite doesn't support DROP COLUMN easily - -- Would need to recreate table without provider column - - - Using TEXT instead of ENUM allows adding new providers (OpenAI, Gemini, etc.) - without database schema changes in the future. This is future-proof. - - - - - - - All existing conversations default to provider='claude' - - All existing messages default to provider='claude' - - Migration is idempotent (can run multiple times safely) - - No data loss during migration - - Existing queries continue to work - - - - - - - GET /api/models - Get all models from all configured providers - - GET /api/providers - Get list of available providers and their status - - POST /api/providers/:provider/key - Set API key for specific provider - - POST /api/providers/:provider/test - Test provider API key - - GET /api/providers/:provider/status - Get provider configuration status - - DELETE /api/providers/:provider/key - Remove provider API key - - - - - POST /api/chat - Updated to use ProviderFactory (works with any provider) - * Accepts: { model: 'model-id', messages: [...], ... } - * Provider is determined from model ID or can be specified - - POST /api/chat/stream - Updated to use ProviderFactory (streaming for any provider) - * Same interface, works with any provider that supports streaming - - - - - - - No new dependencies required (use native fetch for Mistral API) - - Optional: @mistralai/mistralai (only if provides significant value) - - Keep dependencies minimal to avoid conflicts - - - - - - - Verify all existing Claude functionality still works - - Test that existing conversations load correctly - - Verify Claude model selection still works - - Test Claude API endpoints are unaffected - - Verify database queries for Claude still work - - Test Claude streaming still works - - - - - Test switching between Claude and Mistral models - - Test conversations with different providers - - Test error handling doesn't affect other providers - - Test migration doesn't break existing data - - Test ProviderFactory routes correctly - - Test unified model selector with multiple providers - - - - - Verify adding a mock provider is straightforward - - Test that ProviderFactory correctly routes to providers - - Verify provider isolation (errors don't propagate) - - Test that new providers automatically appear in UI - - - - - - - Claude functionality works exactly as before (no regression) - - Mistral models appear in selector and work correctly - - Users can switch between Claude and Mistral seamlessly - - API key management works for both providers - - Database migration is safe and backward compatible - - - - - Adding a new provider (like OpenAI) requires only creating one new file - - No changes needed to existing code when adding providers - - Provider architecture is documented and easy to follow - - Code is organized and maintainable - - - diff --git a/prompts/app_spec_model.txt b/prompts/app_spec_model.txt new file mode 100644 index 0000000..1e35f6d --- /dev/null +++ b/prompts/app_spec_model.txt @@ -0,0 +1,681 @@ + + Claude.ai Clone - AI Chat Interface + + + Build a fully functional clone of claude.ai, Anthropic's conversational AI interface. The application should + provide a clean, modern chat interface for interacting with Claude via the API, including features like + conversation management, artifact rendering, project organization, multiple model selection, and advanced + settings. The UI should closely match claude.ai's design using Tailwind CSS with a focus on excellent + user experience and responsive design. + + + + + You can use an API key located at /tmp/api-key for testing. You will not be allowed to read this file, but you can reference it in code. + + + React with Vite + Tailwind CSS (via CDN) + React hooks and context + React Router for navigation + React Markdown for message rendering + Syntax highlighting for code blocks + Only launch on port {frontend_port} + + + Node.js with Express + SQLite with better-sqlite3 + Claude API for chat completions + Server-Sent Events for streaming responses + + + RESTful endpoints + SSE for real-time message streaming + Integration with Claude API using Anthropic SDK + + + + + + - Repository includes .env with VITE_ANTHROPIC_API_KEY configured + - Frontend dependencies pre-installed via pnpm + - Backend code goes in /server directory + - Install backend dependencies as needed + + + + + + - Clean, centered chat layout with message bubbles + - Streaming message responses with typing indicator + - Markdown rendering with proper formatting + - Code blocks with syntax highlighting and copy button + - LaTeX/math equation rendering + - Image upload and display in messages + - Multi-turn conversations with context + - Message editing and regeneration + - Stop generation button during streaming + - Input field with auto-resize textarea + - Character count and token estimation + - Keyboard shortcuts (Enter to send, Shift+Enter for newline) + + + + - Artifact detection and rendering in side panel + - Code artifact viewer with syntax highlighting + - HTML/SVG preview with live rendering + - React component preview + - Mermaid diagram rendering + - Text document artifacts + - Artifact editing and re-prompting + - Full-screen artifact view + - Download artifact content + - Artifact versioning and history + + + + - Create new conversations + - Conversation list in sidebar + - Rename conversations + - Delete conversations + - Search conversations by title/content + - Pin important conversations + - Archive conversations + - Conversation folders/organization + - Duplicate conversation + - Export conversation (JSON, Markdown, PDF) + - Conversation timestamps (created, last updated) + - Unread message indicators + + + + - Create projects to group related conversations + - Project knowledge base (upload documents) + - Project-specific custom instructions + - Share projects with team (mock feature) + - Project settings and configuration + - Move conversations between projects + - Project templates + - Project analytics (usage stats) + + + + - Model selector dropdown with the following models: + - Claude Sonnet 4.5 (claude-sonnet-4-5-20250929) - default + - Claude Haiku 4.5 (claude-haiku-4-5-20251001) + - Claude Opus 4.1 (claude-opus-4-1-20250805) + - Model capabilities display + - Context window indicator + - Model-specific pricing info (display only) + - Switch models mid-conversation + - Model comparison view + + + + - Global custom instructions + - Project-specific custom instructions + - Conversation-specific system prompts + - Custom instruction templates + - Preview how instructions affect responses + + + + - Theme selection (Light, Dark, Auto) + - Font size adjustment + - Message density (compact, comfortable, spacious) + - Code theme selection + - Language preferences + - Accessibility options + - Keyboard shortcuts reference + - Data export options + - Privacy settings + - API key management + + + + - Temperature control slider + - Max tokens adjustment + - Top-p (nucleus sampling) control + - System prompt override + - Thinking/reasoning mode toggle + - Multi-modal input (text + images) + - Voice input (optional, mock UI) + - Response suggestions + - Related prompts + - Conversation branching + + + + - Share conversation via link (read-only) + - Export conversation formats + - Conversation templates + - Prompt library + - Share artifacts + - Team workspaces (mock UI) + + + + - Search across all conversations + - Filter by project, date, model + - Prompt library with categories + - Example conversations + - Quick actions menu + - Command palette (Cmd/Ctrl+K) + + + + - Token usage display per message + - Conversation cost estimation + - Daily/monthly usage dashboard + - Usage limits and warnings + - API quota tracking + + + + - Welcome screen for new users + - Feature tour highlights + - Example prompts to get started + - Quick tips and best practices + - Keyboard shortcuts tutorial + + + + - Full keyboard navigation + - Screen reader support + - ARIA labels and roles + - High contrast mode + - Focus management + - Reduced motion support + + + + - Mobile-first responsive layout + - Touch-optimized interface + - Collapsible sidebar on mobile + - Swipe gestures for navigation + - Adaptive artifact display + - Progressive Web App (PWA) support + + + + + + + - id, email, name, avatar_url + - created_at, last_login + - preferences (JSON: theme, font_size, etc.) + - custom_instructions + + + + - id, user_id, name, description, color + - custom_instructions, knowledge_base_path + - created_at, updated_at + - is_archived, is_pinned + + + + - id, user_id, project_id, title + - model, created_at, updated_at, last_message_at + - is_archived, is_pinned, is_deleted + - settings (JSON: temperature, max_tokens, etc.) + - token_count, message_count + + + + - id, conversation_id, role (user/assistant/system) + - content, created_at, edited_at + - tokens, finish_reason + - images (JSON array of image data) + - parent_message_id (for branching) + + + + - id, message_id, conversation_id + - type (code/html/svg/react/mermaid/text) + - title, identifier, language + - content, version + - created_at, updated_at + + + + - id, conversation_id, share_token + - created_at, expires_at, view_count + - is_public + + + + - id, user_id, title, description + - prompt_template, category, tags (JSON) + - is_public, usage_count + - created_at, updated_at + + + + - id, user_id, project_id, name, parent_folder_id + - created_at, position + + + + - id, folder_id, conversation_id + + + + - id, user_id, conversation_id, message_id + - model, input_tokens, output_tokens + - cost_estimate, created_at + + + + - id, user_id, key_name, api_key_hash + - created_at, last_used_at + - is_active + + + + + + + - POST /api/auth/login + - POST /api/auth/logout + - GET /api/auth/me + - PUT /api/auth/profile + + + + - GET /api/conversations + - POST /api/conversations + - GET /api/conversations/:id + - PUT /api/conversations/:id + - DELETE /api/conversations/:id + - POST /api/conversations/:id/duplicate + - POST /api/conversations/:id/export + - PUT /api/conversations/:id/archive + - PUT /api/conversations/:id/pin + - POST /api/conversations/:id/branch + + + + - GET /api/conversations/:id/messages + - POST /api/conversations/:id/messages + - PUT /api/messages/:id + - DELETE /api/messages/:id + - POST /api/messages/:id/regenerate + - GET /api/messages/stream (SSE endpoint) + + + + - GET /api/conversations/:id/artifacts + - GET /api/artifacts/:id + - PUT /api/artifacts/:id + - DELETE /api/artifacts/:id + - POST /api/artifacts/:id/fork + - GET /api/artifacts/:id/versions + + + + - GET /api/projects + - POST /api/projects + - GET /api/projects/:id + - PUT /api/projects/:id + - DELETE /api/projects/:id + - POST /api/projects/:id/knowledge + - GET /api/projects/:id/conversations + - PUT /api/projects/:id/settings + + + + - POST /api/conversations/:id/share + - GET /api/share/:token + - DELETE /api/share/:token + - PUT /api/share/:token/settings + + + + - GET /api/prompts/library + - POST /api/prompts/library + - GET /api/prompts/:id + - PUT /api/prompts/:id + - DELETE /api/prompts/:id + - GET /api/prompts/categories + - GET /api/prompts/examples + + + + - GET /api/search/conversations?q=query + - GET /api/search/messages?q=query + - GET /api/search/artifacts?q=query + - GET /api/search/prompts?q=query + + + + - GET /api/folders + - POST /api/folders + - PUT /api/folders/:id + - DELETE /api/folders/:id + - POST /api/folders/:id/items + - DELETE /api/folders/:id/items/:conversationId + + + + - GET /api/usage/daily + - GET /api/usage/monthly + - GET /api/usage/by-model + - GET /api/usage/conversations/:id + + + + - GET /api/settings + - PUT /api/settings + - GET /api/settings/custom-instructions + - PUT /api/settings/custom-instructions + + + + - POST /api/claude/chat (proxy to Claude API) + - POST /api/claude/chat/stream (streaming proxy) + - GET /api/claude/models + - POST /api/claude/images/upload + + + + + + - Three-column layout: sidebar (conversations), main (chat), panel (artifacts) + - Collapsible sidebar with resize handle + - Responsive breakpoints: mobile (single column), tablet (two column), desktop (three column) + - Persistent header with project/model selector + - Bottom input area with send button and options + + + + - New chat button (prominent) + - Project selector dropdown + - Search conversations input + - Conversations list (grouped by date: Today, Yesterday, Previous 7 days, etc.) + - Folder tree view (collapsible) + - Settings gear icon at bottom + - User profile at bottom + + + + - Conversation title (editable inline) + - Model selector badge + - Message history (scrollable) + - Welcome screen for new conversations + - Suggested prompts (empty state) + - Input area with formatting toolbar + - Attachment button for images + - Send button with loading state + - Stop generation button + + + + - Artifact header with title and type badge + - Code editor or preview pane + - Tabs for multiple artifacts + - Full-screen toggle + - Download button + - Edit/Re-prompt button + - Version selector + - Close panel button + + + + - Settings modal (tabbed interface) + - Share conversation modal + - Export options modal + - Project settings modal + - Prompt library modal + - Command palette overlay + - Keyboard shortcuts reference + + + + + + - Primary: Orange/amber accent (#CC785C claude-style) + - Background: White (light mode), Dark gray (#1A1A1A dark mode) + - Surface: Light gray (#F5F5F5 light), Darker gray (#2A2A2A dark) + - Text: Near black (#1A1A1A light), Off-white (#E5E5E5 dark) + - Borders: Light gray (#E5E5E5 light), Dark gray (#404040 dark) + - Code blocks: Monaco editor theme + + + + - Sans-serif system font stack (Inter, SF Pro, Roboto, system-ui) + - Headings: font-semibold + - Body: font-normal, leading-relaxed + - Code: Monospace (JetBrains Mono, Consolas, Monaco) + - Message text: text-base (16px), comfortable line-height + + + + + - User messages: Right-aligned, subtle background + - Assistant messages: Left-aligned, no background + - Markdown formatting with proper spacing + - Inline code with bg-gray-100 background + - Code blocks with syntax highlighting + - Copy button on code blocks + + + + - Primary: Orange/amber background, white text, rounded + - Secondary: Border style with hover fill + - Icon buttons: Square with hover background + - Disabled state: Reduced opacity, no pointer events + + + + - Rounded borders with focus ring + - Textarea auto-resize + - Placeholder text in gray + - Error states in red + - Character counter + + + + - Subtle border or shadow + - Rounded corners (8px) + - Padding: p-4 to p-6 + - Hover state: slight shadow increase + + + + + - Smooth transitions (150-300ms) + - Fade in for new messages + - Slide in for sidebar + - Typing indicator animation + - Loading spinner for generation + - Skeleton loaders for content + + + + + + 1. User types message in input field + 2. Optional: Attach images via button + 3. Click send or press Enter + 4. Message appears in chat immediately + 5. Typing indicator shows while waiting + 6. Response streams in word by word + 7. Code blocks render with syntax highlighting + 8. Artifacts detected and rendered in side panel + 9. Message complete, enable regenerate option + + + + 1. Assistant generates artifact in response + 2. Artifact panel slides in from right + 3. Content renders (code with highlighting or live preview) + 4. User can edit artifact inline + 5. "Re-prompt" button to iterate with Claude + 6. Download or copy artifact content + 7. Full-screen mode for detailed work + 8. Close panel to return to chat focus + + + + 1. Click "New Chat" to start fresh conversation + 2. Conversations auto-save with first message + 3. Auto-generate title from first exchange + 4. Click title to rename inline + 5. Drag conversations into folders + 6. Right-click for context menu (pin, archive, delete, export) + 7. Search filters conversations in real-time + 8. Click conversation to switch context + + + + + + Setup Project Foundation and Database + + - Initialize Express server with SQLite database + - Set up Claude API client with streaming support + - Create database schema with migrations + - Implement authentication endpoints + - Set up basic CORS and middleware + - Create health check endpoint + + + + + Build Core Chat Interface + + - Create main layout with sidebar and chat area + - Implement message display with markdown rendering + - Add streaming message support with SSE + - Build input area with auto-resize textarea + - Add code block syntax highlighting + - Implement stop generation functionality + - Add typing indicators and loading states + + + + + Conversation Management + + - Create conversation list in sidebar + - Implement new conversation creation + - Add conversation switching + - Build conversation rename functionality + - Implement delete with confirmation + - Add conversation search + - Create conversation grouping by date + + + + + Artifacts System + + - Build artifact detection from Claude responses + - Create artifact rendering panel + - Implement code artifact viewer + - Add HTML/SVG live preview + - Build artifact editing interface + - Add artifact versioning + - Implement full-screen artifact view + + + + + Projects and Organization + + - Create projects CRUD endpoints + - Build project selector UI + - Implement project-specific custom instructions + - Add folder system for conversations + - Create drag-and-drop organization + - Build project settings panel + + + + + Advanced Features + + - Add model selection dropdown + - Implement temperature and parameter controls + - Build image upload functionality + - Create message editing and regeneration + - Add conversation branching + - Implement export functionality + + + + + Settings and Customization + + - Build settings modal with tabs + - Implement theme switching (light/dark) + - Add custom instructions management + - Create keyboard shortcuts + - Build prompt library + - Add usage tracking dashboard + + + + + Sharing and Collaboration + + - Implement conversation sharing with tokens + - Create public share view + - Add export to multiple formats + - Build prompt templates + - Create example conversations + + + + + Polish and Optimization + + - Optimize for mobile responsiveness + - Add command palette (Cmd+K) + - Implement comprehensive keyboard navigation + - Add onboarding flow + - Create accessibility improvements + - Performance optimization and caching + + + + + + + - Streaming chat responses work smoothly + - Artifact detection and rendering accurate + - Conversation management intuitive and reliable + - Project organization clear and useful + - Image upload and display working + - All CRUD operations functional + + + + - Interface matches claude.ai design language + - Responsive on all device sizes + - Smooth animations and transitions + - Fast response times and minimal lag + - Intuitive navigation and workflows + - Clear feedback for all actions + + + + - Clean, maintainable code structure + - Proper error handling throughout + - Secure API key management + - Optimized database queries + - Efficient streaming implementation + - Comprehensive testing coverage + + + + - Consistent with claude.ai visual design + - Beautiful typography and spacing + - Smooth animations and micro-interactions + - Excellent contrast and accessibility + - Professional, polished appearance + - Dark mode fully implemented + + + diff --git a/prompts/app_spec_template.txt b/prompts/app_spec_template.txt deleted file mode 100644 index 677d2db..0000000 --- a/prompts/app_spec_template.txt +++ /dev/null @@ -1,134 +0,0 @@ - - Votre Nom d'Application - - - Description complète de votre application. Expliquez : - - Ce que fait l'application - - Qui sont les utilisateurs cibles - - Les objectifs principaux - - Les fonctionnalités clés en quelques phrases - - - - - Note: Vous pouvez utiliser une clé API située à /tmp/api-key pour les tests. - Vous ne pourrez pas lire ce fichier, mais vous pouvez le référencer dans le code. - - - React avec Vite - Tailwind CSS (via CDN) - React hooks et context - React Router pour la navigation - Lancer uniquement sur le port {frontend_port} - - - Node.js avec Express - SQLite avec better-sqlite3 - Intégration avec les APIs nécessaires - Server-Sent Events pour les réponses en temps réel (si nécessaire) - - - Endpoints RESTful - SSE pour le streaming en temps réel (si nécessaire) - - - - - - - Repository inclut .env avec les clés API configurées - - Dépendances frontend pré-installées via npm/pnpm - - Code backend dans le répertoire /server - - Installer les dépendances backend au besoin - - - - - - - - Nom de la fonctionnalité 1 - - Description détaillée de ce que fait cette fonctionnalité. - Incluez les détails techniques importants, les cas d'usage, et les - interactions avec d'autres parties de l'application. - - 1 - frontend - - 1. Étape de test 1 - décrire l'action - 2. Étape de test 2 - décrire l'action - 3. Étape de test 3 - vérifier le résultat attendu - 4. Étape de test 4 - tester les cas d'erreur - - - - - Nom de la fonctionnalité 2 - - Description de la fonctionnalité 2... - - 2 - backend - - 1. Étape de test 1 - 2. Étape de test 2 - - - - - - - - - - - - - - - - - - - - - - - - diff --git a/prompts/app_spec_theme_customization.txt b/prompts/app_spec_theme_customization.txt deleted file mode 100644 index f95e5d3..0000000 --- a/prompts/app_spec_theme_customization.txt +++ /dev/null @@ -1,403 +0,0 @@ - - Claude.ai Clone - Advanced Theme Customization - - - This specification adds advanced theme customization features to the Claude.ai clone application. - Users will be able to customize accent colors, font sizes, message spacing, and choose from - preset color themes. All changes are additive and backward-compatible with existing theme functionality. - - The existing light/dark mode toggle remains unchanged and functional. - - - - - - DO NOT modify existing light/dark mode functionality - - DO NOT break existing theme persistence - - DO NOT change existing CSS classes without ensuring backward compatibility - - All new theme options must be optional (defaults should match current behavior) - - Test thoroughly to ensure existing themes still work - - Maintain backward compatibility at all times - - New theme preferences should be stored separately from existing theme settings - - - - - - Advanced Theme Customization - - Add advanced theme customization options. Users should be able to: - - Customize accent colors (beyond just light/dark mode) - - Choose from preset color themes (blue, green, purple, orange) - - Adjust font size globally (small, medium, large) - - Adjust message spacing (compact, comfortable, spacious) - - Preview theme changes before applying - - Save custom theme preferences - - The customization interface should be intuitive and provide real-time preview - of changes before they are applied. All preferences should persist across sessions. - - 3 - style - - - Create a new "Appearance" or "Theme" section in settings - - Add accent color picker with preset options (blue, green, purple, orange) - - Add font size slider/selector (small, medium, large) - - Add message spacing selector (compact, comfortable, spacious) - - Implement preview functionality that shows changes in real-time - - Store theme preferences in localStorage or backend (user preferences) - - Apply theme using CSS custom properties (CSS variables) - - Ensure theme works with both light and dark modes - - - 1. Open settings menu - 2. Navigate to "Appearance" or "Theme" section - 3. Select a different accent color (e.g., green) - 4. Verify accent color changes are visible in preview - 5. Adjust font size slider to "large" - 6. Verify font size changes in preview - 7. Adjust message spacing option to "spacious" - 8. Verify spacing changes in preview - 9. Click "Preview" to see changes applied temporarily - 10. Click "Apply" to save changes permanently - 11. Verify changes persist after page refresh - 12. Test with both light and dark mode - 13. Test reset to default theme - 14. Verify existing conversations display correctly with new theme - - - - - Accent Color Customization - - Allow users to customize the accent color used throughout the application. - This includes: - - Primary button colors - - Link colors - - Focus states - - Active states - - Selection highlights - - Progress indicators - - Preset options: - - Blue (default, matches Claude.ai) - - Green - - Purple - - Orange - - Users should be able to see a preview of each color before applying. - - 3 - style - - - Define accent colors as CSS custom properties - - Create color palette for each preset (light and dark variants) - - Add color picker UI component in settings - - Update all accent color usages to use CSS variables - - Ensure colors have proper contrast ratios for accessibility - - Store selected accent color in user preferences - - - 1. Open theme settings - 2. Select "Green" accent color - 3. Verify buttons, links, and highlights use green - 4. Switch to dark mode and verify green accent still works - 5. Test all preset colors (blue, green, purple, orange) - 6. Verify color persists after refresh - 7. Test accessibility (contrast ratios) - - - - - Global Font Size Adjustment - - Allow users to adjust the global font size for better readability. - Options: - - Small (12px base) - - Medium (14px base, default) - - Large (16px base) - - Font size should scale proportionally across all text elements: - - Message text - - UI labels - - Input fields - - Buttons - - Sidebar text - - 3 - style - - - Use CSS rem units for all font sizes - - Set base font size on root element - - Create font size presets (small, medium, large) - - Add font size selector in settings - - Store preference in user settings - - Ensure responsive design still works with different font sizes - - - 1. Open theme settings - 2. Select "Small" font size - 3. Verify all text is smaller throughout the app - 4. Select "Large" font size - 5. Verify all text is larger throughout the app - 6. Verify layout doesn't break with different font sizes - 7. Test with long messages to ensure wrapping works - 8. Verify preference persists after refresh - - - - - Message Spacing Customization - - Allow users to adjust the spacing between messages and within message bubbles. - Options: - - Compact: Minimal spacing (for users who prefer dense layouts) - - Comfortable: Default spacing (current behavior) - - Spacious: Increased spacing (for better readability) - - This affects: - - Vertical spacing between messages - - Padding within message bubbles - - Spacing between message elements (avatar, text, timestamp) - - 3 - style - - - Define spacing scale using CSS custom properties - - Create spacing presets (compact, comfortable, spacious) - - Apply spacing to message containers and bubbles - - Add spacing selector in settings - - Store preference in user settings - - Ensure spacing works well with different font sizes - - - 1. Open theme settings - 2. Select "Compact" spacing - 3. Verify messages are closer together - 4. Select "Spacious" spacing - 5. Verify messages have more space between them - 6. Test with long conversations to ensure scrolling works - 7. Verify spacing preference persists after refresh - 8. Test with different font sizes to ensure compatibility - - - - - Theme Preview Functionality - - Allow users to preview theme changes before applying them permanently. - The preview should: - - Show a sample conversation with the new theme applied - - Update in real-time as settings are changed - - Allow users to cancel and revert to previous theme - - Show both light and dark mode previews if applicable - - Users should be able to: - - See preview immediately when changing settings - - Click "Apply" to save changes - - Click "Cancel" to discard changes - - Click "Reset" to return to default theme - - 3 - functional - - - Create preview component showing sample conversation - - Apply theme changes temporarily to preview - - Store original theme state for cancel functionality - - Update preview in real-time as settings change - - Only persist changes when "Apply" is clicked - - Show clear visual feedback for preview vs. applied state - - - 1. Open theme settings - 2. Change accent color to green - 3. Verify preview updates immediately - 4. Change font size to large - 5. Verify preview updates with new font size - 6. Click "Cancel" and verify changes are reverted - 7. Make changes again and click "Apply" - 8. Verify changes are saved and applied to actual interface - 9. Test preview with both light and dark mode - - - - - - - frontend/ - components/ - ThemeSettings.jsx # New theme customization UI (NEW) - ThemePreview.jsx # Preview component (NEW) - styles/ - theme-variables.css # CSS custom properties for themes (NEW) - accent-colors.css # Accent color definitions (NEW) - hooks/ - useTheme.js # Updated to handle new theme options - utils/ - themeStorage.js # Theme preference persistence (NEW) - - - - Use CSS custom properties (CSS variables) for all theme values: - - --accent-color-primary - - --accent-color-hover - - --font-size-base - - --message-spacing-vertical - - --message-padding - - This allows easy theme switching without JavaScript manipulation. - - - - Store theme preferences in: - - localStorage for client-side persistence - - Or backend user preferences table if available - - Structure: - { - accentColor: 'blue' | 'green' | 'purple' | 'orange', - fontSize: 'small' | 'medium' | 'large', - messageSpacing: 'compact' | 'comfortable' | 'spacious', - theme: 'light' | 'dark' (existing) - } - - - - - Keep existing theme functionality intact - - Default values should match current behavior - - Use feature detection for new theme features - - Gracefully degrade if CSS custom properties not supported - - Test with existing conversations and UI elements - - Ensure accessibility standards are maintained - - - - - - Settings panel for theme customization - - - Accent Color: Radio buttons or color swatches for preset colors - - Font Size: Slider or dropdown (small, medium, large) - - Message Spacing: Radio buttons (compact, comfortable, spacious) - - Preview: Live preview of theme changes - - Actions: Apply, Cancel, Reset buttons - - - - - Preview component showing sample conversation - - - Sample user message - - Sample AI response - - Shows current accent color - - Shows current font size - - Shows current spacing - - Updates in real-time - - - - - - - Define CSS variables for each accent color preset: - --accent-blue: #2563eb; - --accent-green: #10b981; - --accent-purple: #8b5cf6; - --accent-orange: #f59e0b; - - Each should have hover, active, and focus variants. - - - - Define base font sizes: - --font-size-small: 0.75rem; (12px) - --font-size-medium: 0.875rem; (14px, default) - --font-size-large: 1rem; (16px) - - - - Define spacing scales: - --spacing-compact: 0.5rem; - --spacing-comfortable: 1rem; (default) - --spacing-spacious: 1.5rem; - - - - - - If storing preferences in backend: - - GET /api/user/preferences - Get user theme preferences - - PUT /api/user/preferences - Update user theme preferences - - GET /api/user/preferences/theme - Get theme preferences only - - - - If using localStorage only, no API endpoints needed. - Backend storage is optional but recommended for multi-device sync. - - - - - - All accent colors must meet WCAG AA contrast ratios (4.5:1 for text) - - Font size changes must not break screen reader compatibility - - Theme settings must be keyboard navigable - - Color choices should not be the only way to convey information - - Provide high contrast mode option if possible - - - - - - Verify existing light/dark mode toggle still works - - Verify existing theme persistence still works - - Test that default theme matches current behavior - - Verify existing conversations display correctly - - Test that all UI elements are styled correctly - - - - - Test each accent color preset - - Test each font size option - - Test each spacing option - - Test theme preview functionality - - Test theme persistence (localStorage/backend) - - Test theme reset to defaults - - Test theme with both light and dark modes - - Test theme changes in real-time - - - - - Test with different browsers (Chrome, Firefox, Safari, Edge) - - Test with different screen sizes (responsive design) - - Test with long conversations - - Test with different message types (text, code, artifacts) - - Test accessibility with screen readers - - - - - - - Users can customize accent colors from preset options - - Users can adjust global font size (small, medium, large) - - Users can adjust message spacing (compact, comfortable, spacious) - - Theme preview shows changes in real-time - - Theme preferences persist across sessions - - Existing light/dark mode functionality works unchanged - - All theme options work together harmoniously - - - - - Theme customization is intuitive and easy to use - - Preview provides clear feedback before applying changes - - Changes apply smoothly without flickering - - Settings are easy to find and access - - Reset to defaults is easily accessible - - - - - Code is well-organized and maintainable - - CSS custom properties are used consistently - - Theme preferences are stored reliably - - No performance degradation with theme changes - - Backward compatibility is maintained - - - diff --git a/prompts/app_spec_types_docs.backup.txt b/prompts/app_spec_types_docs.backup.txt new file mode 100644 index 0000000..0fe4fa6 --- /dev/null +++ b/prompts/app_spec_types_docs.backup.txt @@ -0,0 +1,679 @@ + + Library RAG - Type Safety & Documentation Enhancement + + + Enhance the Library RAG application (philosophical texts indexing and semantic search) by adding + strict type annotations and comprehensive Google-style docstrings to all Python modules. This will + improve code maintainability, enable static type checking with mypy, and provide clear documentation + for all functions, classes, and modules. + + The application is a RAG pipeline that processes PDF documents through OCR, LLM-based extraction, + semantic chunking, and ingestion into Weaviate vector database. It includes a Flask web interface + for document upload, processing, and semantic search. + + + + + Python 3.10+ + Flask 3.0 + Weaviate 1.34.4 with text2vec-transformers + Mistral OCR API + Ollama (local) or Mistral API + mypy with strict configuration + + + Docker Compose (Weaviate + transformers) + weaviate-client, flask, mistralai, python-dotenv + + + + + + - flask_app.py: Main Flask application (640 lines) + - schema.py: Weaviate schema definition (383 lines) + - utils/: 16+ modules for PDF processing pipeline + - pdf_pipeline.py: Main orchestration (879 lines) + - mistral_client.py: OCR API client + - ocr_processor.py: OCR processing + - markdown_builder.py: Markdown generation + - llm_metadata.py: Metadata extraction via LLM + - llm_toc.py: Table of contents extraction + - llm_classifier.py: Section classification + - llm_chunker.py: Semantic chunking + - llm_cleaner.py: Chunk cleaning + - llm_validator.py: Document validation + - weaviate_ingest.py: Database ingestion + - hierarchy_parser.py: Document hierarchy parsing + - image_extractor.py: Image extraction from PDFs + - toc_extractor*.py: Various TOC extraction methods + - templates/: Jinja2 templates for Flask UI + - tests/utils2/: Minimal test coverage (3 test files) + + + + - Inconsistent type annotations across modules (some have partial types, many have none) + - Missing or incomplete docstrings (no Google-style format) + - No mypy configuration for strict type checking + - Type hints missing on function parameters and return values + - Dict[str, Any] used extensively without proper typing + - No type stubs for complex nested structures + + + + + + + - Add complete type annotations to ALL functions and methods + - Use proper generic types (List, Dict, Optional, Union) from typing module + - Add TypedDict for complex dictionary structures + - Add Protocol types for duck-typed interfaces + - Use Literal types for string constants + - Add ParamSpec and TypeVar where appropriate + - Type all class attributes and instance variables + - Add type annotations to lambda functions where possible + + + + - Create mypy.ini with strict configuration + - Enable: check_untyped_defs, disallow_untyped_defs, disallow_incomplete_defs + - Enable: disallow_untyped_calls, disallow_untyped_decorators + - Enable: warn_return_any, warn_redundant_casts + - Enable: strict_equality, strict_optional + - Set python_version to 3.10 + - Configure per-module overrides if needed for gradual migration + + + + - Create TypedDict definitions for common data structures: + - OCR response structures + - Metadata dictionaries + - TOC entries + - Chunk objects + - Weaviate objects + - Pipeline results + - Add NewType for semantic type safety (DocumentName, ChunkId, etc.) + - Create Protocol types for callback functions + + + + - pdf_pipeline.py: Type all 10 pipeline steps, callbacks, result dictionaries + - flask_app.py: Type all route handlers, request/response types + - schema.py: Type Weaviate configuration objects + - llm_*.py: Type LLM request/response structures + - mistral_client.py: Type API client methods and responses + - weaviate_ingest.py: Type ingestion functions and batch operations + + + + + + - Add comprehensive Google-style docstrings to ALL: + - Module-level docstrings explaining purpose and usage + - Class docstrings with Attributes section + - Function/method docstrings with Args, Returns, Raises sections + - Complex algorithm explanations with Examples section + - Include code examples for public APIs + - Document all exceptions that can be raised + - Add Notes section for important implementation details + - Add See Also section for related functions + + + + + - pdf_pipeline.py: Document the 10-step pipeline, each step's purpose + - mistral_client.py: Document OCR API usage, cost calculation + - llm_metadata.py: Document metadata extraction logic + - llm_toc.py: Document TOC extraction strategies + - llm_classifier.py: Document section classification types + - llm_chunker.py: Document semantic vs basic chunking + - llm_cleaner.py: Document cleaning rules and validation + - llm_validator.py: Document validation criteria + - weaviate_ingest.py: Document ingestion process, nested objects + - hierarchy_parser.py: Document hierarchy building algorithm + + + + - Document all routes with request/response examples + - Document SSE (Server-Sent Events) implementation + - Document Weaviate query patterns + - Document upload processing workflow + - Document background job management + + + + - Document Weaviate schema design decisions + - Document each collection's purpose and relationships + - Document nested object structure + - Document vectorization strategy + + + + + - Add inline comments for complex logic only (don't over-comment) + - Explain WHY not WHAT (code should be self-documenting) + - Document performance considerations + - Document cost implications (OCR, LLM API calls) + - Document error handling strategies + + + + + + - All modules must pass mypy --strict + - No # type: ignore comments without justification + - CI/CD should run mypy checks + - Type coverage should be 100% + + + + - All public functions must have docstrings + - All docstrings must follow Google style + - Examples should be executable and tested + - Documentation should be clear and concise + + + + + + + Priority 1 (Most used, most complex): + 1. utils/pdf_pipeline.py - Main orchestration + 2. flask_app.py - Web application entry point + 3. utils/weaviate_ingest.py - Database operations + 4. schema.py - Schema definition + + Priority 2 (Core LLM modules): + 5. utils/llm_metadata.py + 6. utils/llm_toc.py + 7. utils/llm_classifier.py + 8. utils/llm_chunker.py + 9. utils/llm_cleaner.py + 10. utils/llm_validator.py + + Priority 3 (OCR and parsing): + 11. utils/mistral_client.py + 12. utils/ocr_processor.py + 13. utils/markdown_builder.py + 14. utils/hierarchy_parser.py + 15. utils/image_extractor.py + + Priority 4 (Supporting modules): + 16. utils/toc_extractor.py + 17. utils/toc_extractor_markdown.py + 18. utils/toc_extractor_visual.py + 19. utils/llm_structurer.py (legacy) + + + + + + Setup Type Checking Infrastructure + + Configure mypy with strict settings and create foundational type definitions + + + - Create mypy.ini configuration file with strict settings + - Add mypy to requirements.txt or dev dependencies + - Create utils/types.py module for common TypedDict definitions + - Define core types: OCRResponse, Metadata, TOCEntry, ChunkData, PipelineResult + - Add NewType definitions for semantic types: DocumentName, ChunkId, SectionPath + - Create Protocol types for callbacks (ProgressCallback, etc.) + - Document type definitions in utils/types.py module docstring + - Test mypy configuration on a single module to verify settings + + + - mypy.ini exists with strict configuration + - utils/types.py contains all foundational types with docstrings + - mypy runs without errors on utils/types.py + - Type definitions are comprehensive and reusable + + + + + Add Types to PDF Pipeline Orchestration + + Add complete type annotations to pdf_pipeline.py (879 lines, most complex module) + + + - Add type annotations to all function signatures in pdf_pipeline.py + - Type the 10-step pipeline: OCR, Markdown, Metadata, TOC, Classify, Chunk, Clean, Validate, Weaviate + - Type progress_callback parameter with Protocol or Callable + - Add TypedDict for pipeline options dictionary + - Add TypedDict for pipeline result dictionary structure + - Type all helper functions (extract_document_metadata_legacy, etc.) + - Add proper return types for process_pdf_v2, process_pdf, process_pdf_bytes + - Fix any mypy errors that arise + - Verify mypy --strict passes on pdf_pipeline.py + + + - All functions in pdf_pipeline.py have complete type annotations + - progress_callback is properly typed with Protocol + - All Dict[str, Any] replaced with TypedDict where appropriate + - mypy --strict pdf_pipeline.py passes with zero errors + - No # type: ignore comments (or justified if absolutely necessary) + + + + + Add Types to Flask Application + + Add complete type annotations to flask_app.py and type all routes + + + - Add type annotations to all Flask route handlers + - Type request.args, request.form, request.files usage + - Type jsonify() return values + - Type get_weaviate_client context manager + - Type get_collection_stats, get_all_chunks, search_chunks functions + - Add TypedDict for Weaviate query results + - Type background job processing functions (run_processing_job) + - Type SSE generator function (upload_progress) + - Add type hints for template rendering + - Verify mypy --strict passes on flask_app.py + + + - All Flask routes have complete type annotations + - Request/response types are clear and documented + - Weaviate query functions are properly typed + - SSE generator is correctly typed + - mypy --strict flask_app.py passes with zero errors + + + + + Add Types to Core LLM Modules + + Add complete type annotations to all LLM processing modules (metadata, TOC, classifier, chunker, cleaner, validator) + + + - llm_metadata.py: Type extract_metadata function, return structure + - llm_toc.py: Type extract_toc function, TOC hierarchy structure + - llm_classifier.py: Type classify_sections, section types (Literal), validation functions + - llm_chunker.py: Type chunk_section_with_llm, chunk objects + - llm_cleaner.py: Type clean_chunk, is_chunk_valid functions + - llm_validator.py: Type validate_document, validation result structure + - Add TypedDict for LLM request/response structures + - Type provider selection ("ollama" | "mistral" as Literal) + - Type model names with Literal or constants + - Verify mypy --strict passes on all llm_*.py modules + + + - All LLM modules have complete type annotations + - Section types use Literal for type safety + - Provider and model parameters are strongly typed + - LLM request/response structures use TypedDict + - mypy --strict passes on all llm_*.py modules with zero errors + + + + + Add Types to Weaviate and Database Modules + + Add complete type annotations to schema.py and weaviate_ingest.py + + + - schema.py: Type Weaviate configuration objects + - schema.py: Type collection property definitions + - weaviate_ingest.py: Type ingest_document function signature + - weaviate_ingest.py: Type delete_document_chunks function + - weaviate_ingest.py: Add TypedDict for Weaviate object structure + - Type batch insertion operations + - Type nested object references (work, document) + - Add proper error types for Weaviate exceptions + - Verify mypy --strict passes on both modules + + + - schema.py has complete type annotations for Weaviate config + - weaviate_ingest.py functions are fully typed + - Nested object structures use TypedDict + - Weaviate client operations are properly typed + - mypy --strict passes on both modules with zero errors + + + + + Add Types to OCR and Parsing Modules + + Add complete type annotations to mistral_client.py, ocr_processor.py, markdown_builder.py, hierarchy_parser.py + + + - mistral_client.py: Type create_client, run_ocr, estimate_ocr_cost + - mistral_client.py: Add TypedDict for Mistral API response structures + - ocr_processor.py: Type serialize_ocr_response, OCR object structures + - markdown_builder.py: Type build_markdown, image_writer parameter + - hierarchy_parser.py: Type build_hierarchy, flatten_hierarchy functions + - hierarchy_parser.py: Add TypedDict for hierarchy node structure + - image_extractor.py: Type create_image_writer, image handling + - Verify mypy --strict passes on all modules + + + - All OCR/parsing modules have complete type annotations + - Mistral API structures use TypedDict + - Hierarchy nodes are properly typed + - Image handling functions are typed + - mypy --strict passes on all modules with zero errors + + + + + Add Google-Style Docstrings to Core Modules + + Add comprehensive Google-style docstrings to pdf_pipeline.py, flask_app.py, and weaviate modules + + + - pdf_pipeline.py: Add module docstring explaining the V2 pipeline + - pdf_pipeline.py: Add docstrings to process_pdf_v2 with Args, Returns, Raises sections + - pdf_pipeline.py: Document each of the 10 pipeline steps in comments + - pdf_pipeline.py: Add Examples section showing typical usage + - flask_app.py: Add module docstring explaining Flask application + - flask_app.py: Document all routes with request/response examples + - flask_app.py: Document Weaviate connection management + - schema.py: Add module docstring explaining schema design + - schema.py: Document each collection's purpose and relationships + - weaviate_ingest.py: Document ingestion process with examples + - All docstrings must follow Google style format exactly + + + - All core modules have comprehensive module-level docstrings + - All public functions have Google-style docstrings + - Args, Returns, Raises sections are complete and accurate + - Examples are provided for complex functions + - Docstrings explain WHY, not just WHAT + + + + + Add Google-Style Docstrings to LLM Modules + + Add comprehensive Google-style docstrings to all LLM processing modules + + + - llm_metadata.py: Document metadata extraction logic with examples + - llm_toc.py: Document TOC extraction strategies and fallbacks + - llm_classifier.py: Document section types and classification criteria + - llm_chunker.py: Document semantic vs basic chunking approaches + - llm_cleaner.py: Document cleaning rules and validation logic + - llm_validator.py: Document validation criteria and corrections + - Add Examples sections showing input/output for each function + - Document LLM provider differences (Ollama vs Mistral) + - Document cost implications in Notes sections + - All docstrings must follow Google style format exactly + + + - All LLM modules have comprehensive docstrings + - Each function has Args, Returns, Raises sections + - Examples show realistic input/output + - Provider differences are documented + - Cost implications are noted where relevant + + + + + Add Google-Style Docstrings to OCR and Parsing Modules + + Add comprehensive Google-style docstrings to OCR, markdown, hierarchy, and extraction modules + + + - mistral_client.py: Document OCR API usage, cost calculation + - ocr_processor.py: Document OCR response processing + - markdown_builder.py: Document markdown generation strategy + - hierarchy_parser.py: Document hierarchy building algorithm + - image_extractor.py: Document image extraction process + - toc_extractor*.py: Document various TOC extraction methods + - Add Examples sections for complex algorithms + - Document edge cases and error handling + - All docstrings must follow Google style format exactly + + + - All OCR/parsing modules have comprehensive docstrings + - Complex algorithms are well explained + - Edge cases are documented + - Error handling is documented + - Examples demonstrate typical usage + + + + + Final Validation and CI Integration + + Verify all type annotations and docstrings, integrate mypy into CI/CD + + + - Run mypy --strict on entire codebase, verify 100% pass rate + - Verify all public functions have docstrings + - Check docstring formatting with pydocstyle or similar tool + - Create GitHub Actions workflow to run mypy on every commit + - Update README.md with type checking instructions + - Update CLAUDE.md with documentation standards + - Create CONTRIBUTING.md with type annotation and docstring guidelines + - Generate API documentation with Sphinx or pdoc + - Fix any remaining mypy errors or missing docstrings + + + - mypy --strict passes on entire codebase with zero errors + - All public functions have Google-style docstrings + - CI/CD runs mypy checks automatically + - Documentation is generated and accessible + - Contributing guidelines document type/docstring requirements + + + + + + + - 100% type coverage across all modules + - mypy --strict passes with zero errors + - No # type: ignore comments without justification + - All Dict[str, Any] replaced with TypedDict where appropriate + - Proper use of generics, protocols, and type variables + - NewType used for semantic type safety + + + + - All modules have comprehensive module-level docstrings + - All public functions/classes have Google-style docstrings + - All docstrings include Args, Returns, Raises sections + - Complex functions include Examples sections + - Cost implications documented in Notes sections + - Error handling clearly documented + - Provider differences (Ollama vs Mistral) documented + + + + - Code is self-documenting with clear variable names + - Inline comments explain WHY, not WHAT + - Complex algorithms are well explained + - Performance considerations documented + - Security considerations documented + + + + - IDE autocomplete works perfectly with type hints + - Type errors caught at development time, not runtime + - Documentation is easily accessible in IDE + - API examples are executable and tested + - Contributing guidelines are clear and comprehensive + + + + - Refactoring is safer with type checking + - Function signatures are self-documenting + - API contracts are explicit and enforced + - Breaking changes are caught by type checker + - New developers can understand code quickly + + + + + + - Must maintain backward compatibility with existing code + - Cannot break existing Flask routes or API contracts + - Weaviate schema must remain unchanged + - Existing tests must continue to pass + + + + - Can use per-module mypy configuration for gradual migration + - Can temporarily disable strict checks on legacy modules + - Priority modules must be completed first + - Low-priority modules can be deferred + + + + - All type annotations must use Python 3.10+ syntax + - Docstrings must follow Google style exactly (not NumPy or reStructuredText) + - Use typing module (List, Dict, Optional) until Python 3.9 support dropped + - Use from __future__ import annotations if needed for forward references + + + + + + - Run mypy --strict on each module after adding types + - Use mypy daemon (dmypy) for faster incremental checking + - Add mypy to pre-commit hooks + - CI/CD must run mypy and fail on type errors + + + + - Use pydocstyle to validate Google-style format + - Use sphinx-build to generate docs and catch errors + - Manual review of docstring examples + - Verify examples are executable and correct + + + + - Verify existing tests still pass after type additions + - Add new tests for complex typed structures + - Test mypy configuration on sample code + - Verify IDE autocomplete works correctly + + + + + + ```python + """ + PDF Pipeline V2 - Intelligent document processing with LLM enhancement. + + This module orchestrates a 10-step pipeline for processing PDF documents: + 1. OCR via Mistral API + 2. Markdown construction with images + 3. Metadata extraction via LLM + 4. Table of contents (TOC) extraction + 5. Section classification + 6. Semantic chunking + 7. Chunk cleaning and validation + 8. Enrichment with concepts + 9. Validation and corrections + 10. Ingestion into Weaviate vector database + + The pipeline supports multiple LLM providers (Ollama local, Mistral API) and + various processing modes (skip OCR, semantic chunking, OCR annotations). + + Typical usage: + >>> from pathlib import Path + >>> from utils.pdf_pipeline import process_pdf + >>> + >>> result = process_pdf( + ... Path("document.pdf"), + ... use_llm=True, + ... llm_provider="ollama", + ... ingest_to_weaviate=True, + ... ) + >>> print(f"Processed {result['pages']} pages, {result['chunks_count']} chunks") + + See Also: + mistral_client: OCR API client + llm_metadata: Metadata extraction + weaviate_ingest: Database ingestion + """ + ``` + + + + ```python + def process_pdf_v2( + pdf_path: Path, + output_dir: Path = Path("output"), + *, + use_llm: bool = True, + llm_provider: Literal["ollama", "mistral"] = "ollama", + llm_model: Optional[str] = None, + skip_ocr: bool = False, + ingest_to_weaviate: bool = True, + progress_callback: Optional[ProgressCallback] = None, + ) -> PipelineResult: + """ + Process a PDF through the complete V2 pipeline with LLM enhancement. + + This function orchestrates all 10 steps of the intelligent document processing + pipeline, from OCR to Weaviate ingestion. It supports both local (Ollama) and + cloud (Mistral API) LLM providers, with optional caching via skip_ocr. + + Args: + pdf_path: Absolute path to the PDF file to process. + output_dir: Base directory for output files. Defaults to "./output". + use_llm: Enable LLM-based processing (metadata, TOC, chunking). + If False, uses basic heuristic processing. + llm_provider: LLM provider to use. "ollama" for local (free but slow), + "mistral" for API (fast but paid). + llm_model: Specific model name. If None, auto-detects based on provider + (qwen2.5:7b for ollama, mistral-small-latest for mistral). + skip_ocr: If True, reuses existing markdown file to avoid OCR cost. + Requires output_dir//.md to exist. + ingest_to_weaviate: If True, ingests chunks into Weaviate after processing. + progress_callback: Optional callback for real-time progress updates. + Called with (step_id, status, detail) for each pipeline step. + + Returns: + Dictionary containing processing results with the following keys: + - success (bool): True if processing completed without errors + - document_name (str): Name of the processed document + - pages (int): Number of pages in the PDF + - chunks_count (int): Number of chunks generated + - cost_ocr (float): OCR cost in euros (0 if skip_ocr=True) + - cost_llm (float): LLM API cost in euros (0 if provider=ollama) + - cost_total (float): Total cost (ocr + llm) + - metadata (dict): Extracted metadata (title, author, etc.) + - toc (list): Hierarchical table of contents + - files (dict): Paths to generated files (markdown, chunks, etc.) + + Raises: + FileNotFoundError: If pdf_path does not exist. + ValueError: If skip_ocr=True but markdown file not found. + RuntimeError: If Weaviate connection fails during ingestion. + + Examples: + Basic usage with Ollama (free): + >>> result = process_pdf_v2( + ... Path("platon_menon.pdf"), + ... llm_provider="ollama" + ... ) + >>> print(f"Cost: {result['cost_total']:.4f}€") + Cost: 0.0270€ # OCR only + + With Mistral API (faster): + >>> result = process_pdf_v2( + ... Path("platon_menon.pdf"), + ... llm_provider="mistral", + ... llm_model="mistral-small-latest" + ... ) + + Skip OCR to avoid cost: + >>> result = process_pdf_v2( + ... Path("platon_menon.pdf"), + ... skip_ocr=True, # Reuses existing markdown + ... ingest_to_weaviate=False + ... ) + + Notes: + - OCR cost: ~0.003€/page (standard), ~0.009€/page (with annotations) + - LLM cost: Free with Ollama, variable with Mistral API + - Processing time: ~30s/page with Ollama, ~5s/page with Mistral + - Weaviate must be running (docker-compose up -d) before ingestion + """ + ``` + + + diff --git a/prompts/coding_prompt_library.md b/prompts/coding_prompt_library.md new file mode 100644 index 0000000..0f628a3 --- /dev/null +++ b/prompts/coding_prompt_library.md @@ -0,0 +1,290 @@ +## YOUR ROLE - CODING AGENT (Library RAG - Type Safety & Documentation) + +You are working on adding strict type annotations and Google-style docstrings to a Python library project. +This is a FRESH context window - you have no memory of previous sessions. + +You have access to Linear for project management via MCP tools. Linear is your single source of truth. + +### STEP 1: GET YOUR BEARINGS (MANDATORY) + +Start by orienting yourself: + +```bash +# 1. See your working directory +pwd + +# 2. List files to understand project structure +ls -la + +# 3. Read the project specification +cat app_spec.txt + +# 4. Read the Linear project state +cat .linear_project.json + +# 5. Check recent git history +git log --oneline -20 +``` + +### STEP 2: CHECK LINEAR STATUS + +Query Linear to understand current project state using the project_id from `.linear_project.json`. + +1. **Get all issues and count progress:** + ``` + mcp__linear__list_issues with project_id + ``` + Count: + - Issues "Done" = completed + - Issues "Todo" = remaining + - Issues "In Progress" = currently being worked on + +2. **Find META issue** (if exists) for session context + +3. **Check for in-progress work** - complete it first if found + +### STEP 3: SELECT NEXT ISSUE + +Get Todo issues sorted by priority: +``` +mcp__linear__list_issues with project_id, status="Todo", limit=5 +``` + +Select ONE highest-priority issue to work on. + +### STEP 4: CLAIM THE ISSUE + +Use `mcp__linear__update_issue` to set status to "In Progress" + +### STEP 5: IMPLEMENT THE ISSUE + +Based on issue category: + +**For Type Annotation Issues (e.g., "Types - Add type annotations to X.py"):** + +1. Read the target Python file +2. Identify all functions, methods, and variables +3. Add complete type annotations: + - Import necessary types from `typing` and `utils.types` + - Annotate function parameters and return types + - Annotate class attributes + - Use TypedDict, Protocol, or dataclasses where appropriate +4. Save the file +5. Run mypy to verify (MANDATORY): + ```bash + cd generations/library_rag + mypy --config-file=mypy.ini + ``` +6. Fix any mypy errors +7. Commit the changes + +**For Documentation Issues (e.g., "Docs - Add docstrings to X.py"):** + +1. Read the target Python file +2. Add Google-style docstrings to: + - Module (at top of file) + - All public functions/methods + - All classes +3. Include in docstrings: + - Brief description + - Args: with types and descriptions + - Returns: with type and description + - Raises: if applicable + - Example: if complex functionality +4. Save the file +5. Optionally run pydocstyle to verify (if installed) +6. Commit the changes + +**For Setup/Infrastructure Issues:** + +Follow the specific instructions in the issue description. + +### STEP 6: VERIFICATION + +**Type Annotation Issues:** +- Run mypy on the modified file(s) +- Ensure zero type errors +- If errors exist, fix them before proceeding + +**Documentation Issues:** +- Review docstrings for completeness +- Ensure Args/Returns sections match function signatures +- Check that examples are accurate + +**Functional Changes (rare):** +- If the issue changes behavior, test manually +- Start Flask server if needed: `python flask_app.py` +- Test the affected functionality + +### STEP 7: GIT COMMIT + +Make a descriptive commit: +```bash +git add +git commit -m ": + +- +- Verified with mypy (for type issues) +- Linear issue: +" +``` + +### STEP 8: UPDATE LINEAR ISSUE + +1. **Add implementation comment:** + ```markdown + ## Implementation Complete + + ### Changes Made + - [List of files modified] + - [Key changes] + + ### Verification + - mypy passes with zero errors (for type issues) + - All test steps from issue description verified + + ### Git Commit + [commit hash and message] + ``` + +2. **Update status to "Done"** using `mcp__linear__update_issue` + +### STEP 9: DECIDE NEXT ACTION + +After completing an issue, ask yourself: + +1. Have I been working for a while? (Use judgment based on complexity of work done) +2. Is the code in a stable state? +3. Would this be a good handoff point? + +**If YES to all three:** +- Proceed to STEP 10 (Session Summary) +- End cleanly + +**If NO:** +- Continue to another issue (go back to STEP 3) +- But commit first! + +**Pacing Guidelines:** +- Early phase (< 20% done): Can complete multiple simple issues +- Mid/late phase (> 20% done): 1-2 issues per session for quality + +### STEP 10: SESSION SUMMARY (When Ending) + +If META issue exists, add a comment: + +```markdown +## Session Complete + +### Completed This Session +- [Issue ID]: [Title] - [Brief summary] + +### Current Progress +- X issues Done +- Y issues In Progress +- Z issues Todo + +### Notes for Next Session +- [Important context] +- [Recommendations] +- [Any concerns] +``` + +Ensure: +- All code committed +- No uncommitted changes +- App in working state + +--- + +## LINEAR WORKFLOW RULES + +**Status Transitions:** +- Todo → In Progress (when starting) +- In Progress → Done (when verified) + +**NEVER:** +- Delete or modify issue descriptions +- Mark Done without verification +- Leave issues In Progress when switching + +--- + +## TYPE ANNOTATION GUIDELINES + +**Imports needed:** +```python +from typing import Optional, Dict, List, Any, Tuple, Callable +from pathlib import Path +from utils.types import +``` + +**Common patterns:** +```python +# Functions +def process_data(input: str, options: Optional[Dict[str, Any]] = None) -> List[str]: + """Process input data.""" + ... + +# Methods with self +def save(self, path: Path) -> None: + """Save to file.""" + ... + +# Async functions +async def fetch_data(url: str) -> Dict[str, Any]: + """Fetch from API.""" + ... +``` + +**Use project types from `utils/types.py`:** +- Metadata, OCRResponse, TOCEntry, ChunkData, PipelineResult, etc. + +--- + +## DOCSTRING TEMPLATE (Google Style) + +```python +def function_name(param1: str, param2: int = 0) -> List[str]: + """ + Brief one-line description. + + More detailed description if needed. Explain what the function does, + any important behavior, side effects, etc. + + Args: + param1: Description of param1. + param2: Description of param2. Defaults to 0. + + Returns: + Description of return value. + + Raises: + ValueError: When param1 is empty. + IOError: When file cannot be read. + + Example: + >>> result = function_name("test", 5) + >>> print(result) + ['test', 'test', 'test', 'test', 'test'] + """ +``` + +--- + +## IMPORTANT REMINDERS + +**Your Goal:** Add strict type annotations and comprehensive documentation to all Python modules + +**This Session's Goal:** Complete 1-2 issues with quality work and clean handoff + +**Quality Bar:** +- mypy --strict passes with zero errors +- All public functions have complete Google-style docstrings +- Code is clean and well-documented + +**Context is finite.** End sessions early with good handoff notes. The next agent will continue. + +--- + +Begin by running STEP 1 (Get Your Bearings). diff --git a/prompts/initializer_bis_prompt.md b/prompts/initializer_bis_prompt.md index 428eed3..f63699c 100644 --- a/prompts/initializer_bis_prompt.md +++ b/prompts/initializer_bis_prompt.md @@ -60,6 +60,13 @@ using the `mcp__linear__create_issue` tool. - Do NOT modify existing issues - Only create NEW issues for the NEW features +**IMPORTANT - Issue Count:** +Create EXACTLY ONE issue per feature listed in the `` section of the new spec file. +- If the spec has 8 features → create 8 issues +- If the spec has 15 features → create 15 issues +- Do NOT create a fixed number like 50 issues +- Each `` in the spec = 1 Linear issue + **For each NEW feature, create an issue with:** ``` diff --git a/prompts/initializer_prompt.md b/prompts/initializer_prompt.md index 12b46da..4eaf377 100644 --- a/prompts/initializer_prompt.md +++ b/prompts/initializer_prompt.md @@ -30,9 +30,16 @@ Before creating issues, you need to set up Linear: ### CRITICAL TASK: Create Linear Issues +**IMPORTANT - Issue Count:** +Create EXACTLY ONE issue per feature listed in the `` section of app_spec.txt. +- Count the `` tags in the spec file +- If the spec has 8 features → create 8 issues +- If the spec has 50 features → create 50 issues +- Do NOT create a fixed arbitrary number +- Each `` in `` = 1 Linear issue + Based on `app_spec.txt`, create Linear issues for each feature using the -`mcp__linear__create_issue` tool. Create 50 detailed issues that -comprehensively cover all features in the spec. +`mcp__linear__create_issue` tool. **For each feature, create an issue with:** @@ -66,7 +73,7 @@ priority: 1-4 based on importance (1=urgent/foundational, 4=low/polish) ``` **Requirements for Linear Issues:** -- Create 50 issues total covering all features in the spec +- Create ONE issue per `` tag in `` - Mix of functional and style features (note category in description) - Order by priority: foundational features get priority 1-2, polish features get 3-4 - Include detailed test steps in each issue description diff --git a/prompts/spec_embed_BAAI.txt b/prompts/spec_embed_BAAI.txt new file mode 100644 index 0000000..7cb992b --- /dev/null +++ b/prompts/spec_embed_BAAI.txt @@ -0,0 +1,576 @@ + + Library RAG - Migration to BGE-M3 Embeddings + + + Migrate the Library RAG embedding model from sentence-transformers MiniLM-L6 (384-dim) + to BAAI/bge-m3 (1024-dim) for superior performance on multilingual philosophical texts. + + **Why BGE-M3?** + - 1024 dimensions vs 384 (2.7x richer semantic representation) + - 8192 token context vs 512 (16x longer sequences) + - Superior multilingual support (Greek, Latin, French, English) + - Better trained on academic/research texts + - Captures philosophical nuances more effectively + + **Scope:** + This is a focused migration that only affects the vectorization layer. + LLM processing (Ollama/Mistral) remains completely unchanged. + + **Migration Strategy:** + - Auto-detect GPU availability and configure accordingly + - Delete existing collections (384-dim vectors incompatible with 1024-dim) + - Recreate schema with BGE-M3 vectorizer + - Re-ingest existing 2 documents from cached chunks + - Validate search quality improvements + + + + + 1.34.4 (no change) + BAAI/bge-m3 via text2vec-transformers + sentence-transformers-multi-qa-MiniLM-L6-cos-v1 + Auto-detect CUDA availability (ENABLE_CUDA="1" if GPU, "0" if CPU) + + + Ollama/Mistral (no impact on LLM processing) + Mistral OCR (no change) + PDF pipeline steps 1-9 unchanged + + + + + + - Existing Library RAG application (generations/library_rag/) + - Docker and Docker Compose installed + - NVIDIA Docker runtime (if GPU available) + - Only 2 documents currently ingested (will be re-ingested) + - No production data to preserve + - RTX 4070 GPU available (will be auto-detected and used) + + + + + + **LLM Processing (Steps 1-9):** + - OCR extraction (Mistral API) + - Metadata extraction (Ollama/Mistral) + - TOC extraction (Ollama/Mistral) + - Section classification (Ollama/Mistral) + - Semantic chunking (Ollama/Mistral) + - Cleaning and validation (Ollama/Mistral) + + → **None of these are affected by embedding model change** + + **Vectorization (Step 10):** + - Text → Vector conversion (text2vec-transformers in Weaviate) + - This is the ONLY component that changes + - Happens automatically during Weaviate ingestion + - No Python code changes required + + + + **IMPORTANT: Vector dimensions are incompatible** + + - Existing collections use 384-dim vectors (MiniLM-L6) + - New model generates 1024-dim vectors (BGE-M3) + - Weaviate cannot mix dimensions in same collection + - All collections must be deleted and recreated + - All documents must be re-ingested + + **Why this is safe:** + - Only 2 documents currently ingested + - Source chunks.json files preserved in output/ directory + - No OCR/LLM re-processing needed (reuse existing chunks) + - No additional costs incurred + - Estimated total migration time: 20-25 minutes + + + + + + Complete BGE-M3 Setup with GPU Auto-Detection + + Atomic migration: GPU detection → Docker configuration → Schema deletion → Recreation. + This feature must be completed entirely in one session (cannot be partially done). + + **Step 1: GPU Auto-Detection** + - Check for NVIDIA GPU availability: nvidia-smi or docker run --gpus all nvidia/cuda + - If GPU detected: Set ENABLE_CUDA="1" + - If no GPU: Set ENABLE_CUDA="0" + - Verify NVIDIA Docker runtime if GPU available + + **Step 2: Update Docker Compose** + - Backup current docker-compose.yml to docker-compose.yml.backup + - Update text2vec-transformers service: + * Change image to: cr.weaviate.io/semitechnologies/transformers-inference:sentence-transformers-BAAI-bge-m3 + * Set ENABLE_CUDA based on GPU detection + * Add GPU device mapping if CUDA enabled + - Update comments to reflect BGE-M3 model + - Stop containers: docker-compose down + - Remove old transformers image: docker rmi [old-image-name] + - Start new containers: docker-compose up -d + - Verify BGE-M3 loaded: docker-compose logs text2vec-transformers | grep -i "model" + - If GPU enabled, verify GPU usage: nvidia-smi (should show transformers process) + + **Step 3: Delete Existing Collections** + - Create migrate_to_bge_m3.py script with safety checks + - List all existing collections and object counts + - Confirm deletion prompt: "Delete all collections? (yes/no)" + - Delete all collections: client.collections.delete_all() + - Verify deletion: client.collections.list_all() should return empty + - Log deleted collections and counts for reference + + **Step 4: Recreate Schema with BGE-M3** + - Update schema.py docstring (line 40: MiniLM-L6 → BGE-M3) + - Add migration note at top of schema.py + - Run: python schema.py to recreate all collections + - Weaviate will auto-detect 1024-dim from text2vec-transformers service + - Verify collections created: Work, Document, Chunk, Summary + - Verify vectorizer configured: display_schema() should show text2vec-transformers + - Query text2vec-transformers service to confirm 1024 dimensions + + **Validation:** + - All containers running (docker-compose ps) + - BGE-M3 model loaded successfully + - GPU utilized if available (check nvidia-smi) + - All collections exist with empty state + - Vector dimensions = 1024 (query Weaviate schema) + + **Rollback if needed:** + - Restore docker-compose.yml.backup + - docker-compose down && docker-compose up -d + - python schema.py to recreate with old model + + 1 + migration + + 1. Run GPU detection: nvidia-smi or equivalent + 2. Verify ENABLE_CUDA set correctly based on GPU availability + 3. Backup docker-compose.yml created + 4. Stop containers: docker-compose down + 5. Start with BGE-M3: docker-compose up -d + 6. Check logs: docker-compose logs text2vec-transformers + 7. Verify "BAAI/bge-m3" appears in logs + 8. If GPU: verify nvidia-smi shows transformers process + 9. Run migrate_to_bge_m3.py and confirm deletion + 10. Verify all collections deleted + 11. Run schema.py to recreate + 12. Verify 4 collections exist: Work, Document, Chunk, Summary + 13. Query Weaviate API to confirm vector dimensions = 1024 + 14. Verify collections are empty (object count = 0) + + + + + Document Re-ingestion from Cached Chunks + + Re-ingest the 2 existing documents using their cached chunks.json files. + No OCR or LLM re-processing needed (saves time and cost). + + **Process:** + 1. Identify existing documents in output/ directory + 2. For each document directory: + - Read {document_name}_chunks.json + - Verify chunk structure contains all required fields + - Extract Work metadata (title, author, year, language, genre) + - Extract Document metadata (sourceId, edition, pages, toc, hierarchy) + - Extract Chunk data (text, keywords, sectionPath, etc.) + + 3. Ingest to Weaviate using utils/weaviate_ingest.py: + - Create Work object (if not exists) + - Create Document object with nested Work reference + - Create Chunk objects with nested Document and Work references + - text2vec-transformers will auto-generate 1024-dim vectors + + 4. Verify ingestion success: + - Query Weaviate for each document by sourceId + - Verify chunk counts match original + - Check that vectors are 1024 dimensions + - Verify nested Work/Document metadata accessible + + **Example code:** + ```python + import json + from pathlib import Path + from utils.weaviate_ingest import ( + create_work, create_document, ingest_chunks_to_weaviate + ) + + output_dir = Path("output") + for doc_dir in output_dir.iterdir(): + if doc_dir.is_dir(): + chunks_file = doc_dir / f"{doc_dir.name}_chunks.json" + if chunks_file.exists(): + with open(chunks_file) as f: + data = json.load(f) + + # Create Work + work_id = create_work(client, data["work_metadata"]) + + # Create Document + doc_id = create_document(client, data["document_metadata"], work_id) + + # Ingest chunks + ingest_chunks_to_weaviate(client, data["chunks"], doc_id, work_id) + + print(f"✓ Ingested {doc_dir.name}") + ``` + + **Success criteria:** + - All documents from output/ directory ingested + - Chunk counts match original (verify in Weaviate) + - No vectorization errors in logs + - All vectors are 1024 dimensions + + 1 + data + + 1. List all directories in output/ + 2. For each directory, verify {name}_chunks.json exists + 3. Load first chunks.json and inspect structure + 4. Run re-ingestion script for all documents + 5. Query Weaviate for total Chunk count + 6. Verify count matches sum of all original chunks + 7. Query a sample chunk and verify: + - Vector dimensions = 1024 + - Nested work.title and work.author present + - Nested document.sourceId present + 8. Verify no errors in Weaviate logs + 9. Check text2vec-transformers logs for vectorization activity + + + + + Search Quality Validation and Performance Testing + + Validate that BGE-M3 provides superior search quality for philosophical texts. + Test multilingual capabilities and measure performance improvements. + + **Create test script: test_bge_m3_quality.py** + + **Test 1: Multilingual Queries** + - Test French philosophical terms: "justice", "vertu", "liberté" + - Test English philosophical terms: "virtue", "knowledge", "ethics" + - Test Greek philosophical terms: "ἀρετή" (arete), "τέλος" (telos), "ψυχή" (psyche) + - Test Latin philosophical terms: "virtus", "sapientia", "forma" + - Verify results are semantically relevant + - Compare with expected passages (if baseline available) + + **Test 2: Long Query Handling** + - Test query with 100+ words (BGE-M3 supports 8192 tokens) + - Test query with complex philosophical argument + - Verify no truncation warnings + - Verify semantically appropriate results + + **Test 3: Semantic Understanding** + - Query: "What is the nature of reality?" + - Expected: Results about ontology, metaphysics, being + - Query: "How should we live?" + - Expected: Results about ethics, virtue, good life + - Query: "What can we know?" + - Expected: Results about epistemology, knowledge, certainty + + **Test 4: Performance Metrics** + - Measure query latency (should be <500ms) + - Measure indexing speed during ingestion + - Monitor GPU utilization (if enabled) + - Monitor memory usage (~2GB for BGE-M3) + - Compare with baseline (MiniLM-L6) if metrics available + + **Test 5: Vector Dimension Verification** + - Query Weaviate schema API + - Verify all Chunk vectors are 1024 dimensions + - Verify no 384-dim vectors remain (from old model) + + **Example test script:** + ```python + import weaviate + import weaviate.classes.query as wvq + import time + + client = weaviate.connect_to_local() + chunks = client.collections.get("Chunk") + + # Test multilingual + test_queries = [ + ("justice", "French philosophical concept"), + ("ἀρετή", "Greek virtue/excellence"), + ("What is the good life?", "Long philosophical query"), + ] + + for query, description in test_queries: + start = time.time() + result = chunks.query.near_text( + query=query, + limit=5, + return_metadata=wvq.MetadataQuery(distance=True), + ) + latency = (time.time() - start) * 1000 + + print(f"\nQuery: {query} ({description})") + print(f"Latency: {latency:.1f}ms") + + for obj in result.objects: + similarity = (1 - obj.metadata.distance) * 100 + print(f" [{similarity:.1f}%] {obj.properties['work']['title']}") + print(f" {obj.properties['text'][:150]}...") + + client.close() + ``` + + **Document results:** + - Create SEARCH_QUALITY_RESULTS.md with: + * Sample queries and results + * Performance metrics + * Comparison with MiniLM-L6 (if available) + * Notes on quality improvements observed + + 1 + validation + + 1. Create test_bge_m3_quality.py script + 2. Run multilingual query tests (French, English, Greek, Latin) + 3. Verify results are semantically relevant + 4. Test long queries (100+ words) + 5. Measure average query latency over 10 queries + 6. Verify latency <500ms + 7. Query Weaviate schema to verify vector dimensions = 1024 + 8. If GPU enabled, monitor nvidia-smi during queries + 9. Document search quality improvements in markdown file + 10. Compare results with expected philosophical passages + + + + + Documentation Update + + Update all documentation to reflect BGE-M3 migration. + + **Files to update:** + + 1. **docker-compose.yml** + - Update comments to mention BGE-M3 + - Note GPU auto-detection logic + - Document ENABLE_CUDA setting + + 2. **README.md** + - Update "Embedding Model" section + - Change: MiniLM-L6 (384-dim) → BGE-M3 (1024-dim) + - Add benefits: multilingual, longer context, better quality + - Update docker-compose instructions if needed + + 3. **CLAUDE.md** + - Update schema documentation (line ~35) + - Change vectorizer description + - Update example queries to showcase multilingual + - Add migration notes section + + 4. **schema.py** + - Update module docstring (line 40) + - Change "MiniLM-L6" references to "BGE-M3" + - Add migration date and rationale in comments + - Update display_schema() output text + + 5. **Create MIGRATION_BGE_M3.md** + - Document migration process + - Explain why BGE-M3 chosen + - List breaking changes (dimension incompatibility) + - Document rollback procedure + - Include before/after comparison + - Note LLM independence (Ollama/Mistral unaffected) + - Document search quality improvements + + 6. **MCP_README.md** (if exists) + - Update technical details about embeddings + - Update vector dimension references + + **Migration notes template:** + ```markdown + # BGE-M3 Migration - [Date] + + ## Why + - Superior multilingual support (Greek, Latin, French, English) + - 1024-dim vectors (2.7x richer than MiniLM-L6) + - 8192 token context (16x longer than MiniLM-L6) + - Better trained on academic/philosophical texts + + ## What Changed + - Embedding model: MiniLM-L6 → BAAI/bge-m3 + - Vector dimensions: 384 → 1024 + - All collections deleted and recreated + - 2 documents re-ingested + + ## Impact + - LLM processing (Ollama/Mistral): **No impact** + - Search quality: **Significantly improved** + - GPU acceleration: **Auto-enabled** (if available) + - Migration time: ~25 minutes + + ## Search Quality Improvements + [Insert results from Feature 3 testing] + ``` + + **Verify:** + - Search all files for "MiniLM-L6" references + - Search all files for "384" dimension references + - Replace with "BGE-M3" and "1024" respectively + - Grep for "text2vec" and update comments where needed + + 2 + documentation + + 1. Update docker-compose.yml comments + 2. Update README.md embedding section + 3. Update CLAUDE.md schema documentation + 4. Update schema.py docstring and comments + 5. Create MIGRATION_BGE_M3.md with full migration notes + 6. Search codebase for "MiniLM-L6" references: grep -r "MiniLM" . + 7. Replace all with "BGE-M3" + 8. Search for "384" dimension references + 9. Replace with "1024" where appropriate + 10. Review all updated files for consistency + 11. Verify no outdated references remain + + + + + + + - Updated docker-compose.yml with BGE-M3 and GPU auto-detection + - migrate_to_bge_m3.py script for safe collection deletion + - Updated schema.py with BGE-M3 documentation + - Re-ingestion script (or integration with existing utils) + - test_bge_m3_quality.py for validation + + + + - MIGRATION_BGE_M3.md with complete migration notes + - Updated README.md with BGE-M3 details + - Updated CLAUDE.md with schema changes + - SEARCH_QUALITY_RESULTS.md with validation results + - Updated inline comments in all affected files + + + + + + - BGE-M3 model loads successfully in Weaviate + - GPU auto-detected and utilized if available + - All collections recreated with 1024-dim vectors + - Documents re-ingested successfully from cached chunks + - Semantic search returns relevant results + - Multilingual queries work correctly (Greek, Latin, French, English) + + + + - Search quality demonstrably improved vs MiniLM-L6 + - Greek/Latin philosophical terms properly embedded + - Long queries (>512 tokens) handled correctly + - No vectorization errors in logs + - Vector dimensions verified as 1024 across all collections + + + + - Query latency acceptable (<500ms average) + - GPU utilized if available (verified via nvidia-smi) + - Memory usage stable (~2GB for text2vec-transformers) + - Indexing throughput acceptable during re-ingestion + - No performance degradation vs MiniLM-L6 + + + + - All documentation updated to reflect BGE-M3 + - No outdated MiniLM-L6 references remain + - Migration process fully documented + - Rollback procedure documented and tested + - Search quality improvements quantified + + + + + + **IMPORTANT: This is a destructive migration** + + - All existing Weaviate collections must be deleted + - Vector dimensions change: 384 → 1024 (incompatible) + - Weaviate cannot mix dimensions in same collection + - All documents must be re-ingested + + **Low impact:** + - Only 2 documents currently ingested + - Source chunks.json files preserved in output/ directory + - No OCR re-processing needed (saves ~0.006€ per doc) + - No LLM re-processing needed (saves time and cost) + - Estimated migration time: 20-25 minutes total + + + + If BGE-M3 causes issues, rollback is straightforward: + + 1. Stop containers: docker-compose down + 2. Restore backup: mv docker-compose.yml.backup docker-compose.yml + 3. Start containers: docker-compose up -d + 4. Recreate schema: python schema.py + 5. Re-ingest documents from output/ directory (same process as Feature 2) + + **Time to rollback: ~15 minutes** + + **Note:** Backup of docker-compose.yml created automatically in Feature 1 + + + + **GPU is NOT optional - it's auto-detected** + + The system will automatically detect GPU availability and configure accordingly: + + - **If GPU available (RTX 4070 detected):** + * ENABLE_CUDA="1" in docker-compose.yml + * GPU device mapping added to text2vec-transformers service + * Vectorization uses GPU (5-10x faster) + * ~2GB VRAM used (plenty of headroom on 4070) + * Ollama/Qwen can still use remaining VRAM + + - **If NO GPU available:** + * ENABLE_CUDA="0" in docker-compose.yml + * Vectorization uses CPU (slower but functional) + * No GPU device mapping needed + + **Detection method:** + ```bash + # Try nvidia-smi + if command -v nvidia-smi &> /dev/null; then + GPU_AVAILABLE=true + else + # Try Docker GPU test + if docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi &> /dev/null; then + GPU_AVAILABLE=true + else + GPU_AVAILABLE=false + fi + fi + ``` + + **User has RTX 4070:** GPU will be detected and used automatically. + + + + **Ollama/Mistral are NOT affected by this change** + + The embedding model migration ONLY affects Weaviate vectorization (pipeline step 10). + All LLM processing (steps 1-9) remains unchanged: + - OCR extraction (Mistral API) + - Metadata extraction (Ollama/Mistral) + - TOC extraction (Ollama/Mistral) + - Section classification (Ollama/Mistral) + - Semantic chunking (Ollama/Mistral) + - Cleaning and validation (Ollama/Mistral) + + **No Python code changes required.** + Weaviate handles vectorization automatically via text2vec-transformers service. + + **Ollama can still use GPU:** + BGE-M3 uses ~2GB VRAM. RTX 4070 has 12GB. + Ollama/Qwen can use remaining 10GB without conflict. + + + diff --git a/requirements.txt b/requirements.txt index 0c981f6..c353288 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1 +1,2 @@ claude-code-sdk>=0.0.25 +python-dotenv>=1.0.0 diff --git a/security.py b/security.py index 8605bcf..ccac2a9 100644 --- a/security.py +++ b/security.py @@ -29,6 +29,11 @@ ALLOWED_COMMANDS = { # Node.js development "npm", "node", + # Python development + "python", + "python3", + "mypy", + "pytest", # Version control "git", # Process management