feat: Add Memory system with Weaviate integration and MCP tools

MEMORY SYSTEM ARCHITECTURE: - Weaviate-based memory storage (Thought, Message, Conversation collections) - GPU embeddings with BAAI/bge-m3 (1024-dim, RTX 4070) - 9 MCP tools for Claude Desktop integration CORE MODULES (memory/): - core/embedding_service.py: GPU embedder singleton with PyTorch - schemas/memory_schemas.py: Weaviate schema definitions - mcp/thought_tools.py: add_thought, search_thoughts, get_thought - mcp/message_tools.py: add_message, get_messages, search_messages - mcp/conversation_tools.py: get_conversation, search_conversations, list_conversations FLASK TEMPLATES: - conversation_view.html: Display single conversation with messages - conversations.html: List all conversations with search - memories.html: Browse and search thoughts FEATURES: - Semantic search across thoughts, messages, conversations - Privacy levels (private, shared, public) - Thought types (reflection, question, intuition, observation) - Conversation categories with filtering - Message ordering and role-based display DATA (as of 2026-01-08): - 102 Thoughts - 377 Messages - 12 Conversations DOCUMENTATION: - memory/README_MCP_TOOLS.md: Complete API reference and usage examples All MCP tools tested and validated (see test_memory_mcp_tools.py in archive). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-08 18:08:13 +01:00
parent 187ba4854e
commit 2f34125ef6
13 changed files with 2145 additions and 0 deletions
--- a/memory/README_MCP_TOOLS.md
+++ b/memory/README_MCP_TOOLS.md
@@ -0,0 +1,440 @@
+# Memory MCP Tools Documentation
+
+## Overview
+
+The Memory MCP tools provide a complete interface for managing thoughts, messages, and conversations in the unified Weaviate-based memory system. These tools are integrated into the Library RAG MCP server (`generations/library_rag/mcp_server.py`) and use GPU-accelerated embeddings for semantic search.
+
+## Architecture
+
+- **Backend**: Weaviate 1.34.4 (local instance)
+- **Embeddings**: BAAI/bge-m3 model (1024 dimensions, FP16 precision)
+- **GPU**: CUDA-enabled (RTX 4070) via PyTorch 2.6.0+cu124
+- **Collections**: 3 Weaviate collections (Thought, Message, Conversation)
+- **Integration**: FastMCP framework with async handlers
+
+## Available Tools
+
+### Thought Tools (3)
+
+#### 1. add_thought
+Add a new thought to the memory system.
+
+**Parameters:**
+- `content` (str, required): The thought content
+- `thought_type` (str, default="reflection"): Type of thought (reflection, question, intuition, observation, etc.)
+- `trigger` (str, default=""): What triggered this thought
+- `concepts` (list[str], default=[]): Related concepts/tags
+- `privacy_level` (str, default="private"): Privacy level (private, shared, public)
+
+**Returns:**
+```python
+{
+    "success": True,
+    "uuid": "730c1a8e-b09f-4889-bbe9-4867d0ee7f1a",
+    "content": "This is a test thought...",
+    "thought_type": "observation"
+}
+```
+
+**Example:**
+```python
+result = await add_thought(
+    content="Exploring vector databases for semantic search",
+    thought_type="observation",
+    trigger="Research session",
+    concepts=["weaviate", "embeddings", "gpu"],
+    privacy_level="private"
+)
+```
+
+#### 2. search_thoughts
+Search thoughts using semantic similarity.
+
+**Parameters:**
+- `query` (str, required): Search query text
+- `limit` (int, default=10, range=1-100): Maximum results to return
+- `thought_type_filter` (str, optional): Filter by thought type
+
+**Returns:**
+```python
+{
+    "success": True,
+    "query": "vector databases GPU",
+    "results": [
+        {
+            "uuid": "...",
+            "content": "...",
+            "thought_type": "observation",
+            "timestamp": "2025-01-08T...",
+            "trigger": "...",
+            "concepts": ["weaviate", "gpu"]
+        }
+    ],
+    "count": 5
+}
+```
+
+#### 3. get_thought
+Retrieve a specific thought by UUID.
+
+**Parameters:**
+- `uuid` (str, required): Thought UUID
+
+**Returns:**
+```python
+{
+    "success": True,
+    "uuid": "730c1a8e-b09f-4889-bbe9-4867d0ee7f1a",
+    "content": "...",
+    "thought_type": "observation",
+    "timestamp": "2025-01-08T...",
+    "trigger": "...",
+    "concepts": [...],
+    "privacy_level": "private",
+    "emotional_state": "",
+    "context": ""
+}
+```
+
+---
+
+### Message Tools (3)
+
+#### 1. add_message
+Add a new message to a conversation.
+
+**Parameters:**
+- `content` (str, required): Message content
+- `role` (str, required): Role (user, assistant, system)
+- `conversation_id` (str, required): Conversation identifier
+- `order_index` (int, default=0): Position in conversation
+
+**Returns:**
+```python
+{
+    "success": True,
+    "uuid": "...",
+    "content": "Hello, this is a test...",
+    "role": "user",
+    "conversation_id": "test_conversation_001"
+}
+```
+
+**Example:**
+```python
+result = await add_message(
+    content="Explain transformers in AI",
+    role="user",
+    conversation_id="chat_2025_01_08",
+    order_index=0
+)
+```
+
+#### 2. get_messages
+Get all messages from a conversation in order.
+
+**Parameters:**
+- `conversation_id` (str, required): Conversation identifier
+- `limit` (int, default=50, range=1-500): Maximum messages to return
+
+**Returns:**
+```python
+{
+    "success": True,
+    "conversation_id": "test_conversation_001",
+    "messages": [
+        {
+            "uuid": "...",
+            "content": "...",
+            "role": "user",
+            "timestamp": "2025-01-08T...",
+            "order_index": 0
+        },
+        {
+            "uuid": "...",
+            "content": "...",
+            "role": "assistant",
+            "timestamp": "2025-01-08T...",
+            "order_index": 1
+        }
+    ],
+    "count": 2
+}
+```
+
+#### 3. search_messages
+Search messages using semantic similarity.
+
+**Parameters:**
+- `query` (str, required): Search query text
+- `limit` (int, default=10, range=1-100): Maximum results
+- `conversation_id_filter` (str, optional): Filter by conversation
+
+**Returns:**
+```python
+{
+    "success": True,
+    "query": "transformers AI",
+    "results": [...],
+    "count": 5
+}
+```
+
+---
+
+### Conversation Tools (3)
+
+#### 1. get_conversation
+Get a specific conversation by ID.
+
+**Parameters:**
+- `conversation_id` (str, required): Conversation identifier
+
+**Returns:**
+```python
+{
+    "success": True,
+    "conversation_id": "ikario_derniere_pensee",
+    "category": "testing",
+    "summary": "Conversation with 2 participants...",
+    "timestamp_start": "2025-01-06T...",
+    "timestamp_end": "2025-01-06T...",
+    "participants": ["assistant", "user"],
+    "tags": [],
+    "message_count": 19
+}
+```
+
+#### 2. search_conversations
+Search conversations using semantic similarity on summaries.
+
+**Parameters:**
+- `query` (str, required): Search query text
+- `limit` (int, default=10, range=1-50): Maximum results
+- `category_filter` (str, optional): Filter by category
+
+**Returns:**
+```python
+{
+    "success": True,
+    "query": "philosophical discussion",
+    "results": [
+        {
+            "conversation_id": "...",
+            "category": "philosophy",
+            "summary": "...",
+            "timestamp_start": "...",
+            "timestamp_end": "...",
+            "participants": [...],
+            "message_count": 25
+        }
+    ],
+    "count": 5
+}
+```
+
+#### 3. list_conversations
+List all conversations with optional filtering.
+
+**Parameters:**
+- `limit` (int, default=20, range=1-100): Maximum conversations to return
+- `category_filter` (str, optional): Filter by category
+
+**Returns:**
+```python
+{
+    "success": True,
+    "conversations": [
+        {
+            "conversation_id": "...",
+            "category": "testing",
+            "summary": "Conversation with 2 participants... (truncated)",
+            "timestamp_start": "...",
+            "message_count": 19,
+            "participants": [...]
+        }
+    ],
+    "count": 10
+}
+```
+
+---
+
+## Implementation Details
+
+### Handler Pattern
+
+All tools follow a consistent async handler pattern:
+
+```python
+async def tool_handler(input_data: InputModel) -> Dict[str, Any]:
+    """Handler function."""
+    try:
+        # 1. Connect to Weaviate
+        client = weaviate.connect_to_local()
+
+        try:
+            # 2. Get GPU embedder (for vectorization)
+            embedder = get_embedder()
+
+            # 3. Generate vector (if needed)
+            vector = embedder.embed_batch([text])[0]
+
+            # 4. Query/Insert data
+            collection = client.collections.get("CollectionName")
+            result = collection.data.insert(...)
+
+            # 5. Return success response
+            return {"success": True, ...}
+
+        finally:
+            client.close()
+
+    except Exception as e:
+        return {"success": False, "error": str(e)}
+```
+
+### GPU Vectorization
+
+All text content is vectorized using the GPU-accelerated embedder:
+
+```python
+from memory.core import get_embedder
+
+embedder = get_embedder()  # Returns PyTorch GPU embedder
+vector = embedder.embed_batch([content])[0]  # Returns 1024-dim FP16 vector
+```
+
+### Weaviate Connection
+
+Each tool handler creates a new connection and closes it after use:
+
+```python
+client = weaviate.connect_to_local()  # Connects to localhost:8080
+try:
+    # Perform operations
+    collection = client.collections.get("Thought")
+    # ...
+finally:
+    client.close()  # Always close connection
+```
+
+## Testing
+
+A comprehensive test suite is available at `test_memory_mcp_tools.py`:
+
+```bash
+python test_memory_mcp_tools.py
+```
+
+**Test Results (2025-01-08):**
+```
+============================================================
+TESTING THOUGHT TOOLS
+============================================================
+[OK] add_thought: Created thought with UUID
+[OK] search_thoughts: Found 5 thoughts
+[OK] get_thought: Retrieved thought successfully
+
+============================================================
+TESTING MESSAGE TOOLS
+============================================================
+[OK] add_message: Added 3 messages (user, assistant, user)
+[OK] get_messages: Retrieved 3 messages in order
+[OK] search_messages: Found 5 messages
+
+============================================================
+TESTING CONVERSATION TOOLS
+============================================================
+[OK] list_conversations: Found 10 conversations
+[OK] get_conversation: Retrieved conversation metadata
+[OK] search_conversations: Found 5 conversations
+
+[OK] ALL TESTS COMPLETED
+============================================================
+```
+
+## Integration with MCP Server
+
+The Memory tools are integrated into `generations/library_rag/mcp_server.py` alongside the existing Library RAG tools:
+
+**Total tools available: 17**
+- Library RAG: 8 tools (search_documents, add_document, etc.)
+- Memory: 9 tools (thought, message, conversation tools)
+
+**Configuration:**
+The MCP server is configured in Claude Desktop settings:
+```json
+{
+  "mcpServers": {
+    "library-rag": {
+      "command": "python",
+      "args": ["C:/GitHub/linear_coding_library_rag/generations/library_rag/mcp_server.py"]
+    }
+  }
+}
+```
+
+## Error Handling
+
+All tools return consistent error responses:
+
+```python
+{
+    "success": False,
+    "error": "Error message description"
+}
+```
+
+Common errors:
+- Connection errors: "Failed to connect to Weaviate"
+- Not found: "Conversation {id} not found"
+- Validation errors: "Invalid parameter: {details}"
+
+## Performance
+
+- **Vectorization**: ~50-100ms per text on RTX 4070 GPU
+- **Search latency**: <100ms for near-vector queries
+- **Batch operations**: Use embedder.embed_batch() for efficiency
+
+## Next Steps
+
+**Phase 5: Backend Integration** (Pending)
+- Update Flask routes to use Weaviate Memory tools
+- Replace ChromaDB calls with new MCP tool calls
+- Connect flask-app frontend to new backend
+
+## Module Structure
+
+```
+memory/
+├── core/
+│   ├── __init__.py        # GPU embedder initialization
+│   └── config.py          # Weaviate connection config
+├── mcp/
+│   ├── __init__.py        # Tool exports
+│   ├── thought_tools.py   # Thought handlers
+│   ├── message_tools.py   # Message handlers
+│   └── conversation_tools.py  # Conversation handlers
+└── README_MCP_TOOLS.md    # This file
+```
+
+## Dependencies
+
+- weaviate-client >= 4.0.0
+- PyTorch 2.6.0+cu124
+- transformers (for BAAI/bge-m3)
+- pydantic (for input validation)
+- FastMCP framework
+
+## Related Documentation
+
+- Weaviate Schema: `memory/schemas/` (Thought, Message, Conversation schemas)
+- Migration Scripts: `memory/migration/` (ChromaDB → Weaviate migration)
+- Library RAG README: `generations/library_rag/README.md`
+
+---
+
+**Last Updated**: 2025-01-08
+**Status**: Phase 4 Complete ✓
+**Next Phase**: Phase 5 - Backend Integration