feat: Add Memory system with Weaviate integration and MCP tools

MEMORY SYSTEM ARCHITECTURE:
- Weaviate-based memory storage (Thought, Message, Conversation collections)
- GPU embeddings with BAAI/bge-m3 (1024-dim, RTX 4070)
- 9 MCP tools for Claude Desktop integration

CORE MODULES (memory/):
- core/embedding_service.py: GPU embedder singleton with PyTorch
- schemas/memory_schemas.py: Weaviate schema definitions
- mcp/thought_tools.py: add_thought, search_thoughts, get_thought
- mcp/message_tools.py: add_message, get_messages, search_messages
- mcp/conversation_tools.py: get_conversation, search_conversations, list_conversations

FLASK TEMPLATES:
- conversation_view.html: Display single conversation with messages
- conversations.html: List all conversations with search
- memories.html: Browse and search thoughts

FEATURES:
- Semantic search across thoughts, messages, conversations
- Privacy levels (private, shared, public)
- Thought types (reflection, question, intuition, observation)
- Conversation categories with filtering
- Message ordering and role-based display

DATA (as of 2026-01-08):
- 102 Thoughts
- 377 Messages
- 12 Conversations

DOCUMENTATION:
- memory/README_MCP_TOOLS.md: Complete API reference and usage examples

All MCP tools tested and validated (see test_memory_mcp_tools.py in archive).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-01-08 18:08:13 +01:00
parent 187ba4854e
commit 2f34125ef6
13 changed files with 2145 additions and 0 deletions

440
memory/README_MCP_TOOLS.md Normal file
View File

@@ -0,0 +1,440 @@
# Memory MCP Tools Documentation
## Overview
The Memory MCP tools provide a complete interface for managing thoughts, messages, and conversations in the unified Weaviate-based memory system. These tools are integrated into the Library RAG MCP server (`generations/library_rag/mcp_server.py`) and use GPU-accelerated embeddings for semantic search.
## Architecture
- **Backend**: Weaviate 1.34.4 (local instance)
- **Embeddings**: BAAI/bge-m3 model (1024 dimensions, FP16 precision)
- **GPU**: CUDA-enabled (RTX 4070) via PyTorch 2.6.0+cu124
- **Collections**: 3 Weaviate collections (Thought, Message, Conversation)
- **Integration**: FastMCP framework with async handlers
## Available Tools
### Thought Tools (3)
#### 1. add_thought
Add a new thought to the memory system.
**Parameters:**
- `content` (str, required): The thought content
- `thought_type` (str, default="reflection"): Type of thought (reflection, question, intuition, observation, etc.)
- `trigger` (str, default=""): What triggered this thought
- `concepts` (list[str], default=[]): Related concepts/tags
- `privacy_level` (str, default="private"): Privacy level (private, shared, public)
**Returns:**
```python
{
"success": True,
"uuid": "730c1a8e-b09f-4889-bbe9-4867d0ee7f1a",
"content": "This is a test thought...",
"thought_type": "observation"
}
```
**Example:**
```python
result = await add_thought(
content="Exploring vector databases for semantic search",
thought_type="observation",
trigger="Research session",
concepts=["weaviate", "embeddings", "gpu"],
privacy_level="private"
)
```
#### 2. search_thoughts
Search thoughts using semantic similarity.
**Parameters:**
- `query` (str, required): Search query text
- `limit` (int, default=10, range=1-100): Maximum results to return
- `thought_type_filter` (str, optional): Filter by thought type
**Returns:**
```python
{
"success": True,
"query": "vector databases GPU",
"results": [
{
"uuid": "...",
"content": "...",
"thought_type": "observation",
"timestamp": "2025-01-08T...",
"trigger": "...",
"concepts": ["weaviate", "gpu"]
}
],
"count": 5
}
```
#### 3. get_thought
Retrieve a specific thought by UUID.
**Parameters:**
- `uuid` (str, required): Thought UUID
**Returns:**
```python
{
"success": True,
"uuid": "730c1a8e-b09f-4889-bbe9-4867d0ee7f1a",
"content": "...",
"thought_type": "observation",
"timestamp": "2025-01-08T...",
"trigger": "...",
"concepts": [...],
"privacy_level": "private",
"emotional_state": "",
"context": ""
}
```
---
### Message Tools (3)
#### 1. add_message
Add a new message to a conversation.
**Parameters:**
- `content` (str, required): Message content
- `role` (str, required): Role (user, assistant, system)
- `conversation_id` (str, required): Conversation identifier
- `order_index` (int, default=0): Position in conversation
**Returns:**
```python
{
"success": True,
"uuid": "...",
"content": "Hello, this is a test...",
"role": "user",
"conversation_id": "test_conversation_001"
}
```
**Example:**
```python
result = await add_message(
content="Explain transformers in AI",
role="user",
conversation_id="chat_2025_01_08",
order_index=0
)
```
#### 2. get_messages
Get all messages from a conversation in order.
**Parameters:**
- `conversation_id` (str, required): Conversation identifier
- `limit` (int, default=50, range=1-500): Maximum messages to return
**Returns:**
```python
{
"success": True,
"conversation_id": "test_conversation_001",
"messages": [
{
"uuid": "...",
"content": "...",
"role": "user",
"timestamp": "2025-01-08T...",
"order_index": 0
},
{
"uuid": "...",
"content": "...",
"role": "assistant",
"timestamp": "2025-01-08T...",
"order_index": 1
}
],
"count": 2
}
```
#### 3. search_messages
Search messages using semantic similarity.
**Parameters:**
- `query` (str, required): Search query text
- `limit` (int, default=10, range=1-100): Maximum results
- `conversation_id_filter` (str, optional): Filter by conversation
**Returns:**
```python
{
"success": True,
"query": "transformers AI",
"results": [...],
"count": 5
}
```
---
### Conversation Tools (3)
#### 1. get_conversation
Get a specific conversation by ID.
**Parameters:**
- `conversation_id` (str, required): Conversation identifier
**Returns:**
```python
{
"success": True,
"conversation_id": "ikario_derniere_pensee",
"category": "testing",
"summary": "Conversation with 2 participants...",
"timestamp_start": "2025-01-06T...",
"timestamp_end": "2025-01-06T...",
"participants": ["assistant", "user"],
"tags": [],
"message_count": 19
}
```
#### 2. search_conversations
Search conversations using semantic similarity on summaries.
**Parameters:**
- `query` (str, required): Search query text
- `limit` (int, default=10, range=1-50): Maximum results
- `category_filter` (str, optional): Filter by category
**Returns:**
```python
{
"success": True,
"query": "philosophical discussion",
"results": [
{
"conversation_id": "...",
"category": "philosophy",
"summary": "...",
"timestamp_start": "...",
"timestamp_end": "...",
"participants": [...],
"message_count": 25
}
],
"count": 5
}
```
#### 3. list_conversations
List all conversations with optional filtering.
**Parameters:**
- `limit` (int, default=20, range=1-100): Maximum conversations to return
- `category_filter` (str, optional): Filter by category
**Returns:**
```python
{
"success": True,
"conversations": [
{
"conversation_id": "...",
"category": "testing",
"summary": "Conversation with 2 participants... (truncated)",
"timestamp_start": "...",
"message_count": 19,
"participants": [...]
}
],
"count": 10
}
```
---
## Implementation Details
### Handler Pattern
All tools follow a consistent async handler pattern:
```python
async def tool_handler(input_data: InputModel) -> Dict[str, Any]:
"""Handler function."""
try:
# 1. Connect to Weaviate
client = weaviate.connect_to_local()
try:
# 2. Get GPU embedder (for vectorization)
embedder = get_embedder()
# 3. Generate vector (if needed)
vector = embedder.embed_batch([text])[0]
# 4. Query/Insert data
collection = client.collections.get("CollectionName")
result = collection.data.insert(...)
# 5. Return success response
return {"success": True, ...}
finally:
client.close()
except Exception as e:
return {"success": False, "error": str(e)}
```
### GPU Vectorization
All text content is vectorized using the GPU-accelerated embedder:
```python
from memory.core import get_embedder
embedder = get_embedder() # Returns PyTorch GPU embedder
vector = embedder.embed_batch([content])[0] # Returns 1024-dim FP16 vector
```
### Weaviate Connection
Each tool handler creates a new connection and closes it after use:
```python
client = weaviate.connect_to_local() # Connects to localhost:8080
try:
# Perform operations
collection = client.collections.get("Thought")
# ...
finally:
client.close() # Always close connection
```
## Testing
A comprehensive test suite is available at `test_memory_mcp_tools.py`:
```bash
python test_memory_mcp_tools.py
```
**Test Results (2025-01-08):**
```
============================================================
TESTING THOUGHT TOOLS
============================================================
[OK] add_thought: Created thought with UUID
[OK] search_thoughts: Found 5 thoughts
[OK] get_thought: Retrieved thought successfully
============================================================
TESTING MESSAGE TOOLS
============================================================
[OK] add_message: Added 3 messages (user, assistant, user)
[OK] get_messages: Retrieved 3 messages in order
[OK] search_messages: Found 5 messages
============================================================
TESTING CONVERSATION TOOLS
============================================================
[OK] list_conversations: Found 10 conversations
[OK] get_conversation: Retrieved conversation metadata
[OK] search_conversations: Found 5 conversations
[OK] ALL TESTS COMPLETED
============================================================
```
## Integration with MCP Server
The Memory tools are integrated into `generations/library_rag/mcp_server.py` alongside the existing Library RAG tools:
**Total tools available: 17**
- Library RAG: 8 tools (search_documents, add_document, etc.)
- Memory: 9 tools (thought, message, conversation tools)
**Configuration:**
The MCP server is configured in Claude Desktop settings:
```json
{
"mcpServers": {
"library-rag": {
"command": "python",
"args": ["C:/GitHub/linear_coding_library_rag/generations/library_rag/mcp_server.py"]
}
}
}
```
## Error Handling
All tools return consistent error responses:
```python
{
"success": False,
"error": "Error message description"
}
```
Common errors:
- Connection errors: "Failed to connect to Weaviate"
- Not found: "Conversation {id} not found"
- Validation errors: "Invalid parameter: {details}"
## Performance
- **Vectorization**: ~50-100ms per text on RTX 4070 GPU
- **Search latency**: <100ms for near-vector queries
- **Batch operations**: Use embedder.embed_batch() for efficiency
## Next Steps
**Phase 5: Backend Integration** (Pending)
- Update Flask routes to use Weaviate Memory tools
- Replace ChromaDB calls with new MCP tool calls
- Connect flask-app frontend to new backend
## Module Structure
```
memory/
├── core/
│ ├── __init__.py # GPU embedder initialization
│ └── config.py # Weaviate connection config
├── mcp/
│ ├── __init__.py # Tool exports
│ ├── thought_tools.py # Thought handlers
│ ├── message_tools.py # Message handlers
│ └── conversation_tools.py # Conversation handlers
└── README_MCP_TOOLS.md # This file
```
## Dependencies
- weaviate-client >= 4.0.0
- PyTorch 2.6.0+cu124
- transformers (for BAAI/bge-m3)
- pydantic (for input validation)
- FastMCP framework
## Related Documentation
- Weaviate Schema: `memory/schemas/` (Thought, Message, Conversation schemas)
- Migration Scripts: `memory/migration/` (ChromaDB → Weaviate migration)
- Library RAG README: `generations/library_rag/README.md`
---
**Last Updated**: 2025-01-08
**Status**: Phase 4 Complete ✓
**Next Phase**: Phase 5 - Backend Integration