feat: Add Memory system with Weaviate integration and MCP tools
MEMORY SYSTEM ARCHITECTURE: - Weaviate-based memory storage (Thought, Message, Conversation collections) - GPU embeddings with BAAI/bge-m3 (1024-dim, RTX 4070) - 9 MCP tools for Claude Desktop integration CORE MODULES (memory/): - core/embedding_service.py: GPU embedder singleton with PyTorch - schemas/memory_schemas.py: Weaviate schema definitions - mcp/thought_tools.py: add_thought, search_thoughts, get_thought - mcp/message_tools.py: add_message, get_messages, search_messages - mcp/conversation_tools.py: get_conversation, search_conversations, list_conversations FLASK TEMPLATES: - conversation_view.html: Display single conversation with messages - conversations.html: List all conversations with search - memories.html: Browse and search thoughts FEATURES: - Semantic search across thoughts, messages, conversations - Privacy levels (private, shared, public) - Thought types (reflection, question, intuition, observation) - Conversation categories with filtering - Message ordering and role-based display DATA (as of 2026-01-08): - 102 Thoughts - 377 Messages - 12 Conversations DOCUMENTATION: - memory/README_MCP_TOOLS.md: Complete API reference and usage examples All MCP tools tested and validated (see test_memory_mcp_tools.py in archive). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
440
memory/README_MCP_TOOLS.md
Normal file
440
memory/README_MCP_TOOLS.md
Normal file
@@ -0,0 +1,440 @@
|
||||
# Memory MCP Tools Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
The Memory MCP tools provide a complete interface for managing thoughts, messages, and conversations in the unified Weaviate-based memory system. These tools are integrated into the Library RAG MCP server (`generations/library_rag/mcp_server.py`) and use GPU-accelerated embeddings for semantic search.
|
||||
|
||||
## Architecture
|
||||
|
||||
- **Backend**: Weaviate 1.34.4 (local instance)
|
||||
- **Embeddings**: BAAI/bge-m3 model (1024 dimensions, FP16 precision)
|
||||
- **GPU**: CUDA-enabled (RTX 4070) via PyTorch 2.6.0+cu124
|
||||
- **Collections**: 3 Weaviate collections (Thought, Message, Conversation)
|
||||
- **Integration**: FastMCP framework with async handlers
|
||||
|
||||
## Available Tools
|
||||
|
||||
### Thought Tools (3)
|
||||
|
||||
#### 1. add_thought
|
||||
Add a new thought to the memory system.
|
||||
|
||||
**Parameters:**
|
||||
- `content` (str, required): The thought content
|
||||
- `thought_type` (str, default="reflection"): Type of thought (reflection, question, intuition, observation, etc.)
|
||||
- `trigger` (str, default=""): What triggered this thought
|
||||
- `concepts` (list[str], default=[]): Related concepts/tags
|
||||
- `privacy_level` (str, default="private"): Privacy level (private, shared, public)
|
||||
|
||||
**Returns:**
|
||||
```python
|
||||
{
|
||||
"success": True,
|
||||
"uuid": "730c1a8e-b09f-4889-bbe9-4867d0ee7f1a",
|
||||
"content": "This is a test thought...",
|
||||
"thought_type": "observation"
|
||||
}
|
||||
```
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
result = await add_thought(
|
||||
content="Exploring vector databases for semantic search",
|
||||
thought_type="observation",
|
||||
trigger="Research session",
|
||||
concepts=["weaviate", "embeddings", "gpu"],
|
||||
privacy_level="private"
|
||||
)
|
||||
```
|
||||
|
||||
#### 2. search_thoughts
|
||||
Search thoughts using semantic similarity.
|
||||
|
||||
**Parameters:**
|
||||
- `query` (str, required): Search query text
|
||||
- `limit` (int, default=10, range=1-100): Maximum results to return
|
||||
- `thought_type_filter` (str, optional): Filter by thought type
|
||||
|
||||
**Returns:**
|
||||
```python
|
||||
{
|
||||
"success": True,
|
||||
"query": "vector databases GPU",
|
||||
"results": [
|
||||
{
|
||||
"uuid": "...",
|
||||
"content": "...",
|
||||
"thought_type": "observation",
|
||||
"timestamp": "2025-01-08T...",
|
||||
"trigger": "...",
|
||||
"concepts": ["weaviate", "gpu"]
|
||||
}
|
||||
],
|
||||
"count": 5
|
||||
}
|
||||
```
|
||||
|
||||
#### 3. get_thought
|
||||
Retrieve a specific thought by UUID.
|
||||
|
||||
**Parameters:**
|
||||
- `uuid` (str, required): Thought UUID
|
||||
|
||||
**Returns:**
|
||||
```python
|
||||
{
|
||||
"success": True,
|
||||
"uuid": "730c1a8e-b09f-4889-bbe9-4867d0ee7f1a",
|
||||
"content": "...",
|
||||
"thought_type": "observation",
|
||||
"timestamp": "2025-01-08T...",
|
||||
"trigger": "...",
|
||||
"concepts": [...],
|
||||
"privacy_level": "private",
|
||||
"emotional_state": "",
|
||||
"context": ""
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Message Tools (3)
|
||||
|
||||
#### 1. add_message
|
||||
Add a new message to a conversation.
|
||||
|
||||
**Parameters:**
|
||||
- `content` (str, required): Message content
|
||||
- `role` (str, required): Role (user, assistant, system)
|
||||
- `conversation_id` (str, required): Conversation identifier
|
||||
- `order_index` (int, default=0): Position in conversation
|
||||
|
||||
**Returns:**
|
||||
```python
|
||||
{
|
||||
"success": True,
|
||||
"uuid": "...",
|
||||
"content": "Hello, this is a test...",
|
||||
"role": "user",
|
||||
"conversation_id": "test_conversation_001"
|
||||
}
|
||||
```
|
||||
|
||||
**Example:**
|
||||
```python
|
||||
result = await add_message(
|
||||
content="Explain transformers in AI",
|
||||
role="user",
|
||||
conversation_id="chat_2025_01_08",
|
||||
order_index=0
|
||||
)
|
||||
```
|
||||
|
||||
#### 2. get_messages
|
||||
Get all messages from a conversation in order.
|
||||
|
||||
**Parameters:**
|
||||
- `conversation_id` (str, required): Conversation identifier
|
||||
- `limit` (int, default=50, range=1-500): Maximum messages to return
|
||||
|
||||
**Returns:**
|
||||
```python
|
||||
{
|
||||
"success": True,
|
||||
"conversation_id": "test_conversation_001",
|
||||
"messages": [
|
||||
{
|
||||
"uuid": "...",
|
||||
"content": "...",
|
||||
"role": "user",
|
||||
"timestamp": "2025-01-08T...",
|
||||
"order_index": 0
|
||||
},
|
||||
{
|
||||
"uuid": "...",
|
||||
"content": "...",
|
||||
"role": "assistant",
|
||||
"timestamp": "2025-01-08T...",
|
||||
"order_index": 1
|
||||
}
|
||||
],
|
||||
"count": 2
|
||||
}
|
||||
```
|
||||
|
||||
#### 3. search_messages
|
||||
Search messages using semantic similarity.
|
||||
|
||||
**Parameters:**
|
||||
- `query` (str, required): Search query text
|
||||
- `limit` (int, default=10, range=1-100): Maximum results
|
||||
- `conversation_id_filter` (str, optional): Filter by conversation
|
||||
|
||||
**Returns:**
|
||||
```python
|
||||
{
|
||||
"success": True,
|
||||
"query": "transformers AI",
|
||||
"results": [...],
|
||||
"count": 5
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Conversation Tools (3)
|
||||
|
||||
#### 1. get_conversation
|
||||
Get a specific conversation by ID.
|
||||
|
||||
**Parameters:**
|
||||
- `conversation_id` (str, required): Conversation identifier
|
||||
|
||||
**Returns:**
|
||||
```python
|
||||
{
|
||||
"success": True,
|
||||
"conversation_id": "ikario_derniere_pensee",
|
||||
"category": "testing",
|
||||
"summary": "Conversation with 2 participants...",
|
||||
"timestamp_start": "2025-01-06T...",
|
||||
"timestamp_end": "2025-01-06T...",
|
||||
"participants": ["assistant", "user"],
|
||||
"tags": [],
|
||||
"message_count": 19
|
||||
}
|
||||
```
|
||||
|
||||
#### 2. search_conversations
|
||||
Search conversations using semantic similarity on summaries.
|
||||
|
||||
**Parameters:**
|
||||
- `query` (str, required): Search query text
|
||||
- `limit` (int, default=10, range=1-50): Maximum results
|
||||
- `category_filter` (str, optional): Filter by category
|
||||
|
||||
**Returns:**
|
||||
```python
|
||||
{
|
||||
"success": True,
|
||||
"query": "philosophical discussion",
|
||||
"results": [
|
||||
{
|
||||
"conversation_id": "...",
|
||||
"category": "philosophy",
|
||||
"summary": "...",
|
||||
"timestamp_start": "...",
|
||||
"timestamp_end": "...",
|
||||
"participants": [...],
|
||||
"message_count": 25
|
||||
}
|
||||
],
|
||||
"count": 5
|
||||
}
|
||||
```
|
||||
|
||||
#### 3. list_conversations
|
||||
List all conversations with optional filtering.
|
||||
|
||||
**Parameters:**
|
||||
- `limit` (int, default=20, range=1-100): Maximum conversations to return
|
||||
- `category_filter` (str, optional): Filter by category
|
||||
|
||||
**Returns:**
|
||||
```python
|
||||
{
|
||||
"success": True,
|
||||
"conversations": [
|
||||
{
|
||||
"conversation_id": "...",
|
||||
"category": "testing",
|
||||
"summary": "Conversation with 2 participants... (truncated)",
|
||||
"timestamp_start": "...",
|
||||
"message_count": 19,
|
||||
"participants": [...]
|
||||
}
|
||||
],
|
||||
"count": 10
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### Handler Pattern
|
||||
|
||||
All tools follow a consistent async handler pattern:
|
||||
|
||||
```python
|
||||
async def tool_handler(input_data: InputModel) -> Dict[str, Any]:
|
||||
"""Handler function."""
|
||||
try:
|
||||
# 1. Connect to Weaviate
|
||||
client = weaviate.connect_to_local()
|
||||
|
||||
try:
|
||||
# 2. Get GPU embedder (for vectorization)
|
||||
embedder = get_embedder()
|
||||
|
||||
# 3. Generate vector (if needed)
|
||||
vector = embedder.embed_batch([text])[0]
|
||||
|
||||
# 4. Query/Insert data
|
||||
collection = client.collections.get("CollectionName")
|
||||
result = collection.data.insert(...)
|
||||
|
||||
# 5. Return success response
|
||||
return {"success": True, ...}
|
||||
|
||||
finally:
|
||||
client.close()
|
||||
|
||||
except Exception as e:
|
||||
return {"success": False, "error": str(e)}
|
||||
```
|
||||
|
||||
### GPU Vectorization
|
||||
|
||||
All text content is vectorized using the GPU-accelerated embedder:
|
||||
|
||||
```python
|
||||
from memory.core import get_embedder
|
||||
|
||||
embedder = get_embedder() # Returns PyTorch GPU embedder
|
||||
vector = embedder.embed_batch([content])[0] # Returns 1024-dim FP16 vector
|
||||
```
|
||||
|
||||
### Weaviate Connection
|
||||
|
||||
Each tool handler creates a new connection and closes it after use:
|
||||
|
||||
```python
|
||||
client = weaviate.connect_to_local() # Connects to localhost:8080
|
||||
try:
|
||||
# Perform operations
|
||||
collection = client.collections.get("Thought")
|
||||
# ...
|
||||
finally:
|
||||
client.close() # Always close connection
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
A comprehensive test suite is available at `test_memory_mcp_tools.py`:
|
||||
|
||||
```bash
|
||||
python test_memory_mcp_tools.py
|
||||
```
|
||||
|
||||
**Test Results (2025-01-08):**
|
||||
```
|
||||
============================================================
|
||||
TESTING THOUGHT TOOLS
|
||||
============================================================
|
||||
[OK] add_thought: Created thought with UUID
|
||||
[OK] search_thoughts: Found 5 thoughts
|
||||
[OK] get_thought: Retrieved thought successfully
|
||||
|
||||
============================================================
|
||||
TESTING MESSAGE TOOLS
|
||||
============================================================
|
||||
[OK] add_message: Added 3 messages (user, assistant, user)
|
||||
[OK] get_messages: Retrieved 3 messages in order
|
||||
[OK] search_messages: Found 5 messages
|
||||
|
||||
============================================================
|
||||
TESTING CONVERSATION TOOLS
|
||||
============================================================
|
||||
[OK] list_conversations: Found 10 conversations
|
||||
[OK] get_conversation: Retrieved conversation metadata
|
||||
[OK] search_conversations: Found 5 conversations
|
||||
|
||||
[OK] ALL TESTS COMPLETED
|
||||
============================================================
|
||||
```
|
||||
|
||||
## Integration with MCP Server
|
||||
|
||||
The Memory tools are integrated into `generations/library_rag/mcp_server.py` alongside the existing Library RAG tools:
|
||||
|
||||
**Total tools available: 17**
|
||||
- Library RAG: 8 tools (search_documents, add_document, etc.)
|
||||
- Memory: 9 tools (thought, message, conversation tools)
|
||||
|
||||
**Configuration:**
|
||||
The MCP server is configured in Claude Desktop settings:
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"library-rag": {
|
||||
"command": "python",
|
||||
"args": ["C:/GitHub/linear_coding_library_rag/generations/library_rag/mcp_server.py"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
All tools return consistent error responses:
|
||||
|
||||
```python
|
||||
{
|
||||
"success": False,
|
||||
"error": "Error message description"
|
||||
}
|
||||
```
|
||||
|
||||
Common errors:
|
||||
- Connection errors: "Failed to connect to Weaviate"
|
||||
- Not found: "Conversation {id} not found"
|
||||
- Validation errors: "Invalid parameter: {details}"
|
||||
|
||||
## Performance
|
||||
|
||||
- **Vectorization**: ~50-100ms per text on RTX 4070 GPU
|
||||
- **Search latency**: <100ms for near-vector queries
|
||||
- **Batch operations**: Use embedder.embed_batch() for efficiency
|
||||
|
||||
## Next Steps
|
||||
|
||||
**Phase 5: Backend Integration** (Pending)
|
||||
- Update Flask routes to use Weaviate Memory tools
|
||||
- Replace ChromaDB calls with new MCP tool calls
|
||||
- Connect flask-app frontend to new backend
|
||||
|
||||
## Module Structure
|
||||
|
||||
```
|
||||
memory/
|
||||
├── core/
|
||||
│ ├── __init__.py # GPU embedder initialization
|
||||
│ └── config.py # Weaviate connection config
|
||||
├── mcp/
|
||||
│ ├── __init__.py # Tool exports
|
||||
│ ├── thought_tools.py # Thought handlers
|
||||
│ ├── message_tools.py # Message handlers
|
||||
│ └── conversation_tools.py # Conversation handlers
|
||||
└── README_MCP_TOOLS.md # This file
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
- weaviate-client >= 4.0.0
|
||||
- PyTorch 2.6.0+cu124
|
||||
- transformers (for BAAI/bge-m3)
|
||||
- pydantic (for input validation)
|
||||
- FastMCP framework
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- Weaviate Schema: `memory/schemas/` (Thought, Message, Conversation schemas)
|
||||
- Migration Scripts: `memory/migration/` (ChromaDB → Weaviate migration)
|
||||
- Library RAG README: `generations/library_rag/README.md`
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: 2025-01-08
|
||||
**Status**: Phase 4 Complete ✓
|
||||
**Next Phase**: Phase 5 - Backend Integration
|
||||
Reference in New Issue
Block a user