MEMORY SYSTEM ARCHITECTURE: - Weaviate-based memory storage (Thought, Message, Conversation collections) - GPU embeddings with BAAI/bge-m3 (1024-dim, RTX 4070) - 9 MCP tools for Claude Desktop integration CORE MODULES (memory/): - core/embedding_service.py: GPU embedder singleton with PyTorch - schemas/memory_schemas.py: Weaviate schema definitions - mcp/thought_tools.py: add_thought, search_thoughts, get_thought - mcp/message_tools.py: add_message, get_messages, search_messages - mcp/conversation_tools.py: get_conversation, search_conversations, list_conversations FLASK TEMPLATES: - conversation_view.html: Display single conversation with messages - conversations.html: List all conversations with search - memories.html: Browse and search thoughts FEATURES: - Semantic search across thoughts, messages, conversations - Privacy levels (private, shared, public) - Thought types (reflection, question, intuition, observation) - Conversation categories with filtering - Message ordering and role-based display DATA (as of 2026-01-08): - 102 Thoughts - 377 Messages - 12 Conversations DOCUMENTATION: - memory/README_MCP_TOOLS.md: Complete API reference and usage examples All MCP tools tested and validated (see test_memory_mcp_tools.py in archive). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
11 KiB
Memory MCP Tools Documentation
Overview
The Memory MCP tools provide a complete interface for managing thoughts, messages, and conversations in the unified Weaviate-based memory system. These tools are integrated into the Library RAG MCP server (generations/library_rag/mcp_server.py) and use GPU-accelerated embeddings for semantic search.
Architecture
- Backend: Weaviate 1.34.4 (local instance)
- Embeddings: BAAI/bge-m3 model (1024 dimensions, FP16 precision)
- GPU: CUDA-enabled (RTX 4070) via PyTorch 2.6.0+cu124
- Collections: 3 Weaviate collections (Thought, Message, Conversation)
- Integration: FastMCP framework with async handlers
Available Tools
Thought Tools (3)
1. add_thought
Add a new thought to the memory system.
Parameters:
content(str, required): The thought contentthought_type(str, default="reflection"): Type of thought (reflection, question, intuition, observation, etc.)trigger(str, default=""): What triggered this thoughtconcepts(list[str], default=[]): Related concepts/tagsprivacy_level(str, default="private"): Privacy level (private, shared, public)
Returns:
{
"success": True,
"uuid": "730c1a8e-b09f-4889-bbe9-4867d0ee7f1a",
"content": "This is a test thought...",
"thought_type": "observation"
}
Example:
result = await add_thought(
content="Exploring vector databases for semantic search",
thought_type="observation",
trigger="Research session",
concepts=["weaviate", "embeddings", "gpu"],
privacy_level="private"
)
2. search_thoughts
Search thoughts using semantic similarity.
Parameters:
query(str, required): Search query textlimit(int, default=10, range=1-100): Maximum results to returnthought_type_filter(str, optional): Filter by thought type
Returns:
{
"success": True,
"query": "vector databases GPU",
"results": [
{
"uuid": "...",
"content": "...",
"thought_type": "observation",
"timestamp": "2025-01-08T...",
"trigger": "...",
"concepts": ["weaviate", "gpu"]
}
],
"count": 5
}
3. get_thought
Retrieve a specific thought by UUID.
Parameters:
uuid(str, required): Thought UUID
Returns:
{
"success": True,
"uuid": "730c1a8e-b09f-4889-bbe9-4867d0ee7f1a",
"content": "...",
"thought_type": "observation",
"timestamp": "2025-01-08T...",
"trigger": "...",
"concepts": [...],
"privacy_level": "private",
"emotional_state": "",
"context": ""
}
Message Tools (3)
1. add_message
Add a new message to a conversation.
Parameters:
content(str, required): Message contentrole(str, required): Role (user, assistant, system)conversation_id(str, required): Conversation identifierorder_index(int, default=0): Position in conversation
Returns:
{
"success": True,
"uuid": "...",
"content": "Hello, this is a test...",
"role": "user",
"conversation_id": "test_conversation_001"
}
Example:
result = await add_message(
content="Explain transformers in AI",
role="user",
conversation_id="chat_2025_01_08",
order_index=0
)
2. get_messages
Get all messages from a conversation in order.
Parameters:
conversation_id(str, required): Conversation identifierlimit(int, default=50, range=1-500): Maximum messages to return
Returns:
{
"success": True,
"conversation_id": "test_conversation_001",
"messages": [
{
"uuid": "...",
"content": "...",
"role": "user",
"timestamp": "2025-01-08T...",
"order_index": 0
},
{
"uuid": "...",
"content": "...",
"role": "assistant",
"timestamp": "2025-01-08T...",
"order_index": 1
}
],
"count": 2
}
3. search_messages
Search messages using semantic similarity.
Parameters:
query(str, required): Search query textlimit(int, default=10, range=1-100): Maximum resultsconversation_id_filter(str, optional): Filter by conversation
Returns:
{
"success": True,
"query": "transformers AI",
"results": [...],
"count": 5
}
Conversation Tools (3)
1. get_conversation
Get a specific conversation by ID.
Parameters:
conversation_id(str, required): Conversation identifier
Returns:
{
"success": True,
"conversation_id": "ikario_derniere_pensee",
"category": "testing",
"summary": "Conversation with 2 participants...",
"timestamp_start": "2025-01-06T...",
"timestamp_end": "2025-01-06T...",
"participants": ["assistant", "user"],
"tags": [],
"message_count": 19
}
2. search_conversations
Search conversations using semantic similarity on summaries.
Parameters:
query(str, required): Search query textlimit(int, default=10, range=1-50): Maximum resultscategory_filter(str, optional): Filter by category
Returns:
{
"success": True,
"query": "philosophical discussion",
"results": [
{
"conversation_id": "...",
"category": "philosophy",
"summary": "...",
"timestamp_start": "...",
"timestamp_end": "...",
"participants": [...],
"message_count": 25
}
],
"count": 5
}
3. list_conversations
List all conversations with optional filtering.
Parameters:
limit(int, default=20, range=1-100): Maximum conversations to returncategory_filter(str, optional): Filter by category
Returns:
{
"success": True,
"conversations": [
{
"conversation_id": "...",
"category": "testing",
"summary": "Conversation with 2 participants... (truncated)",
"timestamp_start": "...",
"message_count": 19,
"participants": [...]
}
],
"count": 10
}
Implementation Details
Handler Pattern
All tools follow a consistent async handler pattern:
async def tool_handler(input_data: InputModel) -> Dict[str, Any]:
"""Handler function."""
try:
# 1. Connect to Weaviate
client = weaviate.connect_to_local()
try:
# 2. Get GPU embedder (for vectorization)
embedder = get_embedder()
# 3. Generate vector (if needed)
vector = embedder.embed_batch([text])[0]
# 4. Query/Insert data
collection = client.collections.get("CollectionName")
result = collection.data.insert(...)
# 5. Return success response
return {"success": True, ...}
finally:
client.close()
except Exception as e:
return {"success": False, "error": str(e)}
GPU Vectorization
All text content is vectorized using the GPU-accelerated embedder:
from memory.core import get_embedder
embedder = get_embedder() # Returns PyTorch GPU embedder
vector = embedder.embed_batch([content])[0] # Returns 1024-dim FP16 vector
Weaviate Connection
Each tool handler creates a new connection and closes it after use:
client = weaviate.connect_to_local() # Connects to localhost:8080
try:
# Perform operations
collection = client.collections.get("Thought")
# ...
finally:
client.close() # Always close connection
Testing
A comprehensive test suite is available at test_memory_mcp_tools.py:
python test_memory_mcp_tools.py
Test Results (2025-01-08):
============================================================
TESTING THOUGHT TOOLS
============================================================
[OK] add_thought: Created thought with UUID
[OK] search_thoughts: Found 5 thoughts
[OK] get_thought: Retrieved thought successfully
============================================================
TESTING MESSAGE TOOLS
============================================================
[OK] add_message: Added 3 messages (user, assistant, user)
[OK] get_messages: Retrieved 3 messages in order
[OK] search_messages: Found 5 messages
============================================================
TESTING CONVERSATION TOOLS
============================================================
[OK] list_conversations: Found 10 conversations
[OK] get_conversation: Retrieved conversation metadata
[OK] search_conversations: Found 5 conversations
[OK] ALL TESTS COMPLETED
============================================================
Integration with MCP Server
The Memory tools are integrated into generations/library_rag/mcp_server.py alongside the existing Library RAG tools:
Total tools available: 17
- Library RAG: 8 tools (search_documents, add_document, etc.)
- Memory: 9 tools (thought, message, conversation tools)
Configuration: The MCP server is configured in Claude Desktop settings:
{
"mcpServers": {
"library-rag": {
"command": "python",
"args": ["C:/GitHub/linear_coding_library_rag/generations/library_rag/mcp_server.py"]
}
}
}
Error Handling
All tools return consistent error responses:
{
"success": False,
"error": "Error message description"
}
Common errors:
- Connection errors: "Failed to connect to Weaviate"
- Not found: "Conversation {id} not found"
- Validation errors: "Invalid parameter: {details}"
Performance
- Vectorization: ~50-100ms per text on RTX 4070 GPU
- Search latency: <100ms for near-vector queries
- Batch operations: Use embedder.embed_batch() for efficiency
Next Steps
Phase 5: Backend Integration (Pending)
- Update Flask routes to use Weaviate Memory tools
- Replace ChromaDB calls with new MCP tool calls
- Connect flask-app frontend to new backend
Module Structure
memory/
├── core/
│ ├── __init__.py # GPU embedder initialization
│ └── config.py # Weaviate connection config
├── mcp/
│ ├── __init__.py # Tool exports
│ ├── thought_tools.py # Thought handlers
│ ├── message_tools.py # Message handlers
│ └── conversation_tools.py # Conversation handlers
└── README_MCP_TOOLS.md # This file
Dependencies
- weaviate-client >= 4.0.0
- PyTorch 2.6.0+cu124
- transformers (for BAAI/bge-m3)
- pydantic (for input validation)
- FastMCP framework
Related Documentation
- Weaviate Schema:
memory/schemas/(Thought, Message, Conversation schemas) - Migration Scripts:
memory/migration/(ChromaDB → Weaviate migration) - Library RAG README:
generations/library_rag/README.md
Last Updated: 2025-01-08 Status: Phase 4 Complete ✓ Next Phase: Phase 5 - Backend Integration