feat: Add Memory system with Weaviate integration and MCP tools

MEMORY SYSTEM ARCHITECTURE:
- Weaviate-based memory storage (Thought, Message, Conversation collections)
- GPU embeddings with BAAI/bge-m3 (1024-dim, RTX 4070)
- 9 MCP tools for Claude Desktop integration

CORE MODULES (memory/):
- core/embedding_service.py: GPU embedder singleton with PyTorch
- schemas/memory_schemas.py: Weaviate schema definitions
- mcp/thought_tools.py: add_thought, search_thoughts, get_thought
- mcp/message_tools.py: add_message, get_messages, search_messages
- mcp/conversation_tools.py: get_conversation, search_conversations, list_conversations

FLASK TEMPLATES:
- conversation_view.html: Display single conversation with messages
- conversations.html: List all conversations with search
- memories.html: Browse and search thoughts

FEATURES:
- Semantic search across thoughts, messages, conversations
- Privacy levels (private, shared, public)
- Thought types (reflection, question, intuition, observation)
- Conversation categories with filtering
- Message ordering and role-based display

DATA (as of 2026-01-08):
- 102 Thoughts
- 377 Messages
- 12 Conversations

DOCUMENTATION:
- memory/README_MCP_TOOLS.md: Complete API reference and usage examples

All MCP tools tested and validated (see test_memory_mcp_tools.py in archive).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-01-08 18:08:13 +01:00
parent 187ba4854e
commit 2f34125ef6
13 changed files with 2145 additions and 0 deletions

440
memory/README_MCP_TOOLS.md Normal file
View File

@@ -0,0 +1,440 @@
# Memory MCP Tools Documentation
## Overview
The Memory MCP tools provide a complete interface for managing thoughts, messages, and conversations in the unified Weaviate-based memory system. These tools are integrated into the Library RAG MCP server (`generations/library_rag/mcp_server.py`) and use GPU-accelerated embeddings for semantic search.
## Architecture
- **Backend**: Weaviate 1.34.4 (local instance)
- **Embeddings**: BAAI/bge-m3 model (1024 dimensions, FP16 precision)
- **GPU**: CUDA-enabled (RTX 4070) via PyTorch 2.6.0+cu124
- **Collections**: 3 Weaviate collections (Thought, Message, Conversation)
- **Integration**: FastMCP framework with async handlers
## Available Tools
### Thought Tools (3)
#### 1. add_thought
Add a new thought to the memory system.
**Parameters:**
- `content` (str, required): The thought content
- `thought_type` (str, default="reflection"): Type of thought (reflection, question, intuition, observation, etc.)
- `trigger` (str, default=""): What triggered this thought
- `concepts` (list[str], default=[]): Related concepts/tags
- `privacy_level` (str, default="private"): Privacy level (private, shared, public)
**Returns:**
```python
{
"success": True,
"uuid": "730c1a8e-b09f-4889-bbe9-4867d0ee7f1a",
"content": "This is a test thought...",
"thought_type": "observation"
}
```
**Example:**
```python
result = await add_thought(
content="Exploring vector databases for semantic search",
thought_type="observation",
trigger="Research session",
concepts=["weaviate", "embeddings", "gpu"],
privacy_level="private"
)
```
#### 2. search_thoughts
Search thoughts using semantic similarity.
**Parameters:**
- `query` (str, required): Search query text
- `limit` (int, default=10, range=1-100): Maximum results to return
- `thought_type_filter` (str, optional): Filter by thought type
**Returns:**
```python
{
"success": True,
"query": "vector databases GPU",
"results": [
{
"uuid": "...",
"content": "...",
"thought_type": "observation",
"timestamp": "2025-01-08T...",
"trigger": "...",
"concepts": ["weaviate", "gpu"]
}
],
"count": 5
}
```
#### 3. get_thought
Retrieve a specific thought by UUID.
**Parameters:**
- `uuid` (str, required): Thought UUID
**Returns:**
```python
{
"success": True,
"uuid": "730c1a8e-b09f-4889-bbe9-4867d0ee7f1a",
"content": "...",
"thought_type": "observation",
"timestamp": "2025-01-08T...",
"trigger": "...",
"concepts": [...],
"privacy_level": "private",
"emotional_state": "",
"context": ""
}
```
---
### Message Tools (3)
#### 1. add_message
Add a new message to a conversation.
**Parameters:**
- `content` (str, required): Message content
- `role` (str, required): Role (user, assistant, system)
- `conversation_id` (str, required): Conversation identifier
- `order_index` (int, default=0): Position in conversation
**Returns:**
```python
{
"success": True,
"uuid": "...",
"content": "Hello, this is a test...",
"role": "user",
"conversation_id": "test_conversation_001"
}
```
**Example:**
```python
result = await add_message(
content="Explain transformers in AI",
role="user",
conversation_id="chat_2025_01_08",
order_index=0
)
```
#### 2. get_messages
Get all messages from a conversation in order.
**Parameters:**
- `conversation_id` (str, required): Conversation identifier
- `limit` (int, default=50, range=1-500): Maximum messages to return
**Returns:**
```python
{
"success": True,
"conversation_id": "test_conversation_001",
"messages": [
{
"uuid": "...",
"content": "...",
"role": "user",
"timestamp": "2025-01-08T...",
"order_index": 0
},
{
"uuid": "...",
"content": "...",
"role": "assistant",
"timestamp": "2025-01-08T...",
"order_index": 1
}
],
"count": 2
}
```
#### 3. search_messages
Search messages using semantic similarity.
**Parameters:**
- `query` (str, required): Search query text
- `limit` (int, default=10, range=1-100): Maximum results
- `conversation_id_filter` (str, optional): Filter by conversation
**Returns:**
```python
{
"success": True,
"query": "transformers AI",
"results": [...],
"count": 5
}
```
---
### Conversation Tools (3)
#### 1. get_conversation
Get a specific conversation by ID.
**Parameters:**
- `conversation_id` (str, required): Conversation identifier
**Returns:**
```python
{
"success": True,
"conversation_id": "ikario_derniere_pensee",
"category": "testing",
"summary": "Conversation with 2 participants...",
"timestamp_start": "2025-01-06T...",
"timestamp_end": "2025-01-06T...",
"participants": ["assistant", "user"],
"tags": [],
"message_count": 19
}
```
#### 2. search_conversations
Search conversations using semantic similarity on summaries.
**Parameters:**
- `query` (str, required): Search query text
- `limit` (int, default=10, range=1-50): Maximum results
- `category_filter` (str, optional): Filter by category
**Returns:**
```python
{
"success": True,
"query": "philosophical discussion",
"results": [
{
"conversation_id": "...",
"category": "philosophy",
"summary": "...",
"timestamp_start": "...",
"timestamp_end": "...",
"participants": [...],
"message_count": 25
}
],
"count": 5
}
```
#### 3. list_conversations
List all conversations with optional filtering.
**Parameters:**
- `limit` (int, default=20, range=1-100): Maximum conversations to return
- `category_filter` (str, optional): Filter by category
**Returns:**
```python
{
"success": True,
"conversations": [
{
"conversation_id": "...",
"category": "testing",
"summary": "Conversation with 2 participants... (truncated)",
"timestamp_start": "...",
"message_count": 19,
"participants": [...]
}
],
"count": 10
}
```
---
## Implementation Details
### Handler Pattern
All tools follow a consistent async handler pattern:
```python
async def tool_handler(input_data: InputModel) -> Dict[str, Any]:
"""Handler function."""
try:
# 1. Connect to Weaviate
client = weaviate.connect_to_local()
try:
# 2. Get GPU embedder (for vectorization)
embedder = get_embedder()
# 3. Generate vector (if needed)
vector = embedder.embed_batch([text])[0]
# 4. Query/Insert data
collection = client.collections.get("CollectionName")
result = collection.data.insert(...)
# 5. Return success response
return {"success": True, ...}
finally:
client.close()
except Exception as e:
return {"success": False, "error": str(e)}
```
### GPU Vectorization
All text content is vectorized using the GPU-accelerated embedder:
```python
from memory.core import get_embedder
embedder = get_embedder() # Returns PyTorch GPU embedder
vector = embedder.embed_batch([content])[0] # Returns 1024-dim FP16 vector
```
### Weaviate Connection
Each tool handler creates a new connection and closes it after use:
```python
client = weaviate.connect_to_local() # Connects to localhost:8080
try:
# Perform operations
collection = client.collections.get("Thought")
# ...
finally:
client.close() # Always close connection
```
## Testing
A comprehensive test suite is available at `test_memory_mcp_tools.py`:
```bash
python test_memory_mcp_tools.py
```
**Test Results (2025-01-08):**
```
============================================================
TESTING THOUGHT TOOLS
============================================================
[OK] add_thought: Created thought with UUID
[OK] search_thoughts: Found 5 thoughts
[OK] get_thought: Retrieved thought successfully
============================================================
TESTING MESSAGE TOOLS
============================================================
[OK] add_message: Added 3 messages (user, assistant, user)
[OK] get_messages: Retrieved 3 messages in order
[OK] search_messages: Found 5 messages
============================================================
TESTING CONVERSATION TOOLS
============================================================
[OK] list_conversations: Found 10 conversations
[OK] get_conversation: Retrieved conversation metadata
[OK] search_conversations: Found 5 conversations
[OK] ALL TESTS COMPLETED
============================================================
```
## Integration with MCP Server
The Memory tools are integrated into `generations/library_rag/mcp_server.py` alongside the existing Library RAG tools:
**Total tools available: 17**
- Library RAG: 8 tools (search_documents, add_document, etc.)
- Memory: 9 tools (thought, message, conversation tools)
**Configuration:**
The MCP server is configured in Claude Desktop settings:
```json
{
"mcpServers": {
"library-rag": {
"command": "python",
"args": ["C:/GitHub/linear_coding_library_rag/generations/library_rag/mcp_server.py"]
}
}
}
```
## Error Handling
All tools return consistent error responses:
```python
{
"success": False,
"error": "Error message description"
}
```
Common errors:
- Connection errors: "Failed to connect to Weaviate"
- Not found: "Conversation {id} not found"
- Validation errors: "Invalid parameter: {details}"
## Performance
- **Vectorization**: ~50-100ms per text on RTX 4070 GPU
- **Search latency**: <100ms for near-vector queries
- **Batch operations**: Use embedder.embed_batch() for efficiency
## Next Steps
**Phase 5: Backend Integration** (Pending)
- Update Flask routes to use Weaviate Memory tools
- Replace ChromaDB calls with new MCP tool calls
- Connect flask-app frontend to new backend
## Module Structure
```
memory/
├── core/
│ ├── __init__.py # GPU embedder initialization
│ └── config.py # Weaviate connection config
├── mcp/
│ ├── __init__.py # Tool exports
│ ├── thought_tools.py # Thought handlers
│ ├── message_tools.py # Message handlers
│ └── conversation_tools.py # Conversation handlers
└── README_MCP_TOOLS.md # This file
```
## Dependencies
- weaviate-client >= 4.0.0
- PyTorch 2.6.0+cu124
- transformers (for BAAI/bge-m3)
- pydantic (for input validation)
- FastMCP framework
## Related Documentation
- Weaviate Schema: `memory/schemas/` (Thought, Message, Conversation schemas)
- Migration Scripts: `memory/migration/` (ChromaDB → Weaviate migration)
- Library RAG README: `generations/library_rag/README.md`
---
**Last Updated**: 2025-01-08
**Status**: Phase 4 Complete ✓
**Next Phase**: Phase 5 - Backend Integration

18
memory/__init__.py Normal file
View File

@@ -0,0 +1,18 @@
"""
Ikario Unified Memory System.
This package provides the unified RAG system combining:
- Personal memory (thoughts, conversations)
- Philosophical library (works, documents, chunks)
Architecture:
- Weaviate 1.34.4 vector database
- GPU embeddings (BAAI/bge-m3 on RTX 4070)
- 7 collections: Thought, Conversation, Message, Work, Document, Chunk, Summary
- 17 MCP tools (9 memory + 8 library)
See: PLAN_MIGRATION_WEAVIATE_GPU.md
"""
__version__ = "2.0.0"
__author__ = "Ikario Project"

31
memory/core/__init__.py Normal file
View File

@@ -0,0 +1,31 @@
"""
Memory Core Module - GPU Embedding Service and Utilities.
This module provides core functionality for the unified RAG system:
- GPU-accelerated embeddings (RTX 4070 + PyTorch CUDA)
- Singleton embedding service
- Weaviate connection utilities
Usage:
from memory.core import get_embedder, embed_text
# Get singleton embedder
embedder = get_embedder()
# Embed text
embedding = embed_text("Hello world")
"""
from memory.core.embedding_service import (
GPUEmbeddingService,
get_embedder,
embed_text,
embed_texts,
)
__all__ = [
"GPUEmbeddingService",
"get_embedder",
"embed_text",
"embed_texts",
]

View File

@@ -0,0 +1,271 @@
#!/usr/bin/env python3
"""
GPU Embedding Service - Singleton for RTX 4070.
This module provides a singleton service for generating embeddings using
BAAI/bge-m3 model on GPU with FP16 precision.
Architecture:
- Singleton pattern: One model instance shared across application
- PyTorch CUDA: RTX 4070 with 8 GB VRAM
- FP16 precision: Reduces VRAM usage by ~50%
- Optimal batch size: 48 (tested for RTX 4070 with 5.3 GB available)
Performance (RTX 4070):
- Single embedding: ~17 ms
- Batch 48: ~34 ms (0.71 ms per item)
- VRAM usage: ~2.6 GB peak
Usage:
from memory.core.embedding_service import get_embedder
embedder = get_embedder()
# Single text
embedding = embedder.embed_single("Test text")
# Batch
embeddings = embedder.embed_batch(["Text 1", "Text 2", ...])
"""
import torch
from sentence_transformers import SentenceTransformer
from typing import List, Union
import logging
import numpy as np
logger = logging.getLogger(__name__)
class GPUEmbeddingService:
"""Singleton GPU embedding service using BAAI/bge-m3."""
_instance = None
_initialized = False
def __new__(cls):
"""Singleton pattern: only one instance."""
if cls._instance is None:
cls._instance = super(GPUEmbeddingService, cls).__new__(cls)
return cls._instance
def __init__(self):
"""Initialize GPU embedder (only once)."""
if self._initialized:
return
logger.info("Initializing GPU Embedding Service...")
# Check CUDA availability
if not torch.cuda.is_available():
raise RuntimeError(
"CUDA not available! GPU embedding service requires PyTorch with CUDA.\n"
"Install with: pip install torch --index-url https://download.pytorch.org/whl/cu124"
)
# Device configuration
self.device = torch.device("cuda:0")
logger.info(f"Using GPU: {torch.cuda.get_device_name(0)}")
# Model configuration
self.model_name = "BAAI/bge-m3"
self.embedding_dim = 1024
self.max_seq_length = 8192
# Load model on GPU
logger.info(f"Loading {self.model_name} on GPU...")
self.model = SentenceTransformer(self.model_name, device=str(self.device))
# Convert to FP16 for memory efficiency
logger.info("Converting model to FP16 precision...")
self.model.half()
# Optimal batch size for RTX 4070 (5.3 GB VRAM available)
# Tested: batch 48 uses ~3.5 GB VRAM, leaves ~1.8 GB buffer
self.optimal_batch_size = 48
# VRAM monitoring
self._log_vram_usage()
self._initialized = True
logger.info("GPU Embedding Service initialized successfully")
def _log_vram_usage(self):
"""Log current VRAM usage."""
allocated = torch.cuda.memory_allocated(0) / 1024**3
reserved = torch.cuda.memory_reserved(0) / 1024**3
total = torch.cuda.get_device_properties(0).total_memory / 1024**3
logger.info(
f"VRAM: {allocated:.2f} GB allocated, "
f"{reserved:.2f} GB reserved, "
f"{total:.2f} GB total"
)
def embed_single(self, text: str) -> np.ndarray:
"""
Embed a single text.
Args:
text: Text to embed.
Returns:
Embedding vector (1024 dimensions).
Example:
>>> embedder = get_embedder()
>>> emb = embedder.embed_single("Hello world")
>>> emb.shape
(1024,)
"""
# Use convert_to_numpy=False to keep tensor on GPU
embedding_tensor = self.model.encode(
text,
convert_to_numpy=False,
show_progress_bar=False
)
# Convert to numpy on CPU
return embedding_tensor.cpu().numpy()
def embed_batch(
self,
texts: List[str],
batch_size: int = None,
show_progress: bool = False
) -> np.ndarray:
"""
Embed a batch of texts.
Args:
texts: List of texts to embed.
batch_size: Batch size (default: optimal_batch_size=48).
show_progress: Show progress bar.
Returns:
Array of embeddings, shape (len(texts), 1024).
Example:
>>> embedder = get_embedder()
>>> texts = ["Text 1", "Text 2", "Text 3"]
>>> embs = embedder.embed_batch(texts)
>>> embs.shape
(3, 1024)
"""
if batch_size is None:
batch_size = self.optimal_batch_size
# Adjust batch size if VRAM is low
if batch_size > self.optimal_batch_size:
logger.warning(
f"Batch size {batch_size} exceeds optimal {self.optimal_batch_size}, "
f"reducing to avoid OOM"
)
batch_size = self.optimal_batch_size
# Encode on GPU, keep as tensor
embeddings_tensor = self.model.encode(
texts,
batch_size=batch_size,
convert_to_numpy=False,
show_progress_bar=show_progress
)
# Handle both tensor and list of tensors
if isinstance(embeddings_tensor, list):
embeddings_tensor = torch.stack(embeddings_tensor)
# Convert to numpy on CPU
return embeddings_tensor.cpu().numpy()
def get_embedding_dimension(self) -> int:
"""Get embedding dimension (1024 for bge-m3)."""
return self.embedding_dim
def get_model_info(self) -> dict:
"""
Get model information.
Returns:
Dictionary with model metadata.
"""
return {
"model_name": self.model_name,
"embedding_dim": self.embedding_dim,
"max_seq_length": self.max_seq_length,
"device": str(self.device),
"optimal_batch_size": self.optimal_batch_size,
"precision": "FP16",
"vram_allocated_gb": torch.cuda.memory_allocated(0) / 1024**3,
"vram_reserved_gb": torch.cuda.memory_reserved(0) / 1024**3,
}
def clear_cache(self):
"""Clear CUDA cache to free VRAM."""
torch.cuda.empty_cache()
logger.info("CUDA cache cleared")
self._log_vram_usage()
def adjust_batch_size(self, new_batch_size: int):
"""
Adjust optimal batch size (for OOM handling).
Args:
new_batch_size: New batch size to use.
"""
logger.warning(
f"Adjusting batch size from {self.optimal_batch_size} to {new_batch_size}"
)
self.optimal_batch_size = new_batch_size
# Singleton accessor
_embedder_instance = None
def get_embedder() -> GPUEmbeddingService:
"""
Get the singleton GPU embedding service.
Returns:
Initialized GPUEmbeddingService instance.
Example:
>>> from memory.core.embedding_service import get_embedder
>>> embedder = get_embedder()
>>> emb = embedder.embed_single("Test")
"""
global _embedder_instance
if _embedder_instance is None:
_embedder_instance = GPUEmbeddingService()
return _embedder_instance
# Convenience functions
def embed_text(text: str) -> np.ndarray:
"""
Convenience function to embed single text.
Args:
text: Text to embed.
Returns:
Embedding vector (1024 dimensions).
"""
return get_embedder().embed_single(text)
def embed_texts(texts: List[str], batch_size: int = None) -> np.ndarray:
"""
Convenience function to embed batch of texts.
Args:
texts: Texts to embed.
batch_size: Batch size (default: optimal).
Returns:
Array of embeddings, shape (len(texts), 1024).
"""
return get_embedder().embed_batch(texts, batch_size=batch_size)

56
memory/mcp/__init__.py Normal file
View File

@@ -0,0 +1,56 @@
"""
Memory MCP Tools Package.
Provides MCP tools for Memory system (Thoughts, Messages, Conversations).
"""
from memory.mcp.thought_tools import (
AddThoughtInput,
SearchThoughtsInput,
add_thought_handler,
search_thoughts_handler,
get_thought_handler,
)
from memory.mcp.message_tools import (
AddMessageInput,
GetMessagesInput,
SearchMessagesInput,
add_message_handler,
get_messages_handler,
search_messages_handler,
)
from memory.mcp.conversation_tools import (
GetConversationInput,
SearchConversationsInput,
ListConversationsInput,
get_conversation_handler,
search_conversations_handler,
list_conversations_handler,
)
__all__ = [
# Thought tools
"AddThoughtInput",
"SearchThoughtsInput",
"add_thought_handler",
"search_thoughts_handler",
"get_thought_handler",
# Message tools
"AddMessageInput",
"GetMessagesInput",
"SearchMessagesInput",
"add_message_handler",
"get_messages_handler",
"search_messages_handler",
# Conversation tools
"GetConversationInput",
"SearchConversationsInput",
"ListConversationsInput",
"get_conversation_handler",
"search_conversations_handler",
"list_conversations_handler",
]

View File

@@ -0,0 +1,208 @@
"""
Conversation MCP Tools - Handlers for conversation-related operations.
Provides tools for searching and retrieving conversations.
"""
import weaviate
from typing import Any, Dict
from pydantic import BaseModel, Field
from memory.core import get_embedder
class GetConversationInput(BaseModel):
"""Input for get_conversation tool."""
conversation_id: str = Field(..., description="Conversation identifier")
class SearchConversationsInput(BaseModel):
"""Input for search_conversations tool."""
query: str = Field(..., description="Search query text")
limit: int = Field(default=10, ge=1, le=50, description="Maximum results")
category_filter: str | None = Field(default=None, description="Filter by category")
class ListConversationsInput(BaseModel):
"""Input for list_conversations tool."""
limit: int = Field(default=20, ge=1, le=100, description="Maximum conversations")
category_filter: str | None = Field(default=None, description="Filter by category")
async def get_conversation_handler(input_data: GetConversationInput) -> Dict[str, Any]:
"""
Get a specific conversation by ID.
Args:
input_data: Query parameters.
Returns:
Dictionary with conversation data.
"""
try:
# Connect to Weaviate
client = weaviate.connect_to_local()
try:
# Get collection
collection = client.collections.get("Conversation")
# Fetch by conversation_id
results = collection.query.fetch_objects(
filters=weaviate.classes.query.Filter.by_property("conversation_id").equal(input_data.conversation_id),
limit=1,
)
if not results.objects:
return {
"success": False,
"error": f"Conversation {input_data.conversation_id} not found",
}
obj = results.objects[0]
return {
"success": True,
"conversation_id": obj.properties['conversation_id'],
"category": obj.properties['category'],
"summary": obj.properties['summary'],
"timestamp_start": obj.properties['timestamp_start'],
"timestamp_end": obj.properties['timestamp_end'],
"participants": obj.properties['participants'],
"tags": obj.properties.get('tags', []),
"message_count": obj.properties['message_count'],
}
finally:
client.close()
except Exception as e:
return {
"success": False,
"error": str(e),
}
async def search_conversations_handler(input_data: SearchConversationsInput) -> Dict[str, Any]:
"""
Search conversations using semantic similarity.
Args:
input_data: Search parameters.
Returns:
Dictionary with search results.
"""
try:
# Connect to Weaviate
client = weaviate.connect_to_local()
try:
# Get embedder
embedder = get_embedder()
# Generate query vector
query_vector = embedder.embed_batch([input_data.query])[0]
# Get collection
collection = client.collections.get("Conversation")
# Build query
query_builder = collection.query.near_vector(
near_vector=query_vector.tolist(),
limit=input_data.limit,
)
# Apply category filter if provided
if input_data.category_filter:
query_builder = query_builder.where(
weaviate.classes.query.Filter.by_property("category").equal(input_data.category_filter)
)
# Execute search
results = query_builder.objects
# Format results
conversations = []
for obj in results:
conversations.append({
"conversation_id": obj.properties['conversation_id'],
"category": obj.properties['category'],
"summary": obj.properties['summary'],
"timestamp_start": obj.properties['timestamp_start'],
"timestamp_end": obj.properties['timestamp_end'],
"participants": obj.properties['participants'],
"message_count": obj.properties['message_count'],
})
return {
"success": True,
"query": input_data.query,
"results": conversations,
"count": len(conversations),
}
finally:
client.close()
except Exception as e:
return {
"success": False,
"error": str(e),
}
async def list_conversations_handler(input_data: ListConversationsInput) -> Dict[str, Any]:
"""
List all conversations with filtering.
Args:
input_data: Query parameters.
Returns:
Dictionary with conversation list.
"""
try:
# Connect to Weaviate
client = weaviate.connect_to_local()
try:
# Get collection
collection = client.collections.get("Conversation")
# Build query
if input_data.category_filter:
results = collection.query.fetch_objects(
filters=weaviate.classes.query.Filter.by_property("category").equal(input_data.category_filter),
limit=input_data.limit,
)
else:
results = collection.query.fetch_objects(
limit=input_data.limit,
)
# Format results
conversations = []
for obj in results.objects:
conversations.append({
"conversation_id": obj.properties['conversation_id'],
"category": obj.properties['category'],
"summary": obj.properties['summary'][:100] + "..." if len(obj.properties['summary']) > 100 else obj.properties['summary'],
"timestamp_start": obj.properties['timestamp_start'],
"message_count": obj.properties['message_count'],
"participants": obj.properties['participants'],
})
return {
"success": True,
"conversations": conversations,
"count": len(conversations),
}
finally:
client.close()
except Exception as e:
return {
"success": False,
"error": str(e),
}

213
memory/mcp/message_tools.py Normal file
View File

@@ -0,0 +1,213 @@
"""
Message MCP Tools - Handlers for message-related operations.
Provides tools for adding, searching, and retrieving conversation messages.
"""
import weaviate
from datetime import datetime, timezone
from typing import Any, Dict
from pydantic import BaseModel, Field
from memory.core import get_embedder
class AddMessageInput(BaseModel):
"""Input for add_message tool."""
content: str = Field(..., description="Message content")
role: str = Field(..., description="Role: user, assistant, system")
conversation_id: str = Field(..., description="Conversation identifier")
order_index: int = Field(default=0, description="Position in conversation")
class GetMessagesInput(BaseModel):
"""Input for get_messages tool."""
conversation_id: str = Field(..., description="Conversation identifier")
limit: int = Field(default=50, ge=1, le=500, description="Maximum messages")
class SearchMessagesInput(BaseModel):
"""Input for search_messages tool."""
query: str = Field(..., description="Search query text")
limit: int = Field(default=10, ge=1, le=100, description="Maximum results")
conversation_id_filter: str | None = Field(default=None, description="Filter by conversation")
async def add_message_handler(input_data: AddMessageInput) -> Dict[str, Any]:
"""
Add a new message to Weaviate.
Args:
input_data: Message data to add.
Returns:
Dictionary with success status and message UUID.
"""
try:
# Connect to Weaviate
client = weaviate.connect_to_local()
try:
# Get embedder
embedder = get_embedder()
# Generate vector for message content
vector = embedder.embed_batch([input_data.content])[0]
# Get collection
collection = client.collections.get("Message")
# Insert message
uuid = collection.data.insert(
properties={
"content": input_data.content,
"role": input_data.role,
"timestamp": datetime.now(timezone.utc).isoformat(),
"conversation_id": input_data.conversation_id,
"order_index": input_data.order_index,
"conversation": {
"conversation_id": input_data.conversation_id,
"category": "general", # Default
},
},
vector=vector.tolist()
)
return {
"success": True,
"uuid": str(uuid),
"content": input_data.content[:100] + "..." if len(input_data.content) > 100 else input_data.content,
"role": input_data.role,
"conversation_id": input_data.conversation_id,
}
finally:
client.close()
except Exception as e:
return {
"success": False,
"error": str(e),
}
async def get_messages_handler(input_data: GetMessagesInput) -> Dict[str, Any]:
"""
Get all messages from a conversation.
Args:
input_data: Query parameters.
Returns:
Dictionary with messages in order.
"""
try:
# Connect to Weaviate
client = weaviate.connect_to_local()
try:
# Get collection
collection = client.collections.get("Message")
# Fetch messages for conversation
results = collection.query.fetch_objects(
filters=weaviate.classes.query.Filter.by_property("conversation_id").equal(input_data.conversation_id),
limit=input_data.limit,
)
# Sort by order_index
messages = []
for obj in results.objects:
messages.append({
"uuid": str(obj.uuid),
"content": obj.properties['content'],
"role": obj.properties['role'],
"timestamp": obj.properties['timestamp'],
"order_index": obj.properties['order_index'],
})
# Sort by order_index
messages.sort(key=lambda m: m['order_index'])
return {
"success": True,
"conversation_id": input_data.conversation_id,
"messages": messages,
"count": len(messages),
}
finally:
client.close()
except Exception as e:
return {
"success": False,
"error": str(e),
}
async def search_messages_handler(input_data: SearchMessagesInput) -> Dict[str, Any]:
"""
Search messages using semantic similarity.
Args:
input_data: Search parameters.
Returns:
Dictionary with search results.
"""
try:
# Connect to Weaviate
client = weaviate.connect_to_local()
try:
# Get embedder
embedder = get_embedder()
# Generate query vector
query_vector = embedder.embed_batch([input_data.query])[0]
# Get collection
collection = client.collections.get("Message")
# Build query
query_builder = collection.query.near_vector(
near_vector=query_vector.tolist(),
limit=input_data.limit,
)
# Apply conversation filter if provided
if input_data.conversation_id_filter:
query_builder = query_builder.where(
weaviate.classes.query.Filter.by_property("conversation_id").equal(input_data.conversation_id_filter)
)
# Execute search
results = query_builder.objects
# Format results
messages = []
for obj in results:
messages.append({
"uuid": str(obj.uuid),
"content": obj.properties['content'],
"role": obj.properties['role'],
"timestamp": obj.properties['timestamp'],
"conversation_id": obj.properties['conversation_id'],
"order_index": obj.properties['order_index'],
})
return {
"success": True,
"query": input_data.query,
"results": messages,
"count": len(messages),
}
finally:
client.close()
except Exception as e:
return {
"success": False,
"error": str(e),
}

203
memory/mcp/thought_tools.py Normal file
View File

@@ -0,0 +1,203 @@
"""
Thought MCP Tools - Handlers for thought-related operations.
Provides tools for adding, searching, and retrieving thoughts from Weaviate.
"""
import weaviate
from datetime import datetime, timezone
from typing import Any, Dict
from pydantic import BaseModel, Field
from memory.core import get_embedder
class AddThoughtInput(BaseModel):
"""Input for add_thought tool."""
content: str = Field(..., description="The thought content")
thought_type: str = Field(default="reflection", description="Type: reflection, question, intuition, observation, etc.")
trigger: str = Field(default="", description="What triggered this thought")
concepts: list[str] = Field(default_factory=list, description="Related concepts/tags")
privacy_level: str = Field(default="private", description="Privacy: private, shared, public")
class SearchThoughtsInput(BaseModel):
"""Input for search_thoughts tool."""
query: str = Field(..., description="Search query text")
limit: int = Field(default=10, ge=1, le=100, description="Maximum results")
thought_type_filter: str | None = Field(default=None, description="Filter by thought type")
async def add_thought_handler(input_data: AddThoughtInput) -> Dict[str, Any]:
"""
Add a new thought to Weaviate.
Args:
input_data: Thought data to add.
Returns:
Dictionary with success status and thought UUID.
"""
try:
# Connect to Weaviate
client = weaviate.connect_to_local()
try:
# Get embedder
embedder = get_embedder()
# Generate vector for thought content
vector = embedder.embed_batch([input_data.content])[0]
# Get collection
collection = client.collections.get("Thought")
# Insert thought
uuid = collection.data.insert(
properties={
"content": input_data.content,
"thought_type": input_data.thought_type,
"timestamp": datetime.now(timezone.utc).isoformat(),
"trigger": input_data.trigger,
"concepts": input_data.concepts,
"privacy_level": input_data.privacy_level,
"emotional_state": "",
"context": "",
},
vector=vector.tolist()
)
return {
"success": True,
"uuid": str(uuid),
"content": input_data.content[:100] + "..." if len(input_data.content) > 100 else input_data.content,
"thought_type": input_data.thought_type,
}
finally:
client.close()
except Exception as e:
return {
"success": False,
"error": str(e),
}
async def search_thoughts_handler(input_data: SearchThoughtsInput) -> Dict[str, Any]:
"""
Search thoughts using semantic similarity.
Args:
input_data: Search parameters.
Returns:
Dictionary with search results.
"""
try:
# Connect to Weaviate
client = weaviate.connect_to_local()
try:
# Get embedder
embedder = get_embedder()
# Generate query vector
query_vector = embedder.embed_batch([input_data.query])[0]
# Get collection
collection = client.collections.get("Thought")
# Build query
query = collection.query.near_vector(
near_vector=query_vector.tolist(),
limit=input_data.limit,
)
# Apply thought_type filter if provided
if input_data.thought_type_filter:
query = query.where({
"path": ["thought_type"],
"operator": "Equal",
"valueText": input_data.thought_type_filter,
})
# Execute search
results = query.objects
# Format results
thoughts = []
for obj in results:
thoughts.append({
"uuid": str(obj.uuid),
"content": obj.properties['content'],
"thought_type": obj.properties['thought_type'],
"timestamp": obj.properties['timestamp'],
"trigger": obj.properties.get('trigger', ''),
"concepts": obj.properties.get('concepts', []),
})
return {
"success": True,
"query": input_data.query,
"results": thoughts,
"count": len(thoughts),
}
finally:
client.close()
except Exception as e:
return {
"success": False,
"error": str(e),
}
async def get_thought_handler(uuid: str) -> Dict[str, Any]:
"""
Get a specific thought by UUID.
Args:
uuid: Thought UUID.
Returns:
Dictionary with thought data.
"""
try:
# Connect to Weaviate
client = weaviate.connect_to_local()
try:
# Get collection
collection = client.collections.get("Thought")
# Fetch by UUID
obj = collection.query.fetch_object_by_id(uuid)
if not obj:
return {
"success": False,
"error": f"Thought {uuid} not found",
}
return {
"success": True,
"uuid": str(obj.uuid),
"content": obj.properties['content'],
"thought_type": obj.properties['thought_type'],
"timestamp": obj.properties['timestamp'],
"trigger": obj.properties.get('trigger', ''),
"concepts": obj.properties.get('concepts', []),
"privacy_level": obj.properties.get('privacy_level', 'private'),
"emotional_state": obj.properties.get('emotional_state', ''),
"context": obj.properties.get('context', ''),
}
finally:
client.close()
except Exception as e:
return {
"success": False,
"error": str(e),
}

View File

@@ -0,0 +1,24 @@
"""
Memory Schemas Package.
Defines Weaviate schemas for Memory collections:
- Thought
- Conversation
- Message
"""
from memory.schemas.memory_schemas import (
create_thought_collection,
create_conversation_collection,
create_message_collection,
create_all_memory_schemas,
delete_memory_schemas,
)
__all__ = [
"create_thought_collection",
"create_conversation_collection",
"create_message_collection",
"create_all_memory_schemas",
"delete_memory_schemas",
]

View File

@@ -0,0 +1,308 @@
#!/usr/bin/env python3
"""
Memory Collections Schemas for Weaviate.
This module defines the schema for 3 Memory collections:
- Thought: Individual thoughts/reflections
- Conversation: Complete conversations
- Message: Individual messages in conversations
All collections use manual vectorization (GPU embeddings).
"""
import weaviate
import weaviate.classes.config as wvc
from datetime import datetime
from typing import Optional
def create_thought_collection(client: weaviate.WeaviateClient) -> None:
"""
Create Thought collection.
Schema:
- content: TEXT (vectorized) - The thought content
- thought_type: TEXT - Type (reflexion, question, intuition, observation, etc.)
- timestamp: DATE - When created
- trigger: TEXT (optional) - What triggered the thought
- emotional_state: TEXT (optional) - Emotional state
- concepts: TEXT_ARRAY (vectorized) - Related concepts/tags
- privacy_level: TEXT - private, shared, public
- context: TEXT (optional) - Additional context
"""
# Check if exists
if "Thought" in client.collections.list_all():
print("[WARN] Thought collection already exists, skipping")
return
client.collections.create(
name="Thought",
# Manual vectorization (GPU) - single default vector
vectorizer_config=wvc.Configure.Vectorizer.none(),
properties=[
# Vectorized fields
wvc.Property(
name="content",
data_type=wvc.DataType.TEXT,
description="The thought content",
),
wvc.Property(
name="concepts",
data_type=wvc.DataType.TEXT_ARRAY,
description="Related concepts/tags",
),
# Metadata fields (not vectorized)
wvc.Property(
name="thought_type",
data_type=wvc.DataType.TEXT,
skip_vectorization=True,
description="Type: reflexion, question, intuition, observation, etc.",
),
wvc.Property(
name="timestamp",
data_type=wvc.DataType.DATE,
description="When the thought was created",
),
wvc.Property(
name="trigger",
data_type=wvc.DataType.TEXT,
skip_vectorization=True,
description="What triggered the thought (optional)",
),
wvc.Property(
name="emotional_state",
data_type=wvc.DataType.TEXT,
skip_vectorization=True,
description="Emotional state (optional)",
),
wvc.Property(
name="privacy_level",
data_type=wvc.DataType.TEXT,
skip_vectorization=True,
description="Privacy level: private, shared, public",
),
wvc.Property(
name="context",
data_type=wvc.DataType.TEXT,
skip_vectorization=True,
description="Additional context (optional)",
),
],
)
print("[OK] Thought collection created")
def create_conversation_collection(client: weaviate.WeaviateClient) -> None:
"""
Create Conversation collection.
Schema:
- conversation_id: TEXT - Unique conversation ID
- category: TEXT - philosophy, technical, personal, etc.
- timestamp_start: DATE - Conversation start
- timestamp_end: DATE (optional) - Conversation end
- summary: TEXT (vectorized) - Conversation summary
- participants: TEXT_ARRAY - List of participants
- tags: TEXT_ARRAY - Semantic tags
- message_count: INT - Number of messages
- context: TEXT (optional) - Global context
"""
# Check if exists
if "Conversation" in client.collections.list_all():
print("[WARN] Conversation collection already exists, skipping")
return
client.collections.create(
name="Conversation",
# Manual vectorization (GPU)
vectorizer_config=wvc.Configure.Vectorizer.none(),
properties=[
# Vectorized field
wvc.Property(
name="summary",
data_type=wvc.DataType.TEXT,
description="Conversation summary",
),
# Metadata fields (not vectorized)
wvc.Property(
name="conversation_id",
data_type=wvc.DataType.TEXT,
skip_vectorization=True,
description="Unique conversation identifier",
),
wvc.Property(
name="category",
data_type=wvc.DataType.TEXT,
skip_vectorization=True,
description="Category: philosophy, technical, personal, etc.",
),
wvc.Property(
name="timestamp_start",
data_type=wvc.DataType.DATE,
description="Conversation start time",
),
wvc.Property(
name="timestamp_end",
data_type=wvc.DataType.DATE,
description="Conversation end time (optional)",
),
wvc.Property(
name="participants",
data_type=wvc.DataType.TEXT_ARRAY,
skip_vectorization=True,
description="List of participants",
),
wvc.Property(
name="tags",
data_type=wvc.DataType.TEXT_ARRAY,
skip_vectorization=True,
description="Semantic tags",
),
wvc.Property(
name="message_count",
data_type=wvc.DataType.INT,
description="Number of messages in conversation",
),
wvc.Property(
name="context",
data_type=wvc.DataType.TEXT,
skip_vectorization=True,
description="Global context (optional)",
),
],
)
print("[OK] Conversation collection created")
def create_message_collection(client: weaviate.WeaviateClient) -> None:
"""
Create Message collection.
Schema:
- content: TEXT (vectorized) - Message content
- role: TEXT - user, assistant, system
- timestamp: DATE - When sent
- conversation_id: TEXT - Link to parent Conversation
- order_index: INT - Position in conversation
- conversation: OBJECT (nested) - Denormalized conversation data
- conversation_id: TEXT
- category: TEXT
"""
# Check if exists
if "Message" in client.collections.list_all():
print("[WARN] Message collection already exists, skipping")
return
client.collections.create(
name="Message",
# Manual vectorization (GPU)
vectorizer_config=wvc.Configure.Vectorizer.none(),
properties=[
# Vectorized field
wvc.Property(
name="content",
data_type=wvc.DataType.TEXT,
description="Message content",
),
# Metadata fields (not vectorized)
wvc.Property(
name="role",
data_type=wvc.DataType.TEXT,
skip_vectorization=True,
description="Role: user, assistant, system",
),
wvc.Property(
name="timestamp",
data_type=wvc.DataType.DATE,
description="When the message was sent",
),
wvc.Property(
name="conversation_id",
data_type=wvc.DataType.TEXT,
skip_vectorization=True,
description="Link to parent Conversation",
),
wvc.Property(
name="order_index",
data_type=wvc.DataType.INT,
description="Position in conversation",
),
# Nested object (denormalized for performance)
wvc.Property(
name="conversation",
data_type=wvc.DataType.OBJECT,
skip_vectorization=True,
description="Denormalized conversation data",
nested_properties=[
wvc.Property(
name="conversation_id",
data_type=wvc.DataType.TEXT,
),
wvc.Property(
name="category",
data_type=wvc.DataType.TEXT,
),
],
),
],
)
print("[OK] Message collection created")
def create_all_memory_schemas(client: weaviate.WeaviateClient) -> None:
"""
Create all 3 Memory collections.
Args:
client: Connected Weaviate client.
"""
print("="*60)
print("Creating Memory Schemas")
print("="*60)
create_thought_collection(client)
create_conversation_collection(client)
create_message_collection(client)
print("\n" + "="*60)
print("Memory Schemas Created Successfully")
print("="*60)
# List all collections
all_collections = client.collections.list_all()
print(f"\nTotal collections: {len(all_collections)}")
memory_cols = [c for c in all_collections.keys() if c in ["Thought", "Conversation", "Message"]]
library_cols = [c for c in all_collections.keys() if c in ["Work", "Document", "Chunk", "Summary"]]
print(f"\nMemory collections ({len(memory_cols)}): {', '.join(sorted(memory_cols))}")
print(f"Library collections ({len(library_cols)}): {', '.join(sorted(library_cols))}")
def delete_memory_schemas(client: weaviate.WeaviateClient) -> None:
"""
Delete all Memory collections (for testing/cleanup).
WARNING: This deletes all data in Memory collections!
"""
print("[WARN] WARNING: Deleting all Memory collections...")
for collection_name in ["Thought", "Conversation", "Message"]:
try:
client.collections.delete(collection_name)
print(f"Deleted {collection_name}")
except Exception as e:
print(f"Could not delete {collection_name}: {e}")