Commit Graph

4 Commits

Author SHA1 Message Date
e78c3ae292 docs: Update README to reflect 6 collections (3 RAG + 3 Memory)
Architecture clarification:
- Updated schema section: 4 → 6 collections
- Clarified separation: RAG (3) vs Memory (3)
- Removed Document collection references
- Updated collection names: Chunk → Chunk_v2, Summary → Summary_v2

Schema changes reflected:
- RAG: Work, Chunk_v2, Summary_v2 (schema.py)
- Memory: Conversation, Message, Thought (memory/schemas/memory_schemas.py)

Vectorization details:
- All 5 vectorized collections use GPU embedder (BAAI/bge-m3, RTX 4070)
- Manual vectorization with Python PyTorch CUDA
- 1024 dimensions, cosine similarity

Updated diagrams:
- Architecture mermaid diagram shows 6 collections
- Pipeline diagram updated to 6 collections
- Added memory/ module structure

Updated examples:
- Replaced Chunk with Chunk_v2 in all code examples
- Added Memory collections documentation
- Clarified separation of concerns

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 14:36:08 +01:00
187ba4854e chore: Major cleanup - archive migration scripts and remove temp files
CLEANUP ACTIONS:
- Archived 11 migration/optimization scripts to archive/migration_scripts/
- Archived 11 phase documentation files to archive/documentation/
- Moved backups/, docs/, scripts/ to archive/
- Deleted 30+ temporary debug/test/fix scripts
- Cleaned Python cache (__pycache__/, *.pyc)
- Cleaned log files (*.log)

NEW FILES:
- CHANGELOG.md: Consolidated project history and migration documentation
- Updated .gitignore: Added *.log, *.pyc, archive/ exclusions

FINAL ROOT STRUCTURE (19 items):
- Core framework: agent.py, autonomous_agent_demo.py, client.py, security.py, progress.py, prompts.py
- Config: requirements.txt, package.json, .gitignore
- Docs: README.md, CHANGELOG.md, project_progress.md
- Directories: archive/, generations/, memory/, prompts/, utils/

ARCHIVED SCRIPTS (in archive/migration_scripts/):
01-11: Migration & optimization scripts (migrate, schema, rechunk, vectorize, etc.)

ARCHIVED DOCS (in archive/documentation/):
PHASE_0-8: Detailed phase summaries
MIGRATION_README.md, PLAN_MIGRATION_WEAVIATE_GPU.md

Repository is now clean and production-ready with all important files preserved in archive/.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-08 18:05:43 +01:00
ca221887eb docs: Update README for schema changes and Docker config
- Add 'summary' vectorized field to Chunk collection description
- Update vectorization strategy (text/summary/keywords)
- Add HNSW + RQ vector index configuration section
- Correct Docker config: BGE-M3 ONNX is CPU-only (not CUDA)
- Add llm_summarizer.py and summary generation scripts to project structure
- Update annexe with accurate GPU/VRAM information
- Remove incorrect GPU configuration example

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-07 23:10:36 +01:00
d2f7165120 Add Library RAG project and cleanup root directory
- Add complete Library RAG application (Flask + MCP server)
  - PDF processing pipeline with OCR and LLM extraction
  - Weaviate vector database integration (BGE-M3 embeddings)
  - Flask web interface with search and document management
  - MCP server for Claude Desktop integration
  - Comprehensive test suite (134 tests)

- Clean up root directory
  - Remove obsolete documentation files
  - Remove backup and temporary files
  - Update autonomous agent configuration

- Update prompts
  - Enhance initializer bis prompt with better instructions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-30 11:57:12 +01:00