Files
linear-coding-agent/.gitignore
David Blanc Brioir 53f6a92365 feat: Remove Document collection from schema
BREAKING CHANGE: Document collection removed from Weaviate schema

Architecture simplification:
- Removed Document collection (unused by Flask app)
- All metadata now in Work collection or file-based (chunks.json)
- Simplified from 4 collections to 3 (Work, Chunk_v2, Summary_v2)

Schema changes (schema.py):
- Removed create_document_collection() function
- Updated verify_schema() to expect 3 collections
- Updated display_schema() and print_summary()
- Updated documentation to reflect Chunk_v2/Summary_v2

Ingestion changes (weaviate_ingest.py):
- Removed ingest_document_metadata() function
- Removed ingest_document_collection parameter
- Updated IngestResult to use work_uuid instead of document_uuid
- Removed Document deletion from delete_document_chunks()
- Updated DeleteResult TypedDict

Type changes (types.py):
- WeaviateIngestResult: document_uuid → work_uuid

Documentation updates (.claude/CLAUDE.md):
- Updated schema diagram (4 → 3 collections)
- Removed Document references
- Updated to reflect manual GPU vectorization

Database changes:
- Deleted Document collection (13 objects)
- Deleted Chunk collection (0 objects, old schema)

Benefits:
- Simpler architecture (3 collections vs 4)
- No redundant data storage
- All metadata available via Work or file-based storage
- Reduced Weaviate memory footprint

Migration:
- See DOCUMENT_COLLECTION_ANALYSIS.md for detailed analysis
- See migrate_chunk_v2_to_none_vectorizer.py for vectorizer migration

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 14:13:51 +01:00

49 lines
755 B
Plaintext

# Agent-generated output directories
generations/*
!generations/library_rag/
# Python cache and compiled files
__pycache__/
*.pyc
*.pyo
*.pyd
# Log files
logs/
*.log
.env
venv
# Node modules (if any)
node_modules/
package-lock.json
# Backup and temporary files
backup_migration_*/
restoration_log.txt
restoration_remaining_log.txt
summary_generation_progress.json
nul
# Test files and temporary scripts (Jan 2026)
test_*.txt
test_ingestion*.py
test_direct*.py
test_upload*.py
*_backup.json
chunks_to_vectorize.json
output/
check_chunks.py
verify_works.py
complete_*.py
extract_*.py
fast_extract.py
stream_extract.py
quick_vectorize.py
vectorize_remaining.py
migrate_chunk_*.py
# Archives (migration scripts moved here)
archive/chunk_v2_backup.json