feat: Add multi-file batch upload with sequential processing

Implements comprehensive batch upload system with real-time progress tracking:

Backend Infrastructure:
- Add batch_jobs global dict for batch orchestration
- Add BatchFileInfo and BatchJob TypedDicts to utils/types.py
- Create run_batch_sequential() worker function with thread.join() synchronization
- Modify /upload POST route to detect single vs multi-file uploads
- Add 3 batch API routes: /upload/batch/progress, /status, /result
- Add timestamp_to_date Jinja2 template filter

Frontend:
- Update upload.html with 'multiple' attribute and file counter
- Create upload_batch_progress.html: Real-time dashboard with SSE per file
- Create upload_batch_result.html: Final summary with statistics

Architecture:
- Backward compatible: single-file upload unchanged
- Sequential processing: one file after another (respects API limits)
- N parallel SSE connections: one per file for real-time progress
- Polling mechanism to discover job IDs as files start processing
- 1-hour timeout per file with error handling and continuation

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-01-08 22:41:52 +01:00
parent 7a7a2b8e19
commit b70b796ef8
5 changed files with 819 additions and 37 deletions

View File

@@ -1216,3 +1216,47 @@ class DeleteDocumentResult(TypedDict, total=False):
deleted_sections: int
deleted_document: bool
error: Optional[str]
class BatchFileInfo(TypedDict, total=False):
"""Information about a single file in a batch upload.
Attributes:
filename: Original filename
job_id: Processing job ID (assigned when processing starts)
status: Current status (pending, processing, complete, error)
error: Error message if processing failed
size_bytes: File size in bytes
"""
filename: str
job_id: Optional[str]
status: str # Literal["pending", "processing", "complete", "error"]
error: Optional[str]
size_bytes: int
class BatchJob(TypedDict, total=False):
"""Batch processing job tracking multiple file uploads.
Attributes:
job_ids: List of individual processing job IDs
files: List of file information dictionaries
total_files: Total number of files in batch
completed_files: Number of files successfully processed
failed_files: Number of files that failed processing
status: Overall batch status (processing, complete, partial)
current_job_id: Currently processing job ID (None if between files)
options: Processing options applied to all files
created_at: Timestamp when batch was created
"""
job_ids: List[str]
files: List[BatchFileInfo]
total_files: int
completed_files: int
failed_files: int
status: str # Literal["processing", "complete", "partial"]
current_job_id: Optional[str]
options: ProcessingOptions
created_at: float