New feature specification to add native Markdown (.md) file support:
- Skip OCR for .md files (0€ cost vs ~0.003€/page for PDF)
- Process Markdown directly through LLM pipeline
- Maintain full compatibility with existing PDF workflow
- Includes 10 features, 5 implementation steps, comprehensive tests
This will enable users to upload pre-digitized philosophical texts
in Markdown format without incurring OCR costs while still benefiting
from LLM-based metadata extraction, TOC generation, semantic chunking,
and Weaviate vectorization.
🤖 Generated with Claude Code
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>