Ajout de la fonctionnalité TTS (Text-to-Speech) avec XTTS v2
- Ajout de TTS>=0.22.0 aux dépendances - Création du module utils/tts_generator.py avec Coqui XTTS v2 * Support GPU avec mixed precision (FP16) * Lazy loading avec singleton pattern * Chunking automatique pour textes longs * Support multilingue (fr, en, es, de, etc.) - Ajout de la route /chat/export-audio dans flask_app.py - Ajout du bouton Audio dans chat.html (côté Word/PDF) - Génération audio WAV téléchargeable depuis les réponses Optimisé pour GPU 4070 (8GB VRAM) : utilise 4-6GB, génération rapide Qualité : voix naturelle française avec prosodie expressive 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -1575,6 +1575,72 @@ def chat_export_pdf() -> Union[WerkzeugResponse, tuple[Dict[str, Any], int]]:
|
||||
return jsonify({"error": f"Export failed: {str(e)}"}), 500
|
||||
|
||||
|
||||
@app.route("/chat/export-audio", methods=["POST"])
|
||||
def chat_export_audio() -> Union[WerkzeugResponse, tuple[Dict[str, Any], int]]:
|
||||
"""Export a chat exchange to audio format (TTS).
|
||||
|
||||
Generates a natural-sounding speech audio file (.wav) from the assistant's
|
||||
response using Coqui XTTS v2 multilingual TTS model. Supports GPU acceleration
|
||||
for faster generation.
|
||||
|
||||
Request JSON:
|
||||
assistant_response (str): The assistant's complete response (required).
|
||||
language (str, optional): Language code for TTS ("fr", "en", etc.).
|
||||
Default: "fr" (French).
|
||||
|
||||
Returns:
|
||||
Audio file download (.wav) on success.
|
||||
JSON error response with 400/500 status on failure.
|
||||
|
||||
Example:
|
||||
POST /chat/export-audio
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"assistant_response": "La phénoménologie est une approche philosophique...",
|
||||
"language": "fr"
|
||||
}
|
||||
|
||||
Response: chat_audio_20250130_143045.wav (download)
|
||||
|
||||
Note:
|
||||
First call will download XTTS v2 model (~2GB) and cache it.
|
||||
GPU usage: 4-6GB VRAM. Falls back to CPU if no GPU available.
|
||||
"""
|
||||
try:
|
||||
data = request.get_json()
|
||||
|
||||
if not data:
|
||||
return jsonify({"error": "No JSON data provided"}), 400
|
||||
|
||||
assistant_response = data.get("assistant_response")
|
||||
language = data.get("language", "fr")
|
||||
|
||||
if not assistant_response:
|
||||
return jsonify({"error": "assistant_response is required"}), 400
|
||||
|
||||
# Import TTS generator
|
||||
from utils.tts_generator import generate_speech
|
||||
|
||||
# Generate audio file
|
||||
filepath = generate_speech(
|
||||
text=assistant_response,
|
||||
output_dir=app.config["UPLOAD_FOLDER"],
|
||||
language=language,
|
||||
)
|
||||
|
||||
# Send file as download
|
||||
return send_from_directory(
|
||||
directory=filepath.parent,
|
||||
path=filepath.name,
|
||||
as_attachment=True,
|
||||
download_name=filepath.name,
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
return jsonify({"error": f"TTS failed: {str(e)}"}), 500
|
||||
|
||||
|
||||
# ═══════════════════════════════════════════════════════════════════════════════
|
||||
# PDF Upload & Processing
|
||||
# ═══════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
Reference in New Issue
Block a user