feat: wire PDF and text file content into AI chat messages

PDF uploads were stored in S3 and had text extracted during upload, but
fetch_upload_images() filtered exclusively for image MIME types, so
document content never reached the AI.

- Add fetch_upload_documents() in storage_service.py to retrieve
  extracted_content for PDFs and text files
- Update ai_sessions.py chat endpoint to call both fetch_upload_images
  and fetch_upload_documents, injecting document text as context
- Add PDF text extraction in _generate_ai_description (pypdf)
- Add pypdf>=4.0.0 to requirements.txt
- Fix test_db teardown to avoid connection pool issues
- Add 5 tests for fetch_upload_documents

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
chihlasm
2026-03-27 21:02:56 +00:00
parent 3cea949519
commit 11de850054
6 changed files with 324 additions and 12 deletions

View File

@@ -57,3 +57,6 @@ boto3>=1.34.0
# Image processing (vision upload resize)
Pillow>=10.0.0
# PDF text extraction (upload analysis)
pypdf>=4.0.0