feat: wire PDF and text file content into AI chat messages

PDF uploads were stored in S3 and had text extracted during upload, but
fetch_upload_images() filtered exclusively for image MIME types, so
document content never reached the AI.

- Add fetch_upload_documents() in storage_service.py to retrieve
  extracted_content for PDFs and text files
- Update ai_sessions.py chat endpoint to call both fetch_upload_images
  and fetch_upload_documents, injecting document text as context
- Add PDF text extraction in _generate_ai_description (pypdf)
- Add pypdf>=4.0.0 to requirements.txt
- Fix test_db teardown to avoid connection pool issues
- Add 5 tests for fetch_upload_documents

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

This commit is contained in:

chihlasm

2026-03-27 21:02:56 +00:00

parent 3cea949519

commit 11de850054

6 changed files with 324 additions and 12 deletions

3

backend/requirements.txt

View File

@@ -57,3 +57,6 @@ boto3>=1.34.0
 # Image processing (vision upload resize)
 Pillow>=10.0.0
 # PDF text extraction (upload analysis)
 pypdf>=4.0.0

feat: wire PDF and text file content into AI chat messages

3 backend/requirements.txt Unescape Escape View File

3

backend/requirements.txt

View File