feat(platform): Complete Postgres-Only architecture refactoring (Phase 1-7)

Major Changes: - Implement Platform-Only architecture pattern (unified task management) - Add PostgresCacheAdapter for unified caching (platform_schema.app_cache) - Add PgBossQueue for job queue management (platform_schema.job) - Implement CheckpointService using job.data (generic for all modules) - Add intelligent threshold-based dual-mode processing (THRESHOLD=50) - Add task splitting mechanism (auto chunk size recommendation) - Refactor ASL screening service with smart mode selection - Refactor DC extraction service with smart mode selection - Register workers for ASL and DC modules Technical Highlights: - All task management data stored in platform_schema.job.data (JSONB) - Business tables remain clean (no task management fields) - CheckpointService is generic (shared by all modules) - Zero code duplication (DRY principle) - Follows 3-layer architecture principle - Zero additional cost (no Redis needed, save 8400 CNY/year) Code Statistics: - New code: ~1750 lines - Modified code: ~500 lines - Test code: ~1800 lines - Documentation: ~3000 lines Testing: - Unit tests: 8/8 passed - Integration tests: 2/2 passed - Architecture validation: passed - Linter errors: 0 Files: - Platform layer: PostgresCacheAdapter, PgBossQueue, CheckpointService, utils - ASL module: screeningService, screeningWorker - DC module: ExtractionController, extractionWorker - Tests: 11 test files - Docs: Updated 4 key documents Status: Phase 1-7 completed, Phase 8-9 pending
2025-12-13 16:10:04 +08:00
parent a3586cdf30
commit fa72beea6c
135 changed files with 17508 additions and 91 deletions
--- a/docs/03-业务模块/ASL-AI智能文献/00-模块当前状态与开发指南.md
+++ b/docs/03-业务模块/ASL-AI智能文献/00-模块当前状态与开发指南.md
@@ -1,9 +1,10 @@
 # AI智能文献模块 - 当前状态与开发指南

-> **文档版本：** v1.3  
+> **文档版本：** v1.4  
 > **创建日期：** 2025-11-21  
 > **维护者：** AI智能文献开发团队  
-> **最后更新：** 2025-11-23 (Day 5完成后)  
+> **最后更新：** 2025-12-13 🏆 **Postgres-Only 架构改造完成**  
+> **重大进展：** Platform-Only 架构改造 - 智能双模式处理、任务拆分、断点续传  
 > **文档目的：** 反映模块真实状态，帮助新开发人员快速上手

 ---
@@ -35,6 +36,50 @@ AI智能文献模块是一个基于大语言模型（LLM）的文献筛选系统
 - **模型支持**：DeepSeek-V3 + Qwen-Max 双模型筛选
 - **部署状态**：✅ 本地开发环境运行正常

+### 🏆 Postgres-Only 架构改造（2025-12-13完成）
+
+**改造目标：**
+- 支持2-24小时的长时间任务（1000篇文献筛选）
+- 实例重启后任务可恢复（断点续传）
+- 零额外成本（使用 Postgres，不需要 Redis）
+
+**核心实现：**
+
+1. **智能双模式处理** 🎯
+   - 阈值：50篇文献
+   - 小任务（<50篇）：直接处理，快速响应（<1分钟）
+   - 大任务（≥50篇）：队列处理，可靠性高（支持断点续传）
+
+2. **任务拆分机制** 📦
+   - 100篇 → 2个批次（每批50篇）
+   - 1000篇 → 20个批次（每批50篇）
+   - 自动推荐批次大小
+
+3. **断点续传机制** 🔄
+   - 每10篇文献保存一次断点
+   - 断点数据存储在 `platform_schema.job.data`（pg-boss）
+   - 实例重启后自动从上次位置继续
+
+4. **Platform层统一管理** 🏗️
+   - 任务管理信息不存储在 `asl_schema.screening_tasks`
+   - 统一存储在 `platform_schema.job.data`（JSONB）
+   - 使用 `CheckpointService` 操作 job.data（所有模块通用）
+
+**改造文件：**
+- `screeningService.ts`：添加智能阈值判断，推送批次任务到 pg-boss
+- `screeningWorker.ts`：批次处理逻辑，断点续传实现
+- `CheckpointService.ts`：操作 job.data，不依赖业务表
+
+**测试验证：**
+- ✅ 小任务（7篇）- 直接模式测试通过
+- ✅ 大任务（100篇）- 队列模式测试通过
+- ✅ 任务拆分逻辑验证通过
+- ✅ Platform-Only 架构验证通过
+
+**技术债务：**
+- ⚠️ Phase 8 全面测试（断点续传压力测试、1000篇文献完整流程）
+- ⚠️ Phase 9 SAE 部署验证
+
 ### 关键里程碑

 **标题摘要初筛（已完成）**: