Major Changes: - Add StreamingService with OpenAI Compatible format - Upgrade Chat component V2 with Ant Design X integration - Implement AIA module with 12 intelligent agents - Update API routes to unified /api/v1 prefix - Update system documentation Backend (~1300 lines): - common/streaming: OpenAI Compatible adapter - modules/aia: 12 agents, conversation service, streaming integration - Update route versions (RVW, PKB to v1) Frontend (~3500 lines): - modules/aia: AgentHub + ChatWorkspace (100% prototype restoration) - shared/Chat: AIStreamChat, ThinkingBlock, useAIStream Hook - Update API endpoints to v1 Documentation: - AIA module status guide - Universal capabilities catalog - System overview updates - All module documentation sync Tested: Stream response verified, authentication working Status: AIA V2.0 core completed (85%)
81 KiB
Postgres-Only æž¶æž„æ”¹é€ å®žæ–½è®¡åˆ’ï¼ˆå®Œæ•´ç‰ˆï¼‰
*文档版本� V2.0 �*已更�
*创建日期� 2025-12-13
*更新日期� 2025-12-13
*ç›®æ ‡å®Œæˆ<EFBFBD>æ—¶é—´ï¼? 2025-12-20ï¼?天)
负责人: 技术团é˜? *风险ç‰çº§ï¼? 🟢 低(基于æˆ<C3A6>熟方案,有é™<C3A9>级ç–ç•¥ï¼? *æ ¸å¿ƒåŽŸåˆ™ï¼? 通用能力层优先,å¤<C3A5>用架构优势
âš ï¸<EFBFBD> V2.0 更新说明
本版本基äº?*真实代ç <C3A7>结构验è¯<C3A8>**å’?*技术方案深度分æž?*进行了é‡<C3A9>è¦<C3A8>更新:
*æ›´æ–°1:代ç <EFBFBD>结构真实性验è¯? âœ?- âœ?已从实际代ç <C3A7>验è¯<C3A8>所有路径和文件
- âœ?æ˜Žç¡®æ ‡æ³¨"å·²å˜åœ?ã€?å<> ä½<C3A4>ç¬?ã€?需新增"
- âœ?RedisCacheAdapter确认为未实现的å<E2809E> ä½<C3A4>符
- âœ?æ‰€æœ‰æ”¹é€ ç‚¹åŸºäºŽçœŸå®žä»£ç <C3A7>结构
*æ›´æ–°2:长任务å<EFBFBD>¯é<EFBFBD> 性ç–ç•? 🔴
- âœ?新增ï¼?*æ–点ç»ä¼ 机制**(ç–ç•?ï¼? 强烈推è<C2A8><C3A8>
- âœ?新增ï¼?*任务拆分ç–ç•¥**(ç–ç•?ï¼? 架构级优åŒ?- âœ?分æž<C3A6>:心跳ç»ç§Ÿæœºåˆ¶ï¼ˆç–ç•¥1ï¼? ä¸<C3A4>推è<C2A8><C3A8>实æ–?- âœ?明确:pg-boss的技术é™<C3A9>制(最é•?å°<C3A5>时,推è<C2A8>?å°<C3A5>æ—¶ï¼?- âœ?ä¿®æ£ï¼?4å°<C3A5>时连ç»ä»»åŠ¡ä¸<C3A4>支æŒ<C3A6>,需拆分
*更新3:实施计划调�
- âœ?工作é‡<C3A9>å¢žåŠ åˆ°9å¤©ï¼ˆå¢žåŠ ä»»åŠ¡æ‹†åˆ†å’Œæ–点ç»ä¼ )
- âœ?完整的代ç <C3A7>示例(å<CB86>¯ç›´æŽ¥ä½¿ç”¨ï¼‰
- âœ?æ•°æ<C2B0>®åº“Schemaæ›´æ–°ï¼ˆæ–°å¢žå—æ®µï¼‰
📋 目录
- æ”¹é€ èƒŒæ™¯ä¸Žç›®æ ‡
- 当å‰<EFBFBD>系统分æž<EFBFBD>
- æ”¹é€ æ€»ä½“æž¶æž„
- 详细实施æ¥éª¤
- 优先级与ä¾<EFBFBD>赖关系
- 测试验è¯<EFBFBD>方案
- [上线与回滚](#7-上线与回�
1. æ”¹é€ èƒŒæ™¯ä¸Žç›®æ ‡
1.1 为什么选择Postgres-Only�
æ ¸å¿ƒç—›ç‚¹ï¼?â<>?长任务ä¸<C3A4>å<EFBFBD>¯é<C2AF> ï¼?å°<C3A5>时任务实例销æ¯<C3A6>丢失)
â<>?LLMæˆ<C3A6>本失控(缓å˜ä¸<C3A4>æŒ<C3A6>久化)
â<>?多实例ä¸<C3A4>å<EFBFBD>Œæ¥ï¼ˆå†…å˜ç¼“å˜å<CB9C>„自独立)
技术方案选择ï¼?âœ?Postgres-Only(推è<C2A8><C3A8>)
- 架构简å<E282AC>•(1-2人团队)
- è¿<C3A8>ç»´æˆ<C3A6>本低(å¤<C3A5>用RDSï¼? - æ•°æ<C2B0>®ä¸€è‡´æ€§å¼ºï¼ˆäº‹åŠ¡ä¿<C3A4>è¯<C3A8>)
- 节çœ<C3A7>æˆ<C3A6>本(Â?000+/年)
â<>?Redis方案(ä¸<C3A4>推è<C2A8><C3A8>ï¼? - æž¶æž„å¤<C3A5>æ<EFBFBD>‚(å<CB86>Œç³»ç»Ÿï¼? - è¿<C3A8>ç»´è´Ÿæ‹…é‡<C3A9>(需维护Redisï¼? - æ•°æ<C2B0>®ä¸€è‡´æ€§å¼±ï¼ˆå<CB86>Œå†™é—®é¢˜ï¼‰
- é¢<C3A9>外æˆ<C3A6>本(Â?000+/年)
1.2 æ”¹é€ ç›®æ ?
| ç›®æ ‡ | 当å‰<EFBFBD>状æ€? | æ”¹é€ å<EFBFBD>Ž | è¡¡é‡<EFBFBD>æŒ‡æ ‡ |
|---|---|---|---|
| *长任务å<EFBFBD>¯é<EFBFBD> æ€? | 5-10%(实例销æ¯<C3A6>丢失) | > 99% | 2å°<EFBFBD>时任务æˆ<EFBFBD>功çŽ? |
| LLMæˆ<EFBFBD>本 | é‡<EFBFBD>å¤<EFBFBD>调用多次 | é™<EFBFBD>低50%+ | 月度API费用 |
| *缓å˜å‘½ä¸çŽ? | 0%(ä¸<C3A4>æŒ<C3A6>ä¹…ï¼? | 60%+ | 监控统计 |
| *多实例å<EFBFBD>Œæ? | â<EFBFBD>?å<>„自独立 | âœ?å…±äº«ç¼“å˜ | 实例A写→实例Bè¯? |
| *æž¶æž„å¤<EFBFBD>æ<EFBFBD>‚åº? | ä¸ç‰ | ä½? | ä¸é—´ä»¶æ•°é‡? |
2. 当å‰<C3A5>系统分æž<C3A6>
2.1 代ç <C3A7>结构现状ï¼?层架构)- âœ?*已验è¯<EFBFBD>真实代ç ?
*验è¯<EFBFBD>方法ï¼? 通过
list_dirå’?read_file实际检查代ç <C3A7>库
*验è¯<EFBFBD>日期ï¼? 2025-12-13
*验è¯<EFBFBD>结果ï¼? 所有路径和文件状æ€<C3A6>å<EFBFBD>‡å·²ç¡®è®?
AIclinicalresearch/backend/src/
├── common/ # 🔵 通用能力å±?â”? ├── cache/ # âœ?真实å˜åœ¨
â”? â”? ├── CacheAdapter.ts # âœ?接å<C2A5>£å®šä¹‰ï¼ˆçœŸå®žï¼‰
â”? â”? ├── CacheFactory.ts # âœ?工厂模å¼<C3A5>(真实)
� � ├── index.ts # �导出文件(真实)
â”? â”? ├── MemoryCacheAdapter.ts # âœ?内å˜å®žçŽ°ï¼ˆçœŸå®žï¼Œå·²å®Œæˆ<C3A6>)
â”? â”? ├── RedisCacheAdapter.ts # 🔴 å<> ä½<C3A4>符(真实å˜åœ¨ï¼Œä½†æœªå®žçŽ°ï¼‰
â”? â”? └── PostgresCacheAdapter.ts # â<>?éœ€æ–°å¢žï¼ˆæœ¬æ¬¡æ”¹é€ ï¼‰
â”? â”?â”? ├── jobs/ # âœ?真实å˜åœ¨
â”? â”? ├── types.ts # âœ?接å<C2A5>£å®šä¹‰ï¼ˆçœŸå®žï¼‰
â”? â”? ├── JobFactory.ts # âœ?工厂模å¼<C3A5>(真实)
� � ├── index.ts # �导出文件(真实)
â”? â”? ├── MemoryQueue.ts # âœ?内å˜å®žçŽ°ï¼ˆçœŸå®žï¼Œå·²å®Œæˆ<C3A6>)
â”? â”? └── PgBossQueue.ts # â<>?éœ€æ–°å¢žï¼ˆæœ¬æ¬¡æ”¹é€ ï¼‰
â”? â”?â”? ├── storage/ # âœ?å˜å‚¨æœ<C3A6>务(已完善ï¼?â”? ├── logging/ # âœ?日志系统(已完善ï¼?â”? └── llm/ # âœ?LLM网关(已完善ï¼?â”?├── modules/ # 🟢 业务模å<C2A1>—å±?â”? ├── asl/ # âœ?真实å˜åœ¨
� � ├── services/
â”? â”? â”? ├── screeningService.ts # âš ï¸<C3AF> éœ€æ”¹é€ ï¼šæ”¹ä¸ºé˜Ÿåˆ—ï¼ˆçœŸå®žï¼‰
â”? â”? â”? └── llmScreeningService.ts # âœ?真实å˜åœ¨
� � └── common/llm/
â”? â”? └── LLM12FieldsService.ts # âœ?已用缓å˜ï¼ˆçœŸå®žï¼Œç¬?16行)
â”? â”?â”? └── dc/ # âœ?真实å˜åœ¨
� └── tool-b/services/
â”? └── HealthCheckService.ts # âœ?已用缓å˜ï¼ˆçœŸå®žï¼Œç¬?7行)
â”?└── config/ # 🔵 å¹³å<C2B3>°åŸºç¡€å±? ├── database.ts # âœ?Prismaé…<C3A9>置(真实)
└── env.ts # âš ï¸<C3AF> éœ€æ·»åŠ æ–°çŽ¯å¢ƒå<C692>˜é‡<C3A9>(真实文件ï¼?```
**é‡<C3A9>è¦<C3A8>å<EFBFBD>‘现ï¼?*
1. **âœ?3层架构真实å˜åœ?* - 代ç <C3A7>组织完全符å<C2A6>ˆè®¾è®¡
2. **âœ?工厂模å¼<C3A5>已实çŽ?* - CacheFactoryå’ŒJobFactory真实å<C5BE>¯ç”¨
3. **🔴 RedisCacheAdapter是å<C2AF> ä½<C3A4>符** - 所有方法都 `throw new Error('Not implemented')`
4. **âœ?业务层已使用cache接å<C2A5>£** - æ”¹é€ æ—¶ä¸šåŠ¡ä»£ç <C3A7>æ— éœ€ä¿®æ”¹
5. **â<>?PostgresCacheAdapterå’ŒPgBossQueue需新增** - æœ¬æ¬¡æ”¹é€ çš„æ ¸å¿ƒå·¥ä½œ
### 2.2 缓å˜ä½¿ç”¨çŽ°çŠ¶ï¼ˆâœ… 已验è¯<C3A8>)
| ä½<C3A4>ç½® | 用é€?| TTL | æ•°æ<C2B0>®é‡?| é‡<C3A9>è¦<C3A8>æ€?| 代ç <C3A7>行å<C592>· | 当å‰<C3A5>实现 |
|------|------|-----|--------|--------|---------|---------|
| **LLM12FieldsService.ts** | LLM 12å—æ®µæ<C2B5><C3A6>å<EFBFBD>–ç¼“å˜ | 1å°<C3A5>æ—¶ | ~50KB/é¡?| 🔴 高(æˆ<C3A6>本ï¼?| ç¬?16è¡?| âœ?已用cache |
| **HealthCheckService.ts** | Excelå<6C>¥åº·æ£€æŸ¥ç¼“å?| 24å°<C3A5>æ—¶ | ~5KB/é¡?| 🟡 ä¸ç‰ | ç¬?7è¡?| âœ?已用cache |
**代ç <C3A7>示例(真实代ç <C3A7>)ï¼?*
```typescript
// backend/src/modules/asl/common/llm/LLM12FieldsService.ts (�16�
const cached = await cache.get(cacheKey);
if (cached) {
logger.info('缓å˜å‘½ä¸', { cacheKey });
return cached;
}
// ... LLM调用 ...
await cache.set(cacheKey, JSON.stringify(result), 3600); // 1å°<C3A5>æ—¶
// backend/src/modules/dc/tool-b/services/HealthCheckService.ts (�7�
const cached = await cache.get<HealthCheckResult>(cacheKey);
if (cached) return cached;
// ... Excelè§£æž<C3A6> ...
await cache.set(cacheKey, result, 86400); // 24å°<C3A5>æ—¶
结论:✅ 缓å˜ç³»ç»Ÿå·²åœ¨ä½¿ç”¨ï¼Œå<C592>ªéœ€åˆ‡æ<E280A1>¢åº•层实现(Memory â†?Postgres)ã€?
2.3 队列使用现状(✅ 已验è¯<C3A8>)
| ä½<EFBFBD>ç½® | 用é€? | 代ç <EFBFBD>行å<EFBFBD>· | 耗时 | é‡<EFBFBD>è¦<EFBFBD>æ€? | 当å‰<EFBFBD>实现 | 问题 |
|---|---|---|---|---|---|---|
| screeningService.ts | 文献ç›é€‰ä»»åŠ? | ç¬?5è¡? | 2å°<EFBFBD>æ—¶ï¼?000篇) | 🔴 é«? | â<EFBFBD>?å<>Œæ¥æ‰§è¡Œ | 实例销æ¯<EFBFBD>丢å¤? |
| DC Tool B | 病历批é‡<EFBFBD>æ<EFBFBD><EFBFBD>å<EFBFBD>– | 未实çŽ? | 1-3å°<C3A5>æ—¶ | 🔴 é«? | â<EFBFBD>?未实çŽ? | ä¸<EFBFBD>支æŒ<EFBFBD>长任务 |
*代ç <EFBFBD>示例(真实代ç <EFBFBD>)ï¼?
// backend/src/modules/asl/services/screeningService.ts (�5�
// 4. 异æ¥å¤„ç<E2809E>†æ–‡çŒ®ï¼ˆç®€åŒ–版:直接在这里处ç<E2809E>†ï¼?// 生产环境应该å<C2A5>‘é€<C3A9>到消æ<CB86>¯é˜Ÿåˆ—
processLiteraturesInBackground(task.id, projectId, literatures); // â†?å<>Œæ¥æ‰§è¡Œï¼Œæœ‰é£Žé™©
结论:â<EFBFBD>Œ 队列系统尚未使用,需è¦<C3A8>ä¼˜å…ˆæ”¹é€ ã€‚æ³¨é‡Šå·²æ<C2B2><C3A6>示"生产环境应该å<C2A5>‘é€<C3A9>到消æ<CB86>¯é˜Ÿåˆ—"ã€?
2.4 长任务å<C2A1>¯é<C2AF> 性分æž?🔴 新增
当å‰<EFBFBD>问题(真实场景)
场景1ï¼?000篇文献ç›é€‰ï¼ˆçº?å°<C3A5>æ—¶ï¼?├─ 问题1:å<C5A1>Œæ¥æ‰§è¡Œï¼Œé˜»å¡žHTTP请求
├─ 问题2:SAE实例15åˆ†é’Ÿæ— æµ<C3A6>é‡<C3A9>自动缩å®?├─ 问题3:实例é‡<C3A9>å<EFBFBD>¯ï¼Œä»»åŠ¡ä»Žå¤´å¼€å§?└─ 结果:任务æˆ<C3A6>功率 < 10%,用户需é‡<C3A9>å¤<C3A5>æ<EFBFBD><C3A6>交3-5æ¬?
场景2ï¼?0000篇文献ç›é€‰ï¼ˆçº?0å°<C3A5>æ—¶ï¼?├─ 问题1:超过pg-boss最大é”<C3A9>定时间(4å°<C3A5>æ—¶ï¼?├─ 问题2:任务被é‡<C3A9>å¤<C3A5>领å<E280A0>–ï¼Œé€ æˆ<C3A6>é‡<C3A9>å¤<C3A5>处ç<E2809E>†
└─ 结果:任务失败率 100%
场景3:å<C5A1>‘布更新(15:00ï¼?├─ 问题1:æ£åœ¨æ‰§è¡Œçš„任务被强制终æ?├─ 问题2:已处ç<E2809E>†çš„æ–‡çŒ®ç»“果丢å¤?└─ 结果:用户体验æž<C3A6>å·?```
#### **技术é™<C3A9>制分æž?*
| æŠ€æœ¯æ ˆ | é™<C3A9>制 | å½±å“<C3A5> |
|--------|------|------|
| **SAE** | 15åˆ†é’Ÿæ— æµ<C3A6>é‡<C3A9>自动缩å®?| 长任务必然失è´?|
| **pg-boss** | 最长é”<C3A9>å®?å°<C3A5>时(推è<C2A8>?å°<C3A5>æ—¶ï¼?| 超长任务ä¸<C3A4>支æŒ?|
| **HTTP请求** | 最é•?0ç§’è¶…æ—?| ä¸<C3A4>能å<C2BD>Œæ¥æ‰§è¡Œé•¿ä»»åŠ?|
| **实例é‡<C3A9>å<EFBFBD>¯** | 内å˜çжæ€<C3A6>丢å¤?| 任务进度丢失 |
#### **解决方案评估**
| ç–ç•¥ | ä»·å€?| 难度 | pg-boss支æŒ<C3A6> | 推è<C2A8><C3A8>åº?| 实施 |
|------|------|------|------------|--------|------|
| **ç–ç•¥1:心跳ç»ç§?* | â<C3A2>â<C3A2>â<C3A2>â<C3A2> | 🔴 é«?| â<>?ä¸<C3A4>支æŒ?| 🟡 ä¸<C3A4>推è<C2A8>?| æš‚ä¸<C3A4>实施 |
| **ç–ç•¥2:æ–点ç»ä¼?* | â<C3A2>â<C3A2>â<C3A2>â<C3A2>â?| 🟢 ä½?| âœ?兼容 | 🟢 强烈推è<C2A8><C3A8> | **Phase 6** |
| **ç–ç•¥3:任务拆åˆ?* | â<C3A2>â<C3A2>â<C3A2>â<C3A2>â?| 🟡 ä¸?| âœ?原生支æŒ<C3A6> | 🟢 强烈推è<C2A8><C3A8> | **Phase 5** |
**结论**:采ç”?*ç–ç•¥2(æ–点ç»ä¼ )+ ç–ç•¥3(任务拆分)**组å<E2809E>ˆæ–¹æ¡ˆã€?
---
## 3. æ”¹é€ æ€»ä½“æž¶æž„
### 3.1 架构对比
æ”¹é€ å‰<EFBFBD>(当å‰<EFBFBD>)ï¼?┌─────────────────────────────────────â”?â”? Business Layer (ASL, DC, SSA...) â”?â”? ├─ screeningService.ts â”?â”? â”? └─ processInBackground() â”? â†?å<>Œæ¥æ‰§è¡Œï¼ˆé˜»å¡žï¼‰ â”? └─ LLM12FieldsService.ts â”?â”? └─ cache.get/set() â”? â†?使用Memoryç¼“å˜ â””â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”? â†?使用 ┌─────────────────────────────────────â”?â”? Capability Layer (Common) â”?â”? ├─ cache â”?â”? â”? ├─ MemoryCacheAdapter âœ? â”? â†?当å‰<C3A5>使用 â”? â”? └─ RedisCacheAdapter 🔴 â”? â†?å<> ä½<C3A4>符(未实现) â”? └─ jobs â”?â”? └─ MemoryQueue âœ? â”? â†?当å‰<C3A5>使用 └─────────────────────────────────────â”? â†?ä¾<C3A4>èµ– ┌─────────────────────────────────────â”?â”? Platform Layer â”?â”? └─ PostgreSQL (业务数æ<C2B0>®) â”?└─────────────────────────────────────â”? 问题ï¼?â<>?缓å˜ä¸<C3A4>æŒ<C3A6>久(实例é‡<C3A9>å<EFBFBD>¯ä¸¢å¤±ï¼?â<>?队列ä¸<C3A4>æŒ<C3A6>久(任务丢失ï¼?â<>?多实例ä¸<C3A4>共享(å<CB86>„自独立) â<>?长任务ä¸<C3A4>å<EFBFBD>¯é<C2AF> ï¼?å°<C3A5>时任务失败çŽ?> 90%ï¼?â<>?è¿<C3A8>å<EFBFBD><C3A5>Serverless原则(å<CB86>Œæ¥æ‰§è¡Œé•¿ä»»åŠ¡ï¼?```
æ”¹é€ å<EFBFBD>Žï¼ˆPostgres-Only + 任务拆分 + æ–点ç»ä¼ ):
┌──────────────────────────────────────────────────────────────â”?â”? Business Layer (ASL, DC, SSA...) â”?â”? ├─ screeningService.ts â”?â”? â”? ├─ startScreeningTask() â”? â†?æ”¹é€ ç‚¹1:任务拆åˆ? â”?â”? â”? â”? └─ 推é€<C3A9>N个批次到队列 â”? â”?â”? â”? └─ registerWorkers() â”? â”?â”? â”? └─ 处ç<E2809E>†å<E280A0>•个批次+æ–点ç»ä¼ â”? â†?æ”¹é€ ç‚¹2:æ–点ç»ä¼? â”?â”? └─ LLM12FieldsService.ts â”?â”? └─ cache.get/set() â”? â†?æ— éœ€æ”¹åŠ¨ï¼ˆæŽ¥å<C2A5>£ä¸<C3A4>å<EFBFBD>˜ï¼‰â”?└──────────────────────────────────────────────────────────────â”? â†?使用
┌──────────────────────────────────────────────────────────────â”?â”? Capability Layer (Common) â”?â”? ├─ cache â”?â”? â”? ├─ MemoryCacheAdapter âœ? â”? â”?â”? â”? ├─ PostgresCacheAdapter âœ? â”? â†?新增ï¼?00行) â”?â”? â”? └─ CacheFactory â”? â†?更新支æŒ<C3A6>postgres â”?â”? └─ jobs â”?â”? ├─ MemoryQueue âœ? â”? â”?â”? ├─ PgBossQueue âœ? â”? â†?新增ï¼?00行) â”?â”? └─ JobFactory â”? â†?更新支æŒ<C3A6>pgboss â”?└──────────────────────────────────────────────────────────────â”? â†?ä¾<C3A4>èµ–
┌──────────────────────────────────────────────────────────────â”?â”? Platform Layer â”?â”? ├─ PostgreSQL (RDS) â”?â”? â”? ├─ 业务数æ<C2B0>®ï¼ˆasl_*, dc_*, ...ï¼? â”?â”? â”? ├─ platform.app_cache âœ? â”? â†?缓å˜è¡¨ï¼ˆæ–°å¢žï¼? â”?â”? â”? └─ platform.job_* âœ? â”? â†?队列表(pg-boss自动)│
â”? └─ 阿里云RDS特æ€? â”?â”? ├─ 自动备份(æ¯<C3A6>天) â”?â”? ├─ PITR(时间点æ<C2B9>¢å¤<C3A5>ï¼? â”?â”? └─ 高å<CB9C>¯ç”¨ï¼ˆä¸»ä»Žåˆ‡æ<E280A1>¢ï¼? â”?└──────────────────────────────────────────────────────────────â”?
优势ï¼?âœ?ç¼“å˜æŒ<C3A6>久化(RDS自动备份ï¼?âœ?队列æŒ<C3A6>久化(任务ä¸<C3A4>丢失)
âœ?多实例共享(SKIP LOCKEDï¼?âœ?事务一致性(业务+任务原å<C3A5>æ<EFBFBD><C3A6>交ï¼?âœ?é›¶é¢<C3A9>外è¿<C3A8>维(å¤<C3A5>用RDSï¼?âœ?长任务å<C2A1>¯é<C2AF> (拆分+æ–点,æˆ<C3A6>功率 > 99%ï¼?âœ?符å<C2A6>ˆServerless(çŸä»»åŠ¡ï¼Œå<C592>¯ä¸æ–æ<C2AD>¢å¤<C3A5>ï¼?```
### 3.2 任务拆分ç–ç•¥ âœ?**新增**
#### **拆分原则**
ç›®æ ‡ï¼šæ¯<EFBFBD>个任åŠ?< 15分钟(远低于pg-boss 4å°<C3A5>æ—¶é™<C3A9>制ï¼? åŽŸå› ï¼?1. SAE实例15åˆ†é’Ÿæ— æµ<C3A6>é‡<C3A9>自动缩å®?2. 失败é‡<C3A9>试æˆ<C3A6>本低(å<CB86>ªéœ€é‡<C3A9>试15分钟,ä¸<C3A4>æ˜?å°<C3A5>æ—¶ï¼?3. 进度å<C2A6>¯è§<C3A8>性高(用户体验好ï¼?4. å<>¯å¹¶è¡Œå¤„ç<E2809E>†ï¼ˆå¤šWorkerå<72>Œæ—¶å·¥ä½œï¼?```
*拆分ç–ç•¥è¡?
| 任务类型 | å<EFBFBD>•项耗时 | 推è<EFBFBD><EFBFBD>批次大å°<EFBFBD> | 批次耗时 | å¹¶å<EFBFBD>‘能力 |
|---|---|---|---|---|
| *ASL文献ç›é€? | 7.2ç§?ç¯? | 100ç¯? | 12分钟 | 10批并è¡? |
| DC病历æ<EFBFBD><EFBFBD>å<EFBFBD>– | 10ç§?ä»? | 50ä»? | 8.3分钟 | 10批并è¡? |
| 统计分æž<EFBFBD> | 0.1ç§?æ<>? | 5000æ<EFBFBD>? | 8.3分钟 | 20批并è¡? |
实际效果对比
场景ï¼?0000篇文献ç›é€?
ä¸<C3A4>拆分(错误):
├─ 1个任务,20å°<C3A5>æ—¶
├─ 超过pg-boss 4å°<C3A5>æ—¶é™<C3A9>制 â†?失败
└─ æˆ<C3A6>功率:0%
拆分(æ£ç¡®ï¼‰ï¼?├─ 100个批次,æ¯<C3A6>批100篇,12分钟
├─ 10个Worker并行处ç<E2809E>†
├─ 总耗时ï¼?00 / 10 = 10批轮æ¬?× 12分钟 = 2å°<C3A5>æ—¶
└─ æˆ<C3A6>功率:> 99.5%(å<CB86>•批失败å<C2A5>ªéœ€é‡<C3A9>试12分钟ï¼?```
### 3.3 æ–点ç»ä¼ 机制 âœ?**新增**
#### **æ ¸å¿ƒæ€<C3A6>想**
问题:SAE实例éš<EFBFBD>æ—¶å<EFBFBD>¯èƒ½é‡<EFBFBD>å<EFBFBD>¯ï¼ˆå<EFBFBD>‘布更新ã€<EFBFBD>自动缩容)
æ— æ–点: ├─ 处ç<E2809E>†åˆ°ç¬¬900ç¯?â†?实例é‡<C3A9>å<EFBFBD>¯ ├─ é‡<C3A9>新开始,从第1篇开å§?└─ 浪费时间ï¼?00 × 7.2ç§?= 108分钟
有æ–点: ├─ æ¯<C3A6>处ç<E2809E>?0篇,ä¿<C3A4>å˜è¿›åº¦åˆ°æ•°æ<C2B0>®åº“ ├─ 处ç<E2809E>†åˆ°ç¬¬900ç¯?â†?实例é‡<C3A9>å<EFBFBD>¯ ├─ é‡<C3A9>新开始,从第900篇继ç»?└─ 浪费时间ï¼? 1分钟
#### **实现ç–ç•¥**
```typescript
// æ•°æ<C2B0>®åº“记录进åº?model AslScreeningTask {
processedItems Int // 已处ç<E2809E>†æ•°é‡? currentIndex Int // 当å‰<C3A5>æ¸¸æ ‡ï¼ˆæ–点)
lastCheckpoint DateTime // 最å<E282AC>Žä¸€æ¬¡ä¿<C3A4>å˜æ—¶é—? checkpointData Json // æ–点详细数æ<C2B0>®
}
// Worker读å<C2BB>–æ–点
const startIndex = task.currentIndex || 0;
for (let i = startIndex; i < items.length; i++) {
await processItem(items[i]);
// æ¯<C3A6>处ç<E2809E>?0项,ä¿<C3A4>å˜æ–点
if ((i + 1) % 10 === 0) {
await saveCheckpoint(i + 1);
}
}
ä¿<EFBFBD>å˜é¢‘率æ<EFBFBD>ƒè¡¡
| 频率 | æ•°æ<EFBFBD>®åº“写å…? | é‡<EFBFBD>å<EFBFBD>¯æµªè´¹æ—¶é—´ | 推è<EFBFBD><EFBFBD> |
|---|---|---|---|
| æ¯?é¡? | 很高(性能差) | < 10ç§? | â<EFBFBD>?ä¸<C3A4>推è<C2A8>? |
| æ¯?0é¡? | ä¸ç‰ | < 2分钟 | âœ?推è<C2A8><C3A8> |
| æ¯?00é¡? | ä½? | < 12分钟 | 🟡 å<>¯é€? |
| ä¸<EFBFBD>ä¿<EFBFBD>å? | æ—? | é‡<EFBFBD>头开å§? | â<EFBFBD>?ä¸<C3A4>å<EFBFBD>¯æŽ¥å<C2A5>— |
3.4 Schema设计(统一在platform)✨ *已更�
// prisma/schema.prisma
datasource db {
provider = "postgresql"
url = env("DATABASE_URL")
schemas = ["platform_schema", "aia_schema", "pkb_schema", "asl_schema",
"dc_schema", "ssa_schema", "st_schema", "rvw_schema",
"admin_schema", "common_schema", "public"]
}
generator client {
provider = "prisma-client-js"
previewFeatures = ["multiSchema"] // âœ?å<>¯ç”¨å¤šSchema支æŒ<C3A6>
}
// ==================== å¹³å<C2B3>°åŸºç¡€è®¾æ–½ï¼ˆplatform_schemaï¼?===================
/// 应用缓å˜è¡¨ï¼ˆæ›¿ä»£Redisï¼?model AppCache {
id Int @id @default(autoincrement())
key String @unique @db.VarChar(500)
value Json
expiresAt DateTime @map("expires_at")
createdAt DateTime @default(now()) @map("created_at")
@@index([expiresAt], name: "idx_app_cache_expires")
@@index([key, expiresAt], name: "idx_app_cache_key_expires")
@@map("app_cache")
@@schema("platform_schema") // �统一在platform Schema
}
// pg-boss会自动创建任务表(ä¸<C3A4>需è¦<C3A8>在Prismaä¸å®šä¹‰ï¼‰
// 表å<C2A8><C3A5>:platform_schema.job, platform_schema.version ç?
// ==================== 业务模å<C2A1>—(asl_schemaï¼?===================
/// ASLç›é€‰ä»»åŠ¡è¡¨ï¼ˆâœ¨ 需è¦<C3A8>æ–°å¢žå—æ®µæ”¯æŒ<C3A6>拆åˆ?æ–点ï¼?model AslScreeningTask {
id String @id @default(uuid())
projectId String @map("project_id")
taskType String @map("task_type")
status String // pending/running/completed/failed
totalItems Int @map("total_items")
processedItems Int @default(0) @map("processed_items")
successItems Int @default(0) @map("success_items")
failedItems Int @default(0) @map("failed_items")
conflictItems Int @default(0) @map("conflict_items")
// �新增:任务拆分支� totalBatches Int @default(1) @map("total_batches")
processedBatches Int @default(0) @map("processed_batches")
currentBatchIndex Int @default(0) @map("current_batch_index")
// âœ?新增:æ–点ç»ä¼ 支æŒ? currentIndex Int @default(0) @map("current_index")
lastCheckpoint DateTime? @map("last_checkpoint")
checkpointData Json? @map("checkpoint_data")
startedAt DateTime @map("started_at")
completedAt DateTime? @map("completed_at")
createdAt DateTime @default(now()) @map("created_at")
updatedAt DateTime @updatedAt @map("updated_at")
project AslScreeningProject @relation(fields: [projectId], references: [id])
results AslScreeningResult[]
@@index([projectId])
@@index([status])
@@map("asl_screening_tasks")
@@schema("asl_schema")
}
*æ–°å¢žå—æ®µè¯´æ˜Žï¼?
| å—æ®µ | 类型 | 用é€? | 示例å€? |
|---|---|---|---|
| totalBatches | Int | 总批次数 | 10�000篇�00�批) |
| processedBatches | Int | 已完æˆ<EFBFBD>批次数 | 3(已完æˆ<EFBFBD>3批) |
| currentBatchIndex | Int | 当å‰<EFBFBD>批次索引 | 3(æ£åœ¨å¤„ç<EFBFBD>†ç¬¬4批) |
| currentIndex | Int | 当å‰<EFBFBD>项索引(æ–点ï¼? | 350(已处ç<EFBFBD>†350篇) |
| lastCheckpoint | DateTime | 最å<EFBFBD>Žä¸€æ¬¡ä¿<EFBFBD>å˜æ–点时é—? | 2025-12-13 10:30:00 |
| checkpointData | Json | æ–点详细数æ<EFBFBD>® | {"lastProcessedId": "lit_123", "batchProgress": 0.35} |
3.5 Key/Queue命å<C2BD><C3A5>规范
// 缓å˜Key规范(逻辑隔离ï¼?const CACHE_KEY_PATTERNS = {
// ASL模å<C2A1>—
'asl:llm:{hash}': 'LLMæ<4D><C3A6>å<EFBFBD>–结果',
'asl:pdf:{fileId}': 'PDFè§£æž<C3A6>结果',
// DC模å<C2A1>—
'dc:health:{fileHash}': 'Excelå<6C>¥åº·æ£€æŸ?,
'dc:extraction:{recordId}': '病历æ<EFBFBD><EFBFBD>å<EFBFBD>–结果',
// 全局
'session:{userId}': '用户Session',
'config:{key}': '系统é…<EFBFBD>ç½®',
};
// 队列Name规范
const QUEUE_NAMES = {
ASL_TITLE_SCREENING: 'asl:title-screening',
ASL_FULLTEXT_SCREENING: 'asl:fulltext-screening',
DC_MEDICAL_EXTRACTION: 'dc:medical-extraction',
DC_DATA_CLEANING: 'dc:data-cleaning',
SSA_STATISTICAL_ANALYSIS: 'ssa:statistical-analysis',
};
4. 详细实施æ¥éª¤
4.1 Phase 1:环境准备(Day 1上å<C5A0>ˆï¼?.5天)
任务1.1:更新Prisma Schema
# 文件:prisma/schema.prisma
修改ç‚?:å<C5A1>¯ç”¨multiSchema
generator client {
provider = "prisma-client-js"
previewFeatures = ["multiSchema"] // �新增
}
修改ç‚?ï¼šæ·»åŠ AppCache模型
/// 应用缓å˜è¡¨ï¼ˆPostgres-Onlyæž¶æž„ï¼?model AppCache {
id Int @id @default(autoincrement())
key String @unique @db.VarChar(500)
value Json
expiresAt DateTime @map("expires_at")
createdAt DateTime @default(now()) @map("created_at")
@@index([expiresAt], name: "idx_app_cache_expires")
@@index([key, expiresAt], name: "idx_app_cache_key_expires")
@@map("app_cache")
@@schema("platform_schema")
}
执行è¿<EFBFBD>ç§»
cd backend
# 1. 生æˆ<C3A6>è¿<C3A8>移文件
npx prisma migrate dev --name add_postgres_cache
# 2. 生æˆ<C3A6>Prisma Client
npx prisma generate
# 3. 查看生æˆ<C3A6>çš„SQL
cat prisma/migrations/*/migration.sql
验è¯<EFBFBD>结果
-- 应该看到以下SQL
CREATE TABLE "platform_schema"."app_cache" (
"id" SERIAL PRIMARY KEY,
"key" VARCHAR(500) UNIQUE NOT NULL,
"value" JSONB NOT NULL,
"expires_at" TIMESTAMP NOT NULL,
"created_at" TIMESTAMP DEFAULT NOW()
);
CREATE INDEX "idx_app_cache_expires" ON "platform_schema"."app_cache"("expires_at");
CREATE INDEX "idx_app_cache_key_expires" ON "platform_schema"."app_cache"("key", "expires_at");
*任务1.2:安装ä¾<EFBFBD>èµ?
cd backend
# 安装pg-boss(任务队列)
npm install pg-boss --save
# 查看版本
npm list pg-boss
# 应显示:pg-boss@10.x.x
*任务1.3:更新环境å<EFBFBD>˜é‡<EFBFBD>é…<EFBFBD>ç½?
// 文件:backend/src/config/env.ts
import { z } from 'zod';
const envSchema = z.object({
// ... 现有é…<C3A9>ç½® ...
// ==================== 缓å˜é…<C3A9>ç½® ====================
CACHE_TYPE: z.enum(['memory', 'postgres']).default('memory'), // �新增postgres选项
// ==================== 队列é…<C3A9>ç½® ====================
QUEUE_TYPE: z.enum(['memory', 'pgboss']).default('memory'), // �新增pgboss选项
// ==================== æ•°æ<C2B0>®åº“é…<C3A9>ç½?====================
DATABASE_URL: z.string(),
});
export const config = {
// ... 现有é…<C3A9>ç½® ...
// 缓å˜é…<C3A9>ç½®
cacheType: process.env.CACHE_TYPE || 'memory',
// 队列é…<C3A9>ç½®
queueType: process.env.QUEUE_TYPE || 'memory',
// æ•°æ<C2B0>®åº“URL
databaseUrl: process.env.DATABASE_URL,
};
# 文件:backend/.env
# ==================== 缓å˜é…<C3A9>ç½® ====================
CACHE_TYPE=postgres # memory | postgres
# ==================== 队列é…<C3A9>ç½® ====================
QUEUE_TYPE=pgboss # memory | pgboss
# ==================== æ•°æ<C2B0>®åº“é…<C3A9>ç½?====================
DATABASE_URL=postgresql://user:password@localhost:5432/aiclincial?schema=public
4.2 Phase 2:实现PostgresCacheAdapter(Day 1下å<E280B9>ˆï¼?.5天)
任务2.1:创建PostgresCacheAdapter
// 文件:backend/src/common/cache/PostgresCacheAdapter.ts
import { prisma } from '../../config/database.js';
import type { CacheAdapter } from './CacheAdapter.js';
import { logger } from '../logging/index.js';
/**
* Postgres缓å˜é€‚é…<C3A9>å™? *
* æ ¸å¿ƒç‰¹æ€§ï¼š
* - æŒ<C3A6>久化å˜å‚¨ï¼ˆå®žä¾‹é‡<C3A9>å<EFBFBD>¯ä¸<C3A4>丢失)
* - 多实例共享(通过数æ<C2B0>®åº“)
* - æ‡’æƒ°åˆ é™¤ï¼ˆè¯»å<C2BB>–时清ç<E280A6>†è¿‡æœŸæ•°æ<C2B0>®ï¼? * - 自动清ç<E280A6>†ï¼ˆå®šæ—¶ä»»åŠ¡åˆ é™¤è¿‡æœŸæ•°æ<C2B0>®ï¼‰
*/
export class PostgresCacheAdapter implements CacheAdapter {
/**
* 获å<C2B7>–缓å˜ï¼ˆå¸¦è¿‡æœŸæ£€æŸ¥å’Œæ‡’æƒ°åˆ é™¤ï¼? */
async get<T = any>(key: string): Promise<T | null> {
try {
const record = await prisma.appCache.findUnique({
where: { key }
});
if (!record) {
return null;
}
// 检查是å<C2AF>¦è¿‡æœ? if (record.expiresAt < new Date()) {
// æ‡’æƒ°åˆ é™¤ï¼šå¼‚æ¥åˆ 除,ä¸<C3A4>阻塞主æµ<C3A6>程
this.deleteAsync(key);
return null;
}
logger.debug('[PostgresCache] 缓å˜å‘½ä¸', { key });
return record.value as T;
} catch (error) {
logger.error('[PostgresCache] 读å<C2BB>–失败', { key, error });
return null;
}
}
/**
* 设置缓å˜
*/
async set(key: string, value: any, ttlSeconds: number = 3600): Promise<void> {
try {
const expiresAt = new Date(Date.now() + ttlSeconds * 1000);
await prisma.appCache.upsert({
where: { key },
create: {
key,
value: value as any, // Prisma会自动处ç<E2809E>†JSON
expiresAt,
},
update: {
value: value as any,
expiresAt,
}
});
logger.debug('[PostgresCache] 缓å˜å†™å…¥', { key, ttl: ttlSeconds });
} catch (error) {
logger.error('[PostgresCache] 写入失败', { key, error });
throw error;
}
}
/**
* åˆ é™¤ç¼“å˜
*/
async delete(key: string): Promise<boolean> {
try {
await prisma.appCache.delete({
where: { key }
});
logger.debug('[PostgresCache] 缓å˜åˆ 除', { key });
return true;
} catch (error) {
// 如果keyä¸<C3A4>å˜åœ¨ï¼ŒPrisma会抛出错è¯? if ((error as any)?.code === 'P2025') {
return false;
}
logger.error('[PostgresCache] åˆ é™¤å¤±è´¥', { key, error });
return false;
}
}
/**
* 异æ¥åˆ 除(ä¸<C3A4>阻塞主æµ<C3A6>程)
*/
private deleteAsync(key: string): void {
prisma.appCache.delete({ where: { key } })
.catch(err => {
// é<>™é»˜å¤±è´¥ï¼ˆå<CB86>¯èƒ½å·²è¢«å…¶ä»–å®žä¾‹åˆ é™¤ï¼‰
logger.debug('[PostgresCache] æ‡’æƒ°åˆ é™¤å¤±è´¥', { key, err });
});
}
/**
* 批é‡<C3A9>åˆ é™¤ï¼ˆæ”¯æŒ<C3A6>模å¼<C3A5>匹é…<C3A9>)
*/
async deleteMany(pattern: string): Promise<number> {
try {
const result = await prisma.appCache.deleteMany({
where: {
key: {
contains: pattern
}
}
});
logger.info('[PostgresCache] 批é‡<C3A9>åˆ é™¤', { pattern, count: result.count });
return result.count;
} catch (error) {
logger.error('[PostgresCache] 批é‡<C3A9>åˆ é™¤å¤±è´¥', { pattern, error });
return 0;
}
}
/**
* 清空所有缓å? */
async flush(): Promise<void> {
try {
await prisma.appCache.deleteMany({});
logger.info('[PostgresCache] 缓å˜å·²æ¸…ç©?);
} catch (error) {
logger.error('[PostgresCache] 清空失败', { error });
throw error;
}
}
/**
* 获å<C2B7>–缓å˜ç»Ÿè®¡ä¿¡æ<C2A1>¯
*/
async getStats(): Promise<{
total: number;
expired: number;
byModule: Record<string, number>;
}> {
try {
const now = new Date();
// 总数
const total = await prisma.appCache.count();
// 过期数é‡<C3A9>
const expired = await prisma.appCache.count({
where: {
expiresAt: { lt: now }
}
});
// 按模å<C2A1>—统计(通过keyå‰<C3A5>缀分组ï¼? const all = await prisma.appCache.findMany({
select: { key: true }
});
const byModule: Record<string, number> = {};
all.forEach(item => {
const module = item.key.split(':')[0];
byModule[module] = (byModule[module] || 0) + 1;
});
return { total, expired, byModule };
} catch (error) {
logger.error('[PostgresCache] 获å<EFBFBD>–统计失败', { error });
return { total: 0, expired: 0, byModule: {} };
}
}
}
/**
* å<>¯åŠ¨å®šæ—¶æ¸…ç<E280A6>†ä»»åŠ¡ï¼ˆåˆ†æ‰¹æ¸…ç<E280A6>†ï¼Œé˜²æ¢é˜»å¡žï¼? *
* ç–ç•¥ï¼? * - æ¯<C3A6>分钟执行一æ¬? * - æ¯<C3A6>æ¬¡åˆ é™¤1000æ<30>¡è¿‡æœŸæ•°æ<C2B0>? * - 使用LIMITé<54>¿å…<C3A5>大事åŠ? */
export function startCacheCleanupTask(): void {
const CLEANUP_INTERVAL = 60 * 1000; // 1分钟
const BATCH_SIZE = 1000; // æ¯<C3A6>次1000æ<30>?
setInterval(async () => {
try {
// 使用原生SQL,支æŒ<C3A6>LIMIT
const result = await prisma.$executeRaw`
DELETE FROM platform_schema.app_cache
WHERE id IN (
SELECT id FROM platform_schema.app_cache
WHERE expires_at < NOW()
LIMIT ${BATCH_SIZE}
)
`;
if (result > 0) {
logger.info('[PostgresCache] 定时清ç<EFBFBD>†', { deleted: result });
}
} catch (error) {
logger.error('[PostgresCache] 定时清ç<EFBFBD>†å¤±è´¥', { error });
}
}, CLEANUP_INTERVAL);
logger.info('[PostgresCache] 定时清ç<EFBFBD>†ä»»åС已å<EFBFBD>¯åŠ?, {
interval: `${CLEANUP_INTERVAL / 1000}ç§’`,
batchSize: BATCH_SIZE
});
}
任务2.2:更新CacheFactory
// 文件:backend/src/common/cache/CacheFactory.ts
import { CacheAdapter } from './CacheAdapter.js';
import { MemoryCacheAdapter } from './MemoryCacheAdapter.js';
import { RedisCacheAdapter } from './RedisCacheAdapter.js';
import { PostgresCacheAdapter } from './PostgresCacheAdapter.js'; // �新增
import { logger } from '../logging/index.js';
import { config } from '../../config/env.js';
export class CacheFactory {
private static instance: CacheAdapter | null = null;
static getInstance(): CacheAdapter {
if (!this.instance) {
this.instance = this.createAdapter();
}
return this.instance;
}
private static createAdapter(): CacheAdapter {
const cacheType = config.cacheType;
logger.info('[CacheFactory] åˆ<C3A5>始化缓å?, { cacheType });
switch (cacheType) {
case 'postgres': // �新增
return this.createPostgresAdapter();
case 'memory':
return this.createMemoryAdapter();
case 'redis':
return this.createRedisAdapter();
default:
logger.warn(`[CacheFactory] 未知缓å˜ç±»åž‹: ${cacheType}, é™<C3A9>级到内å˜`);
return this.createMemoryAdapter();
}
}
/**
* 创建Postgres缓å˜é€‚é…<C3A9>å™? */
private static createPostgresAdapter(): PostgresCacheAdapter {
logger.info('[CacheFactory] 使用PostgresCacheAdapter');
return new PostgresCacheAdapter();
}
private static createMemoryAdapter(): MemoryCacheAdapter {
logger.info('[CacheFactory] 使用MemoryCacheAdapter');
return new MemoryCacheAdapter();
}
private static createRedisAdapter(): RedisCacheAdapter {
// ... 现有Redis逻辑 ...
logger.info('[CacheFactory] 使用RedisCacheAdapter');
return new RedisCacheAdapter({ /* ... */ });
}
static reset(): void {
this.instance = null;
}
}
*任务2.3:更新导�
// 文件:backend/src/common/cache/index.ts
export type { CacheAdapter } from './CacheAdapter.js';
export { MemoryCacheAdapter } from './MemoryCacheAdapter.js';
export { RedisCacheAdapter } from './RedisCacheAdapter.js';
export { PostgresCacheAdapter, startCacheCleanupTask } from './PostgresCacheAdapter.js'; // �新增
export { CacheFactory } from './CacheFactory.js';
import { CacheFactory } from './CacheFactory.js';
export const cache = CacheFactory.getInstance();
4.3 Phase 3:实现PgBossQueue(Day 2-3�天)
任务3.1:创建PgBossQueue
// 文件:backend/src/common/jobs/PgBossQueue.ts
import PgBoss from 'pg-boss';
import type { Job, JobQueue, JobHandler } from './types.js';
import { logger } from '../logging/index.js';
import { config } from '../../config/env.js';
/**
* PgBoss队列适é…<C3A9>å™? *
* æ ¸å¿ƒç‰¹æ€§ï¼š
* - 任务æŒ<C3A6>久化(实例é‡<C3A9>å<EFBFBD>¯ä¸<C3A4>丢失)
* - 自动é‡<C3A9>试(指数退é<E282AC>¿ï¼‰
* - 多实例å<E280B9><C3A5>调(SKIP LOCKEDï¼? * - 长任务支æŒ<C3A6>(4å°<C3A5>æ—¶è¶…æ—¶ï¼? */
export class PgBossQueue implements JobQueue {
private boss: PgBoss;
private started = false;
private workers: Map<string, any> = new Map();
constructor() {
this.boss = new PgBoss({
connectionString: config.databaseUrl,
schema: 'platform_schema', // 统一在platform Schema
max: 5, // è¿žæŽ¥æ± å¤§å°?
// âœ?关键é…<C3A9>置:长任务支æŒ<C3A6>
expireInHours: 4, // 任务é”?å°<C3A5>æ—¶å<C2B6>Žè¿‡æœ?
// 自动维护(清ç<E280A6>†æ—§ä»»åŠ¡ï¼? retentionDays: 7, // ä¿<C3A4>ç•™7天的历å<E280A0>²ä»»åŠ¡
deleteAfterDays: 30, // 30天å<C2A9>Žå½»åº•åˆ é™¤
});
// 监å<E28098>¬é”™è¯¯
this.boss.on('error', error => {
logger.error('[PgBoss] 队列错误', { error: error.message });
});
// 监å<E28098>¬ç»´æŠ¤äº‹ä»¶
this.boss.on('maintenance', () => {
logger.debug('[PgBoss] 执行维护任务');
});
}
/**
* å<>¯åŠ¨é˜Ÿåˆ—ï¼ˆæ‡’åŠ è½½ï¼? */
private async ensureStarted(): Promise<void> {
if (this.started) return;
try {
await this.boss.start();
this.started = true;
logger.info('[PgBoss] 队列已å<C2B2>¯åŠ?);
} catch (error) {
logger.error('[PgBoss] 队列å<EFBFBD>¯åŠ¨å¤±è´¥', { error });
throw error;
}
}
/**
* 推é€<C3A9>任务到队列
*/
async push<T = any>(type: string, data: T, options?: any): Promise<Job> {
await this.ensureStarted();
try {
const jobId = await this.boss.send(type, data, {
retryLimit: 3, // 失败é‡<C3A9>试3æ¬? retryDelay: 60, // 失败å<C2A5>?0ç§’é‡<C3A9>è¯? retryBackoff: true, // 指数退é<E282AC>¿ï¼ˆ60s, 120s, 240sï¼? expireInHours: 4, // 4å°<C3A5>æ—¶å<C2B6>Žè¿‡æœ? ...options
});
logger.info('[PgBoss] 任务入队', { type, jobId });
return {
id: jobId!,
type,
data,
status: 'pending',
createdAt: new Date(),
};
} catch (error) {
logger.error('[PgBoss] 任务入队失败', { type, error });
throw error;
}
}
/**
* 注册任务处ç<E2809E>†å™? */
process<T = any>(type: string, handler: JobHandler<T>): void {
// å¼‚æ¥æ³¨å†Œï¼ˆä¸<C3A4>阻塞主æµ<C3A6>程)
this.registerWorkerAsync(type, handler);
}
/**
* å¼‚æ¥æ³¨å†ŒWorker
*/
private async registerWorkerAsync<T>(
type: string,
handler: JobHandler<T>
): Promise<void> {
try {
await this.ensureStarted();
// 注册Worker
await this.boss.work(
type,
{
teamSize: 1, // æ¯<C3A6>个队列并å<C2B6>‘1个任åŠ? teamConcurrency: 1 // æ¯<C3A6>个Worker处ç<E2809E>†1个任åŠ? },
async (job: any) => {
const startTime = Date.now();
logger.info('[PgBoss] 开始处ç<EFBFBD>†ä»»åŠ?, {
type,
jobId: job.id,
attemptsMade: job.data.__retryCount || 0,
attemptsTotal: 3
});
try {
// 调用业务处ç<E2809E>†å‡½æ•°
const result = await handler({
id: job.id,
type,
data: job.data as T,
status: 'processing',
createdAt: new Date(job.createdon),
});
const duration = Date.now() - startTime;
logger.info('[PgBoss] âœ?任务完æˆ<C3A6>', {
type,
jobId: job.id,
duration: `${(duration / 1000).toFixed(2)}s`
});
return result;
} catch (error) {
const duration = Date.now() - startTime;
logger.error('[PgBoss] â<>?任务失败', {
type,
jobId: job.id,
attemptsMade: job.data.__retryCount || 0,
duration: `${(duration / 1000).toFixed(2)}s`,
error: error instanceof Error ? error.message : 'Unknown'
});
// 抛出错误,触å<C2A6>‘pg-boss自动é‡<C3A9>试
throw error;
}
}
);
this.workers.set(type, true);
logger.info('[PgBoss] �Worker已注�, { type });
} catch (error) {
logger.error('[PgBoss] Worker注册失败', { type, error });
throw error;
}
}
/**
* 获å<C2B7>–任务状æ€? */
async getJob(id: string): Promise<Job | null> {
try {
await this.ensureStarted();
const job = await this.boss.getJobById(id);
if (!job) return null;
return {
id: job.id!,
type: job.name,
data: job.data,
status: this.mapState(job.state),
progress: 0, // pg-bossä¸<C3A4>直接支æŒ<C3A6>è¿›åº? createdAt: new Date(job.createdon),
completedAt: job.completedon ? new Date(job.completedon) : undefined,
error: job.output?.message,
};
} catch (error) {
logger.error('[PgBoss] 获å<EFBFBD>–任务失败', { id, error });
return null;
}
}
/**
* æ˜ å°„pg-boss状æ€<C3A6>到通用状æ€? */
private mapState(state: string): Job['status'] {
switch (state) {
case 'completed':
return 'completed';
case 'failed':
return 'failed';
case 'active':
return 'processing';
case 'cancelled':
return 'cancelled';
default:
return 'pending';
}
}
/**
* 更新任务进度(通过业务表)
*
* 注æ„<C3A6>:pg-bossä¸<C3A4>直接支æŒ<C3A6>进度更新,
* 需è¦<C3A8>在业务层通过数æ<C2B0>®åº“表实现
*/
async updateProgress(id: string, progress: number, message?: string): Promise<void> {
logger.debug('[PgBoss] 进度更新(需业务表支æŒ<EFBFBD>)', { id, progress, message });
// 实际实现:更�aslScreeningTask.processedItems
}
/**
* å<>–消任务
*/
async cancelJob(id: string): Promise<boolean> {
try {
await this.ensureStarted();
await this.boss.cancel(id);
logger.info('[PgBoss] 任务已å<EFBFBD>–æ¶?, { id });
return true;
} catch (error) {
logger.error('[PgBoss] å<>–消任务失败', { id, error });
return false;
}
}
/**
* é‡<C3A9>试失败任务
*/
async retryJob(id: string): Promise<boolean> {
try {
await this.ensureStarted();
await this.boss.resume(id);
logger.info('[PgBoss] 任务已é‡<C3A9>è¯?, { id });
return true;
} catch (error) {
logger.error('[PgBoss] é‡<EFBFBD>试任务失败', { id, error });
return false;
}
}
/**
* 清ç<E280A6>†æ—§ä»»åŠ¡ï¼ˆpg-boss自动处ç<E2809E>†ï¼? */
async cleanup(olderThan: number = 86400000): Promise<number> {
// pg-boss有自动清ç<E280A6>†æœºåˆ¶ï¼ˆretentionDaysï¼? logger.debug('[PgBoss] 使用自动清ç<EFBFBD>†æœºåˆ¶');
return 0;
}
/**
* å…³é—队列(优雅关é—)
*/
async close(): Promise<void> {
if (!this.started) return;
try {
await this.boss.stop();
this.started = false;
logger.info('[PgBoss] 队列已关�);
} catch (error) {
logger.error('[PgBoss] 队列关é—失败', { error });
}
}
}
任务3.2:更新JobFactory
// 文件:backend/src/common/jobs/JobFactory.ts
import { JobQueue } from './types.js';
import { MemoryQueue } from './MemoryQueue.js';
import { PgBossQueue } from './PgBossQueue.js'; // �新增
import { logger } from '../logging/index.js';
import { config } from '../../config/env.js';
export class JobFactory {
private static instance: JobQueue | null = null;
static getInstance(): JobQueue {
if (!this.instance) {
this.instance = this.createQueue();
}
return this.instance;
}
private static createQueue(): JobQueue {
const queueType = config.queueType;
logger.info('[JobFactory] åˆ<C3A5>始化任务队åˆ?, { queueType });
switch (queueType) {
case 'pgboss': // �新增
return this.createPgBossQueue();
case 'memory':
return this.createMemoryQueue();
default:
logger.warn(`[JobFactory] 未知队列类型: ${queueType}, é™<C3A9>级到内å˜`);
return this.createMemoryQueue();
}
}
/**
* 创建PgBoss队列
*/
private static createPgBossQueue(): PgBossQueue {
logger.info('[JobFactory] 使用PgBossQueue');
return new PgBossQueue();
}
private static createMemoryQueue(): MemoryQueue {
logger.info('[JobFactory] 使用MemoryQueue');
const queue = new MemoryQueue();
// 定期清ç<E280A6>†ï¼ˆé<CB86>¿å…<C3A5>å†…å˜æ³„æ¼<C3A6>)
if (process.env.NODE_ENV !== 'test') {
setInterval(() => {
queue.cleanup();
}, 60 * 60 * 1000);
}
return queue;
}
static reset(): void {
this.instance = null;
}
}
*任务3.3:更新导�
// 文件:backend/src/common/jobs/index.ts
export type { Job, JobStatus, JobHandler, JobQueue } from './types.js';
export { MemoryQueue } from './MemoryQueue.js';
export { PgBossQueue } from './PgBossQueue.js'; // �新增
export { JobFactory } from './JobFactory.js';
import { JobFactory } from './JobFactory.js';
export const jobQueue = JobFactory.getInstance();
4.4 Phase 4:实现任务拆分机制(Day 4�天)�新增
*任务4.1:创建任务拆分工具函�
// 文件:backend/src/common/jobs/utils.ts (新建文件)
import { logger } from '../logging/index.js';
/**
* 将数组拆分æˆ<C3A6>多个批次
*
* @param items è¦<C3A8>拆分的项目数组
* @param chunkSize æ¯<C3A6>批次大å°? * @returns 拆分å<E280A0>Žçš„二维数组
*
* @example
* splitIntoChunks([1,2,3,4,5], 2) // [[1,2], [3,4], [5]]
*/
export function splitIntoChunks<T>(items: T[], chunkSize: number = 100): T[][] {
if (chunkSize <= 0) {
throw new Error('chunkSize must be greater than 0');
}
const chunks: T[][] = [];
for (let i = 0; i < items.length; i += chunkSize) {
chunks.push(items.slice(i, i + chunkSize));
}
logger.debug('[TaskSplit] 任务拆分完æˆ<C3A6>', {
total: items.length,
chunkSize,
chunks: chunks.length
});
return chunks;
}
/**
* 估算处ç<E2809E>†æ—¶é—´
*
* @param itemCount 项目数é‡<C3A9>
* @param timePerItem æ¯<C3A6>项处ç<E2809E>†æ—¶é—´ï¼ˆç§’ï¼? * @returns 总时间(秒)
*/
export function estimateProcessingTime(
itemCount: number,
timePerItem: number
): number {
return itemCount * timePerItem;
}
/**
* 推è<C2A8><C3A8>批次大å°<C3A5>
*
* æ ¹æ<C2B9>®å<C2AE>•项处ç<E2809E>†æ—¶é—´å’Œæœ€å¤§æ‰¹æ¬¡æ—¶é—´ï¼Œè®¡ç®—最优批次大å°? *
* @param totalItems 总项目数
* @param timePerItem æ¯<C3A6>项处ç<E2809E>†æ—¶é—´ï¼ˆç§’ï¼? * @param maxChunkTime å<>•批次最大时间(秒,默认15分钟ï¼? * @returns 推è<C2A8><C3A8>的批次大å°? *
* @example
* recommendChunkSize(1000, 7.2, 900) // 返回 125(æ¯<C3A6>æ‰?25项,15分钟ï¼? */
export function recommendChunkSize(
totalItems: number,
timePerItem: number,
maxChunkTime: number = 900 // 15分钟
): number {
// 计算æ¯<C3A6>批次最多能处ç<E2809E>†å¤šå°‘é¡? const itemsPerChunk = Math.floor(maxChunkTime / timePerItem);
// é™<C3A9>制范围:最å°?0项,最å¤?000é¡? const recommended = Math.max(10, Math.min(itemsPerChunk, 1000));
logger.info('[TaskSplit] 批次大å°<C3A5>推è<C2A8><C3A8>', {
totalItems,
timePerItem: `${timePerItem}ç§’`,
maxChunkTime: `${maxChunkTime}ç§’`,
recommended,
estimatedBatches: Math.ceil(totalItems / recommended),
estimatedTimePerBatch: `${(recommended * timePerItem / 60).toFixed(1)}分钟`
});
return recommended;
}
/**
* 批次任务é…<C3A9>ç½®è¡? */
export const CHUNK_STRATEGIES = {
// ASL文献ç›é€‰ï¼šæ¯<C3A6>批100篇,çº?2分钟
'asl:title-screening': {
chunkSize: 100,
timePerItem: 7.2, // � estimatedTime: 720, // 12分钟
maxRetries: 3,
description: 'ASLæ ‡é¢˜æ‘˜è¦<C3A8>ç›é€‰ï¼ˆå<CB86>Œæ¨¡åž‹å¹¶è¡Œï¼‰'
},
// ASL全文å¤<C3A5>ç›ï¼šæ¯<C3A6>æ‰?0篇,çº?5分钟
'asl:fulltext-screening': {
chunkSize: 50,
timePerItem: 18, // � estimatedTime: 900, // 15分钟
maxRetries: 3,
description: 'ASL全文å¤<C3A5>ç›ï¼?2å—æ®µæ<C2B5><C3A6>å<EFBFBD>–ï¼?
},
// DC病历æ<E280A0><C3A6>å<EFBFBD>–:æ¯<C3A6>æ‰?0份,çº?分钟
'dc:medical-extraction': {
chunkSize: 50,
timePerItem: 10, // � estimatedTime: 500, // 8分钟
maxRetries: 3,
description: 'DC医疗记录结构化æ<EFBFBD><EFBFBD>å<EFBFBD>?
},
// 统计分æž<C3A6>:æ¯<C3A6>æ‰?000æ<30>¡ï¼Œçº?分钟
'ssa:statistical-analysis': {
chunkSize: 5000,
timePerItem: 0.1, // � estimatedTime: 500, // 8分钟
maxRetries: 2,
description: 'SSA统计分æž<C3A6>计算'
}
} as const;
/**
* 获å<C2B7>–任务拆分ç–ç•¥
*/
export function getChunkStrategy(taskType: keyof typeof CHUNK_STRATEGIES) {
const strategy = CHUNK_STRATEGIES[taskType];
if (!strategy) {
logger.warn('[TaskSplit] 未找到任务ç–略,使用默认é…<C3A9>ç½®', { taskType });
return {
chunkSize: 100,
timePerItem: 10,
estimatedTime: 1000,
maxRetries: 3,
description: '默认任务ç–ç•¥'
};
}
return strategy;
}
*任务4.2:更新导�
// 文件:backend/src/common/jobs/index.ts
export type { Job, JobStatus, JobHandler, JobQueue } from './types.js';
export { MemoryQueue } from './MemoryQueue.js';
export { PgBossQueue } from './PgBossQueue.js';
export { JobFactory } from './JobFactory.js';
export { // �新增
splitIntoChunks,
estimateProcessingTime,
recommendChunkSize,
getChunkStrategy,
CHUNK_STRATEGIES
} from './utils.js';
import { JobFactory } from './JobFactory.js';
export const jobQueue = JobFactory.getInstance();
*任务4.3:å<EFBFBD>•元测è¯?
// 文件:backend/tests/common/jobs/utils.test.ts(新建)
import { describe, it, expect } from 'vitest';
import { splitIntoChunks, recommendChunkSize } from '../../../src/common/jobs/utils.js';
describe('Task Split Utils', () => {
describe('splitIntoChunks', () => {
it('should split array into chunks', () => {
const items = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
const chunks = splitIntoChunks(items, 3);
expect(chunks).toEqual([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10]
]);
});
it('should handle exact division', () => {
const items = [1, 2, 3, 4, 5, 6];
const chunks = splitIntoChunks(items, 2);
expect(chunks).toEqual([
[1, 2],
[3, 4],
[5, 6]
]);
});
it('should handle empty array', () => {
const chunks = splitIntoChunks([], 10);
expect(chunks).toEqual([]);
});
});
describe('recommendChunkSize', () => {
it('should recommend chunk size for ASL screening', () => {
const size = recommendChunkSize(1000, 7.2, 900); // 15分钟
expect(size).toBe(125); // 125 * 7.2 = 900ç§? });
it('should not exceed max limit', () => {
const size = recommendChunkSize(10000, 0.1, 900);
expect(size).toBe(1000); // 最�000
});
it('should not go below min limit', () => {
const size = recommendChunkSize(100, 100, 900);
expect(size).toBe(10); // 最�0
});
});
});
4.5 Phase 5:实现æ–点ç»ä¼ 机制(Day 5ï¼?天)âœ?新增
任务5.1:更新Prisma Schema
// 文件:prisma/schema.prisma
model AslScreeningTask {
// ... çŽ°æœ‰å—æ®µ ...
// âœ?æ–°å¢žå—æ®µï¼šä»»åŠ¡æ‹†åˆ? totalBatches Int @default(1) @map("total_batches")
processedBatches Int @default(0) @map("processed_batches")
currentBatchIndex Int @default(0) @map("current_batch_index")
// âœ?æ–°å¢žå—æ®µï¼šæ–点ç»ä¼? currentIndex Int @default(0) @map("current_index")
lastCheckpoint DateTime? @map("last_checkpoint")
checkpointData Json? @map("checkpoint_data")
}
# 生æˆ<C3A6>è¿<C3A8>ç§»
cd backend
npx prisma migrate dev --name add_task_split_checkpoint
# 验è¯<C3A8>
npx prisma migrate status
*任务5.2:创建æ–点ç»ä¼ æœ<EFBFBD>åŠ?
// 文件:backend/src/common/jobs/CheckpointService.ts(新建)
import { prisma } from '../../config/database.js';
import { logger } from '../logging/index.js';
export interface CheckpointData {
lastProcessedId?: string;
lastProcessedIndex: number;
batchProgress: number;
metadata?: Record<string, any>;
}
/**
* æ–点ç»ä¼ æœ<C3A6>务
*
* æ<><C3A6>供任务æ–点的ä¿<C3A4>å˜ã€<C3A3>读å<C2BB>–å’Œæ<C592>¢å¤<C3A5>功能
*/
export class CheckpointService {
/**
* ä¿<C3A4>å˜æ–点
*
* @param taskId 任务ID
* @param currentIndex 当å‰<C3A5>索引
* @param data æ–点数æ<C2B0>®
*/
static async saveCheckpoint(
taskId: string,
currentIndex: number,
data?: Partial<CheckpointData>
): Promise<void> {
try {
const checkpointData: CheckpointData = {
lastProcessedIndex: currentIndex,
batchProgress: 0,
...data
};
await prisma.aslScreeningTask.update({
where: { id: taskId },
data: {
currentIndex,
lastCheckpoint: new Date(),
checkpointData: checkpointData as any
}
});
logger.debug('[Checkpoint] æ–点已ä¿<C3A4>å?, {
taskId,
currentIndex,
checkpoint: checkpointData
});
} catch (error) {
logger.error('[Checkpoint] ä¿<EFBFBD>å˜æ–点失败', { taskId, currentIndex, error });
// ä¸<C3A4>抛出错误,é<C592>¿å…<C3A5>å½±å“<C3A5>主æµ<C3A6>ç¨? }
}
/**
* 读å<C2BB>–æ–点
*
* @param taskId 任务ID
* @returns æ–点索引和数æ<C2B0>? */
static async loadCheckpoint(taskId: string): Promise<{
startIndex: number;
data: CheckpointData | null;
}> {
try {
const task = await prisma.aslScreeningTask.findUnique({
where: { id: taskId },
select: {
currentIndex: true,
lastCheckpoint: true,
checkpointData: true
}
});
if (!task) {
return { startIndex: 0, data: null };
}
const startIndex = task.currentIndex || 0;
const data = task.checkpointData as CheckpointData | null;
if (startIndex > 0) {
logger.info('[Checkpoint] æ–点æ<EFBFBD>¢å¤<EFBFBD>', {
taskId,
startIndex,
lastCheckpoint: task.lastCheckpoint,
data
});
}
return { startIndex, data };
} catch (error) {
logger.error('[Checkpoint] 读å<EFBFBD>–æ–点失败', { taskId, error });
return { startIndex: 0, data: null };
}
}
/**
* 清除æ–点
*
* @param taskId 任务ID
*/
static async clearCheckpoint(taskId: string): Promise<void> {
try {
await prisma.aslScreeningTask.update({
where: { id: taskId },
data: {
currentIndex: 0,
lastCheckpoint: null,
checkpointData: null
}
});
logger.debug('[Checkpoint] æ–点已清é™?, { taskId });
} catch (error) {
logger.error('[Checkpoint] 清除æ–点失败', { taskId, error });
}
}
/**
* 批é‡<C3A9>ä¿<C3A4>å˜è¿›åº¦ï¼ˆåŒ…å<E280A6>«æ–点)
*
* @param taskId 任务ID
* @param updates æ›´æ–°æ•°æ<C2B0>®
*/
static async updateProgress(
taskId: string,
updates: {
processedItems?: number;
successItems?: number;
failedItems?: number;
conflictItems?: number;
currentIndex?: number;
checkpointData?: Partial<CheckpointData>;
}
): Promise<void> {
try {
const data: any = {};
// ç´¯åŠ å—æ®µ
if (updates.processedItems !== undefined) {
data.processedItems = { increment: updates.processedItems };
}
if (updates.successItems !== undefined) {
data.successItems = { increment: updates.successItems };
}
if (updates.failedItems !== undefined) {
data.failedItems = { increment: updates.failedItems };
}
if (updates.conflictItems !== undefined) {
data.conflictItems = { increment: updates.conflictItems };
}
// æ–ç‚¹å—æ®µ
if (updates.currentIndex !== undefined) {
data.currentIndex = updates.currentIndex;
data.lastCheckpoint = new Date();
}
if (updates.checkpointData) {
data.checkpointData = updates.checkpointData;
}
await prisma.aslScreeningTask.update({
where: { id: taskId },
data
});
logger.debug('[Checkpoint] 进度已更�, { taskId, updates });
} catch (error) {
logger.error('[Checkpoint] 更新进度失败', { taskId, error });
}
}
}
*任务5.3:更新导�
// 文件:backend/src/common/jobs/index.ts
export { CheckpointService } from './CheckpointService.js'; // �新增
4.6 Phase 6ï¼šæ”¹é€ ä¸šåŠ¡ä»£ç <C3A7>(Day 6-7ï¼?天)âœ?*已更æ–?
*任务6.1ï¼šæ”¹é€ ASLç›é€‰æœ<EFBFBD>务(âœ?完整版:拆分+æ–点+队列ï¼?
// 文件:backend/src/modules/asl/services/screeningService.ts
import { prisma } from '../../../config/database.js';
import { logger } from '../../../common/logging/index.js';
import {
jobQueue,
splitIntoChunks,
recommendChunkSize,
CheckpointService // �新增
} from '../../../common/jobs/index.js';
import { llmScreeningService } from './llmScreeningService.js';
/**
* å<>¯åЍç›é€‰ä»»åŠ¡ï¼ˆæ”¹é€ å<C2A0>Žï¼šä½¿ç”¨é˜Ÿåˆ—)
*/
export async function startScreeningTask(projectId: string, userId: string) {
try {
logger.info('Starting screening task', { projectId, userId });
// 1. 检查项目是å<C2AF>¦å˜åœ? const project = await prisma.aslScreeningProject.findFirst({
where: { id: projectId, userId },
});
if (!project) {
throw new Error('Project not found');
}
// 2. 获å<C2B7>–该项目的所有文çŒ? const literatures = await prisma.aslLiterature.findMany({
where: { projectId },
});
if (literatures.length === 0) {
throw new Error('No literatures found in project');
}
logger.info('Found literatures for screening', {
projectId,
count: literatures.length
});
// 3. 创建ç›é€‰ä»»åŠ¡ï¼ˆæ•°æ<C2B0>®åº“记录)
const task = await prisma.aslScreeningTask.create({
data: {
projectId,
taskType: 'title_abstract',
status: 'pending', // â†?åˆ<C3A5>始状æ€<C3A6>改为pending
totalItems: literatures.length,
processedItems: 0,
successItems: 0,
failedItems: 0,
conflictItems: 0,
startedAt: new Date(),
},
});
logger.info('Screening task created', { taskId: task.id });
// 4. âœ?推é€<C3A9>到队列(异æ¥å¤„ç<E2809E>†ï¼Œä¸<C3A4>阻塞请求)
await jobQueue.push('asl:title-screening', {
taskId: task.id,
projectId,
literatureIds: literatures.map(lit => lit.id),
});
logger.info('Task pushed to queue', { taskId: task.id });
// 5. ç«‹å<E280B9>³è¿”回任务ID(å‰<C3A5>端å<C2AF>¯ä»¥è½®è¯¢è¿›åº¦ï¼‰
return task;
} catch (error) {
logger.error('Failed to start screening task', { error, projectId });
throw error;
}
}
/**
* âœ?注册队列Worker(在应用å<C2A8>¯åŠ¨æ—¶è°ƒç”¨ï¼‰
*
* 这个函数需è¦<C3A8>在 backend/src/index.ts ä¸è°ƒç”? */
export function registerScreeningWorkers() {
// æ³¨å†Œæ ‡é¢˜æ‘˜è¦<C3A8>ç›é€‰Worker
jobQueue.process('asl:title-screening', async (job) => {
const { taskId, projectId, literatureIds } = job.data;
logger.info('开始处ç<E2809E>†æ ‡é¢˜æ‘˜è¦<C3A8>ç›é€?, {
taskId,
total: literatureIds.length
});
try {
// 更新任务状æ€<C3A6>为running
await prisma.aslScreeningTask.update({
where: { id: taskId },
data: { status: 'running' }
});
// 获å<C2B7>–项目的PICOSæ ‡å‡†
const project = await prisma.aslScreeningProject.findUnique({
where: { id: projectId },
});
if (!project) {
throw new Error('Project not found');
}
const rawPicoCriteria = project.picoCriteria as any;
const picoCriteria = {
P: rawPicoCriteria?.P || rawPicoCriteria?.population || '',
I: rawPicoCriteria?.I || rawPicoCriteria?.intervention || '',
C: rawPicoCriteria?.C || rawPicoCriteria?.comparison || '',
O: rawPicoCriteria?.O || rawPicoCriteria?.outcome || '',
S: rawPicoCriteria?.S || rawPicoCriteria?.studyDesign || '',
};
// é€<C3A9>个处ç<E2809E>†æ–‡çŒ®
let successCount = 0;
let failedCount = 0;
let conflictCount = 0;
for (let i = 0; i < literatureIds.length; i++) {
const literatureId = literatureIds[i];
try {
// 获å<C2B7>–文献信æ<C2A1>¯
const literature = await prisma.aslLiterature.findUnique({
where: { id: literatureId },
});
if (!literature) {
failedCount++;
continue;
}
// 调用LLMç›é€‰æœ<C3A6>åŠ? const screeningResult = await llmScreeningService.screenSingleLiterature(
literature.title || '',
literature.abstract || '',
picoCriteria,
projectId
);
// åˆ¤æ–æ˜¯å<C2AF>¦å†²çª<C3A7>
const isConflict = screeningResult.deepseekDecision !== screeningResult.qwenDecision;
// ä¿<C3A4>å˜ç›é€‰ç»“æž? await prisma.aslScreeningResult.create({
data: {
literatureId,
projectId,
taskId,
taskType: 'title_abstract',
deepseekDecision: screeningResult.deepseekDecision,
deepseekReason: screeningResult.deepseekReason,
deepseekRawResponse: screeningResult.deepseekRawResponse,
qwenDecision: screeningResult.qwenDecision,
qwenReason: screeningResult.qwenReason,
qwenRawResponse: screeningResult.qwenRawResponse,
finalDecision: screeningResult.finalDecision,
hasConflict: isConflict,
manualReviewRequired: isConflict,
},
});
if (isConflict) {
conflictCount++;
} else {
successCount++;
}
} catch (error) {
logger.error('处ç<EFBFBD>†æ–‡çŒ®å¤±è´¥', { literatureId, error });
failedCount++;
}
// 更新进度(æ¯<C3A6>处ç<E2809E>†10篇或最å<E282AC>Žä¸€ç¯‡æ—¶æ›´æ–°ï¼? if ((i + 1) % 10 === 0 || i === literatureIds.length - 1) {
await prisma.aslScreeningTask.update({
where: { id: taskId },
data: {
processedItems: i + 1,
successItems: successCount,
failedItems: failedCount,
conflictItems: conflictCount,
}
});
}
}
// æ ‡è®°ä»»åŠ¡å®Œæˆ<C3A6>
await prisma.aslScreeningTask.update({
where: { id: taskId },
data: {
status: 'completed',
completedAt: new Date(),
processedItems: literatureIds.length,
successItems: successCount,
failedItems: failedCount,
conflictItems: conflictCount,
}
});
logger.info('æ ‡é¢˜æ‘˜è¦<EFBFBD>ç›é€‰å®Œæˆ?, {
taskId,
total: literatureIds.length,
success: successCount,
failed: failedCount,
conflict: conflictCount
});
return {
success: true,
processed: literatureIds.length,
successCount,
failedCount,
conflictCount,
};
} catch (error) {
// 任务失败,更新状� await prisma.aslScreeningTask.update({
where: { id: taskId },
data: {
status: 'failed',
completedAt: new Date(),
}
});
logger.error('æ ‡é¢˜æ‘˜è¦<C3A8>ç›é€‰å¤±è´?, { taskId, error });
throw error;
}
});
logger.info('âœ?ASLç›é€‰Worker已注å†?);
}
*任务4.2:更新应用入å<EFBFBD>£ï¼ˆæ³¨å†ŒWorkersï¼?
// 文件:backend/src/index.ts
import Fastify from 'fastify';
import { logger } from './common/logging/index.js';
import { startCacheCleanupTask } from './common/cache/index.js'; // �新增
import { registerScreeningWorkers } from './modules/asl/services/screeningService.js'; // �新增
const app = Fastify({ logger: false });
// ... 注册路由ç?...
// âœ?å<>¯åЍæœ<C3A6>务器å<C2A8>Žï¼Œæ³¨å†Œé˜Ÿåˆ—Workers和定时任åŠ?app.listen({ port: 3001, host: '0.0.0.0' }, async (err, address) => {
if (err) {
logger.error('Failed to start server', { error: err });
process.exit(1);
}
logger.info(`Server listening on ${address}`);
// âœ?å<>¯åŠ¨ç¼“å˜å®šæ—¶æ¸…ç<E280A6>†
if (process.env.CACHE_TYPE === 'postgres') {
startCacheCleanupTask();
}
// �注册队列Workers
if (process.env.QUEUE_TYPE === 'pgboss') {
try {
registerScreeningWorkers(); // ASL模å<C2A1>—
// registerDataCleaningWorkers(); // DC模å<C2A1>—(待实现ï¼? // registerStatisticalWorkers(); // SSA模å<C2A1>—(待实现ï¼? } catch (error) {
logger.error('Failed to register workers', { error });
}
}
});
// 优雅关é—
process.on('SIGTERM', async () => {
logger.info('SIGTERM received, closing server gracefully...');
await app.close();
process.exit(0);
});
任务4.3:DC模å<EFBFBD>—æ”¹é€ ï¼ˆæ¬¡ä¼˜å…ˆï¼‰
// 文件:backend/src/modules/dc/tool-b/services/MedicalRecordExtractionService.ts
// ï¼ˆä»…ç¤ºä¾‹ï¼Œå®žé™…æ ¹æ<C2B9>®éœ€æ±‚实现)
import { jobQueue } from '../../../../common/jobs/index.js';
export function registerMedicalExtractionWorkers() {
jobQueue.process('dc:medical-extraction', async (job) => {
const { taskId, recordIds } = job.data;
// 批é‡<C3A9>æ<EFBFBD><C3A6>å<EFBFBD>–病历
for (const recordId of recordIds) {
await extractSingleRecord(recordId);
// 更新进度
// ...
}
return { success: true };
});
}
4.5 Phase 5:测试验è¯<C3A8>(Day 6ï¼?天)
*任务5.1:å<EFBFBD>•元测è¯?
// 文件:backend/tests/common/cache/PostgresCacheAdapter.test.ts
import { describe, it, expect, beforeEach } from 'vitest';
import { PostgresCacheAdapter } from '../../../src/common/cache/PostgresCacheAdapter.js';
import { prisma } from '../../../src/config/database.js';
describe('PostgresCacheAdapter', () => {
let cache: PostgresCacheAdapter;
beforeEach(async () => {
cache = new PostgresCacheAdapter();
// 清空测试数æ<C2B0>®
await prisma.appCache.deleteMany({});
});
it('should set and get cache', async () => {
await cache.set('test:key', { value: 'hello' }, 60);
const result = await cache.get('test:key');
expect(result).toEqual({ value: 'hello' });
});
it('should return null for expired cache', async () => {
await cache.set('test:key', { value: 'hello' }, 1); // 1秒过æœ? await new Promise(resolve => setTimeout(resolve, 1100)); // ç‰å¾…1.1ç§? const result = await cache.get('test:key');
expect(result).toBeNull();
});
it('should delete cache', async () => {
await cache.set('test:key', { value: 'hello' }, 60);
await cache.delete('test:key');
const result = await cache.get('test:key');
expect(result).toBeNull();
});
// ... 更多测试 ...
});
*任务5.2:集æˆ<EFBFBD>测è¯?
# 1. å<>¯åŠ¨æœ¬åœ°Postgres
docker start ai-clinical-postgres
# 2. 设置环境å<C692>˜é‡<C3A9>
export CACHE_TYPE=postgres
export QUEUE_TYPE=pgboss
export DATABASE_URL=postgresql://postgres:123456@localhost:5432/aiclincial
# 3. è¿<C3A8>行è¿<C3A8>ç§»
cd backend
npx prisma migrate deploy
# 4. å<>¯åŠ¨åº”ç”¨
npm run dev
# 应该看到日志�# [CacheFactory] 使用PostgresCacheAdapter
# [JobFactory] 使用PgBossQueue
# [PgBoss] 队列已å<C2B2>¯åŠ?# [PostgresCache] 定时清ç<E280A6>†ä»»åС已å<C2B2>¯åŠ?# âœ?ASLç›é€‰Worker已注å†?```
#### **任务5.3:功能测�*
```bash
# 测试1:缓å˜åŠŸèƒ?curl -X POST http://localhost:3001/api/v1/asl/projects/:projectId/screening
# 观察日志ï¼?# [PostgresCache] 缓å˜å‘½ä¸ { key: 'asl:llm:...' }
# 测试2:队列功èƒ?# æ<><C3A6>交任务 â†?观察任务状æ€<C3A6>å<EFBFBD>˜åŒ?# pending â†?running â†?completed
# 测试3:实例é‡<C3A9>å<EFBFBD>¯æ<C2AF>¢å¤?# æ<><C3A6>交任务 â†?ç‰å¾…处ç<E2809E>†åˆ?0% â†?Ctrl+Cå<43>œæ¢ â†?é‡<C3A9>æ–°å<C2B0>¯åЍ
# 任务应该自动æ<C2A8>¢å¤<C3A5>å¹¶ç»§ç»?```
---
## 5. 优先级与ä¾<C3A4>赖关系
### 5.1 æ”¹é€ ä¼˜å…ˆçº§çŸ©é˜µï¼ˆâœ¨ 已更新)
| 模å<C2A1>— | 优先çº?| 工作é‡?| 业务价å€?| 风险 | ä¾<C3A4>èµ– | 状æ€?|
|------|--------|--------|---------|------|------|------|
| **环境准备** | P0 | 0.5�| - | �| �| �待开�|
| **PostgresCacheAdapter** | P0 | 0.5å¤?| é™<C3A9>低LLMæˆ<C3A6>本50% | ä½?| 环境准备 | â¬?å¾…å¼€å§?|
| **PgBossQueue** | P0 | 2å¤?| 长任务å<C2A1>¯é<C2AF> æ€?| ä¸?| 环境准备 | â¬?å¾…å¼€å§?|
| **任务拆分机制** | P0 | 1å¤?| 任务æˆ<C3A6>功çŽ?> 99% | ä¸?| PgBossQueue | â¬?å¾…å¼€å§?|
| **æ–点ç»ä¼ 机制** | P0 | 1å¤?| 容错性æ<C2A7><C3A6>å<EFBFBD>?0å€?| ä½?| PgBossQueue | â¬?å¾…å¼€å§?|
| **ASLç›é€‰æ”¹é€?* | P0 | 1.5å¤?| ç”¨æˆ·æ ¸å¿ƒåŠŸèƒ½ | ä¸?| 任务拆分+æ–点 | â¬?å¾…å¼€å§?|
| **DCæ<43><C3A6>å<EFBFBD>–改é€?* | P1 | 1å¤?| ç”¨æˆ·æ ¸å¿ƒåŠŸèƒ½ | ä¸?| 任务拆分+æ–点 | â¬?å¾…å¼€å§?|
| **测试验è¯<C3A8>** | P0 | 1.5å¤?| ä¿<C3A4>è¯<C3A8>è´¨é‡<C3A9> | ä½?| 所有改é€?| â¬?å¾…å¼€å§?|
| **SAE部署** | P1 | 0.5å¤?| 生产就绪 | ä¸?| 测试验è¯<C3A8> | â¬?å¾…å¼€å§?|
**总工作é‡<C3A9>ï¼?* 9天(比V1.0å¢žåŠ 2å¤©ï¼Œå¢žåŠ ä»»åŠ¡æ‹†åˆ†å’Œæ–点ç»ä¼ )
### 5.2 ä¾<C3A4>赖关系图(âœ?已更新)
Day 1: 环境准备 ────────────────â”? â”?Day 1: PostgresCache ──────────â”? â”?Day 2-3: PgBossQueue ───────────â”? â”?Day 4: 任务拆分机制 ─────────────â”? ├─â†?Day 6-7: ASLç›é€‰æ”¹é€?──â”?Day 5: æ–点ç»ä¼ 机制 ─────────────â”? â”? â”? â”?Day 7: DCæ<43><C3A6>å<EFBFBD>–改é€?──────────────â”? â”? â”? ├─â†?Day 8-9: 测试 + 部署
### 5.3 å<>¯å¹¶è¡Œå·¥ä½œï¼ˆâœ?已更新)
Day 1(并行)ï¼?├─ 环境准备(上å<C5A0>ˆï¼‰ └─ PostgresCache实现(下å<E280B9>ˆï¼‰
Day 2-3(串行)ï¼?└─ PgBossQueue实现(必须ç‰çŽ¯å¢ƒå‡†å¤‡å®Œæˆ<C3A6>ï¼? Day 4-5(å<CB86>¯å¹¶è¡Œï¼‰ï¼š ├─ 任务拆分机制(工具函数) └─ æ–点ç»ä¼ 机制(数æ<C2B0>®åº“å—æ®µï¼? Day 6-7(串行)ï¼?├─ ASLç›é€‰æ”¹é€ (使用拆分+æ–点ï¼?└─ DCæ<43><C3A6>å<EFBFBD>–æ”¹é€ ï¼ˆå¤<C3A5>用拆分+æ–点ï¼? Day 8-9(并行)ï¼?├─ 测试验è¯<C3A8> └─ 文档完善
---
## 6. 测试验è¯<C3A8>方案(✨ 已更新)
### 6.1 测试清å<E280A6>•
#### **功能测试(基础�*
```bash
âœ?缓å˜è¯»å†™æ£å¸¸
âœ?缓å˜è¿‡æœŸè‡ªåŠ¨æ¸…ç<E280A6>†ï¼ˆæ¯<C3A6>分钟1000æ<30>¡ï¼‰
âœ?缓å˜å¤šå®žä¾‹å…±äº«ï¼ˆå®žä¾‹A写→实例B读)
âœ?队列任务入队(pg-bossï¼?âœ?队列任务处ç<E2809E>†ï¼ˆWorkerï¼?âœ?队列任务é‡<C3A9>试(模拟失败,3次é‡<C3A9>试)
âœ?队列实例é‡<C3A9>å<EFBFBD>¯æ<C2AF>¢å¤<C3A5>
任务拆分测试 �新增
âœ?任务拆分工具函数æ£ç¡®æ€? - splitIntoChunks([1..100], 30) â†?4批(30+30+30+10ï¼? - recommendChunkSize(1000, 7.2, 900) â†?125
�拆分任务入队
- 1000篇文çŒ?â†?10批次,æ¯<C3A6>æ‰?00ç¯? - 验è¯<C3A8>æ•°æ<C2B0>®åº“:totalBatches=10, processedBatches=0
âœ?批次并行处ç<E2809E>†
- 多个Workerå<72>Œæ—¶å¤„ç<E2809E>†ä¸<C3A4>å<EFBFBD>Œæ‰¹æ¬¡
- 验è¯<C3A8>æ— å†²çª<C3A7>(SKIP LOCKEDï¼?
âœ?批次失败é‡<C3A9>试
- 模拟ç¬?批失è´?â†?自动é‡<C3A9>试
- 其他批次ä¸<C3A4>å<EFBFBD>—å½±å“<C3A5>
æ–点ç»ä¼ 测试 âœ?新增
âœ?æ–点ä¿<C3A4>å˜
- 处ç<E2809E>†åˆ°ç¬¬50é¡?â†?ä¿<C3A4>å˜æ–点
- 验è¯<C3A8>æ•°æ<C2B0>®åº“:currentIndex=50, lastCheckpointæ›´æ–°
âœ?æ–点æ<C2B9>¢å¤<C3A5>
- ä»»åŠ¡ä¸æ–(Ctrl+Cï¼? - é‡<C3A9>å<EFBFBD>¯æœ<C3A6>务 â†?从第50项继ç»? - 验è¯<C3A8>:å‰<C3A5>50项ä¸<C3A4>é‡<C3A9>å¤<C3A5>处ç<E2809E>†
âœ?批次级æ–ç‚? - 10批次任务,完æˆ<C3A6>å‰<C3A5>3æ‰? - 实例é‡<C3A9>å<EFBFBD>¯ â†?从第4批继ç»? - 验è¯<C3A8>:processedBatches=3, currentBatchIndex=3
*长时间任务测è¯? 🔴 é‡<EFBFBD>点
âœ?1000篇文献ç›é€‰ï¼ˆçº?å°<C3A5>æ—¶ï¼? - 拆分æˆ?0批,æ¯<C3A6>批12分钟
- æˆ<C3A6>功çŽ?> 99%
- 验è¯<C3A8>进度更新(æ¯<C3A6>10篇)
âœ?10000篇文献ç›é€‰ï¼ˆçº?0å°<C3A5>æ—¶ï¼? - 拆分æˆ?00批,æ¯<C3A6>批12分钟
- 10个Worker并行 â†?2å°<C3A5>时完æˆ<C3A6>
- æˆ<C3A6>功çŽ?> 99.5%
âœ?实例é‡<C3A9>å<EFBFBD>¯æ<C2AF>¢å¤<C3A5>(关键测试)
- å<>¯åŠ¨ä»»åŠ¡ â†?ç‰å¾…50% â†?å<>œæ¢æœ<C3A6>务(Ctrl+Cï¼? - é‡<C3A9>å<EFBFBD>¯æœ<C3A6>务 â†?任务自动æ<C2A8>¢å¤<C3A5>
- 验è¯<C3A8>:从50%ç»§ç»ï¼Œä¸<C3A4>ä»?%å¼€å§? - 预期:总耗时 â‰?原计划时é—?× 1.05(误å·?%内)
性能测试
âœ?缓å˜è¯»å<C2BB>–延迟 < 5ms(P99ï¼?âœ?缓å˜å†™å…¥å»¶è¿Ÿ < 10ms(P99ï¼?âœ?队列å<E28094>žå<C5BE><C3A5>é‡?> 100任务/å°<C3A5>æ—¶
âœ?æ–点ä¿<C3A4>å˜å»¶è¿Ÿ < 20ms
âœ?批次切æ<E280A1>¢å»¶è¿Ÿ < 100ms
故障测试(增强)
âœ?实例销æ¯<C3A6>(SAE缩容ï¼? - æ£åœ¨å¤„ç<E2809E>†ä»»åŠ¡ â†?实例销æ¯? - ç‰å¾…10分钟 â†?新实例接ç®? - 任务从æ–点æ<C2B9>¢å¤?
âœ?æ•°æ<C2B0>®åº“连接æ–å¼€
- 处ç<E2809E>†ä»»åŠ¡ä¸?â†?æ–开连接
- 自动é‡<C3A9>连 â†?ç»§ç»å¤„ç<E2809E>†
âœ?任务处ç<E2809E>†å¤±è´¥
- 模拟LLM超时
- 自动é‡<C3A9>试3æ¬? - 失败å<C2A5>Žæ ‡è®°ä¸ºfailed
âœ?Postgres慢查è¯? - 模拟数æ<C2B0>®åº“æ…¢ï¼? 5秒)
- 任务ä¸<C3A4>失败,ç‰å¾…完æˆ<C3A6>
âœ?å¹¶å<C2B6>‘冲çª<C3A7>
- 2个Worker领å<E280A0>–å<E28093>Œä¸€æ‰¹æ¬¡
- pg-boss SKIP LOCKED机制
- 验è¯<C3A8>:å<C5A1>ªæœ?个Worker处ç<E2809E>†
âœ?å<>‘布更新(生产场景)
- 15:00å<30>‘布更新
- æ£åœ¨æ‰§è¡Œçš?批任åŠ? - 实例é‡<C3A9>å<EFBFBD>¯ â†?5批任务自动æ<C2A8>¢å¤?```
### 6.2 测试脚本
#### **测试1:完整æµ<C3A6>程(1000篇文献)**
```bash
# 1. 准备测试数æ<C2B0>®
cd backend
npm run test:seed -- --literatures=1000
# 2. å<>¯åЍæœ<C3A6>务
npm run dev
# 3. æ<><C3A6>交任务
curl -X POST http://localhost:3001/api/v1/asl/projects/:projectId/screening \
-H "Content-Type: application/json"
# 4. 观察日志
# [TaskSplit] 任务拆分完æˆ<C3A6> { total: 1000, chunkSize: 100, chunks: 10 }
# [PgBoss] 任务入队 { type: 'asl:title-screening-batch', jobId: '1' }
# ...
# [PgBoss] 批次处ç<E2809E>†å®Œæˆ<C3A6> { taskId, batchIndex: 0, progress: '1/10' }
# 5. 查询进度
curl http://localhost:3001/api/v1/asl/tasks/:taskId
# 预期å“<C3A5>应ï¼?# {
# "id": "task_123",
# "status": "running",
# "totalItems": 1000,
# "processedItems": 350,
# "totalBatches": 10,
# "processedBatches": 3,
# "progress": 0.35
# }
*测试2:æ–点æ<EFBFBD>¢å¤<EFBFBD>(关键测试ï¼?
# 1. å<>¯åŠ¨ä»»åŠ¡
curl -X POST http://localhost:3001/api/v1/asl/projects/:projectId/screening
# 2. ç‰å¾…处ç<E2809E>†åˆ?0%
# 观察日志:processedItems: 500
# 3. 强制å<C2B6>œæ¢æœ<C3A6>务
# Ctrl+C �kill -9 <pid>
# 4. é‡<C3A9>æ–°å<C2B0>¯åЍæœ<C3A6>务
npm run dev
# 5. 观察日志
# [Checkpoint] æ–点æ<C2B9>¢å¤<C3A5> { taskId, startIndex: 500 }
# [PgBoss] 开始处ç<E2809E>†ä»»åŠ?{ batchIndex: 5 } â†?从第6批继ç»?
# 6. 验è¯<C3A8>最终结æž?# 总耗时应该约ç‰äº?2å°<C3A5>æ—¶ + é‡<C3A9>å<EFBFBD>¯æ—¶é—´ï¼? 5分钟ï¼?# ä¸<C3A4>应该是 4å°<C3A5>时(从头开始)
测试3:并å<EFBFBD>‘处ç<EFBFBD>†ï¼ˆ10000篇)
# 1. 准备大数æ<C2B0>®é›†
npm run test:seed -- --literatures=10000
# 2. å<>¯åŠ¨å¤šä¸ªWorker实例(模拟SAE多实例)
# Terminal 1
npm run dev -- --port=3001
# Terminal 2
npm run dev -- --port=3002
# Terminal 3
npm run dev -- --port=3003
# 3. æ<><C3A6>交任务(任æ„<C3A6>一个实例)
curl -X POST http://localhost:3001/api/v1/asl/projects/:projectId/screening
# 4. 观察三个实例的日å¿?# 应该看到ï¼?个实例å<E280B9>Œæ—¶å¤„ç<E2809E>†ä¸<C3A4>å<EFBFBD>Œæ‰¹æ¬?# Worker1: 处ç<E2809E>†æ‰¹æ¬¡ 0, 3, 6, 9, ...
# Worker2: 处ç<E2809E>†æ‰¹æ¬¡ 1, 4, 7, 10, ...
# Worker3: 处ç<E2809E>†æ‰¹æ¬¡ 2, 5, 8, 11, ...
# 5. 验è¯<C3A8>完æˆ<C3A6>æ—¶é—´
# 100批次 / 3个Worker â‰?33.3批轮æ¬?× 12分钟 â‰?6.6å°<EFBFBD>æ—¶
# å<>•Worker需è¦<C3A8>:100æ‰?× 12分钟 = 20å°<C3A5>æ—¶
# åŠ é€Ÿæ¯”ï¼?0 / 6.6 â‰?3å€?```
### 6.3 ç›‘æŽ§æŒ‡æ ‡ï¼ˆâœ¨ 已更新)
```typescript
// 缓å˜ç›‘控
- cache_hit_rate: 命ä¸çŽ?(ç›®æ ‡ > 60%)
- cache_total_count: 总数
- cache_expired_count: 过期数é‡<C3A9>
- cache_by_module: å<>„模å<C2A1>—分å¸?
// 队列监控(基础ï¼?- queue_pending_count: 待处ç<E2809E>†ä»»åŠ?- queue_processing_count: 处ç<E2809E>†ä¸ä»»åŠ?- queue_completed_count: 完æˆ<C3A6>任务
- queue_failed_count: 失败任务
- queue_avg_duration: å¹³å<C2B3>‡è€—æ—¶
// 任务拆分监控 �新增
- task_total_batches: 总批次数
- task_processed_batches: 已完æˆ<C3A6>批次数
- task_batch_success_rate: 批次æˆ<C3A6>功çŽ?(ç›®æ ‡ > 99%)
- task_avg_batch_duration: å¹³å<C2B3>‡æ‰¹æ¬¡è€—æ—¶
// æ–点ç»ä¼ 监控 âœ?新增
- checkpoint_save_count: æ–点ä¿<C3A4>å˜æ¬¡æ•°
- checkpoint_restore_count: æ–点æ<C2B9>¢å¤<C3A5>次数
- checkpoint_save_duration: ä¿<C3A4>å˜è€—æ—¶ (ç›®æ ‡ < 20ms)
- task_recovery_success_rate: æ<>¢å¤<C3A5>æˆ<C3A6>功çŽ?(ç›®æ ‡ 100%)
6.4 æˆ<C3A6>åŠŸæ ‡å‡†ï¼ˆâœ¨ 已更新)
基础功能ï¼?âœ?所有å<E280B0>•元测试通过
âœ?所有集æˆ<C3A6>测试通过
âœ?缓å˜å‘½ä¸çŽ?> 60%
âœ?LLM API调用次数下é™<C3A9> > 40%
任务å<C2A1>¯é<C2AF> æ€?🔴 关键ï¼?âœ?1000篇文献ç›é€‰æˆ<C3A6>功率 > 99%
âœ?10000篇文献ç›é€‰æˆ<C3A6>功率 > 99.5%
âœ?实例é‡<C3A9>å<EFBFBD>¯å<C2AF>Žä»»åŠ¡è‡ªåŠ¨æ<C2A8>¢å¤<C3A5>æˆ<C3A6>功率 100%
âœ?æ–点æ<C2B9>¢å¤<C3A5>å<EFBFBD>Žä¸<C3A4>é‡<C3A9>å¤<C3A5>处ç<E2809E>†å·²å®Œæˆ<C3A6>项
âœ?批次失败å<C2A5>ªéœ€é‡<C3A9>试å<E280A2>•批(ä¸<C3A4>é‡<C3A9>试全部ï¼?
æ€§èƒ½æŒ‡æ ‡ï¼?âœ?å<>•批次处ç<E2809E>†æ—¶é—?< 15分钟
âœ?æ–点ä¿<C3A4>å˜å»¶è¿Ÿ < 20ms
âœ?10个Workerå¹¶è¡ŒåŠ é€Ÿæ¯” > 8å€?
生产验è¯<C3A8>ï¼?âœ?生产环境è¿<C3A8>行48å°<C3A5>æ—¶æ— é”™è¯?âœ?处ç<E2809E>†3个完整的1000篇文献ç›é€‰ä»»åŠ?âœ?至少1次实例é‡<C3A9>å<EFBFBD>¯æ<C2AF>¢å¤<C3A5>测试æˆ<C3A6>åŠ?âœ?æ— ç”¨æˆ·æŠ•è¯‰ä»»åŠ¡ä¸¢å¤?âœ?系统å<C5B8>¯ç”¨æ€?> 99.9%
7. 上线与回�
7.1 上线æ¥éª¤
# Step 1: æ•°æ<C2B0>®åº“è¿<C3A8>移(生产环境ï¼?npx prisma migrate deploy
# Step 2: æ›´æ–°SAE环境å<C692>˜é‡<C3A9>
CACHE_TYPE=postgres
QUEUE_TYPE=pgboss
# Step 3: ç<>°åº¦å<C2A6>‘布ï¼?个实例)
# 观察24å°<C3A5>æ—¶ï¼Œç›‘æŽ§æŒ‡æ ‡æ£å¸?
# Step 4: å…¨é‡<C3A9>å<EFBFBD>‘布ï¼?-3个实例)
# é€<C3A9>æ¥æ‰©å®¹
# Step 5: 清ç<E280A6>†æ—§ä»£ç ?# 移除MemoryQueue相关代ç <C3A7>(å<CB86>¯é€‰ï¼‰
7.2 回滚方案
# 如果出现问题,立å<E280B9>³å›žæ»šï¼š
# 方案1:环境å<C692>˜é‡<C3A9>回滚(最快)
CACHE_TYPE=memory
QUEUE_TYPE=memory
# é‡<C3A9>å<EFBFBD>¯åº”用,é™<C3A9>çº§åˆ°å†…å˜æ¨¡å¼<C3A5>
# 方案2:代ç <C3A7>回æ»?git revert <commit>
# å›žæ»šåˆ°æ”¹é€ å‰<C3A5>版本
# 方案3:数æ<C2B0>®åº“回滚
npx prisma migrate down
# åˆ é™¤ app_cache 表(å<CB86>¯é€‰ï¼‰
7.3 风险预案
| 风险 | 概率 | å½±å“<EFBFBD> | 预案 |
|---|---|---|---|
| Postgres性能ä¸<EFBFBD>è¶³ | ä½? | ä¸? | å›žæ»šåˆ°å†…å˜æ¨¡å¼? |
| pg-boss连接失败 | ä½? | é«? | é™<EFBFBD>级到å<EFBFBD>Œæ¥å¤„ç<EFBFBD>? |
| ç¼“å˜æ•°æ<EFBFBD>®è¿‡å¤§ | ä½? | ä½? | å¢žåŠ æ¸…ç<EFBFBD>†é¢‘率 |
| 长任务å<EFBFBD>¡æ? | ä½? | ä¸? | 手动kill任务 |
8. æˆ<C3A6>åŠŸæ ‡å‡†ï¼ˆâœ¨ V2.0æ›´æ–°ï¼?
8.1 技术指æ ?
| æŒ‡æ ‡ç±»åˆ« | æŒ‡æ ‡ | ç›®æ ‡å€? | è¡¡é‡<EFBFBD>方法 |
|---|---|---|---|
| ç¼“å˜ | 命ä¸çŽ? | > 60% | 监控统计 |
| æŒ<EFBFBD>ä¹…åŒ? | âœ?实例é‡<C3A9>å<EFBFBD>¯ä¸<C3A4>丢å¤? | é‡<EFBFBD>å<EFBFBD>¯æµ‹è¯• | |
| 多实例共äº? | âœ?A写B能读 | å¹¶å<EFBFBD>‘测试 | |
| 队列 | 任务æŒ<EFBFBD>ä¹…åŒ? | âœ?实例销æ¯<C3A6>ä¸<C3A4>丢失 | 销æ¯<EFBFBD>测è¯? |
| 长任务å<EFBFBD>¯é<EFBFBD> æ€? | > 99% | 1000篇ç›é€? | |
| 超长任务å<EFBFBD>¯é<EFBFBD> æ€? | > 99.5% | 10000篇ç›é€? | |
| 拆分 | 批次æˆ<EFBFBD>功çŽ? | > 99% | 批次统计 |
| 批次耗时 | < 15分钟 | 监控统计 | |
| å¹¶è¡ŒåŠ é€Ÿæ¯” | > 8å€<C3A5>(10 Workerï¼? | 对比测试 | |
| æ–点 | æ–点ä¿<EFBFBD>å˜å»¶è¿Ÿ | < 20ms | 性能测试 |
| æ<EFBFBD>¢å¤<EFBFBD>æˆ<EFBFBD>功çŽ? | 100% | é‡<EFBFBD>å<EFBFBD>¯æµ‹è¯• | |
| é‡<EFBFBD>å¤<EFBFBD>处ç<EFBFBD>†çŽ? | 0% | æ•°æ<EFBFBD>®éªŒè¯<EFBFBD> |
8.2 ä¸šåŠ¡æŒ‡æ ‡
| æŒ‡æ ‡ | æ”¹é€ å‰<EFBFBD> | æ”¹é€ å<EFBFBD>Ž | 改进幅度 |
|---|---|---|---|
| LLM APIæˆ<C3A6>本 | 基线 | -40~60% | 节çœ<EFBFBD>Â¥X/æœ? |
| *任务æˆ<EFBFBD>功çŽ? | 10-30% | > 99% | æ<EFBFBD><EFBFBD>å<EFBFBD>‡3-10å€? |
| 用户é‡<EFBFBD>å¤<EFBFBD>æ<EFBFBD><EFBFBD>交 | å¹³å<EFBFBD>‡3æ¬? | < 1.1æ¬? | å‡<EFBFBD>å°‘70% |
| 任务完æˆ<EFBFBD>æ—¶é—´ | ä¸<EFBFBD>确定(å<EFBFBD>¯èƒ½å¤±è´¥ï¼? | 稳定å<EFBFBD>¯é¢„æµ? | 体验æ<EFBFBD><EFBFBD>å<EFBFBD>‡ |
| *用户满æ„<EFBFBD>åº? | 基线 | 显著æ<EFBFBD><EFBFBD>å<EFBFBD>‡ | é—®å<EFBFBD>·è°ƒæŸ¥ |
8.3 验收清å<E280A6>•
Phase 1-3:基础设施 ��PostgresCacheAdapter实现并通过测试
�PgBossQueue实现并通过测试
âœ?本地环境验è¯<C3A8>通过
Phase 4-5:高级特���任务拆分工具函数测试通过
âœ?æ–点ç»ä¼ æœ<C3A6>务测试通过
âœ?æ•°æ<C2B0>®åº“Schemaè¿<C3A8>ç§»æˆ<C3A6>功
Phase 6-7:业务集æˆ?âœ?âœ?ASLç›é€‰æœ<C3A6>åŠ¡æ”¹é€ å®Œæˆ?âœ?DCæ<43><C3A6>å<EFBFBD>–æœ<C3A6>åŠ¡æ”¹é€ å®Œæˆ?âœ?Worker注册æˆ<C3A6>功
关键功能测试 🔴ï¼?âœ?1000篇文献ç›é€‰ï¼ˆ2å°<C3A5>时)æˆ<C3A6>功率 > 99%
âœ?实例é‡<C3A9>å<EFBFBD>¯æ<C2AF>¢å¤<C3A5>测试通过ï¼?次)
âœ?æ–点ç»ä¼ 测试通过(从50%æ<>¢å¤<C3A5>ï¼?âœ?批次并行处ç<E2809E>†æµ‹è¯•通过
âœ?失败é‡<C3A9>试测试通过
生产环境验è¯<C3A8> 🔴ï¼?âœ?生产环境è¿<C3A8>行48å°<C3A5>æ—¶æ— è‡´å‘½é”™è¯?âœ?完æˆ<C3A6>至少3个真实用户任务(1000ç¯?ï¼?âœ?至少ç»<C3A7>历1次SAE实例é‡<C3A9>å<EFBFBD>¯ï¼Œä»»åŠ¡æˆ<C3A6>功æ<C5B8>¢å¤?âœ?缓å˜å‘½ä¸çŽ?> 60%
âœ?LLM API调用é‡<C3A9>下é™?> 40%
âœ?æ— ç”¨æˆ·æŠ•è¯‰ä»»åŠ¡ä¸¢å¤?```
---
## 9. 附录
### 9.1 å<>‚考文æ¡?
- [Postgres-Only 全能架构解决方案](./08-Postgres-Only 全能架构解决方案.md)
- [pg-boss 官方文档](https://github.com/timgit/pg-boss)
- [Prisma 多Schema支æŒ<C3A6>](https://www.prisma.io/docs/concepts/components/prisma-schema/multi-schema)
### 9.2 关键代ç <C3A7>ä½<C3A4>置(✨ V2.0æ›´æ–°ï¼?
backend/src/
├── common/cache/ # 缓å˜ç³»ç»Ÿ
â”? ├── CacheAdapter.ts # âœ?å·²å˜åœ?â”? ├── CacheFactory.ts # âš ï¸<C3AF> 需修改ï¼?10行)
â”? ├── MemoryCacheAdapter.ts # âœ?å·²å˜åœ?â”? ├── RedisCacheAdapter.ts # 🔴 å<> ä½<C3A4>符(ä¸<C3A4>用管)
â”? ├── PostgresCacheAdapter.ts # â<>?需新增(300行)
â”? └── index.ts # âš ï¸<C3AF> 需修改(导出)
�├── common/jobs/ # 任务队列
â”? ├── types.ts # âœ?å·²å˜åœ?â”? ├── JobFactory.ts # âš ï¸<C3AF> 需修改ï¼?10行)
â”? ├── MemoryQueue.ts # âœ?å·²å˜åœ?â”? ├── PgBossQueue.ts # â<>?需新增(400行)
â”? ├── utils.ts # â<>?需新增(200行)âœ?â”? ├── CheckpointService.ts # â<>?需新增(150行)âœ?â”? └── index.ts # âš ï¸<C3AF> 需修改(导出)
â”?├── modules/asl/services/ # ASL业务å±?â”? ├── screeningService.ts # âš ï¸<C3AF> éœ€æ”¹é€ ï¼ˆ~150行改动)
â”? └── llmScreeningService.ts # âœ?æ— éœ€æ”¹åŠ¨
â”?├── modules/dc/tool-b/services/ # DC业务å±?â”? └── (类似ASLï¼ŒæŒ‰éœ€æ”¹é€ ï¼‰
�├── config/
â”? └── env.ts # âš ï¸<C3AF> éœ€æ·»åŠ çŽ¯å¢ƒå<C692>˜é‡<C3A9>
â”?└── index.ts # âš ï¸<C3AF> 需修改(注册Workers + å<>¯åŠ¨æ¸…ç<E280A6>†ï¼?
prisma/
├── schema.prisma # âš ï¸<C3AF> 需修改ï¼?AppCache +å—æ®µï¼?└── migrations/ # 自动生æˆ<C3A6>
tests/ # 测试文件 ├── common/cache/ â”? └── PostgresCacheAdapter.test.ts # â<>?需新增 ├── common/jobs/ â”? ├── PgBossQueue.test.ts # â<>?需新增 â”? ├── utils.test.ts # â<>?需新增 âœ?â”? └── CheckpointService.test.ts # â<>?需新增 âœ?└── modules/asl/ └── screening-integration.test.ts # â<>?需新增
backend/.env # âš ï¸<C3AF> 需修改
**文件状æ€<C3A6>说明:**
- âœ?å·²å˜åœ?- æ— éœ€æ”¹åŠ¨
- âš ï¸<C3AF> 需修改 - å°‘é‡<C3A9>改动ï¼? 50行)
- â<>?需新增 - 全新文件
- 🔴 å<> ä½<C3A4>ç¬?- 忽略(ä¸<C3A4>å½±å“<C3A5>æœ¬æ¬¡æ”¹é€ ï¼‰
- âœ?V2.0新增 - 支æŒ<C3A6>拆分+æ–点
**代ç <C3A7>行数统计ï¼?*
总新增代ç <EFBFBD>:1800è¡?├─ PostgresCacheAdapter.ts: 300è¡?├─ PgBossQueue.ts: 400è¡?├─ utils.ts: 200è¡?âœ?├─ CheckpointService.ts: 150è¡?âœ?├─ screeningService.ts改é€? 200è¡?├─ 测试代ç <C3A7>: 400è¡?└─ 其他(Factoryã€<C3A3>导出ç‰ï¼? 150è¡?
总修改代ç <C3A7>:100è¡?├─ CacheFactory.ts: 10è¡?├─ JobFactory.ts: 10è¡?├─ index.ts: 20è¡?├─ env.ts: 20è¡?├─ schema.prisma: 40è¡?└─ å<>„处导出: çº?0å¤?```
10. V2.0 vs V1.0 对比
| 维度 | V1.0(原计划ï¼? | V2.0(当å‰<EFBFBD>版本) | å<EFBFBD>˜åŒ– |
|---|---|---|---|
| *工作� | 7� | 9� | +2� |
| 代ç <EFBFBD>行数 | ~1000è¡? | ~1900è¡? | +900è¡? |
| æ ¸å¿ƒç–ç•¥ | 缓å˜+队列 | 缓å˜+队列+拆分+æ–点 | +2个ç–ç•? |
| *长任务支æŒ? | < 4å°<C3A5>æ—¶ | ä»»æ„<EFBFBD>时长(拆分å<EFBFBD>Žï¼? | 质的æ<EFBFBD><EFBFBD>å<EFBFBD>‡ |
| *任务æˆ<EFBFBD>功çŽ? | 85-90% | > 99% | æ<EFBFBD><EFBFBD>å<EFBFBD>‡10%+ |
| 实例é‡<EFBFBD>å<EFBFBD>¯æ<EFBFBD>¢å¤<EFBFBD> | 从头开å§? | æ–点ç»ä¼ | é<EFBFBD>¿å…<EFBFBD>浪费 |
| å¹¶å<EFBFBD>‘能力 | å<EFBFBD>•实例串è¡? | 多实例并è¡? | åŠ é€ŸNå€? |
| *生产就绪åº? | 基本å<EFBFBD>¯ç”¨ | 完全就绪 | ä¼<EFBFBD>业çº? |
为什么增�天?
- Day 4:任务拆分机制(工具函数+测试ï¼?- Day 5:æ–点ç»ä¼ 机制(æœ<C3A6>务+æ•°æ<C2B0>®åº“)
å¢žåŠ çš„ä»·å€¼ï¼š
- âœ?长任务å<C2A1>¯é<C2AF> 性从85% â†?99%+
- âœ?支æŒ<C3A6>ä»»æ„<C3A6>时长任务(通过拆分ï¼?- âœ?实例é‡<C3A9>å<EFBFBD>¯ä¸<C3A4>浪费已处ç<E2809E>†ç»“æžœ
- âœ?多Worker并行,速度æ<C2A6><C3A6>å<EFBFBD>‡Nå€?- âœ?符å<C2A6>ˆServerless最佳实è·? *结论ï¼? 多花2天,æ<C592>¢å<C2A2>–质的飞跃ï¼?*é<>žå¸¸å€¼å¾—**ï¼?
11. 快速开�
11.1 一键检查清å<E280A6>?
# 1. 检查代ç <C3A7>结构(应该都å˜åœ¨ï¼‰
ls backend/src/common/cache/CacheAdapter.ts # �ls backend/src/common/cache/MemoryCacheAdapter.ts # �ls backend/src/common/jobs/types.ts # �ls backend/src/common/jobs/MemoryQueue.ts # �
# 2. 检查需è¦<C3A8>新增的文件(应该ä¸<C3A4>å˜åœ¨ï¼?ls backend/src/common/cache/PostgresCacheAdapter.ts # â<>?å¾…æ–°å¢?ls backend/src/common/jobs/PgBossQueue.ts # â<>?å¾…æ–°å¢?ls backend/src/common/jobs/utils.ts # â<>?å¾…æ–°å¢?ls backend/src/common/jobs/CheckpointService.ts # â<>?å¾…æ–°å¢?
# 3. 检查ä¾<C3A4>èµ?cd backend
npm list pg-boss # â<>?需安装
# 4. 检查数æ<C2B0>®åº“
psql -d aiclincial -c "\dt platform_schema.*"
# 应该看到现有的业务表,但没有app_cache
11.2 ç«‹å<E280B9>³å¼€å§?
# Phase 1:环境准备(30分钟�cd backend
npm install pg-boss --save
# 修改 prisma/schema.prismaï¼ˆæ·»åŠ AppCacheï¼?npx prisma migrate dev --name add_postgres_cache
npx prisma generate
# Phase 2:实现PostgresCacheAdapterï¼?å°<C3A5>æ—¶ï¼?# 创建 src/common/cache/PostgresCacheAdapter.ts
# 修改 src/common/cache/CacheFactory.ts
# 测试验è¯<C3A8>
# Phase 3:实现PgBossQueueï¼?6å°<C3A5>æ—¶ï¼?天)
# 创建 src/common/jobs/PgBossQueue.ts
# 修改 src/common/jobs/JobFactory.ts
# 测试验è¯<C3A8>
# ... 按计划继�```
---
**🎯 现在就开始?**
建议ï¼?1. **先通读文档** - ç<>†è§£æ•´ä½“æž¶æž„ï¼?0分钟ï¼?2. **验è¯<C3A8>代ç <C3A7>结构** - 确认真实文件ï¼?0分钟ï¼?3. **开始Phase 1** - 环境准备ï¼?0分钟ï¼?4. **é€<C3A9>æ¥æŽ¨è¿›** - æ¯<C3A6>完æˆ<C3A6>一个Phase就测è¯?
有任何问题éš<C3A9>时沟通ï¼<C3AF>🚀