Files
AIclinicalresearch/docs/07-运维文档/09-Postgres-Only改造实施计划(完整版).md
HaHafeng 1b53ab9d52 feat(aia): Complete AIA V2.0 with universal streaming capabilities
Major Changes:
- Add StreamingService with OpenAI Compatible format
- Upgrade Chat component V2 with Ant Design X integration
- Implement AIA module with 12 intelligent agents
- Update API routes to unified /api/v1 prefix
- Update system documentation

Backend (~1300 lines):
- common/streaming: OpenAI Compatible adapter
- modules/aia: 12 agents, conversation service, streaming integration
- Update route versions (RVW, PKB to v1)

Frontend (~3500 lines):
- modules/aia: AgentHub + ChatWorkspace (100% prototype restoration)
- shared/Chat: AIStreamChat, ThinkingBlock, useAIStream Hook
- Update API endpoints to v1

Documentation:
- AIA module status guide
- Universal capabilities catalog
- System overview updates
- All module documentation sync

Tested: Stream response verified, authentication working
Status: AIA V2.0 core completed (85%)
2026-01-14 19:15:01 +08:00

81 KiB
Raw Blame History

Postgres-Only 架构改造实施计划(完整版)

*文档版本� V2.0 �*已更�
*创建日期� 2025-12-13
*更新日期� 2025-12-13
*ç®æ ‡å®Œæˆ<EFBFBD>æ—¶é—´ï¼? 2025-12-20ï¼?天)
负责人: 技术团é˜? *风险等级ï¼? 🟢 低(基于æˆ<C3A6>ç†Ÿæ¹æ¡ˆï¼Œæœ‰é™<C3A9>级ç­ç•¥ï¼? *核心原则ï¼? 通用能åŠå±ä¼˜å…ˆï¼Œå¤<C3A5>用架构优势


âš ï¸<EFBFBD> V2.0 更新说明

本版本基äº?*真实代ç <C3A7>结构验è¯<C3A8>**å’?*技术方案深度分æž?*è¿è¡Œäº†é‡<C3A9>è¦<C3A8>æ´æ°ï¼š

*æ´æ°1:代ç <EFBFBD>结构真实性验è¯? âœ?- âœ?已从实际代ç <C3A7>验è¯<C3A8>æ‰€æœ‰è·¯å¾„åŒæ‡ä»¶

  • âœ?明确标注"已存åœ?ã€?å<> ä½<C3A4>ç¬?ã€?需新增"
  • âœ?RedisCacheAdapter确认为未实现的å<E2809E> ä½<C3A4>符
  • âœ?所有改造ç¹åŸºäºŽçœŸå®žä»£ç <C3A7>结构

*æ´æ°2:长任务å<EFBFBD>¯é<EFBFBD> æ€§ç­ç•? 🔴

  • âœ?新增ï¼?*断点续传机制**(策ç•?ï¼? 强烈推è<C2A8><C3A8>
  • âœ?新增ï¼?*任务拆分策略**(策ç•?ï¼? 架构级优åŒ?- âœ?分æž<C3A6>:心跳续租机制(ç­ç•¥1ï¼? ä¸<C3A4>推è<C2A8><C3A8>实æ?- âœ?明确:pg-boss的技术é™<C3A9>制(最é•?å°<C3A5>时,推è<C2A8>?å°<C3A5>æ—¶ï¼?- âœ?修正ï¼?4å°<C3A5>时连续任务ä¸<C3A4>支æŒ<C3A6>,需æ†åˆ

*æ´æ°3:实æ½è®¡åˆè°ƒæ•?

  • âœ?工作é‡<C3A9>增加到9天(增加任务æ†åˆ†åŒæ­ç¹ç»­ä¼ ï¼‰
  • âœ?完整的代ç <C3A7>示ä¾ï¼ˆå<CB86>¯ç´æŽ¥ä½¿ç”¨ï¼‰
  • âœ?æ•°æ<C2B0>®åº“Schemaæ´æ°ï¼ˆæ°å¢žå­—段)

📋 目录

  1. 改造背景与目标
  2. 当å‰<EFBFBD>系统分æž<EFBFBD>
  3. 改造总体架构
  4. 详细实施步骤
  5. 优先级与ä¾<EFBFBD>èµå…³ç³»
  6. æµè¯•验è¯<EFBFBD>æ¹æ¡ˆ
  7. [上线与回滚](#7-上线与回�

1. 改造背景与目标

1.1 为什么选æ©Postgres-Onlyï¼?

核心痛点ï¼?â<>?长任务ä¸<C3A4>å<EFBFBD>¯é<C2AF> ï¼?å°<C3A5>时任务实ä¾é”€æ¯<C3A6>丢失)
â<>?LLMæˆ<C3A6>本失控(缓存ä¸<C3A4>æŒ<C3A6>ä¹…åŒï¼‰
â<>?多实ä¾ä¸<C3A4>å<EFBFBD>Œæ­¥ï¼ˆå†…存缓存å<CB9C>„自ç¬ç«ï¼‰

技术方案选择ï¼?âœ?Postgres-Only(推è<C2A8><C3A8>)
  - 架构简å<E282AC>•(1-2人å¢é˜Ÿï¼‰
  - è¿<C3A8>ç»´æˆ<C3A6>本低(å¤<C3A5>用RDSï¼?  - æ•°æ<C2B0>®ä¸€è‡´æ€§å¼ºï¼ˆäºåŠ¡ä¿<C3A4>è¯<C3A8>)
  - èŠçœ<C3A7>æˆ<C3A6>本(Â?000+/年)

â<>?Redisæ¹æ¡ˆï¼ˆä¸<C3A4>推è<C2A8><C3A8>ï¼?  - æž¶æž„å¤<C3A5>æ<EFBFBD>(å<CB86>Œç³»ç»Ÿï¼?  - è¿<C3A8>ç»´è´Ÿæ…é‡<C3A9>(需维护Redisï¼?  - æ•°æ<C2B0>®ä¸€è‡´æ€§å¼±ï¼ˆå<CB86>Œå†™é—®é¢˜ï¼‰
  - é¢<C3A9>夿ˆ<C3A6>本(Â?000+/年)

1.2 改造目�

目标 当å‰<EFBFBD>状æ€? 改造å<EFBFBD>Ž è¡¡é‡<EFBFBD>指标
*长任务å<EFBFBD>¯é<EFBFBD> æ€? 5-10%(实ä¾é”€æ¯<C3A6>丢失) > 99% 2å°<EFBFBD>时任务æˆ<EFBFBD>功çŽ?
LLMæˆ<EFBFBD>本 é‡<EFBFBD>å¤<EFBFBD>调用多次 é™<EFBFBD>低50%+ 月度API费用
*缓存命中çŽ? 0%(ä¸<C3A4>æŒ<C3A6>ä¹…ï¼? 60%+ 监控统计
*多实ä¾å<EFBFBD>Œæ­? â<EFBFBD><>„自ç¬ç« âœ?共享缓存 实ä¾A写â†å®žä¾Bè¯?
*æž¶æž„å¤<EFBFBD>æ<EFBFBD>åº? 中等 ä½? 中间件数é‡?

2. 当å‰<C3A5>系统分æž<C3A6>

2.1 代ç <C3A7>结构现状ï¼?层架构)- âœ?*已验è¯<EFBFBD>真实代ç ?

*验è¯<EFBFBD>æ¹æ³•ï¼? 通过 list_dir å’?read_file 实际检查代ç <C3A7>库
*验è¯<EFBFBD>日期ï¼? 2025-12-13
*验è¯<EFBFBD>结果ï¼? æ‰€æœ‰è·¯å¾„åŒæ‡ä»¶çжæ€<C3A6>å<EFBFBD>‡å·²ç¡®è®?

AIclinicalresearch/backend/src/
├── common/                           # 🔵 通用能力��  ├── cache/                        # �真实存在
â”?  â”?  ├── CacheAdapter.ts           # âœ?接å<C2A5>£å®šä¹‰ï¼ˆçœŸå®žï¼‰
â”?  â”?  ├── CacheFactory.ts           # âœ?å·¥åŽæ¨¡å¼<C3A5>(真实)
�  �  ├── index.ts                  # �导出文件(真实)
â”?  â”?  ├── MemoryCacheAdapter.ts     # âœ?内存实现(真实,已完æˆ<C3A6>)
â”?  â”?  ├── RedisCacheAdapter.ts      # 🔴 å<> ä½<C3A4>符(真实存在,但未实现)
â”?  â”?  └── PostgresCacheAdapter.ts   # â<>?需新增(本次改造)
�  ��  ├── jobs/                         # �真实存在
â”?  â”?  ├── types.ts                  # âœ?接å<C2A5>£å®šä¹‰ï¼ˆçœŸå®žï¼‰
â”?  â”?  ├── JobFactory.ts             # âœ?å·¥åŽæ¨¡å¼<C3A5>(真实)
�  �  ├── index.ts                  # �导出文件(真实)
â”?  â”?  ├── MemoryQueue.ts            # âœ?内存实现(真实,已完æˆ<C3A6>)
â”?  â”?  └── PgBossQueue.ts            # â<>?需新增(本次改造)
â”?  â”?â”?  ├── storage/                      # âœ?å­˜å¨æœ<C3A6>务(已完å„ï¼?â”?  ├── logging/                      # âœ?日志系统(已完善ï¼?â”?  └── llm/                          # âœ?LLMç½å…³ï¼ˆå·²å®Œå„ï¼?â”?├── modules/                          # 🟢 业务模å<C2A1>—å±?â”?  ├── asl/                          # âœ?真实存在
�  �  ├── services/
â”?  â”?  â”?  ├── screeningService.ts      # âš ï¸<C3AF> 需改造:改为队列(真实)
�  �  �  └── llmScreeningService.ts   # �真实存在
�  �  └── common/llm/
�  �      └── LLM12FieldsService.ts    # �已用缓存(真实,�16行)
�  ��  └── dc/                           # �真实存在
�      └── tool-b/services/
�          └── HealthCheckService.ts     # �已用缓存(真实,�7行)
â”?└── config/                           # 🔵 å¹³å<C2B3>°åŸºç¡€å±?    ├── database.ts                   # âœ?Prismaé…<C3A9>置(真实)
    └── env.ts                        # âš ï¸<C3AF> 需添加æ°çŽ¯å¢ƒå<C692>˜é‡<C3A9>(真实æ‡ä»¶ï¼?```

**é‡<C3A9>è¦<C3A8>å<EFBFBD>现ï¼?*

1. **âœ?3屿ž¶æž„真实存åœ?* - 代ç <C3A7>组织完全符å<C2A6>ˆè®¾è®¡
2. **âœ?å·¥åŽæ¨¡å¼<C3A5>已实çŽ?* - CacheFactoryåŒJobFactory真实å<C5BE>¯ç”¨
3. **🔴 RedisCacheAdapter是å<C2AF> ä½<C3A4>符** - 所有方法都 `throw new Error('Not implemented')`
4. **âœ?业务å±å·²ä½¿ç”¨cache接å<C2A5>£** - 改造时业务代ç <C3A7>无需修改
5. **â<>?PostgresCacheAdapteråŒPgBossQueue需æ°å¢ž** - 本次改造的核心工作

### 2.2 缓存使用现状(✅ 已验è¯<C3A8>)

| ä½<C3A4>ç½® | 用é€?| TTL | æ•°æ<C2B0>®é‡?| é‡<C3A9>è¦<C3A8>æ€?| 代ç <C3A7>行å<C592>· | 当å‰<C3A5>实现 |
|------|------|-----|--------|--------|---------|---------|
| **LLM12FieldsService.ts** | LLM 12字段æ<C2B5><C3A6>å<EFBFBD>缓存 | 1å°<C3A5>æ—¶ | ~50KB/é¡?| 🔴 高(æˆ<C3A6>本ï¼?| ç¬?16è¡?| âœ?已用cache |
| **HealthCheckService.ts** | Excelå<6C>¥åº·æ£€æŸ¥ç¼“å­?| 24å°<C3A5>æ—¶ | ~5KB/é¡?| 🟡 中等 | ç¬?7è¡?| âœ?已用cache |

**代ç <C3A7>示ä¾ï¼ˆçœŸå®žä»£ç <C3A7>)ï¼?*

```typescript
// backend/src/modules/asl/common/llm/LLM12FieldsService.ts (�16�
const cached = await cache.get(cacheKey);
if (cached) {
  logger.info('缓存命中', { cacheKey });
  return cached;
}
// ... LLM调用 ...
await cache.set(cacheKey, JSON.stringify(result), 3600);  // 1å°<C3A5>æ—¶

// backend/src/modules/dc/tool-b/services/HealthCheckService.ts (�7�
const cached = await cache.get<HealthCheckResult>(cacheKey);
if (cached) return cached;
// ... Excelè§£æž<C3A6> ...
await cache.set(cacheKey, result, 86400);  // 24å°<C3A5>æ—¶

结论:✅ 缓存系统已在使用,å<C592>ªéœ€åˆ‡æ<E280A1>¢åº•å±å®žçŽ°ï¼ˆMemory â†?Postgres)ã€?

2.3 队列使用现状(✅ 已验è¯<C3A8>)

ä½<EFBFBD>ç½® 用é€? 代ç <EFBFBD>行å<EFBFBD>· 耗时 é‡<EFBFBD>è¦<EFBFBD>æ€? 当å‰<EFBFBD>实现 问题
screeningService.ts 文献筛选任åŠ? ç¬?5è¡? 2å°<EFBFBD>æ—¶ï¼?000篇) 🔴 é«? â<EFBFBD><>Œæ­¥æ‰§è¡Œ 实ä¾é”€æ¯<EFBFBD>丢å¤?
DC Tool B 病历批é‡<EFBFBD>æ<EFBFBD><EFBFBD>å<EFBFBD> 未实çŽ? 1-3å°<C3A5>æ—¶ 🔴 é«? â<EFBFBD>?未实çŽ? ä¸<EFBFBD>支æŒ<EFBFBD>长任务

*代ç <EFBFBD>示ä¾ï¼ˆçœŸå®žä»£ç <EFBFBD>)ï¼?

// backend/src/modules/asl/services/screeningService.ts (�5�
// 4. 弿­¥å¤„ç<E2809E>†æ‡çŒ®ï¼ˆç®€åŒç‰ˆï¼šç´æŽ¥åœ¨è¿™é‡Œå¤„ç<E2809E>†ï¼?// 生产环境应该å<C2A5>é€<C3A9>到消æ<CB86>¯é˜ŸåˆprocessLiteraturesInBackground(task.id, projectId, literatures);  // â†?å<>Œæ­¥æ‰§è¡Œï¼Œæœ‰é£Žé™©

结论:â<EFBFBD>Œ 队列系统尚未使用,需è¦<C3A8>ä¼˜å…ˆæ”¹é€ ã€æ³¨é‡Šå·²æ<C2B2><C3A6>示"生产环境应该å<C2A5>é€<C3A9>到消æ<CB86>¯é˜Ÿåˆ—"ã€?

2.4 长任务å<C2A1>¯é<C2AF> æ€§åˆ†æž?🔴 新增

当å‰<EFBFBD>问题(真实场景)

场景1ï¼?000篇æ‡çŒ®ç­é€‰ï¼ˆçº?å°<C3A5>æ—¶ï¼?├─ 问题1:å<C5A1>Œæ­¥æ‰§è¡Œï¼Œé˜»å¡žHTTP请æ±
├─ 问题2:SAE实ä¾15åˆ†éŸæ— æµ<C3A6>é‡<C3A9>自动缩å®?├─ 问题3:实ä¾é‡<C3A9>å<EFBFBD>¯ï¼Œä»»åŠ¡ä»Žå¤´å¼€å§?└─ 结果:任务æˆ<C3A6>功率 < 10%,用户需é‡<C3A9>å¤<C3A5>æ<EFBFBD><C3A6>交3-5æ¬?
场景2ï¼?0000篇æ‡çŒ®ç­é€‰ï¼ˆçº?0å°<C3A5>æ—¶ï¼?├─ 问题1:超过pg-boss最大é”<C3A9>定时间(4å°<C3A5>æ—¶ï¼?├─ 问题2:任务被é‡<C3A9>å¤<C3A5>领å<E280A0>,造æˆ<C3A6>é‡<C3A9>å¤<C3A5>处ç<E2809E>†
└─ 结果:任务失败率 100%

场景3:å<C5A1>å¸ƒæ´æ°ï¼ˆ15:00ï¼?├─ 问题1:正在执行的任务被强制终æ­?├─ 问题2:已处ç<E2809E>†çš„æ‡çŒ®ç»“æžœä¸¢å¤?└─ 结果:用户体验æž<C3A6>å·?```

#### **技术é™<C3A9>制分æž?*

| 技术栈 | é™<C3A9>制 | å½±å“<C3A5> |
|--------|------|------|
| **SAE** | 15åˆ†éŸæ— æµ<C3A6>é‡<C3A9>自动缩å®?| 长任务必然失è´?|
| **pg-boss** | 最长é”<C3A9>å®?å°<C3A5>时(推è<C2A8>?å°<C3A5>æ—¶ï¼?| 超长任务ä¸<C3A4>支æŒ?|
| **HTTP请æ±** | 最é•?0ç§è¶…æ—?| ä¸<C3A4>能å<C2BD>Œæ­¥æ‰§è¡Œé•¿ä»»åŠ?|
| **实ä¾é‡<C3A9>å<EFBFBD>¯** | 内存状æ€<C3A6>丢å¤?| 任务进度丢失 |

#### **解决方案评估**

| ç­–ç•¥ | ä»·å€?| 难度 | pg-boss支æŒ<C3A6> | 推è<C2A8><C3A8>åº?| 实施 |
|------|------|------|------------|--------|------|
| **ç­ç•¥1:心跳续ç§?* | â­<C3A2>â­<C3A2>â­<C3A2>â­<C3A2> | 🔴 é«?| â<>¸<C3A4>支æŒ?| 🟡 ä¸<C3A4>推è<C2A8>?| æšä¸<C3A4>å®žæ½ |
| **ç­ç•¥2:æ­ç¹ç»­ä¼?* | â­<C3A2>â­<C3A2>â­<C3A2>â­<C3A2>â­?| 🟢 ä½?| âœ?兼容 | 🟢 强烈推è<C2A8><C3A8> | **Phase 6** |
| **ç­ç•¥3:任务æ†åˆ?* | â­<C3A2>â­<C3A2>â­<C3A2>â­<C3A2>â­?| 🟡 ä¸?| âœ?原生支æŒ<C3A6> | 🟢 强烈推è<C2A8><C3A8> | **Phase 5** |

**结论**:采ç”?*ç­ç•¥2(æ­ç¹ç»­ä¼ ï¼‰+ ç­ç•¥3(任务æ†åˆ†ï¼‰**组å<E2809E>ˆæ¹æ¡ˆã€?
---

## 3. 改造总体架构

### 3.1 架构对比

改造å‰<EFBFBD>(当å‰<EFBFBD>)ï¼?┌─────────────────────────────────────â”?â”? Business Layer (ASL, DC, SSA...) â”?â”? ├─ screeningService.ts â”?â”? â”? └─ processInBackground() â”? â†?å<>Œæ­¥æ‰§è¡Œï¼ˆé˜»å¡žï¼‰ â”? └─ LLM12FieldsService.ts â”?â”? └─ cache.get/set() â”? â†?使用Memory缓存 └─────────────────────────────────────â”? â†?使用 ┌─────────────────────────────────────â”?â”? Capability Layer (Common) â”?â”? ├─ cache â”?â”? â”? ├─ MemoryCacheAdapter âœ? â”? â†?当å‰<C3A5>使用 â”? â”? └─ RedisCacheAdapter 🔴 â”? â†?å<> ä½<C3A4>符(未实现) â”? └─ jobs â”?â”? └─ MemoryQueue âœ? â”? â†?当å‰<C3A5>使用 └─────────────────────────────────────â”? â†?ä¾<C3A4>èµ â”Œâ”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”?â”? Platform Layer â”?â”? └─ PostgreSQL (业务数æ<C2B0>®) â”?└─────────────────────────────────────â”? 问题ï¼?â<>?缓存ä¸<C3A4>æŒ<C3A6>久(实ä¾é‡<C3A9>å<EFBFBD>¯ä¸¢å¤±ï¼?â<>˜Ÿåˆ—ä¸<C3A4>æŒ<C3A6>久(任务丢失ï¼?â<>?多实ä¾ä¸<C3A4>共享(å<CB86>„自ç¬ç«ï¼‰ â<>?长任务ä¸<C3A4>å<EFBFBD>¯é<C2AF> ï¼?å°<C3A5>时任务失败çŽ?> 90%ï¼?â<>?è¿<C3A8>å<EFBFBD><C3A5>Serverless原则(å<CB86>Œæ­¥æ‰§è¡Œé•¿ä»»åŠ¡ï¼?```

改造å<EFBFBD>Žï¼ˆPostgres-Only + 任务拆分 + 断点续传):
┌──────────────────────────────────────────────────────────────â”?â”? Business Layer (ASL, DC, SSA...)                            â”?â”? ├─ screeningService.ts                                      â”?â”? â”?  ├─ startScreeningTask()        â”? â†?改造ç¹1:任务æ†åˆ?  â”?â”? â”?  â”?  └─ 推é€<C3A9>¸ªæ‰¹æ¬¡åˆ°é˜Ÿåˆ—        â”?                      â”?â”? â”?  └─ registerWorkers()           â”?                      â”?â”? â”?      └─ 处ç<E2809E>†å<E280A0>•个批次+断点续传    â”? â†?改造ç¹2:æ­ç¹ç»­ä¼?  â”?â”? └─ LLM12FieldsService.ts                                    â”?â”?     └─ cache.get/set()             â”? â†?无需改动(接å<C2A5>£ä¸<C3A4>å<EFBFBD>˜ï¼‰â”?└──────────────────────────────────────────────────────────────â”?         â†?使用
┌──────────────────────────────────────────────────────────────â”?â”? Capability Layer (Common)                                   â”?â”? ├─ cache                                                    â”?â”? â”?  ├─ MemoryCacheAdapter âœ?      â”?                      â”?â”? â”?  ├─ PostgresCacheAdapter âœ?    â”? â†?新增ï¼?00行)       â”?â”? â”?  └─ CacheFactory               â”? â†?æ´æ°æ”¯æŒ<C3A6>postgres    â”?â”? └─ jobs                                                     â”?â”?     ├─ MemoryQueue âœ?              â”?                      â”?â”?     ├─ PgBossQueue âœ?              â”? â†?新增ï¼?00行)       â”?â”?     └─ JobFactory                  â”? â†?æ´æ°æ”¯æŒ<C3A6>pgboss     â”?└──────────────────────────────────────────────────────────────â”?         â†?ä¾<C3A4>èµ
┌──────────────────────────────────────────────────────────────â”?â”? Platform Layer                                              â”?â”? ├─ PostgreSQL (RDS)                                         â”?â”? â”?  ├─ 业务数æ<C2B0>®ï¼ˆasl_*, dc_*, ...ï¼?                        â”?â”? â”?  ├─ platform.app_cache âœ?      â”? â†?缓存表(新增ï¼?      â”?â”? â”?  └─ platform.job_* âœ?          â”? â†?队列表(pg-boss自动)â”
â”? └─ 阿里äºRDS特æ€?                                           â”?â”?     ├─ 自动备份(æ¯<C3A6>天)                                      â”?â”?     ├─ PITRï¼ˆæ—¶é—´ç¹æ<C2B9>¢å¤<C3A5>ï¼?                                   â”?â”?     └─ 高å<CB9C>¯ç”¨ï¼ˆä¸»ä»Žåˆ‡æ<E280A1>¢ï¼?                                   â”?└──────────────────────────────────────────────────────────────â”?
优势ï¼?âœ?缓存æŒ<C3A6>ä¹…åŒï¼ˆRDS自动备份ï¼?âœ?队列æŒ<C3A6>ä¹…åŒï¼ˆä»»åŠ¡ä¸<C3A4>丢失)
âœ?多实ä¾å…±äº«ï¼ˆSKIP LOCKEDï¼?âœ?事务一致性(业务+任务原å­<C3A5>æ<EFBFBD><C3A6>交ï¼?âœ?é¶é¢<C3A9>å¤è¿<C3A8>维(å¤<C3A5>用RDSï¼?âœ?长任务å<C2A1>¯é<C2AF> ï¼ˆæ†åˆ†+æ­ç¹ï¼Œæˆ<C3A6>功率 > 99%ï¼?âœ?符å<C2A6>ˆServerless(短任务,å<C592>¯ä¸­æ­æ<C2AD>¢å¤<C3A5>ï¼?```

### 3.2 任务拆分策略 �**新增**

#### **拆分原则**

ç®æ ‡ï¼šæ¯<EFBFBD>个任åŠ?< 15分éŸï¼ˆè¿œä½ŽäºŽpg-boss 4å°<C3A5>æ—¶é™<C3A9>制ï¼? 原因ï¼?1. SAE实ä¾15åˆ†éŸæ— æµ<C3A6>é‡<C3A9>自动缩å®?2. 失败é‡<C3A9>试æˆ<C3A6>本低(å<CB86>ªéœ€é‡<C3A9>试15分éŸï¼Œä¸<C3A4>æ˜?å°<C3A5>æ—¶ï¼?3. è¿åº¦å<C2A6>¯è§<C3A8>性高(用户体验好ï¼?4. å<>¯å¹¶è¡Œå¤„ç<E2809E>†ï¼ˆå¤šWorkerå<72>Œæ—¶å·¥ä½œï¼?```

*拆分策略�

任务类型 å<EFBFBD>•项耗时 推è<EFBFBD><EFBFBD>批次大å°<EFBFBD> 批次耗时 å¹¶å<EFBFBD>能åŠ
*ASLæ‡çŒ®ç­é€? 7.2ç§?ç¯? 100ç¯? 12åˆ†éŸ 10批并è¡?
DC病历æ<EFBFBD><EFBFBD>å<EFBFBD> 10ç§?ä»? 50ä»? 8.3分钟 10批并è¡?
统计分æž<EFBFBD> 0.1ç§?æ<>? 5000æ<EFBFBD>? 8.3分钟 20批并è¡?

实际效果对比

场景ï¼?0000篇æ‡çŒ®ç­é€?
ä¸<C3A4>æ†åˆ†ï¼ˆé”™è¯¯ï¼‰ï¼š
├─ 1个任务,20å°<C3A5>æ—¶
├─ 超过pg-boss 4å°<C3A5>æ—¶é™<C3A9>制 â†?失败
└─ æˆ<C3A6>功率:0%

拆分(正确)ï¼?├─ 100个批次,æ¯<C3A6>批100篇,12分éŸ
├─ 10个Worker并行处ç<E2809E>†
├─ 总耗时ï¼?00 / 10 = 10批轮æ¬?× 12åˆ†éŸ = 2å°<C3A5>æ—¶
└─ æˆ<C3A6>功率:> 99.5%(å<CB86>•批失败å<C2A5>ªéœ€é‡<C3A9>试12分éŸï¼?```

### 3.3 断点续传机制 �**新增**

#### **核心æ€<C3A6>想**

问题:SAE实ä¾éš<EFBFBD>æ—¶å<EFBFBD>¯èƒ½é‡<EFBFBD>å<EFBFBD>¯ï¼ˆå<EFBFBD>å¸ƒæ´æ°ã€<EFBFBD>自动缩容)

无断点: ├─ 处ç<E2809E>†åˆ°ç¬¬900ç¯?â†?实ä¾é‡<C3A9>å<EFBFBD>¯ ├─ é‡<C3A9>æ°å¼€å§ï¼Œä»Žç¬¬1篇开å§?└─ 浪费时间ï¼?00 × 7.2ç§?= 108分éŸ

有断点: ├─ æ¯<C3A6>处ç<E2809E>?0篇,ä¿<C3A4>å­˜è¿åº¦åˆ°æ•°æ<C2B0>®åº“ ├─ 处ç<E2809E>†åˆ°ç¬¬900ç¯?â†?实ä¾é‡<C3A9>å<EFBFBD>¯ ├─ é‡<C3A9>æ°å¼€å§ï¼Œä»Žç¬¬900篇继ç»?└─ 浪费时间ï¼? 1分éŸ


#### **实现策略**

```typescript
// æ•°æ<C2B0>®åº“记录è¿åº?model AslScreeningTask {
  processedItems  Int      // 已处ç<E2809E>†æ•°é‡?  currentIndex    Int      // 当å‰<C3A5>游标(æ­ç¹ï¼‰
  lastCheckpoint  DateTime // 最å<E282AC>Žä¸€æ¬¡ä¿<C3A4>存时é—?  checkpointData  Json     // æ­ç¹è¯¦ç»†æ•°æ<C2B0>®
}

// Worker读å<C2BB>æ­ç¹
const startIndex = task.currentIndex || 0;
for (let i = startIndex; i < items.length; i++) {
  await processItem(items[i]);
  
  // æ¯<C3A6>处ç<E2809E>?0项,ä¿<C3A4>å­˜æ­ç¹
  if ((i + 1) % 10 === 0) {
    await saveCheckpoint(i + 1);
  }
}

ä¿<EFBFBD>å­˜é¢çއæ<EFBFBD>ƒè¡¡

频率 æ•°æ<EFBFBD>®åº“写å…? é‡<EFBFBD>å<EFBFBD>¯æµªè´¹æ—¶é—´ 推è<EFBFBD><EFBFBD>
æ¯?é¡? 很高(性能差) < 10ç§? â<EFBFBD>¸<C3A4>推è<C2A8>?
æ¯?0é¡? 中等 < 2åˆ†éŸ âœ?推è<C2A8><C3A8>
æ¯?00é¡? ä½? < 12åˆ†éŸ ðŸŸ¡ å<>¯é€?
ä¸<EFBFBD>ä¿<EFBFBD>å­? æ—? é‡<EFBFBD>头开å§? â<EFBFBD>¸<C3A4>å<EFBFBD>¯æŽ¥å<C2A5>

3.4 Schema设计(统一在platform)✨ *已更�

// prisma/schema.prisma

datasource db {
  provider = "postgresql"
  url      = env("DATABASE_URL")
  schemas  = ["platform_schema", "aia_schema", "pkb_schema", "asl_schema", 
              "dc_schema", "ssa_schema", "st_schema", "rvw_schema", 
              "admin_schema", "common_schema", "public"]
}

generator client {
  provider        = "prisma-client-js"
  previewFeatures = ["multiSchema"]  // âœ?å<>¯ç”¨å¤šSchema支æŒ<C3A6>
}

// ==================== å¹³å<C2B3>°åŸºç¡€è®¾æ½ï¼ˆplatform_schemaï¼?===================

/// 应用缓存表(æ¿ä»£Redisï¼?model AppCache {
  id        Int      @id @default(autoincrement())
  key       String   @unique @db.VarChar(500)
  value     Json
  expiresAt DateTime @map("expires_at")
  createdAt DateTime @default(now()) @map("created_at")

  @@index([expiresAt], name: "idx_app_cache_expires")
  @@index([key, expiresAt], name: "idx_app_cache_key_expires")
  @@map("app_cache")
  @@schema("platform_schema")  // �统一在platform Schema
}

// pg-boss会自动åˆå»ºä»»åŠ¡è¡¨ï¼ˆä¸<C3A4>需è¦<C3A8>在Prisma中定义)
// 表å<C2A8><C3A5>:platform_schema.job, platform_schema.version ç­?
// ==================== 业务模å<C2A1>—(asl_schemaï¼?===================

/// ASLç­é€‰ä»»åŠ¡è¡¨ï¼ˆâœ¨ 需è¦<C3A8>æ°å¢žå­—段支æŒ<C3A6>æ†åˆ?断点ï¼?model AslScreeningTask {
  id              String   @id @default(uuid())
  projectId       String   @map("project_id")
  taskType        String   @map("task_type")
  status          String   // pending/running/completed/failed
  totalItems      Int      @map("total_items")
  processedItems  Int      @default(0) @map("processed_items")
  successItems    Int      @default(0) @map("success_items")
  failedItems     Int      @default(0) @map("failed_items")
  conflictItems   Int      @default(0) @map("conflict_items")
  
  // �新增:任务拆分支�  totalBatches      Int      @default(1) @map("total_batches")
  processedBatches  Int      @default(0) @map("processed_batches")
  currentBatchIndex Int      @default(0) @map("current_batch_index")
  
  // �新增:断点续传支�  currentIndex    Int       @default(0) @map("current_index")
  lastCheckpoint  DateTime? @map("last_checkpoint")
  checkpointData  Json?     @map("checkpoint_data")
  
  startedAt       DateTime  @map("started_at")
  completedAt     DateTime? @map("completed_at")
  createdAt       DateTime  @default(now()) @map("created_at")
  updatedAt       DateTime  @updatedAt @map("updated_at")

  project         AslScreeningProject @relation(fields: [projectId], references: [id])
  results         AslScreeningResult[]

  @@index([projectId])
  @@index([status])
  @@map("asl_screening_tasks")
  @@schema("asl_schema")
}

*新增字段说明�

字段 类型 用� 示例�
totalBatches Int 总批次数 10�000篇�00�批)
processedBatches Int 已完æˆ<EFBFBD>批次数 3(已完æˆ<EFBFBD>3批)
currentBatchIndex Int 当å‰<EFBFBD>批次索引 3(正在处ç<EFBFBD>†ç¬¬4批)
currentIndex Int 当å‰<EFBFBD>项索引(æ­ç¹ï¼? 350(已处ç<EFBFBD>†350篇)
lastCheckpoint DateTime 最å<EFBFBD>Žä¸€æ¬¡ä¿<EFBFBD>å­˜æ­ç¹æ—¶é—? 2025-12-13 10:30:00
checkpointData Json æ­ç¹è¯¦ç»†æ•°æ<EFBFBD>® {"lastProcessedId": "lit_123", "batchProgress": 0.35}

3.5 Key/Queueå½å<C2BD><C3A5>规范

// 缓存Key规范(逻è¾éš”离ï¼?const CACHE_KEY_PATTERNS = {
  // ASL模å<C2A1>  'asl:llm:{hash}': 'LLMæ<4D><C3A6>å<EFBFBD>结果',
  'asl:pdf:{fileId}': 'PDFè§£æž<C3A6>结果',
  
  // DC模å<C2A1>  'dc:health:{fileHash}': 'Excelå<6C>¥åº·æ£€æŸ?,
  'dc:extraction:{recordId}': '病历æ<EFBFBD><EFBFBD>å<EFBFBD>结果',
  
  // 全局
  'session:{userId}': '用户Session',
  'config:{key}': '系统é…<EFBFBD>ç½®',
};

// 队列Name规范
const QUEUE_NAMES = {
  ASL_TITLE_SCREENING: 'asl:title-screening',
  ASL_FULLTEXT_SCREENING: 'asl:fulltext-screening',
  DC_MEDICAL_EXTRACTION: 'dc:medical-extraction',
  DC_DATA_CLEANING: 'dc:data-cleaning',
  SSA_STATISTICAL_ANALYSIS: 'ssa:statistical-analysis',
};

4. 详细实施步骤

4.1 Phase 1:环境准备(Day 1上å<C5A0>ˆï¼?.5天)

任务1.1ï¼šæ´æ°Prisma Schema

# æ‡ä»¶ï¼šprisma/schema.prisma

修改ç‚?:å<C5A1>¯ç”¨multiSchema

generator client {
  provider        = "prisma-client-js"
  previewFeatures = ["multiSchema"]  // �新增
}

修改ç‚?:添加AppCache模åž

/// 应用缓存表(Postgres-Only架构�model AppCache {
  id        Int      @id @default(autoincrement())
  key       String   @unique @db.VarChar(500)
  value     Json
  expiresAt DateTime @map("expires_at")
  createdAt DateTime @default(now()) @map("created_at")

  @@index([expiresAt], name: "idx_app_cache_expires")
  @@index([key, expiresAt], name: "idx_app_cache_key_expires")
  @@map("app_cache")
  @@schema("platform_schema")
}

执行è¿<EFBFBD>ç§»

cd backend

# 1. 生æˆ<C3A6>è¿<C3A8>ç§»æ‡ä»¶
npx prisma migrate dev --name add_postgres_cache

# 2. 生æˆ<C3A6>Prisma Client
npx prisma generate

# 3. 查çœç”Ÿæˆ<C3A6>çš„SQL
cat prisma/migrations/*/migration.sql

验è¯<EFBFBD>结果

-- 应该çœåˆ°ä»¥ä¸SQL
CREATE TABLE "platform_schema"."app_cache" (
  "id" SERIAL PRIMARY KEY,
  "key" VARCHAR(500) UNIQUE NOT NULL,
  "value" JSONB NOT NULL,
  "expires_at" TIMESTAMP NOT NULL,
  "created_at" TIMESTAMP DEFAULT NOW()
);

CREATE INDEX "idx_app_cache_expires" ON "platform_schema"."app_cache"("expires_at");
CREATE INDEX "idx_app_cache_key_expires" ON "platform_schema"."app_cache"("key", "expires_at");

*任务1.2:安装ä¾<EFBFBD>èµ?

cd backend

# 安装pg-boss(任务队列)
npm install pg-boss --save

# 查看版本
npm list pg-boss
# 应显示:pg-boss@10.x.x

*任务1.3ï¼šæ´æ°çŽ¯å¢ƒå<EFBFBD>˜é‡<EFBFBD>é…<EFBFBD>ç½?

// æ‡ä»¶ï¼šbackend/src/config/env.ts

import { z } from 'zod';

const envSchema = z.object({
  // ... 现有é…<C3A9>ç½® ...
  
  // ==================== 缓存é…<C3A9>ç½® ====================
  CACHE_TYPE: z.enum(['memory', 'postgres']).default('memory'),  // â†?æ°å¢žpostgres选项
  
  // ==================== 队列é…<C3A9>ç½® ====================
  QUEUE_TYPE: z.enum(['memory', 'pgboss']).default('memory'),  // â†?æ°å¢žpgboss选项
  
  // ==================== æ•°æ<C2B0>®åº“é…<C3A9>ç½?====================
  DATABASE_URL: z.string(),
});

export const config = {
  // ... 现有é…<C3A9>ç½® ...
  
  // 缓存é…<C3A9>ç½®
  cacheType: process.env.CACHE_TYPE || 'memory',
  
  // 队列é…<C3A9>ç½®
  queueType: process.env.QUEUE_TYPE || 'memory',
  
  // æ•°æ<C2B0>®åº“URL
  databaseUrl: process.env.DATABASE_URL,
};
# æ‡ä»¶ï¼šbackend/.env

# ==================== 缓存é…<C3A9>ç½® ====================
CACHE_TYPE=postgres  # memory | postgres

# ==================== 队列é…<C3A9>ç½® ====================
QUEUE_TYPE=pgboss    # memory | pgboss

# ==================== æ•°æ<C2B0>®åº“é…<C3A9>ç½?====================
DATABASE_URL=postgresql://user:password@localhost:5432/aiclincial?schema=public

4.2 Phase 2:实现PostgresCacheAdapter(Day 1ä¸å<E280B9>ˆï¼?.5天)

任务2.1:åˆå»ºPostgresCacheAdapter

// æ‡ä»¶ï¼šbackend/src/common/cache/PostgresCacheAdapter.ts

import { prisma } from '../../config/database.js';
import type { CacheAdapter } from './CacheAdapter.js';
import { logger } from '../logging/index.js';

/**
 * Postgres缓存é€é…<C3A9>å™? * 
 * 核心特性:
 * - æŒ<C3A6>ä¹…åŒå­˜å¨ï¼ˆå®žä¾é‡<C3A9>å<EFBFBD>¯ä¸<C3A4>丢失)
 * - 多实ä¾å…±äº«ï¼ˆé€šè¿‡æ•°æ<C2B0>®åº“)
 * - æ‡æƒ°åˆ é™¤ï¼ˆè¯»å<C2BB>时清ç<E280A6>†è¿‡æœŸæ•°æ<C2B0>®ï¼? * - 自动清ç<E280A6>†ï¼ˆå®šæ—¶ä»»åŠ¡åˆ é™¤è¿‡æœŸæ•°æ<C2B0>®ï¼‰
 */
export class PostgresCacheAdapter implements CacheAdapter {
  
  /**
   * 获å<C2B7>ç¼“å­˜ï¼ˆå¸¦è¿‡æœŸæ£€æŸ¥åŒæ‡æƒ°åˆ é™¤ï¼?   */
  async get<T = any>(key: string): Promise<T | null> {
    try {
      const record = await prisma.appCache.findUnique({
        where: { key }
      });
      
      if (!record) {
        return null;
      }
      
      // 检查是å<C2AF>¦è¿‡æœ?      if (record.expiresAt < new Date()) {
        // æ‡æƒ°åˆ é™¤ï¼šå¼æ­¥åˆ é™¤ï¼Œä¸<C3A4>阻塞主æµ<C3A6>ç¨
        this.deleteAsync(key);
        return null;
      }
      
      logger.debug('[PostgresCache] 缓存命中', { key });
      return record.value as T;
      
    } catch (error) {
      logger.error('[PostgresCache] 读å<C2BB>失败', { key, error });
      return null;
    }
  }

  /**
   * 设置缓存
   */
  async set(key: string, value: any, ttlSeconds: number = 3600): Promise<void> {
    try {
      const expiresAt = new Date(Date.now() + ttlSeconds * 1000);
      
      await prisma.appCache.upsert({
        where: { key },
        create: {
          key,
          value: value as any,  // Prisma会自动处ç<E2809E>†JSON
          expiresAt,
        },
        update: {
          value: value as any,
          expiresAt,
        }
      });
      
      logger.debug('[PostgresCache] 缓存写入', { key, ttl: ttlSeconds });
      
    } catch (error) {
      logger.error('[PostgresCache] 写入失败', { key, error });
      throw error;
    }
  }

  /**
   * 删除缓存
   */
  async delete(key: string): Promise<boolean> {
    try {
      await prisma.appCache.delete({
        where: { key }
      });
      
      logger.debug('[PostgresCache] 缓存删除', { key });
      return true;
      
    } catch (error) {
      // 妿žœkeyä¸<C3A4>存在,Prisma会æŠå‡ºé”™è¯?      if ((error as any)?.code === 'P2025') {
        return false;
      }
      logger.error('[PostgresCache] 删除失败', { key, error });
      return false;
    }
  }

  /**
   * 弿­¥åˆ é™¤ï¼ˆä¸<C3A4>阻塞主æµ<C3A6>ç¨ï¼‰
   */
  private deleteAsync(key: string): void {
    prisma.appCache.delete({ where: { key } })
      .catch(err => {
        // é<>™é»˜å¤±è´¥ï¼ˆå<CB86>¯èƒ½å·²è¢«å…¶ä»å®žä¾åˆ é™¤ï¼‰
        logger.debug('[PostgresCache] 懒惰删除失败', { key, err });
      });
  }

  /**
   * 批é‡<C3A9>删除(支æŒ<C3A6>模å¼<C3A5>匹é…<C3A9>)
   */
  async deleteMany(pattern: string): Promise<number> {
    try {
      const result = await prisma.appCache.deleteMany({
        where: {
          key: {
            contains: pattern
          }
        }
      });
      
      logger.info('[PostgresCache] 批é‡<C3A9>删除', { pattern, count: result.count });
      return result.count;
      
    } catch (error) {
      logger.error('[PostgresCache] 批é‡<C3A9>删除失败', { pattern, error });
      return 0;
    }
  }

  /**
   * 清空所有缓�   */
  async flush(): Promise<void> {
    try {
      await prisma.appCache.deleteMany({});
      logger.info('[PostgresCache] 缓存已清�);
    } catch (error) {
      logger.error('[PostgresCache] 清空失败', { error });
      throw error;
    }
  }

  /**
   * 获å<C2B7>缓存统计信æ<C2A1>¯
   */
  async getStats(): Promise<{
    total: number;
    expired: number;
    byModule: Record<string, number>;
  }> {
    try {
      const now = new Date();
      
      // 总数
      const total = await prisma.appCache.count();
      
      // 过期数é‡<C3A9>
      const expired = await prisma.appCache.count({
        where: {
          expiresAt: { lt: now }
        }
      });
      
      // 按模å<C2A1>—统计(通过keyå‰<C3A5>缀分组ï¼?      const all = await prisma.appCache.findMany({
        select: { key: true }
      });
      
      const byModule: Record<string, number> = {};
      all.forEach(item => {
        const module = item.key.split(':')[0];
        byModule[module] = (byModule[module] || 0) + 1;
      });
      
      return { total, expired, byModule };
      
    } catch (error) {
      logger.error('[PostgresCache] 获å<EFBFBD>统计失败', { error });
      return { total: 0, expired: 0, byModule: {} };
    }
  }
}

/**
 * å<>¯åŠ¨å®šæ—¶æ¸…ç<E280A6>†ä»»åŠ¡ï¼ˆåˆ†æ‰¹æ¸…ç<E280A6>†ï¼Œé˜²æ­¢é˜»å¡žï¼? * 
 * ç­–ç•¥ï¼? * - æ¯<C3A6>åˆ†éŸæ‰§è¡Œä¸€æ¬? * - æ¯<C3A6>次删除1000æ<30>¡è¿‡æœŸæ•°æ<C2B0>? * - 使用LIMITé<54>¿å…<C3A5>大äºåŠ? */
export function startCacheCleanupTask(): void {
  const CLEANUP_INTERVAL = 60 * 1000;  // 1分éŸ
  const BATCH_SIZE = 1000;             // æ¯<C3A6>次1000æ<30>?  
  setInterval(async () => {
    try {
      // 使用原生SQL,支æŒ<C3A6>LIMIT
      const result = await prisma.$executeRaw`
        DELETE FROM platform_schema.app_cache 
        WHERE id IN (
          SELECT id FROM platform_schema.app_cache 
          WHERE expires_at < NOW() 
          LIMIT ${BATCH_SIZE}
        )
      `;
      
      if (result > 0) {
        logger.info('[PostgresCache] 定时清ç<EFBFBD>', { deleted: result });
      }
      
    } catch (error) {
      logger.error('[PostgresCache] 定时清ç<EFBFBD>†å¤±è´¥', { error });
    }
  }, CLEANUP_INTERVAL);
  
  logger.info('[PostgresCache] 定时清ç<EFBFBD>†ä»»åС已å<EFBFBD>¯åŠ?, {
    interval: `${CLEANUP_INTERVAL / 1000}ç§’`,
    batchSize: BATCH_SIZE
  });
}

任务2.2ï¼šæ´æ°CacheFactory

// æ‡ä»¶ï¼šbackend/src/common/cache/CacheFactory.ts

import { CacheAdapter } from './CacheAdapter.js';
import { MemoryCacheAdapter } from './MemoryCacheAdapter.js';
import { RedisCacheAdapter } from './RedisCacheAdapter.js';
import { PostgresCacheAdapter } from './PostgresCacheAdapter.js';  // �新增
import { logger } from '../logging/index.js';
import { config } from '../../config/env.js';

export class CacheFactory {
  private static instance: CacheAdapter | null = null;

  static getInstance(): CacheAdapter {
    if (!this.instance) {
      this.instance = this.createAdapter();
    }
    return this.instance;
  }

  private static createAdapter(): CacheAdapter {
    const cacheType = config.cacheType;

    logger.info('[CacheFactory] åˆ<C3A5>å§åŒç¼“å­?, { cacheType });

    switch (cacheType) {
      case 'postgres':  // �新增
        return this.createPostgresAdapter();
      
      case 'memory':
        return this.createMemoryAdapter();
      
      case 'redis':
        return this.createRedisAdapter();
      
      default:
        logger.warn(`[CacheFactory] 未知缓存类型: ${cacheType}, é™<C3A9>级到内存`);
        return this.createMemoryAdapter();
    }
  }

  /**
   * åˆå»ºPostgres缓存é€é…<C3A9>å™?   */
  private static createPostgresAdapter(): PostgresCacheAdapter {
    logger.info('[CacheFactory] 使用PostgresCacheAdapter');
    return new PostgresCacheAdapter();
  }

  private static createMemoryAdapter(): MemoryCacheAdapter {
    logger.info('[CacheFactory] 使用MemoryCacheAdapter');
    return new MemoryCacheAdapter();
  }

  private static createRedisAdapter(): RedisCacheAdapter {
    // ... 现有Redisé€»è¾ ...
    logger.info('[CacheFactory] 使用RedisCacheAdapter');
    return new RedisCacheAdapter({ /* ... */ });
  }

  static reset(): void {
    this.instance = null;
  }
}

*任务2.3:更新导�

// æ‡ä»¶ï¼šbackend/src/common/cache/index.ts

export type { CacheAdapter } from './CacheAdapter.js';
export { MemoryCacheAdapter } from './MemoryCacheAdapter.js';
export { RedisCacheAdapter } from './RedisCacheAdapter.js';
export { PostgresCacheAdapter, startCacheCleanupTask } from './PostgresCacheAdapter.js';  // �新增
export { CacheFactory } from './CacheFactory.js';

import { CacheFactory } from './CacheFactory.js';

export const cache = CacheFactory.getInstance();

4.3 Phase 3:实现PgBossQueue(Day 2-3�天)

任务3.1:åˆå»ºPgBossQueue

// æ‡ä»¶ï¼šbackend/src/common/jobs/PgBossQueue.ts

import PgBoss from 'pg-boss';
import type { Job, JobQueue, JobHandler } from './types.js';
import { logger } from '../logging/index.js';
import { config } from '../../config/env.js';

/**
 * PgBoss队列é€é…<C3A9>å™? * 
 * 核心特性:
 * - 任务æŒ<C3A6>ä¹…åŒï¼ˆå®žä¾é‡<C3A9>å<EFBFBD>¯ä¸<C3A4>丢失)
 * - 自动é‡<C3A9>试(指数退é<E282AC>¿ï¼‰
 * - 多实ä¾å<E280B9><C3A5>调(SKIP LOCKEDï¼? * - 长任务支æŒ<C3A6>(4å°<C3A5>æ—¶è¶…æ—¶ï¼? */
export class PgBossQueue implements JobQueue {
  private boss: PgBoss;
  private started = false;
  private workers: Map<string, any> = new Map();

  constructor() {
    this.boss = new PgBoss({
      connectionString: config.databaseUrl,
      schema: 'platform_schema',  // 统一在platform Schema
      max: 5,                      // 连接池大�      
      // âœ?关键é…<C3A9>置:长任务支æŒ<C3A6>
      expireInHours: 4,            // 任务é”?å°<C3A5>æ—¶å<C2B6>Žè¿‡æœ?      
      // 自动维护(清ç<E280A6>†æ—§ä»»åŠ¡ï¼?      retentionDays: 7,            // ä¿<C3A4>ç•™7天的历å<E280A0>²ä»»åŠ¡
      deleteAfterDays: 30,         // 30天å<C2A9>Žå½»åº•删除
    });

    // çå<E28098>¬é”™è¯¯
    this.boss.on('error', error => {
      logger.error('[PgBoss] 队列错误', { error: error.message });
    });

    // çå<E28098>¬ç»´æŠ¤äºä»¶
    this.boss.on('maintenance', () => {
      logger.debug('[PgBoss] 执行维护任务');
    });
  }

  /**
   * å<>¯åŠ¨é˜Ÿåˆ—ï¼ˆæ‡åŠ è½½ï¼?   */
  private async ensureStarted(): Promise<void> {
    if (this.started) return;
    
    try {
      await this.boss.start();
      this.started = true;
      logger.info('[PgBoss] 队列已å<C2B2>¯åŠ?);
    } catch (error) {
      logger.error('[PgBoss] 队列å<EFBFBD>¯åŠ¨å¤±è´¥', { error });
      throw error;
    }
  }

  /**
   * 推é€<C3A9>ä»»åŠ¡åˆ°é˜Ÿåˆ   */
  async push<T = any>(type: string, data: T, options?: any): Promise<Job> {
    await this.ensureStarted();
    
    try {
      const jobId = await this.boss.send(type, data, {
        retryLimit: 3,                 // 失败é‡<C3A9>试3æ¬?        retryDelay: 60,                // 失败å<C2A5>?0ç§é‡<C3A9>è¯?        retryBackoff: true,            // 指数退é<E282AC>¿ï¼ˆ60s, 120s, 240sï¼?        expireInHours: 4,              // 4å°<C3A5>æ—¶å<C2B6>Žè¿‡æœ?        ...options
      });

      logger.info('[PgBoss] 任务入队', { type, jobId });

      return {
        id: jobId!,
        type,
        data,
        status: 'pending',
        createdAt: new Date(),
      };
      
    } catch (error) {
      logger.error('[PgBoss] 任务入队失败', { type, error });
      throw error;
    }
  }

  /**
   * 注册任务处ç<E2809E>†å™?   */
  process<T = any>(type: string, handler: JobHandler<T>): void {
    // 弿­¥æ³¨å†Œï¼ˆä¸<C3A4>阻塞主æµ<C3A6>ç¨ï¼‰
    this.registerWorkerAsync(type, handler);
  }

  /**
   * 弿­¥æ³¨å†ŒWorker
   */
  private async registerWorkerAsync<T>(
    type: string, 
    handler: JobHandler<T>
  ): Promise<void> {
    try {
      await this.ensureStarted();
      
      // 注册Worker
      await this.boss.work(
        type,
        {
          teamSize: 1,       // æ¯<C3A6>个队列并å<C2B6>¸ªä»»åŠ?          teamConcurrency: 1 // æ¯<C3A6>个Worker处ç<E2809E>†1个任åŠ?        },
        async (job: any) => {
          const startTime = Date.now();
          
          logger.info('[PgBoss] å¼€å§å¤„ç<EFBFBD>†ä»»åŠ?, { 
            type, 
            jobId: job.id,
            attemptsMade: job.data.__retryCount || 0,
            attemptsTotal: 3
          });

          try {
            // 调用业务处ç<E2809E>†å‡½æ•°
            const result = await handler({
              id: job.id,
              type,
              data: job.data as T,
              status: 'processing',
              createdAt: new Date(job.createdon),
            });

            const duration = Date.now() - startTime;
            logger.info('[PgBoss] âœ?任务完æˆ<C3A6>', { 
              type, 
              jobId: job.id,
              duration: `${(duration / 1000).toFixed(2)}s`
            });

            return result;

          } catch (error) {
            const duration = Date.now() - startTime;
            logger.error('[PgBoss] â<>?任务失败', { 
              type, 
              jobId: job.id,
              attemptsMade: job.data.__retryCount || 0,
              duration: `${(duration / 1000).toFixed(2)}s`,
              error: error instanceof Error ? error.message : 'Unknown'
            });
            
            // æŠå‡ºé”™è¯¯ï¼Œè§¦å<C2A6>pg-boss自动é‡<C3A9>试
            throw error;
          }
        }
      );

      this.workers.set(type, true);
      logger.info('[PgBoss] �Worker已注�, { type });

    } catch (error) {
      logger.error('[PgBoss] Worker注册失败', { type, error });
      throw error;
    }
  }

  /**
   * 获å<C2B7>任务状æ€?   */
  async getJob(id: string): Promise<Job | null> {
    try {
      await this.ensureStarted();
      
      const job = await this.boss.getJobById(id);
      if (!job) return null;

      return {
        id: job.id!,
        type: job.name,
        data: job.data,
        status: this.mapState(job.state),
        progress: 0,  // pg-bossä¸<C3A4>ç´æŽ¥æ”¯æŒ<C3A6>è¿åº?        createdAt: new Date(job.createdon),
        completedAt: job.completedon ? new Date(job.completedon) : undefined,
        error: job.output?.message,
      };
      
    } catch (error) {
      logger.error('[PgBoss] 获å<EFBFBD>任务失败', { id, error });
      return null;
    }
  }

  /**
   * 映射pg-boss状æ€<C3A6>到通用状æ€?   */
  private mapState(state: string): Job['status'] {
    switch (state) {
      case 'completed':
        return 'completed';
      case 'failed':
        return 'failed';
      case 'active':
        return 'processing';
      case 'cancelled':
        return 'cancelled';
      default:
        return 'pending';
    }
  }

  /**
   * 更新任务进度(通过业务表)
   * 
   * 注æ„<C3A6>:pg-bossä¸<C3A4>ç´æŽ¥æ”¯æŒ<C3A6>è¿åº¦æ´æ°ï¼Œ
   * 需è¦<C3A8>在业务å±é€šè¿‡æ•°æ<C2B0>®åº“表实现
   */
  async updateProgress(id: string, progress: number, message?: string): Promise<void> {
    logger.debug('[PgBoss] è¿åº¦æ´æ°ï¼ˆéœ€ä¸šåŠ¡è¡¨æ”¯æŒ<EFBFBD>)', { id, progress, message });
    // 实际实现:更�aslScreeningTask.processedItems
  }

  /**
   * å<>消任务
   */
  async cancelJob(id: string): Promise<boolean> {
    try {
      await this.ensureStarted();
      await this.boss.cancel(id);
      logger.info('[PgBoss] 任务已å<EFBFBD>æ¶?, { id });
      return true;
    } catch (error) {
      logger.error('[PgBoss] å<>消任务失败', { id, error });
      return false;
    }
  }

  /**
   * é‡<C3A9>试失败任务
   */
  async retryJob(id: string): Promise<boolean> {
    try {
      await this.ensureStarted();
      await this.boss.resume(id);
      logger.info('[PgBoss] 任务已é‡<C3A9>è¯?, { id });
      return true;
    } catch (error) {
      logger.error('[PgBoss] é‡<EFBFBD>试任务失败', { id, error });
      return false;
    }
  }

  /**
   * 清ç<E280A6>†æ—§ä»»åŠ¡ï¼ˆpg-boss自动处ç<E2809E>†ï¼?   */
  async cleanup(olderThan: number = 86400000): Promise<number> {
    // pg-boss有自动清ç<E280A6>†æœºåˆ¶ï¼ˆretentionDaysï¼?    logger.debug('[PgBoss] 使用自动清ç<EFBFBD>†æœºåˆ');
    return 0;
  }

  /**
   * 关闭队列(优雅关闭)
   */
  async close(): Promise<void> {
    if (!this.started) return;
    
    try {
      await this.boss.stop();
      this.started = false;
      logger.info('[PgBoss] 队列已关�);
    } catch (error) {
      logger.error('[PgBoss] 队列关闭失败', { error });
    }
  }
}

任务3.2ï¼šæ´æ°JobFactory

// æ‡ä»¶ï¼šbackend/src/common/jobs/JobFactory.ts

import { JobQueue } from './types.js';
import { MemoryQueue } from './MemoryQueue.js';
import { PgBossQueue } from './PgBossQueue.js';  // �新增
import { logger } from '../logging/index.js';
import { config } from '../../config/env.js';

export class JobFactory {
  private static instance: JobQueue | null = null;

  static getInstance(): JobQueue {
    if (!this.instance) {
      this.instance = this.createQueue();
    }
    return this.instance;
  }

  private static createQueue(): JobQueue {
    const queueType = config.queueType;

    logger.info('[JobFactory] åˆ<C3A5>å§åŒä»»åŠ¡é˜Ÿåˆ?, { queueType });

    switch (queueType) {
      case 'pgboss':  // �新增
        return this.createPgBossQueue();
      
      case 'memory':
        return this.createMemoryQueue();
      
      default:
        logger.warn(`[JobFactory] 未知队列类型: ${queueType}, é™<C3A9>级到内存`);
        return this.createMemoryQueue();
    }
  }

  /**
   * åˆå»ºPgBossé˜Ÿåˆ   */
  private static createPgBossQueue(): PgBossQueue {
    logger.info('[JobFactory] 使用PgBossQueue');
    return new PgBossQueue();
  }

  private static createMemoryQueue(): MemoryQueue {
    logger.info('[JobFactory] 使用MemoryQueue');
    
    const queue = new MemoryQueue();
    
    // 定期清ç<E280A6>†ï¼ˆé<CB86>¿å…<C3A5>内存泄æ¼<C3A6>)
    if (process.env.NODE_ENV !== 'test') {
      setInterval(() => {
        queue.cleanup();
      }, 60 * 60 * 1000);
    }
    
    return queue;
  }

  static reset(): void {
    this.instance = null;
  }
}

*任务3.3:更新导�

// æ‡ä»¶ï¼šbackend/src/common/jobs/index.ts

export type { Job, JobStatus, JobHandler, JobQueue } from './types.js';
export { MemoryQueue } from './MemoryQueue.js';
export { PgBossQueue } from './PgBossQueue.js';  // �新增
export { JobFactory } from './JobFactory.js';

import { JobFactory } from './JobFactory.js';

export const jobQueue = JobFactory.getInstance();

4.4 Phase 4:实现任务æ†åˆ†æœºåˆ¶ï¼ˆDay 4ï¼?天)âœ?新增

*任务4.1:创建任务拆分工具函�

// æ‡ä»¶ï¼šbackend/src/common/jobs/utils.ts (新建文件)

import { logger } from '../logging/index.js';

/**
 * 将数组æ†åˆ†æˆ<C3A6>多个批次
 * 
 * @param items è¦<C3A8>æ†åˆ†çš„é¡¹ç®æ•°ç»„
 * @param chunkSize æ¯<C3A6>批次大å°? * @returns æ†åˆ†å<E280A0>Žçš„二维数组
 * 
 * @example
 * splitIntoChunks([1,2,3,4,5], 2)  // [[1,2], [3,4], [5]]
 */
export function splitIntoChunks<T>(items: T[], chunkSize: number = 100): T[][] {
  if (chunkSize <= 0) {
    throw new Error('chunkSize must be greater than 0');
  }
  
  const chunks: T[][] = [];
  for (let i = 0; i < items.length; i += chunkSize) {
    chunks.push(items.slice(i, i + chunkSize));
  }
  
  logger.debug('[TaskSplit] 任务æ†åˆ†å®Œæˆ<C3A6>', {
    total: items.length,
    chunkSize,
    chunks: chunks.length
  });
  
  return chunks;
}

/**
 * 估算处ç<E2809E>†æ—¶é—´
 * 
 * @param itemCount é¡¹ç®æ•°é‡<C3A9>
 * @param timePerItem æ¯<C3A6>项处ç<E2809E>†æ—¶é—´ï¼ˆç§ï¼? * @returns 总时间(秒)
 */
export function estimateProcessingTime(
  itemCount: number,
  timePerItem: number
): number {
  return itemCount * timePerItem;
}

/**
 * 推è<C2A8><C3A8>批次大å°<C3A5>
 * 
 * æ ¹æ<C2B9>®å<C2AE>•项处ç<E2809E>†æ—¶é—´åŒæœ€å¤§æ‰¹æ¬¡æ—¶é—´ï¼Œè®¡ç®—最优批次大å°? * 
 * @param totalItems 总项目数
 * @param timePerItem æ¯<C3A6>项处ç<E2809E>†æ—¶é—´ï¼ˆç§ï¼? * @param maxChunkTime å<>•批次最大时间(ç§ï¼Œé»˜è®¤15分éŸï¼? * @returns 推è<C2A8><C3A8>的批次大å°? * 
 * @example
 * recommendChunkSize(1000, 7.2, 900)  // 返回 125(æ¯<C3A6>æ‰?25项,15分éŸï¼? */
export function recommendChunkSize(
  totalItems: number,
  timePerItem: number,
  maxChunkTime: number = 900  // 15分éŸ
): number {
  // 计算æ¯<C3A6>批次最多能处ç<E2809E>†å¤šå°é¡?  const itemsPerChunk = Math.floor(maxChunkTime / timePerItem);
  
  // é™<C3A9>制范å´ï¼šæœ€å°?0项,最å¤?000é¡?  const recommended = Math.max(10, Math.min(itemsPerChunk, 1000));
  
  logger.info('[TaskSplit] 批次大å°<C3A5>推è<C2A8><C3A8>', {
    totalItems,
    timePerItem: `${timePerItem}ç§’`,
    maxChunkTime: `${maxChunkTime}ç§’`,
    recommended,
    estimatedBatches: Math.ceil(totalItems / recommended),
    estimatedTimePerBatch: `${(recommended * timePerItem / 60).toFixed(1)}分钟`
  });
  
  return recommended;
}

/**
 * 批次任务é…<C3A9>ç½®è¡? */
export const CHUNK_STRATEGIES = {
  // ASLæ‡çŒ®ç­é€‰ï¼šæ¯<C3A6>批100篇,çº?2分éŸ
  'asl:title-screening': {
    chunkSize: 100,
    timePerItem: 7.2,  // ç§?    estimatedTime: 720,  // 12分éŸ
    maxRetries: 3,
    description: 'ASL标题æ˜è¦<C3A8>ç­é€‰ï¼ˆå<CB86>Œæ¨¡åžå¹¶è¡Œï¼‰'
  },
  
  // ASLå…¨æ‡å¤<C3A5>ç­ï¼šæ¯<C3A6>æ‰?0篇,çº?5分éŸ
  'asl:fulltext-screening': {
    chunkSize: 50,
    timePerItem: 18,  // ç§?    estimatedTime: 900,  // 15分éŸ
    maxRetries: 3,
    description: 'ASLå…¨æ‡å¤<C3A5>ç­ï¼?2字段æ<C2B5><C3A6>å<EFBFBD>ï¼?
  },
  
  // DC病历æ<E280A0><C3A6>å<EFBFBD>:æ¯<C3A6>æ‰?0份,çº?分钟
  'dc:medical-extraction': {
    chunkSize: 50,
    timePerItem: 10,  // ç§?    estimatedTime: 500,  // 8分éŸ
    maxRetries: 3,
    description: 'DC医ç—è®°å½•ç»“æž„åŒæ<EFBFBD><EFBFBD>å<EFBFBD>?
  },
  
  // 统计分æž<C3A6>:æ¯<C3A6>æ‰?000æ<30>¡ï¼Œçº?分钟
  'ssa:statistical-analysis': {
    chunkSize: 5000,
    timePerItem: 0.1,  // ç§?    estimatedTime: 500,  // 8分éŸ
    maxRetries: 2,
    description: 'SSA统计分æž<C3A6>计算'
  }
} as const;

/**
 * 获å<C2B7>任务æ†åˆ†ç­ç•¥
 */
export function getChunkStrategy(taskType: keyof typeof CHUNK_STRATEGIES) {
  const strategy = CHUNK_STRATEGIES[taskType];
  if (!strategy) {
    logger.warn('[TaskSplit] 未找到任务ç­ç•¥ï¼Œä½¿ç”¨é»˜è®¤é…<C3A9>ç½®', { taskType });
    return {
      chunkSize: 100,
      timePerItem: 10,
      estimatedTime: 1000,
      maxRetries: 3,
      description: '默认任务策略'
    };
  }
  return strategy;
}

*任务4.2:更新导�

// æ‡ä»¶ï¼šbackend/src/common/jobs/index.ts

export type { Job, JobStatus, JobHandler, JobQueue } from './types.js';
export { MemoryQueue } from './MemoryQueue.js';
export { PgBossQueue } from './PgBossQueue.js';
export { JobFactory } from './JobFactory.js';
export {  // �新增
  splitIntoChunks,
  estimateProcessingTime,
  recommendChunkSize,
  getChunkStrategy,
  CHUNK_STRATEGIES
} from './utils.js';

import { JobFactory } from './JobFactory.js';

export const jobQueue = JobFactory.getInstance();

*任务4.3:å<EFBFBD>•å…ƒæµè¯?

// æ‡ä»¶ï¼šbackend/tests/common/jobs/utils.test.ts(æ°å»ºï¼‰

import { describe, it, expect } from 'vitest';
import { splitIntoChunks, recommendChunkSize } from '../../../src/common/jobs/utils.js';

describe('Task Split Utils', () => {
  describe('splitIntoChunks', () => {
    it('should split array into chunks', () => {
      const items = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
      const chunks = splitIntoChunks(items, 3);
      
      expect(chunks).toEqual([
        [1, 2, 3],
        [4, 5, 6],
        [7, 8, 9],
        [10]
      ]);
    });

    it('should handle exact division', () => {
      const items = [1, 2, 3, 4, 5, 6];
      const chunks = splitIntoChunks(items, 2);
      
      expect(chunks).toEqual([
        [1, 2],
        [3, 4],
        [5, 6]
      ]);
    });

    it('should handle empty array', () => {
      const chunks = splitIntoChunks([], 10);
      expect(chunks).toEqual([]);
    });
  });

  describe('recommendChunkSize', () => {
    it('should recommend chunk size for ASL screening', () => {
      const size = recommendChunkSize(1000, 7.2, 900);  // 15分éŸ
      expect(size).toBe(125);  // 125 * 7.2 = 900ç§?    });

    it('should not exceed max limit', () => {
      const size = recommendChunkSize(10000, 0.1, 900);
      expect(size).toBe(1000);  // 最�000
    });

    it('should not go below min limit', () => {
      const size = recommendChunkSize(100, 100, 900);
      expect(size).toBe(10);  // 最�0
    });
  });
});

4.5 Phase 5:实现æ­ç¹ç»­ä¼ æœºåˆ¶ï¼ˆDay 5ï¼?天)âœ?新增

任务5.1ï¼šæ´æ°Prisma Schema

// æ‡ä»¶ï¼šprisma/schema.prisma

model AslScreeningTask {
  // ... 现有字段 ...
  
  // �新增字段:任务拆�  totalBatches      Int      @default(1) @map("total_batches")
  processedBatches  Int      @default(0) @map("processed_batches")
  currentBatchIndex Int      @default(0) @map("current_batch_index")
  
  // �新增字段:断点续�  currentIndex      Int       @default(0) @map("current_index")
  lastCheckpoint    DateTime? @map("last_checkpoint")
  checkpointData    Json?     @map("checkpoint_data")
}
# 生æˆ<C3A6>è¿<C3A8>ç§»
cd backend
npx prisma migrate dev --name add_task_split_checkpoint

# 验è¯<C3A8>
npx prisma migrate status

*任务5.2:åˆå»ºæ­ç¹ç»­ä¼ æœ<EFBFBD>åŠ?

// æ‡ä»¶ï¼šbackend/src/common/jobs/CheckpointService.ts(æ°å»ºï¼‰

import { prisma } from '../../config/database.js';
import { logger } from '../logging/index.js';

export interface CheckpointData {
  lastProcessedId?: string;
  lastProcessedIndex: number;
  batchProgress: number;
  metadata?: Record<string, any>;
}

/**
 * æ­ç¹ç»­ä¼ æœ<C3A6>务
 * 
 * æ<><C3A6>ä¾ä»»åŠ¡æ­ç¹çš„ä¿<C3A4>å­˜ã€<C3A3>读å<C2BB>åŒæ<C592>¢å¤<C3A5>功能
 */
export class CheckpointService {
  /**
   * ä¿<C3A4>å­˜æ­ç¹
   * 
   * @param taskId 任务ID
   * @param currentIndex 当å‰<C3A5>索引
   * @param data æ­ç¹æ•°æ<C2B0>®
   */
  static async saveCheckpoint(
    taskId: string,
    currentIndex: number,
    data?: Partial<CheckpointData>
  ): Promise<void> {
    try {
      const checkpointData: CheckpointData = {
        lastProcessedIndex: currentIndex,
        batchProgress: 0,
        ...data
      };

      await prisma.aslScreeningTask.update({
        where: { id: taskId },
        data: {
          currentIndex,
          lastCheckpoint: new Date(),
          checkpointData: checkpointData as any
        }
      });

      logger.debug('[Checkpoint] æ­ç¹å·²ä¿<C3A4>å­?, {
        taskId,
        currentIndex,
        checkpoint: checkpointData
      });

    } catch (error) {
      logger.error('[Checkpoint] ä¿<EFBFBD>å­˜æ­ç¹å¤±è´¥', { taskId, currentIndex, error });
      // ä¸<C3A4>æŠå‡ºé”™è¯¯ï¼Œé<C592>¿å…<C3A5>å½±å“<C3A5>主æµ<C3A6>ç¨?    }
  }

  /**
   * 读å<C2BB>æ­ç¹
   * 
   * @param taskId 任务ID
   * @returns æ­ç¹ç´¢å¼•åŒæ•°æ<C2B0>?   */
  static async loadCheckpoint(taskId: string): Promise<{
    startIndex: number;
    data: CheckpointData | null;
  }> {
    try {
      const task = await prisma.aslScreeningTask.findUnique({
        where: { id: taskId },
        select: {
          currentIndex: true,
          lastCheckpoint: true,
          checkpointData: true
        }
      });

      if (!task) {
        return { startIndex: 0, data: null };
      }

      const startIndex = task.currentIndex || 0;
      const data = task.checkpointData as CheckpointData | null;

      if (startIndex > 0) {
        logger.info('[Checkpoint] æ­ç¹æ<EFBFBD>¢å¤<EFBFBD>', {
          taskId,
          startIndex,
          lastCheckpoint: task.lastCheckpoint,
          data
        });
      }

      return { startIndex, data };

    } catch (error) {
      logger.error('[Checkpoint] 读å<EFBFBD>æ­ç¹å¤±è´¥', { taskId, error });
      return { startIndex: 0, data: null };
    }
  }

  /**
   * 清除断点
   * 
   * @param taskId 任务ID
   */
  static async clearCheckpoint(taskId: string): Promise<void> {
    try {
      await prisma.aslScreeningTask.update({
        where: { id: taskId },
        data: {
          currentIndex: 0,
          lastCheckpoint: null,
          checkpointData: null
        }
      });

      logger.debug('[Checkpoint] 断点已清�, { taskId });

    } catch (error) {
      logger.error('[Checkpoint] 清除断点失败', { taskId, error });
    }
  }

  /**
   * 批é‡<C3A9>ä¿<C3A4>å­˜è¿åº¦ï¼ˆåŒ…å<E280A6>«æ­ç¹ï¼‰
   * 
   * @param taskId 任务ID
   * @param updates æ´æ°æ•°æ<C2B0>®
   */
  static async updateProgress(
    taskId: string,
    updates: {
      processedItems?: number;
      successItems?: number;
      failedItems?: number;
      conflictItems?: number;
      currentIndex?: number;
      checkpointData?: Partial<CheckpointData>;
    }
  ): Promise<void> {
    try {
      const data: any = {};

      // 累加字段
      if (updates.processedItems !== undefined) {
        data.processedItems = { increment: updates.processedItems };
      }
      if (updates.successItems !== undefined) {
        data.successItems = { increment: updates.successItems };
      }
      if (updates.failedItems !== undefined) {
        data.failedItems = { increment: updates.failedItems };
      }
      if (updates.conflictItems !== undefined) {
        data.conflictItems = { increment: updates.conflictItems };
      }

      // 断点字段
      if (updates.currentIndex !== undefined) {
        data.currentIndex = updates.currentIndex;
        data.lastCheckpoint = new Date();
      }
      if (updates.checkpointData) {
        data.checkpointData = updates.checkpointData;
      }

      await prisma.aslScreeningTask.update({
        where: { id: taskId },
        data
      });

      logger.debug('[Checkpoint] 进度已更�, { taskId, updates });

    } catch (error) {
      logger.error('[Checkpoint] 更新进度失败', { taskId, error });
    }
  }
}

*任务5.3:更新导�

// æ‡ä»¶ï¼šbackend/src/common/jobs/index.ts

export { CheckpointService } from './CheckpointService.js';  // �新增

4.6 Phase 6:改造业务代ç <C3A7>(Day 6-7ï¼?天)âœ?*已更æ–?

*任务6.1:改造ASLç­é€‰æœ<EFBFBD>务(âœ?完整版:拆分+断点+队列ï¼?

// æ‡ä»¶ï¼šbackend/src/modules/asl/services/screeningService.ts

import { prisma } from '../../../config/database.js';
import { logger } from '../../../common/logging/index.js';
import { 
  jobQueue, 
  splitIntoChunks, 
  recommendChunkSize,
  CheckpointService  // �新增
} from '../../../common/jobs/index.js';
import { llmScreeningService } from './llmScreeningService.js';

/**
 * å<>¯åЍç­é€‰ä»»åŠ¡ï¼ˆæ”¹é€ å<C2A0>Žï¼šä½¿ç”¨é˜Ÿåˆ—)
 */
export async function startScreeningTask(projectId: string, userId: string) {
  try {
    logger.info('Starting screening task', { projectId, userId });

    // 1. æ£€æŸ¥é¡¹ç®æ˜¯å<C2AF>¦å­˜åœ?    const project = await prisma.aslScreeningProject.findFirst({
      where: { id: projectId, userId },
    });

    if (!project) {
      throw new Error('Project not found');
    }

    // 2. 获å<C2B7>该项ç®çš„æ‰€æœ‰æ‡çŒ?    const literatures = await prisma.aslLiterature.findMany({
      where: { projectId },
    });

    if (literatures.length === 0) {
      throw new Error('No literatures found in project');
    }

    logger.info('Found literatures for screening', { 
      projectId, 
      count: literatures.length 
    });

    // 3. åˆå»ºç­é€‰ä»»åŠ¡ï¼ˆæ•°æ<C2B0>®åº“记录)
    const task = await prisma.aslScreeningTask.create({
      data: {
        projectId,
        taskType: 'title_abstract',
        status: 'pending',  // â†?åˆ<C3A5>å§çжæ€<C3A6>改为pending
        totalItems: literatures.length,
        processedItems: 0,
        successItems: 0,
        failedItems: 0,
        conflictItems: 0,
        startedAt: new Date(),
      },
    });

    logger.info('Screening task created', { taskId: task.id });

    // 4. âœ?推é€<C3A9>åˆ°é˜Ÿåˆ—ï¼ˆå¼æ­¥å¤„ç<E2809E>†ï¼Œä¸<C3A4>阻塞请æ±ï¼‰
    await jobQueue.push('asl:title-screening', {
      taskId: task.id,
      projectId,
      literatureIds: literatures.map(lit => lit.id),
    });

    logger.info('Task pushed to queue', { taskId: task.id });

    // 5. ç«å<E280B9>³è¿”åžä»»åŠ¡ID(å‰<C3A5>端å<C2AF>¯ä»¥è½®è¯¢è¿åº¦ï¼‰
    return task;
    
  } catch (error) {
    logger.error('Failed to start screening task', { error, projectId });
    throw error;
  }
}

/**
 * âœ?注册队列Worker(在应用å<C2A8>¯åŠ¨æ—¶è°ƒç”¨ï¼‰
 * 
 * 这个函数需è¦<C3A8>在 backend/src/index.ts 中调ç”? */
export function registerScreeningWorkers() {
  // 注册标题æ˜è¦<C3A8>ç­é€‰Worker
  jobQueue.process('asl:title-screening', async (job) => {
    const { taskId, projectId, literatureIds } = job.data;
    
    logger.info('å¼€å§å¤„ç<E2809E>†æ ‡é¢˜æ˜è¦<C3A8>ç­é€?, { 
      taskId, 
      total: literatureIds.length 
    });

    try {
      // æ´æ°ä»»åŠ¡çŠ¶æ€<C3A6>为running
      await prisma.aslScreeningTask.update({
        where: { id: taskId },
        data: { status: 'running' }
      });

      // 获å<C2B7>项ç®çš„PICOS标准
      const project = await prisma.aslScreeningProject.findUnique({
        where: { id: projectId },
      });

      if (!project) {
        throw new Error('Project not found');
      }

      const rawPicoCriteria = project.picoCriteria as any;
      const picoCriteria = {
        P: rawPicoCriteria?.P || rawPicoCriteria?.population || '',
        I: rawPicoCriteria?.I || rawPicoCriteria?.intervention || '',
        C: rawPicoCriteria?.C || rawPicoCriteria?.comparison || '',
        O: rawPicoCriteria?.O || rawPicoCriteria?.outcome || '',
        S: rawPicoCriteria?.S || rawPicoCriteria?.studyDesign || '',
      };

      // é€<C3A9>个处ç<E2809E>†æ‡çŒ®
      let successCount = 0;
      let failedCount = 0;
      let conflictCount = 0;

      for (let i = 0; i < literatureIds.length; i++) {
        const literatureId = literatureIds[i];
        
        try {
          // 获å<C2B7>æ‡çŒ®ä¿¡æ<C2A1>¯
          const literature = await prisma.aslLiterature.findUnique({
            where: { id: literatureId },
          });

          if (!literature) {
            failedCount++;
            continue;
          }

          // 调用LLMç­é€‰æœ<C3A6>åŠ?          const screeningResult = await llmScreeningService.screenSingleLiterature(
            literature.title || '',
            literature.abstract || '',
            picoCriteria,
            projectId
          );

          // åˆ¤æ­æ˜¯å<C2AF>¦å†²çª<C3A7>
          const isConflict = screeningResult.deepseekDecision !== screeningResult.qwenDecision;

          // ä¿<C3A4>å­˜ç­é€‰ç»“æž?          await prisma.aslScreeningResult.create({
            data: {
              literatureId,
              projectId,
              taskId,
              taskType: 'title_abstract',
              
              deepseekDecision: screeningResult.deepseekDecision,
              deepseekReason: screeningResult.deepseekReason,
              deepseekRawResponse: screeningResult.deepseekRawResponse,
              
              qwenDecision: screeningResult.qwenDecision,
              qwenReason: screeningResult.qwenReason,
              qwenRawResponse: screeningResult.qwenRawResponse,
              
              finalDecision: screeningResult.finalDecision,
              hasConflict: isConflict,
              manualReviewRequired: isConflict,
            },
          });

          if (isConflict) {
            conflictCount++;
          } else {
            successCount++;
          }

        } catch (error) {
          logger.error('处ç<EFBFBD>†æ‡çŒ®å¤±è´¥', { literatureId, error });
          failedCount++;
        }

        // æ´æ°è¿åº¦ï¼ˆæ¯<C3A6>处ç<E2809E>†10ç¯‡æˆæœ€å<E282AC>Žä¸€ç¯‡æ—¶æ´æ°ï¼?        if ((i + 1) % 10 === 0 || i === literatureIds.length - 1) {
          await prisma.aslScreeningTask.update({
            where: { id: taskId },
            data: {
              processedItems: i + 1,
              successItems: successCount,
              failedItems: failedCount,
              conflictItems: conflictCount,
            }
          });
        }
      }

      // 标记任务完æˆ<C3A6>
      await prisma.aslScreeningTask.update({
        where: { id: taskId },
        data: {
          status: 'completed',
          completedAt: new Date(),
          processedItems: literatureIds.length,
          successItems: successCount,
          failedItems: failedCount,
          conflictItems: conflictCount,
        }
      });

      logger.info('标题æ˜è¦<EFBFBD>ç­é€‰å®Œæˆ?, { 
        taskId, 
        total: literatureIds.length,
        success: successCount,
        failed: failedCount,
        conflict: conflictCount
      });

      return {
        success: true,
        processed: literatureIds.length,
        successCount,
        failedCount,
        conflictCount,
      };

    } catch (error) {
      // 任务失败,更新状�      await prisma.aslScreeningTask.update({
        where: { id: taskId },
        data: {
          status: 'failed',
          completedAt: new Date(),
        }
      });

      logger.error('标题æ˜è¦<C3A8>ç­é€‰å¤±è´?, { taskId, error });
      throw error;
    }
  });

  logger.info('�ASL筛选Worker已注�);
}

*任务4.2ï¼šæ´æ°åº”用入å<EFBFBD>£ï¼ˆæ³¨å†ŒWorkersï¼?

// æ‡ä»¶ï¼šbackend/src/index.ts

import Fastify from 'fastify';
import { logger } from './common/logging/index.js';
import { startCacheCleanupTask } from './common/cache/index.js';  // �新增
import { registerScreeningWorkers } from './modules/asl/services/screeningService.js';  // �新增

const app = Fastify({ logger: false });

// ... 注册路由�...

// âœ?å<>¯åЍæœ<C3A6>务器å<C2A8>Žï¼Œæ³¨å†Œé˜Ÿåˆ—WorkersåŒå®šæ—¶ä»»åŠ?app.listen({ port: 3001, host: '0.0.0.0' }, async (err, address) => {
  if (err) {
    logger.error('Failed to start server', { error: err });
    process.exit(1);
  }

  logger.info(`Server listening on ${address}`);

  // âœ?å<>¯åŠ¨ç¼“å­˜å®šæ—¶æ¸…ç<E280A6>  if (process.env.CACHE_TYPE === 'postgres') {
    startCacheCleanupTask();
  }

  // �注册队列Workers
  if (process.env.QUEUE_TYPE === 'pgboss') {
    try {
      registerScreeningWorkers();  // ASL模å<C2A1>      // registerDataCleaningWorkers();  // DC模å<C2A1>—(待实现ï¼?      // registerStatisticalWorkers();   // SSA模å<C2A1>—(待实现ï¼?    } catch (error) {
      logger.error('Failed to register workers', { error });
    }
  }
});

// 优雅关闭
process.on('SIGTERM', async () => {
  logger.info('SIGTERM received, closing server gracefully...');
  await app.close();
  process.exit(0);
});

任务4.3:DC模å<EFBFBD>—改造(次优先)

// æ‡ä»¶ï¼šbackend/src/modules/dc/tool-b/services/MedicalRecordExtractionService.ts
// (仅示ä¾ï¼Œå®žé™…æ ¹æ<C2B9>®éœ€æ±å®žçŽ°ï¼‰

import { jobQueue } from '../../../../common/jobs/index.js';

export function registerMedicalExtractionWorkers() {
  jobQueue.process('dc:medical-extraction', async (job) => {
    const { taskId, recordIds } = job.data;
    
    // 批é‡<C3A9>æ<EFBFBD><C3A6>å<EFBFBD>病历
    for (const recordId of recordIds) {
      await extractSingleRecord(recordId);
      
      // 更新进度
      // ...
    }
    
    return { success: true };
  });
}

4.5 Phase 5:æµè¯•验è¯<C3A8>(Day 6ï¼?天)

*任务5.1:å<EFBFBD>•å…ƒæµè¯?

// æ‡ä»¶ï¼šbackend/tests/common/cache/PostgresCacheAdapter.test.ts

import { describe, it, expect, beforeEach } from 'vitest';
import { PostgresCacheAdapter } from '../../../src/common/cache/PostgresCacheAdapter.js';
import { prisma } from '../../../src/config/database.js';

describe('PostgresCacheAdapter', () => {
  let cache: PostgresCacheAdapter;

  beforeEach(async () => {
    cache = new PostgresCacheAdapter();
    // 清空æµè¯•æ•°æ<C2B0>®
    await prisma.appCache.deleteMany({});
  });

  it('should set and get cache', async () => {
    await cache.set('test:key', { value: 'hello' }, 60);
    const result = await cache.get('test:key');
    expect(result).toEqual({ value: 'hello' });
  });

  it('should return null for expired cache', async () => {
    await cache.set('test:key', { value: 'hello' }, 1);  // 1ç§è¿‡æœ?    await new Promise(resolve => setTimeout(resolve, 1100));  // 等待1.1ç§?    const result = await cache.get('test:key');
    expect(result).toBeNull();
  });

  it('should delete cache', async () => {
    await cache.set('test:key', { value: 'hello' }, 60);
    await cache.delete('test:key');
    const result = await cache.get('test:key');
    expect(result).toBeNull();
  });

  // ... 更多测试 ...
});

*任务5.2ï¼šé†æˆ<EFBFBD>æµè¯?

# 1. å<>¯åŠ¨æœ¬åœ°Postgres
docker start ai-clinical-postgres

# 2. 设置环境å<C692>˜é‡<C3A9>
export CACHE_TYPE=postgres
export QUEUE_TYPE=pgboss
export DATABASE_URL=postgresql://postgres:123456@localhost:5432/aiclincial

# 3. è¿<C3A8>行è¿<C3A8>ç§»
cd backend
npx prisma migrate deploy

# 4. å<>¯åŠ¨åº”ç”¨
npm run dev

# 应该看到日志�# [CacheFactory] 使用PostgresCacheAdapter
# [JobFactory] 使用PgBossQueue
# [PgBoss] 队列已å<C2B2>¯åŠ?# [PostgresCache] 定时清ç<E280A6>†ä»»åС已å<C2B2>¯åŠ?# âœ?ASLç­é€‰Worker已注å†?```

#### **任务5.3:功能测�*

```bash
# æµè¯•1:缓存功èƒ?curl -X POST http://localhost:3001/api/v1/asl/projects/:projectId/screening

# 观察日志�# [PostgresCache] 缓存命中 { key: 'asl:llm:...' }

# æµè¯•2:队列功èƒ?# æ<><C3A6>交任务 â†?è§å¯Ÿä»»åŠ¡çŠ¶æ€<C3A6>å<EFBFBD>˜åŒ?# pending â†?running â†?completed

# æµè¯•3:实ä¾é‡<C3A9>å<EFBFBD>¯æ<C2AF>¢å¤?# æ<><C3A6>交任务 â†?等待处ç<E2809E>†åˆ?0% â†?Ctrl+Cå<43>œæ­¢ â†?é‡<C3A9>æ°å<C2B0>¯åЍ
# 任务应该自动æ<C2A8>¢å¤<C3A5>å¹¶ç»§ç»?```

---

## 5. 优先级与ä¾<C3A4>èµå…³ç³»

### 5.1 改造优先级矩阵(✨ 已更新)

| 模å<C2A1>| 优先çº?| 工作é‡?| 业务价å€?| 风险 | ä¾<C3A4>èµ | 状æ€?|
|------|--------|--------|---------|------|------|------|
| **环境准备** | P0 | 0.5�| - | �| �| �待开�|
| **PostgresCacheAdapter** | P0 | 0.5å¤?| é™<C3A9>低LLMæˆ<C3A6>本50% | ä½?| 环境准备 | â¬?å¾…å¼€å§?|
| **PgBossQueue** | P0 | 2å¤?| 长任务å<C2A1>¯é<C2AF> æ€?| ä¸?| 环境准备 | â¬?å¾…å¼€å§?|
| **任务拆分机制** | P0 | 1å¤?| 任务æˆ<C3A6>功çŽ?> 99% | ä¸?| PgBossQueue | â¬?å¾…å¼€å§?|
| **断点续传机制** | P0 | 1å¤?| 容错性æ<C2A7><C3A6>å<EFBFBD>?0å€?| ä½?| PgBossQueue | â¬?å¾…å¼€å§?|
| **ASLç­é€‰æ”¹é€?* | P0 | 1.5å¤?| 用户核心功能 | ä¸?| 任务拆分+断点 | â¬?å¾…å¼€å§?|
| **DCæ<43><C3A6>å<EFBFBD>改é€?* | P1 | 1å¤?| 用户核心功能 | ä¸?| 任务拆分+断点 | â¬?å¾…å¼€å§?|
| **æµè¯•验è¯<C3A8>** | P0 | 1.5å¤?| ä¿<C3A4>è¯<C3A8>è´¨é‡<C3A9> | ä½?| 所有改é€?| â¬?å¾…å¼€å§?|
| **SAE部署** | P1 | 0.5å¤?| 生产就绪 | ä¸?| æµè¯•验è¯<C3A8> | â¬?å¾…å¼€å§?|

**总工作é‡<C3A9>ï¼?* 9天(比V1.0增加2天,增加任务æ†åˆ†åŒæ­ç¹ç»­ä¼ ï¼‰

### 5.2 ä¾<C3A4>èµå…³ç³»å¾ï¼ˆâœ?已更新)

Day 1: 环境准备 ────────────────â”? â”?Day 1: PostgresCache ──────────â”? â”?Day 2-3: PgBossQueue ───────────â”? â”?Day 4: 任务拆分机制 ─────────────â”? ├─â†?Day 6-7: ASLç­é€‰æ”¹é€?──â”?Day 5: 断点续传机制 ─────────────â”? â”? â”? â”?Day 7: DCæ<43><C3A6>å<EFBFBD>改é€?──────────────â”? â”? â”? ├─â†?Day 8-9: 测试 + 部署


### 5.3 å<>¯å¹¶è¡Œå·¥ä½œï¼ˆâœ?已更新)

Day 1(并行)ï¼?├─ 环境准备(上å<C5A0>ˆï¼‰ └─ PostgresCache实现(ä¸å<E280B9>ˆï¼‰

Day 2-3(串行)ï¼?└─ PgBossQueue实现(必须等环境准备完æˆ<C3A6>ï¼? Day 4-5(å<CB86>¯å¹¶è¡Œï¼‰ï¼š ├─ 任务拆分机制(工具函数) └─ æ­ç¹ç»­ä¼ æœºåˆ¶ï¼ˆæ•°æ<C2B0>®åº“字段ï¼? Day 6-7(串行)ï¼?├─ ASLç­é€‰æ”¹é€ ï¼ˆä½¿ç”¨æ†åˆ†+断点ï¼?└─ DCæ<43><C3A6>å<EFBFBD>改造(å¤<C3A5>用æ†åˆ†+断点ï¼? Day 8-9(并行)ï¼?├─ æµè¯•验è¯<C3A8> └─ 文档完善


---

## 6. æµè¯•验è¯<C3A8>æ¹æ¡ˆï¼ˆâœ¨ 已更新)

### 6.1 æµè¯•清å<E280A6>•

#### **功能测试(基础�*
```bash
�缓存读写正常
âœ?缓存过期自动清ç<E280A6>†ï¼ˆæ¯<C3A6>分éŸ1000æ<30>¡ï¼‰
âœ?缓存多实ä¾å…±äº«ï¼ˆå®žä¾A写â†å®žä¾B读)
âœ?队列任务入队(pg-bossï¼?âœ?队列任务处ç<E2809E>†ï¼ˆWorkerï¼?âœ?队列任务é‡<C3A9>试(模æŸå¤±è´¥ï¼Œ3次é‡<C3A9>试)
âœ?队列实ä¾é‡<C3A9>å<EFBFBD>¯æ<C2AF>¢å¤<C3A5>

任务拆分测试 �新增

�任务拆分工具函数正确�  - splitIntoChunks([1..100], 30) �4批(30+30+30+10�  - recommendChunkSize(1000, 7.2, 900) �125

�拆分任务入队
  - 1000篇æ‡çŒ?â†?10批次,æ¯<C3A6>æ‰?00ç¯?  - 验è¯<C3A8>æ•°æ<C2B0>®åº“:totalBatches=10, processedBatches=0

âœ?批次并行处ç<E2809E>†
  - 多个Workerå<72>Œæ—¶å¤„ç<E2809E>†ä¸<C3A4>å<EFBFBD>Œæ‰¹æ¬¡
  - 验è¯<C3A8>无冲çª<C3A7>(SKIP LOCKEDï¼?
âœ?批次失败é‡<C3A9>试
  - 模拟ç¬?批失è´?â†?自动é‡<C3A9>试
  - 其仿‰¹æ¬¡ä¸<C3A4>å<EFBFBD>—å½±å“<C3A5>

断点续传测试 �新增

âœ?æ­ç¹ä¿<C3A4>å­˜
  - 处ç<E2809E>†åˆ°ç¬¬50é¡?â†?ä¿<C3A4>å­˜æ­ç¹
  - 验è¯<C3A8>æ•°æ<C2B0>®åº“:currentIndex=50, lastCheckpointæ´æ°

âœ?æ­ç¹æ<C2B9>¢å¤<C3A5>
  - 任务中æ­ï¼ˆCtrl+Cï¼?  - é‡<C3A9>å<EFBFBD>¯æœ<C3A6>务 â†?从第50项继ç»?  - 验è¯<C3A8>:å‰<C3A5>50项ä¸<C3A4>é‡<C3A9>å¤<C3A5>处ç<E2809E>†

âœ?批次级断ç‚?  - 10批次任务,完æˆ<C3A6>å‰<C3A5>3æ‰?  - 实ä¾é‡<C3A9>å<EFBFBD>¯ â†?从第4批继ç»?  - 验è¯<C3A8>:processedBatches=3, currentBatchIndex=3

*长时间任务测è¯? 🔴 é‡<EFBFBD>ç¹

âœ?1000篇æ‡çŒ®ç­é€‰ï¼ˆçº?å°<C3A5>æ—¶ï¼?  - 拆分æˆ?0批,æ¯<C3A6>批12分éŸ
  - æˆ<C3A6>功çŽ?> 99%
  - 验è¯<C3A8>è¿åº¦æ´æ°ï¼ˆæ¯<C3A6>10篇)

âœ?10000篇æ‡çŒ®ç­é€‰ï¼ˆçº?0å°<C3A5>æ—¶ï¼?  - 拆分æˆ?00批,æ¯<C3A6>批12分éŸ
  - 10个Worker并行 â†?2å°<C3A5>时完æˆ<C3A6>
  - æˆ<C3A6>功çŽ?> 99.5%

âœ?实ä¾é‡<C3A9>å<EFBFBD>¯æ<C2AF>¢å¤<C3A5>(关键æµè¯•)
  - å<>¯åŠ¨ä»»åŠ¡ â†?等待50% â†?å<>œæ­¢æœ<C3A6>务(Ctrl+Cï¼?  - é‡<C3A9>å<EFBFBD>¯æœ<C3A6>务 â†?任务自动æ<C2A8>¢å¤<C3A5>
  - 验è¯<C3A8>:从50%继续,ä¸<C3A4>ä»?%å¼€å§?  - 预期:总耗时 â‰?原计划时é—?× 1.05(误å·?%内)

性能测试

âœ?缓存读å<C2BB>延迟 < 5ms(P99ï¼?âœ?缓存写入延迟 < 10ms(P99ï¼?âœ?队列å<E28094>žå<C5BE><C3A5>é‡?> 100任务/å°<C3A5>æ—¶
âœ?æ­ç¹ä¿<C3A4>存延迟 < 20ms
âœ?批次切æ<E280A1>¢å»¶è¿Ÿ < 100ms

故障测试(增强)

âœ?实ä¾é”€æ¯<C3A6>(SAE缩容ï¼?  - 正在处ç<E2809E>†ä»»åŠ¡ â†?实例销æ¯?  - 等待10åˆ†éŸ â†?新实例接ç®?  - 任务从æ­ç¹æ<C2B9>¢å¤?
âœ?æ•°æ<C2B0>®åº“连接æ­å¼€
  - 处ç<E2809E>†ä»»åŠ¡ä¸?â†?断开连接
  - 自动é‡<C3A9>连 â†?继续处ç<E2809E>†

âœ?任务处ç<E2809E>†å¤±è´¥
  - 模æŸLLMè¶…æ—¶
  - 自动é‡<C3A9>试3æ¬?  - 失败å<C2A5>Žæ ‡è®°ä¸ºfailed

âœ?Postgres慢查è¯?  - æ¨¡æŸæ•°æ<C2B0>®åº“æ…¢ï¼? 5ç§ï¼‰
  - 任务ä¸<C3A4>失败,等待完æˆ<C3A6>

âœ?å¹¶å<C2B6>冲çª<C3A7>
  - 2个Worker领å<E280A0>å<E28093>Œä¸€æ‰¹æ¬¡
  - pg-boss SKIP LOCKED机制
  - 验è¯<C3A8>:å<C5A1>ªæœ?个Worker处ç<E2809E>†

âœ?å<>å¸ƒæ´æ°ï¼ˆç”Ÿäº§åœºæ™¯ï¼‰
  - 15:00å<30>å¸ƒæ´æ°
  - 正在执行çš?批任åŠ?  - 实ä¾é‡<C3A9>å<EFBFBD>¯ â†?5批任务自动æ<C2A8>¢å¤?```

### 6.2 测试脚本

#### **æµè¯•1:完整æµ<C3A6>ç¨ï¼ˆ1000篇æ‡çŒ®ï¼‰**

```bash
# 1. 准备æµè¯•æ•°æ<C2B0>®
cd backend
npm run test:seed -- --literatures=1000

# 2. å<>¯åЍæœ<C3A6>务
npm run dev

# 3. æ<><C3A6>交任务
curl -X POST http://localhost:3001/api/v1/asl/projects/:projectId/screening \
  -H "Content-Type: application/json"

# 4. 观察日志
# [TaskSplit] 任务æ†åˆ†å®Œæˆ<C3A6> { total: 1000, chunkSize: 100, chunks: 10 }
# [PgBoss] 任务入队 { type: 'asl:title-screening-batch', jobId: '1' }
# ...
# [PgBoss] 批次处ç<E2809E>†å®Œæˆ<C3A6> { taskId, batchIndex: 0, progress: '1/10' }

# 5. 查询进度
curl http://localhost:3001/api/v1/asl/tasks/:taskId

# 预期å“<C3A5>应ï¼?# {
#   "id": "task_123",
#   "status": "running",
#   "totalItems": 1000,
#   "processedItems": 350,
#   "totalBatches": 10,
#   "processedBatches": 3,
#   "progress": 0.35
# }

*æµè¯•2:æ­ç¹æ<EFBFBD>¢å¤<EFBFBD>(关键æµè¯•ï¼?

# 1. å<>¯åŠ¨ä»»åŠ¡
curl -X POST http://localhost:3001/api/v1/asl/projects/:projectId/screening

# 2. 等待处ç<E2809E>†åˆ?0%
# è§å¯Ÿæ—¥å¿—:processedItems: 500

# 3. 强制å<C2B6>œæ­¢æœ<C3A6>务
# Ctrl+C �kill -9 <pid>

# 4. é‡<C3A9>æ°å<C2B0>¯åЍæœ<C3A6>务
npm run dev

# 5. 观察日志
# [Checkpoint] æ­ç¹æ<C2B9>¢å¤<C3A5> { taskId, startIndex: 500 }
# [PgBoss] å¼€å§å¤„ç<E2809E>†ä»»åŠ?{ batchIndex: 5 }  â†?从第6批继ç»?
# 6. 验è¯<C3A8>最终结æž?# 总耗时应该约等äº?2å°<C3A5>æ—¶ + é‡<C3A9>å<EFBFBD>¯æ—¶é—´ï¼? 5分éŸï¼?# ä¸<C3A4>应该是 4å°<C3A5>时(从头开å§ï¼‰

æµè¯•3:并å<EFBFBD>处ç<EFBFBD>†ï¼ˆ10000篇)

# 1. 准备大数æ<C2B0>®é
npm run test:seed -- --literatures=10000

# 2. å<>¯åŠ¨å¤šä¸ªWorker实ä¾ï¼ˆæ¨¡æŸSAE多实ä¾ï¼‰
# Terminal 1
npm run dev -- --port=3001

# Terminal 2
npm run dev -- --port=3002

# Terminal 3
npm run dev -- --port=3003

# 3. æ<><C3A6>交任务(任æ„<C3A6>一个实ä¾ï¼‰
curl -X POST http://localhost:3001/api/v1/asl/projects/:projectId/screening

# 4. 观察三个实例的日å¿?# 应该看到ï¼?个实ä¾å<E280B9>Œæ—¶å¤„ç<E2809E>†ä¸<C3A4>å<EFBFBD>Œæ‰¹æ¬?# Worker1: 处ç<E2809E>†æ‰¹æ¬¡ 0, 3, 6, 9, ...
# Worker2: 处ç<E2809E>†æ‰¹æ¬¡ 1, 4, 7, 10, ...
# Worker3: 处ç<E2809E>†æ‰¹æ¬¡ 2, 5, 8, 11, ...

# 5. 验è¯<C3A8>完æˆ<C3A6>æ—¶é—´
# 100批次 / 3个Worker â‰?33.3批轮æ¬?× 12åˆ†éŸ â‰?6.6å°<EFBFBD>æ—¶
# å<>•Worker需è¦<C3A8>:100æ‰?× 12åˆ†éŸ = 20å°<C3A5>æ—¶
# 加速比�0 / 6.6 �3�```

### 6.3 监控指标(✨ 已更新)

```typescript
// 缓存监控
- cache_hit_rate: 命中�(目标 > 60%)
- cache_total_count: 总数
- cache_expired_count: 过期数é‡<C3A9>
- cache_by_module: å<>„模å<C2A1>—分å¸?
// 队列监控(基础ï¼?- queue_pending_count: 待处ç<E2809E>†ä»»åŠ?- queue_processing_count: 处ç<E2809E>†ä¸­ä»»åŠ?- queue_completed_count: 完æˆ<C3A6>任务
- queue_failed_count: 失败任务
- queue_avg_duration: å¹³å<C2B3>‡è€—æ—¶

// 任务拆分监控 �新增
- task_total_batches: 总批次数
- task_processed_batches: 已完æˆ<C3A6>批次数
- task_batch_success_rate: 批次æˆ<C3A6>功çŽ?(目标 > 99%)
- task_avg_batch_duration: å¹³å<C2B3>‡æ‰¹æ¬¡è€—æ—¶

// 断点续传监控 �新增
- checkpoint_save_count: æ­ç¹ä¿<C3A4>存次数
- checkpoint_restore_count: æ­ç¹æ<C2B9>¢å¤<C3A5>次数
- checkpoint_save_duration: ä¿<C3A4>存耗时 (目标 < 20ms)
- task_recovery_success_rate: æ<>¢å¤<C3A5>æˆ<C3A6>功çŽ?(目标 100%)

6.4 æˆ<C3A6>功标准(✨ 已更新)

基础功能ï¼?âœ?所有å<E280B0>•å…ƒæµè¯•通过
âœ?æ‰€æœ‰é†æˆ<C3A6>æµè¯•通过
�缓存命中�> 60%
âœ?LLM API调用次数ä¸é™<C3A9> > 40%

任务å<C2A1>¯é<C2AF> æ€?🔴 关键ï¼?âœ?1000篇æ‡çŒ®ç­é€‰æˆ<C3A6>功率 > 99%
âœ?10000篇æ‡çŒ®ç­é€‰æˆ<C3A6>功率 > 99.5%
âœ?实ä¾é‡<C3A9>å<EFBFBD>¯å<C2AF>Žä»»åŠ¡è‡ªåŠ¨æ<C2A8>¢å¤<C3A5>æˆ<C3A6>功率 100%
âœ?æ­ç¹æ<C2B9>¢å¤<C3A5>å<EFBFBD>Žä¸<C3A4>é‡<C3A9>å¤<C3A5>处ç<E2809E>†å·²å®Œæˆ<C3A6>项
âœ?批次失败å<C2A5>ªéœ€é‡<C3A9>试å<E280A2>•批(ä¸<C3A4>é‡<C3A9>试全部ï¼?
性能指标ï¼?âœ?å<>•批次处ç<E2809E>†æ—¶é—?< 15分éŸ
âœ?æ­ç¹ä¿<C3A4>存延迟 < 20ms
�10个Worker并行加速比 > 8�
生产验è¯<C3A8>ï¼?âœ?生产环境è¿<C3A8>行48å°<C3A5>æ—¶æ— é”™è¯?âœ?处ç<E2809E>†3个完整的1000篇æ‡çŒ®ç­é€‰ä»»åŠ?âœ?至å°1次实ä¾é‡<C3A9>å<EFBFBD>¯æ<C2AF>¢å¤<C3A5>æµè¯•æˆ<C3A6>åŠ?âœ?无用户投诉任务丢å¤?âœ?系统å<C5B8>¯ç”¨æ€?> 99.9%

7. 上线与回�

7.1 上线步骤

# Step 1: æ•°æ<C2B0>®åº“è¿<C3A8>移(生产环境ï¼?npx prisma migrate deploy

# Step 2: æ´æ°SAE环境å<C692>˜é‡<C3A9>
CACHE_TYPE=postgres
QUEUE_TYPE=pgboss

# Step 3: ç<>°åº¦å<C2A6>布ï¼?个实例)
# è§å¯Ÿ24å°<C3A5>æ—¶ï¼ŒçæŽ§æŒ‡æ ‡æ­£å¸?
# Step 4: å…¨é‡<C3A9>å<EFBFBD>布ï¼?-3个实ä¾ï¼‰
# é€<C3A9>步扩容

# Step 5: 清ç<E280A6>†æ—§ä»£ç ?# 移除MemoryQueueç¸å…³ä»£ç <C3A7>(å<CB86>¯é€‰ï¼‰

7.2 回滚方案

# 妿žœå‡ºçŽ°é—®é¢˜ï¼Œç«å<E280B9>³åžæ»šï¼š

# æ¹æ¡ˆ1:环境å<C692>˜é‡<C3A9>åžæ»šï¼ˆæœ€å¿«ï¼‰
CACHE_TYPE=memory
QUEUE_TYPE=memory
# é‡<C3A9>å<EFBFBD>¯åº”用,é™<C3A9>级到内存模å¼<C3A5>

# æ¹æ¡ˆ2:代ç <C3A7>åžæ»?git revert <commit>
# åžæ»šåˆ°æ”¹é€ å‰<C3A5>版本

# æ¹æ¡ˆ3:数æ<C2B0>®åº“åžæ»š
npx prisma migrate down
# 删除 app_cache 表(å<CB86>¯é€‰ï¼‰

7.3 风险预案

风险 概率 å½±å“<EFBFBD> 预案
Postgres性能ä¸<EFBFBD>è¶³ ä½? ä¸? 回滚到内存模å¼?
pg-boss连接失败 ä½? é«? é™<EFBFBD>级到å<EFBFBD>Œæ­¥å¤„ç<EFBFBD>?
缓存数æ<EFBFBD>®è¿‡å¤§ ä½? ä½? 增加清ç<EFBFBD>†é¢çއ
长任务å<EFBFBD>¡æ­? ä½? ä¸? æ‰åЍkill任务

8. æˆ<C3A6>功标准(✨ V2.0æ›´æ–°ï¼?

8.1 技术指�

指标类别 指标 目标å€? è¡¡é‡<EFBFBD>æ¹æ³•
缓存 命中� > 60% 监控统计
æŒ<EFBFBD>ä¹…åŒ? âœ?实ä¾é‡<C3A9>å<EFBFBD>¯ä¸<C3A4>丢å¤? é‡<EFBFBD>å<EFBFBD>¯æµè¯•
多实例共äº? âœ?A写B能读 å¹¶å<EFBFBD>æµè¯•
队列 任务æŒ<EFBFBD>ä¹…åŒ? âœ?实ä¾é”€æ¯<C3A6>ä¸<C3A4>丢失 销æ¯<EFBFBD>æµè¯?
长任务å<EFBFBD>¯é<EFBFBD> æ€? > 99% 1000篇ç­é€?
超长任务å<EFBFBD>¯é<EFBFBD> æ€? > 99.5% 10000篇ç­é€?
拆分 批次æˆ<EFBFBD>功çŽ? > 99% 批次统计
批次耗时 < 15åˆ†éŸ ç›‘æŽ§ç»Ÿè®¡
并行加速比 > 8å€<C3A5>(10 Workerï¼? 对比测试
断点 æ­ç¹ä¿<EFBFBD>存延迟 < 20ms 性能测试
æ<EFBFBD>¢å¤<EFBFBD>æˆ<EFBFBD>功çŽ? 100% é‡<EFBFBD>å<EFBFBD>¯æµè¯•
é‡<EFBFBD>å¤<EFBFBD>处ç<EFBFBD>†çŽ? 0% æ•°æ<EFBFBD>®éªŒè¯<EFBFBD>

8.2 业务指标

指标 改造å‰<EFBFBD> 改造å<EFBFBD>Ž æ”¹è¿›å¹…åº¦
LLM APIæˆ<C3A6>本 基线 -40~60% èŠçœ<EFBFBD>Â¥X/æœ?
*任务æˆ<EFBFBD>功çŽ? 10-30% > 99% æ<EFBFBD><EFBFBD>å<EFBFBD>‡3-10å€?
用户é‡<EFBFBD>å¤<EFBFBD>æ<EFBFBD><EFBFBD>交 å¹³å<EFBFBD>‡3æ¬? < 1.1æ¬? å‡<EFBFBD>å°70%
任务完æˆ<EFBFBD>æ—¶é—´ ä¸<EFBFBD>确定(å<EFBFBD>¯èƒ½å¤±è´¥ï¼? 稳定å<EFBFBD>¯é¢„æµ? 体验æ<EFBFBD><EFBFBD>å<EFBFBD>
*用户满æ„<EFBFBD>åº? 基线 æ˜¾è—æ<EFBFBD><EFBFBD>å<EFBFBD> é—®å<EFBFBD>·è°ƒæŸ¥

8.3 验收清å<E280A6>

Phase 1-3ï¼šåŸºç¡€è®¾æ½ âœ?âœ?PostgresCacheAdapter实现并通过æµè¯•
âœ?PgBossQueue实现并通过æµè¯•
âœ?本地环境验è¯<C3A8>通过

Phase 4-5:高级特���任务拆分工具函数测试通过
âœ?æ­ç¹ç»­ä¼ æœ<C3A6>务æµè¯•通过
âœ?æ•°æ<C2B0>®åº“Schemaè¿<C3A8>ç§»æˆ<C3A6>功

Phase 6-7ï¼šä¸šåŠ¡é†æˆ?âœ?âœ?ASLç­é€‰æœ<C3A6>务改造完æˆ?âœ?DCæ<43><C3A6>å<EFBFBD>æœ<C3A6>务改造完æˆ?âœ?Worker注册æˆ<C3A6>功

关键功能测试 🔴ï¼?âœ?1000篇æ‡çŒ®ç­é€‰ï¼ˆ2å°<C3A5>时)æˆ<C3A6>功率 > 99%
âœ?实ä¾é‡<C3A9>å<EFBFBD>¯æ<C2AF>¢å¤<C3A5>æµè¯•通过ï¼?次)
âœ?æ­ç¹ç»­ä¼ æµè¯•通过(从50%æ<>¢å¤<C3A5>ï¼?âœ?批次并行处ç<E2809E>†æµè¯•通过
âœ?失败é‡<C3A9>试æµè¯•通过

生产环境验è¯<C3A8> 🔴ï¼?âœ?生产环境è¿<C3A8>行48å°<C3A5>时无致å½é”™è¯?âœ?完æˆ<C3A6>至尸ªçœŸå®žç”¨æˆ·ä»»åŠ¡ï¼ˆ1000ç¯?ï¼?âœ?至å°ç»<C3A7>历1次SAE实ä¾é‡<C3A9>å<EFBFBD>¯ï¼Œä»»åŠ¡æˆ<C3A6>功æ<C5B8>¢å¤?âœ?缓存命中çŽ?> 60%
âœ?LLM API调用é‡<C3A9>ä¸é™?> 40%
�无用户投诉任务丢�```

---

## 9. 附录

### 9.1 å<>è€ƒæ‡æ¡?
- [Postgres-Only 全能架构解决方案](./08-Postgres-Only 全能架构解决方案.md)
- [pg-boss 官方文档](https://github.com/timgit/pg-boss)
- [Prisma 多Schema支æŒ<C3A6>](https://www.prisma.io/docs/concepts/components/prisma-schema/multi-schema)

### 9.2 关键代ç <C3A7>ä½<C3A4>置(✨ V2.0æ›´æ–°ï¼?

backend/src/ ├── common/cache/ # 缓存系统 â”? ├── CacheAdapter.ts # âœ?已存åœ?â”? ├── CacheFactory.ts # âš ï¸<C3AF> 需修改ï¼?10行) â”? ├── MemoryCacheAdapter.ts # âœ?已存åœ?â”? ├── RedisCacheAdapter.ts # 🔴 å<> ä½<C3A4>符(ä¸<C3A4>用管) â”? ├── PostgresCacheAdapter.ts # â<>?需新增(300行) â”? └── index.ts # âš ï¸<C3AF> 需修改(导出) â”?├── common/jobs/ # 任务队列 â”? ├── types.ts # âœ?已存åœ?â”? ├── JobFactory.ts # âš ï¸<C3AF> 需修改ï¼?10行) â”? ├── MemoryQueue.ts # âœ?已存åœ?â”? ├── PgBossQueue.ts # â<>?需新增(400行) â”? ├── utils.ts # â<>?需新增(200行)âœ?â”? ├── CheckpointService.ts # â<>?需新增(150行)âœ?â”? └── index.ts # âš ï¸<C3AF> 需修改(导出) â”?├── modules/asl/services/ # ASL业务å±?â”? ├── screeningService.ts # âš ï¸<C3AF> 需改造(~150行改动) â”? └── llmScreeningService.ts # âœ?无需改动 â”?├── modules/dc/tool-b/services/ # DC业务å±?â”? └── (类似ASL,按需改造) â”?├── config/ â”? └── env.ts # âš ï¸<C3AF> 需添加环境å<C692>˜é‡<C3A9> â”?└── index.ts # âš ï¸<C3AF> 需修改(注册Workers + å<>¯åŠ¨æ¸…ç<E280A6>†ï¼? prisma/ ├── schema.prisma # âš ï¸<C3AF> 需修改ï¼?AppCache +字段ï¼?└── migrations/ # 自动生æˆ<C3A6>

tests/ # 测试文件 ├── common/cache/ â”? └── PostgresCacheAdapter.test.ts # â<>?需新增 ├── common/jobs/ â”? ├── PgBossQueue.test.ts # â<>?需新增 â”? ├── utils.test.ts # â<>?需新增 âœ?â”? └── CheckpointService.test.ts # â<>?需新增 âœ?└── modules/asl/ └── screening-integration.test.ts # â<>?需新增

backend/.env # âš ï¸<C3AF> 需修改


**æ‡ä»¶çжæ€<C3A6>说明:**
- �已存�- 无需改动
- âš ï¸<C3AF> 需修改 - å°é‡<C3A9>改动ï¼? 50行)
- â<>?需新增 - 全新文件
- 🔴 å<> ä½<C3A4>ç¬?- 忽略(ä¸<C3A4>å½±å“<C3A5>本次改造)
- âœ?V2.0新增 - 支æŒ<C3A6>æ†åˆ†+断点

**代ç <C3A7>行数统计ï¼?*

总æ°å¢žä»£ç <EFBFBD>:1800è¡?├─ PostgresCacheAdapter.ts: 300è¡?├─ PgBossQueue.ts: 400è¡?├─ utils.ts: 200è¡?âœ?├─ CheckpointService.ts: 150è¡?âœ?├─ screeningService.ts改é€? 200è¡?├─ æµè¯•代ç <C3A7>: 400è¡?└─ å…¶ä»ï¼ˆFactoryã€<C3A3>导出等ï¼? 150è¡? 总修改代ç <C3A7>:100è¡?├─ CacheFactory.ts: 10è¡?├─ JobFactory.ts: 10è¡?├─ index.ts: 20è¡?├─ env.ts: 20è¡?├─ schema.prisma: 40è¡?└─ å<>„处导出: çº?0å¤?```


10. V2.0 vs V1.0 对比

维度 V1.0(原计划ï¼? V2.0(当å‰<EFBFBD>版本) å<EFBFBD>˜åŒ
*工作� 7� 9� +2�
代ç <EFBFBD>行数 ~1000è¡? ~1900è¡? +900è¡?
核心策略 缓存+队列 缓存+队列+拆分+断点 +2个ç­ç•?
*长任务支æŒ? < 4å°<C3A5>æ—¶ ä»»æ„<EFBFBD>时长(æ†åˆ†å<EFBFBD>Žï¼? 质的æ<EFBFBD><EFBFBD>å<EFBFBD>
*任务æˆ<EFBFBD>功çŽ? 85-90% > 99% æ<EFBFBD><EFBFBD>å<EFBFBD>‡10%+
实ä¾é‡<EFBFBD>å<EFBFBD>¯æ<EFBFBD>¢å¤<EFBFBD> 从头开å§? 断点续传 é<EFBFBD>¿å…<EFBFBD>浪费
å¹¶å<EFBFBD>èƒ½åŠ å<EFBFBD>•实ä¾ä¸²è¡? 多实例并è¡? 加速Nå€?
*生产就绪åº? 基本å<EFBFBD>¯ç”¨ 完全就绪 ä¼<EFBFBD>业çº?

为什么增�天?

  • Day 4:任务æ†åˆ†æœºåˆ¶ï¼ˆå·¥å…·å‡½æ•°+测试ï¼?- Day 5:æ­ç¹ç»­ä¼ æœºåˆ¶ï¼ˆæœ<C3A6>务+æ•°æ<C2B0>®åº“)

增加的价值:

  • âœ?长任务å<C2A1>¯é<C2AF> æ€§ä»Ž85% â†?99%+
  • âœ?支æŒ<C3A6>ä»»æ„<C3A6>时长任务(通过æ†åˆ†ï¼?- âœ?实ä¾é‡<C3A9>å<EFBFBD>¯ä¸<C3A4>浪费已处ç<E2809E>†ç»“æžœ
  • âœ?多Worker并行,速度æ<C2A6><C3A6>å<EFBFBD>‡Nå€?- âœ?符å<C2A6>ˆServerless最佳实è·? *结论ï¼? 多花2天,æ<C592>¢å<C2A2>质的飞跃ï¼?*é<>žå¸¸å€¼å¾—**ï¼?

11. 快速开�

11.1 一键检查清å<E280A6>?

# 1. 检查代ç <C3A7>结构(应该都存在)
ls backend/src/common/cache/CacheAdapter.ts     # �ls backend/src/common/cache/MemoryCacheAdapter.ts # �ls backend/src/common/jobs/types.ts              # �ls backend/src/common/jobs/MemoryQueue.ts        # �
# 2. 检查需è¦<C3A8>æ°å¢žçš„æ‡ä»¶ï¼ˆåº”è¯¥ä¸<C3A4>存在ï¼?ls backend/src/common/cache/PostgresCacheAdapter.ts  # â<>?å¾…æ–°å¢?ls backend/src/common/jobs/PgBossQueue.ts            # â<>?å¾…æ–°å¢?ls backend/src/common/jobs/utils.ts                  # â<>?å¾…æ–°å¢?ls backend/src/common/jobs/CheckpointService.ts     # â<>?å¾…æ–°å¢?
# 3. 检查ä¾<C3A4>èµ?cd backend
npm list pg-boss   # â<>?需安装

# 4. 检查数æ<C2B0>®åº“
psql -d aiclincial -c "\dt platform_schema.*"
# 应该çœåˆ°çŽ°æœ‰çš„ä¸šåŠ¡è¡¨ï¼Œä½†æ²¡æœ‰app_cache

11.2 ç«å<E280B9>³å¼€å§?

# Phase 1:环境准备(30分éŸï¼?cd backend
npm install pg-boss --save
# 修改 prisma/schema.prisma(添加AppCache�npx prisma migrate dev --name add_postgres_cache
npx prisma generate

# Phase 2:实现PostgresCacheAdapterï¼?å°<C3A5>æ—¶ï¼?# 创建 src/common/cache/PostgresCacheAdapter.ts
# 修改 src/common/cache/CacheFactory.ts
# æµè¯•验è¯<C3A8>

# Phase 3:实现PgBossQueueï¼?6å°<C3A5>æ—¶ï¼?天)
# 创建 src/common/jobs/PgBossQueue.ts
# 修改 src/common/jobs/JobFactory.ts
# æµè¯•验è¯<C3A8>

# ... 按计划继�```

---

**🎯 现在就开始?**

建议ï¼?1. **先通读文档** - ç<>†è§£æ•´ä½“æž¶æž„ï¼?0分éŸï¼?2. **验è¯<C3A8>代ç <C3A7>结构** - 确认真实文件ï¼?0分éŸï¼?3. **å¼€å§Phase 1** - 环境准备ï¼?0分éŸï¼?4. **é€<C3A9>步推è¿** - æ¯<C3A6>完æˆ<C3A6>一个Phaseå°±æµè¯?
有任何问题éš<C3A9>时沟通ï¼<C3AF>🚀