Files
AIclinicalresearch/docs/03-业务模块/ASL-AI智能文献/05-开发记录/2025-11-21-真实LLM集成完成报告.md
HaHafeng 1b53ab9d52 feat(aia): Complete AIA V2.0 with universal streaming capabilities
Major Changes:
- Add StreamingService with OpenAI Compatible format
- Upgrade Chat component V2 with Ant Design X integration
- Implement AIA module with 12 intelligent agents
- Update API routes to unified /api/v1 prefix
- Update system documentation

Backend (~1300 lines):
- common/streaming: OpenAI Compatible adapter
- modules/aia: 12 agents, conversation service, streaming integration
- Update route versions (RVW, PKB to v1)

Frontend (~3500 lines):
- modules/aia: AgentHub + ChatWorkspace (100% prototype restoration)
- shared/Chat: AIStreamChat, ThinkingBlock, useAIStream Hook
- Update API endpoints to v1

Documentation:
- AIA module status guide
- Universal capabilities catalog
- System overview updates
- All module documentation sync

Tested: Stream response verified, authentication working
Status: AIA V2.0 core completed (85%)
2026-01-14 19:15:01 +08:00

382 lines
9.0 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 真实LLM醿ˆ<C3A6>完æˆ<C3A6>报åŠ
**日期**: 2025-11-21
**任务**: å°†Mock AIæ¿æ<C2BF>¢ä¸ºçœŸå®žLLM调用
**状æ€?*: âœ?完æˆ<C3A6>
---
## 📋 背景
### ä¹å‰<C3A5>的状æ€?
- �已完�Prompt 设计(v1.0.0-MVP�
- �已实�`llmScreeningService.ts`(真实LLM调用�
- âœ?已完æˆ<C3A6>æµè¯•框架åŒè´¨é‡<C3A9>验è¯<C3A8>
- â<>?**问题**: `screeningService.ts` 中使ç”?`mockAIScreening` 生æˆ<C3A6>å<EFBFBD>‡æ•°æ<C2B0>?
### 用户需�
ä»?设置与å<C5BD>¯åŠ?页é<C2B5>¢ä¸Šä¼ çœŸå®žæ‡çŒ®æ•°æ<C2B0>®å<C2AE>Žï¼Œ**使用真实çš?DeepSeek å’?Qwen API 进行筛é€?*,而ä¸<C3A4>æ˜¯æ¨¡æŸæ•°æ<C2B0>®ã€?
---
## âœ?完æˆ<C3A6>内容
### 1. 修改 `screeningService.ts`
**文件**: `backend/src/modules/asl/services/screeningService.ts`
#### 核心改动
**引入真实LLMæœ<C3A6>务**:
```typescript
import { llmScreeningService } from './llmScreeningService.js';
```
**æ¿æ<C2BF>¢å¤„ç<E2809E>†é€»è¾**:
```typescript
// â<>?旧代ç <C3A7>(Mockï¼?
const result = await mockAIScreening(projectId, literature);
// âœ?æ°ä»£ç <C3A7>(真实LLMï¼?
const screeningResult = await llmScreeningService.dualModelScreening(
literature.id,
literature.title,
literature.abstract,
picoCriteria,
inclusionCriteria,
exclusionCriteria,
[models[0], models[1]],
screeningConfig?.style || 'standard',
literature.authors,
literature.journal,
literature.publicationYear
);
```
#### 新增功能
1. **从项ç®è¯»å<C2BB>PICOS标准**:
```typescript
const project = await prisma.aslScreeningProject.findUnique({
where: { id: projectId },
});
const picoCriteria = project.picoCriteria;
const inclusionCriteria = project.inclusionCriteria;
const exclusionCriteria = project.exclusionCriteria;
```
2. **支æŒ<C3A6>自定义模åžé€‰æ©**:
```typescript
const models = screeningConfig?.models || ['deepseek-chat', 'qwen-max'];
```
3. **详细日志记录**:
```typescript
logger.info('Processing literature', {
literatureId: literature.id,
title: literature.title?.substring(0, 50) + '...',
});
```
4. **结果映射到数æ<C2B0>®åº“æ ¼å¼<C3A5>**:
```typescript
const dbResult = {
projectId,
literatureId: literature.id,
// DeepSeek结果
dsModelName: screeningResult.deepseekModel,
dsPJudgment: screeningResult.deepseek.judgment.P,
// ... 完整的字段映�
};
```
---
## 🔄 完整æµ<C3A6>ç¨
### 用户æ“<C3A6>作æµ<C3A6>ç¨
```
1. 访问"设置与å<C5BD>¯åŠ?页é<C2B5>¢
�
2. 填写 PICOS 标准
�
3. 上传 Excel æ‡çŒ®åˆ—表(ä¾å¦ï¼š199篇)
�
4. 点击"å¼€å§AIåˆ<C3A5>ç­"
�
5. å<>Žç«¯è‡ªåЍ处ç<E2809E>†ï¼?
a. 创建项目
b. 导入文献
c. å<>¯åЍç­é€‰ä»»åŠ?
�
6. 真实LLM处ç<E2809E>†ï¼ˆæ¯<C3A6>篇约10-15ç§ï¼‰
a. 调用 DeepSeek API
b. 调用 Qwen API
c. 对比结果,检测冲�
d. ä¿<C3A4>存到数æ<C2B0>®åº“
�
7. å‰<C3A5>端自动跳转åˆ?审核工作å<C593>?
�
8. 显示真实的AIç­é€‰ç»“æž?
```
### 技术æµ<C3A6>ç¨?
```
å‰<EFBFBD>端: TitleScreeningSettings.tsx
�POST /api/v1/asl/literatures/import
å<EFBFBD>Žç«¯: literatureController.ts
�importLiteratures()
�startScreeningTask()
å<EFBFBD>Žç«¯: screeningService.ts
�processLiteraturesInBackground()
�for each literature:
�llmScreeningService.dualModelScreening()
å<EFBFBD>Žç«¯: llmScreeningService.ts
�Promise.all([
screenWithModel('deepseek-chat', ...),
screenWithModel('qwen-max', ...),
])
å<EFBFBD>Žç«¯: LLMFactory
�getAdapter('deepseek-v3')
�getAdapter('qwen3-72b')
真实API调用
�DeepSeek API
�Qwen API
结果ä¿<EFBFBD>å­˜
�AslScreeningResult �
å‰<EFBFBD>端: ScreeningWorkbench.tsx
�GET /api/v1/asl/projects/:projectId/screening-results
�显示真实结果
```
---
## â<>±ï¸<C3AF> 性能预期
### å<>•篇æ‡çŒ®å¤„ç<E2809E>†æ—¶é—´
| 步骤 | 耗时(串行) |
|-----|------------|
| DeepSeek API 调用 | 5-10�|
| Qwen API 调用 | 5-10�|
| 结果ä¿<C3A4>å­˜ | 0.1ç§?|
| **总计** | **10-20�* |
### 批é‡<C3A9>处ç<E2809E>†æ—¶é—´ï¼?99篇)
| 模å¼<C3A5> | 耗时 | 说明 |
|-----|------|-----|
| **串行处ç<E2809E>†** | 33-66åˆ†éŸ | 当å‰<C3A5>实现(é<CB86>¿å…<C3A5>APIé™<C3A9>æµ<C3A6>)|
| å¹¶å<C2B6>处ç<E2809E>†ï¼?个) | 11-22åˆ†éŸ | å<>¯é€‰ä¼˜åŒï¼ˆéœ€æµè¯•ï¼?|
| å¹¶å<C2B6>处ç<E2809E>†ï¼?0个) | 3-7åˆ†éŸ | 风险:å<C5A1>¯èƒ½è§¦å<C2A6>APIé™<C3A9>é¢<C3A9> |
**当å‰<C3A5>ç­ç•¥**: 串行处ç<E2809E>†ï¼ˆç¨³å®šä¼˜å…ˆï¼‰
---
## 🎯 与Mockæ•°æ<C2B0>®çš„对æ¯?
### Mock æ•°æ<C2B0>®ï¼ˆæ—§ï¼?
```javascript
// â<><>‡æ•°æ<C2B0>?
dsPEvidence: "模æŸè¯<C3A8>æ<EFBFBD>®: 研究人群与PICO中的P标准匹é…<C3A9>"
dsReason: "åŸºäºŽæ ‡é¢˜åŒæ˜è¦<C3A8>分æž<C3A6>,该æ‡çŒ®ç¬¦å<C2A6>ˆçº³å…¥æ ‡å‡†ã€?
dsConclusion: randomConclusion() // éš<C3A9>机ï¼?
// 特点�
- 1ç§å®Œæˆ?99ç¯?
- è¯<C3A8>æ<EFBFBD>®éƒ½æ˜¯"模æŸè¯<C3A8>æ<EFBFBD>®"
- 判æ­ç»“æžœéš<C3A9>机生æˆ<C3A6>
```
### 真实LLM(æ°ï¼?
```javascript
// âœ?真实数æ<C2B0>®
dsPEvidence: "This study included adult patients with type 2 diabetes mellitus aged 18 years or older, which matches the population criteria."
dsReason: "The study population consists of T2DM patients, the intervention is an SGLT2 inhibitor (empagliflozin), the comparator is placebo, and the study design is a randomized controlled trial. All PICO criteria are met. The study reports on cardiovascular outcomes including MACE, heart failure hospitalization, and cardiovascular death, which are the outcomes of interest."
dsConclusion: "include" // AI真实判æ­ï¼?
// 特点�
- 33-66分éŸå®Œæˆ<C3A6>199ç¯?
- è¯<C3A8>æ<EFBFBD>®å¼•用æ‡çŒ®åŽŸæ
- 判æ­åŸºäºŽPrompt v1.0.0-MVP
- 准确率:60%(首次测试)
```
---
## ðŸ”<C5B8> æ•°æ<C2B0>®éªŒè¯<C3A8>
### 验è¯<C3A8>æ¹æ³•
```bash
cd AIclinicalresearch/backend
node check-data.mjs
```
### 预期输出(真实数æ<C2B0>®ï¼‰
```
🔬 筛选结果样�
[1] 文献: Assessment of Thrombectomy versus Combined...
DeepSeek: include (P:match, I:partial, C:mismatch, S:match)
Qwen: exclude (P:mismatch, I:mismatch, C:partial, S:match)
冲çª<C3A7>状æ€? conflict
是å<C2AF>¦æœ‰è¯<C3A8>æ<EFBFBD>? DeepSeek=true, Qwen=true âœ?
è¯<C3A8>æ<EFBFBD>®ç¤ºä¾:
- dsPEvidence: "The study population consists of..."
- qwenPEvidence: "Patients with acute ischemic stroke..."
```
---
## 📊 è´¨é‡<C3A9>ä¿<C3A4>éšœ
### 已实现的质é‡<C3A9>措æ½
1. **JSON Schema 验è¯<C3A8>**:
- 所有LLM输出必须通过Schema验è¯<C3A8>
- ä¸<C3A4>å<EFBFBD>ˆæ ¼çš„输出会被æç»<C3A7>
2. **错误处ç<E2809E>†**:
- å<>•篇æ‡çŒ®å¤±è´¥ä¸<C3A4>å½±å“<C3A5>整体任åŠ?
- 详细错误日志记录
3. **进度追踪**:
- æ¯?0ç¯‡æ´æ°ä¸€æ¬¡è¿åº?
- 实时统计æˆ<C3A6>功/冲çª<C3A7>/失败æ•?
4. **å<>¯è¿½æº¯æ€?*:
- 记录原å§LLM输出(`rawOutput`ï¼?
- 记录Prompt版本(`promptVersion`�
- 记录处ç<E2809E>†æ—¶é—´ï¼ˆ`aiProcessedAt`ï¼?
---
## 🚀 测试步骤
### Step 1: 准备æµè¯•æ•°æ<C2B0>®
```
使用现有测试文件:
- PICOS: docs/.../æµè¯•案ä¾çš„PICOSã€<C3A3>纳入标准ã€<C3A3>æŽé™¤æ ‡å‡?txt
- Excel: docs/.../Test Cases.xlsx (199篇æ‡çŒ?
```
### Step 2: 执行测试
1. å<>¯åЍå<C2A8>Žç«¯: `cd backend && npm run dev`
2. å<>¯åЍå‰<C3A5>端: `cd frontend-v2 && npm run dev`
3. 访问: `http://localhost:3001`
4. 填写PICOS + 上传Excel
5. 点击"å¼€å§AIåˆ<C3A5>ç­"
6. **等待30-60分éŸ**ï¼?99篇Ã?0ç§ï¼‰
7. 查çœå®¡æ ¸å·¥ä½œå<C593>?
### Step 3: 验è¯<C3A8>结果
```bash
cd backend
node check-data.mjs
```
**检查项**:
- [ ] 所有文献都有筛选结�
- [ ] è¯<C3A8>æ<EFBFBD>®ä¸<C3A4>å†<C3A5>æ˜?模æŸè¯<C3A8>æ<EFBFBD>®"
- [ ] è¯<C3A8>æ<EFBFBD>®åŒ…å<E280A6>«æ‡çŒ®åŽŸæ‡å¼•用
- [ ] 判æ­ç<C2AD>†ç”±è¯¦ç»†ä¸”符å<C2A6>ˆé€»è¾
- [ ] 冲çª<C3A7>检æµå‡†ç¡®ï¼ˆconclusionä¸<C3A4>å<EFBFBD>Œï¼?
---
## âš ï¸<C3AF> 注æ„<C3A6>äºé¡¹
### API密é¥é…<C3A9>ç½®
ç¡®ä¿<EFBFBD>环境å<EFBFBD>˜é‡<EFBFBD>å·²é…<EFBFBD>ç½?
```bash
# .env
DEEPSEEK_API_KEY=sk-xxxxx
QWEN_API_KEY=sk-xxxxx
```
### APIé™<C3A9>æµ<C3A6>
- DeepSeek: 60 RPM(æ¯<C3A6>分éŸè¯·æ±æ•°ï¼‰
- Qwen: 60 RPM
**当å‰<C3A5>ç­ç•¥**: 串行处ç<E2809E>†ï¼Œä¸<C3A4>会触å<C2A6>é™<C3A9>æµ?
### æˆ<C3A6>本估算
- DeepSeek: ~$0.001/�× 199 = **$0.20**
- Qwen: ~$0.001/�× 199 = **$0.20**
- **总计**: **$0.40** / 次完整测�
---
## 💡 优化建议
### 短期优åŒï¼ˆWeek 2 - Day 4-5ï¼?
1. **å¹¶å<C2B6>控制**: 改为3个并å<C2B6>(33åˆ†éŸ â†?11分éŸï¼?
2. **进度显示**: å‰<C3A5>端轮询显示è¿åº¦ç™¾åˆ†æ¯?
3. **错误é‡<C3A9>试**: 失败的æ‡çŒ®è‡ªåЍé‡<C3A9>è¯?æ¬?
### 中期优åŒï¼ˆWeek 3ï¼?
1. **消æ<CB86>¯é˜Ÿåˆ—**: 使用Bull Queue弿­¥å¤„ç<E2809E>
2. **批é‡<C3A9>优åŒ**: 使用批é‡<C3A9>API接å<C2A5>£ï¼ˆå¦æžœæœ‰ï¼?
3. **缓存机制**: ç¸å<C2B8>Œæ‡çŒ®ä¸<C3A4>é‡<C3A9>å¤<C3A5>ç­é€?
---
## ðŸ“<C5B8> 相关文件
### 修改的文�
- `backend/src/modules/asl/services/screeningService.ts` â­?
### ä¾<C3A4>èµçš„æ‡ä»¶ï¼ˆå·²å­˜åœ¨ï¼‰
- `backend/src/modules/asl/services/llmScreeningService.ts`
- `backend/src/modules/asl/schemas/screening.schema.ts`
- `backend/prompts/asl/screening/v1.0.0-mvp.txt`
- `backend/src/common/llm/adapters/LLMFactory.ts`
### 测试文件
- `backend/scripts/test-llm-screening.ts`
- `backend/scripts/test-samples/asl-test-literatures.json`
---
## 🎉 æˆ<C3A6>果总结
### 已实�
âœ?真实LLMè°ƒç”¨æ¿æ<C2BF>¢Mockæ•°æ<C2B0>®
âœ?从项ç®è¯»å<C2BB>PICOS标准
âœ?å<>Œæ¨¡åžå¹¶è¡Œç­é€?
âœ?冲çª<C3A7>检æµä¸Žæ ‡è®°
�完整的日志追�
âœ?错误处ç<E2809E>†æœºåˆ
### 待优�
âš ï¸<EFBFBD> 处ç<E2809E>†æ—¶é—´è¾ƒé•¿ï¼?0-60分éŸï¼?
âš ï¸<EFBFBD> 串行处ç<E2809E>†ï¼ˆå<CB86>¯æ”¹ä¸ºå¹¶å<C2B6>ï¼?
âš ï¸<EFBFBD> å‰<C3A5>端è¿åº¦æ˜¾ç¤ºï¼ˆéœ€ä¼˜åŒè½®è¯¢é¢çއï¼?
---
## 🔗 å<>è€ƒæ‡æ¡?
- [Prompt设计与æµè¯•完æˆ<EFBFBD>报åŠ](./2025-11-18-Prompt设计与æµè¯•完æˆ<C3A6>报å?md)
- [å<EFBFBD>中数æ<EFBFBD>®æ³åŒæµè¯•报åŠ](./2025-11-18-å<>中数æ<C2B0>®æ³åŒæµè¯•报åŠ.md)
- [任务分解](../04-å¼€å<E282AC>计åˆ?03-任务分解.md)
---
**报告�*: AI Assistant
**日期**: 2025-11-21
**版本**: v1.0.0