Major Changes: - Add StreamingService with OpenAI Compatible format - Upgrade Chat component V2 with Ant Design X integration - Implement AIA module with 12 intelligent agents - Update API routes to unified /api/v1 prefix - Update system documentation Backend (~1300 lines): - common/streaming: OpenAI Compatible adapter - modules/aia: 12 agents, conversation service, streaming integration - Update route versions (RVW, PKB to v1) Frontend (~3500 lines): - modules/aia: AgentHub + ChatWorkspace (100% prototype restoration) - shared/Chat: AIStreamChat, ThinkingBlock, useAIStream Hook - Update API endpoints to v1 Documentation: - AIA module status guide - Universal capabilities catalog - System overview updates - All module documentation sync Tested: Stream response verified, authentication working Status: AIA V2.0 core completed (85%)
382 lines
9.0 KiB
Markdown
382 lines
9.0 KiB
Markdown
# 真实LLM集æˆ<C3A6>完æˆ<C3A6>报告
|
||
|
||
**日期**: 2025-11-21
|
||
**任务**: å°†Mock AI替æ<C2BF>¢ä¸ºçœŸå®žLLM调用
|
||
**状æ€?*: âœ?完æˆ<C3A6>
|
||
|
||
---
|
||
|
||
## 📋 背景
|
||
|
||
### 之å‰<C3A5>的状æ€?
|
||
- �已完�Prompt 设计(v1.0.0-MVP�
|
||
- �已实�`llmScreeningService.ts`(真实LLM调用�
|
||
- âœ?已完æˆ<C3A6>测试框架和质é‡<C3A9>验è¯<C3A8>
|
||
- â<>?**问题**: `screeningService.ts` ä¸ä½¿ç”?`mockAIScreening` 生æˆ<C3A6>å<EFBFBD>‡æ•°æ<C2B0>?
|
||
|
||
### 用户需�
|
||
ä»?设置与å<C5BD>¯åŠ?页é<C2B5>¢ä¸Šä¼ 真实文献数æ<C2B0>®å<C2AE>Žï¼Œ**使用真实çš?DeepSeek å’?Qwen API 进行ç›é€?*,而ä¸<C3A4>是模拟数æ<C2B0>®ã€?
|
||
|
||
---
|
||
|
||
## âœ?完æˆ<C3A6>内容
|
||
|
||
### 1. 修改 `screeningService.ts`
|
||
|
||
**文件**: `backend/src/modules/asl/services/screeningService.ts`
|
||
|
||
#### æ ¸å¿ƒæ”¹åŠ¨
|
||
|
||
**引入真实LLMæœ<C3A6>务**:
|
||
```typescript
|
||
import { llmScreeningService } from './llmScreeningService.js';
|
||
```
|
||
|
||
**替æ<C2BF>¢å¤„ç<E2809E>†é€»è¾‘**:
|
||
```typescript
|
||
// â<>?旧代ç <C3A7>(Mockï¼?
|
||
const result = await mockAIScreening(projectId, literature);
|
||
|
||
// âœ?新代ç <C3A7>(真实LLMï¼?
|
||
const screeningResult = await llmScreeningService.dualModelScreening(
|
||
literature.id,
|
||
literature.title,
|
||
literature.abstract,
|
||
picoCriteria,
|
||
inclusionCriteria,
|
||
exclusionCriteria,
|
||
[models[0], models[1]],
|
||
screeningConfig?.style || 'standard',
|
||
literature.authors,
|
||
literature.journal,
|
||
literature.publicationYear
|
||
);
|
||
```
|
||
|
||
#### 新增功能
|
||
|
||
1. **从项目读å<C2BB>–PICOSæ ‡å‡†**:
|
||
```typescript
|
||
const project = await prisma.aslScreeningProject.findUnique({
|
||
where: { id: projectId },
|
||
});
|
||
|
||
const picoCriteria = project.picoCriteria;
|
||
const inclusionCriteria = project.inclusionCriteria;
|
||
const exclusionCriteria = project.exclusionCriteria;
|
||
```
|
||
|
||
2. **支æŒ<C3A6>自定义模型选择**:
|
||
```typescript
|
||
const models = screeningConfig?.models || ['deepseek-chat', 'qwen-max'];
|
||
```
|
||
|
||
3. **详细日志记录**:
|
||
```typescript
|
||
logger.info('Processing literature', {
|
||
literatureId: literature.id,
|
||
title: literature.title?.substring(0, 50) + '...',
|
||
});
|
||
```
|
||
|
||
4. **ç»“æžœæ˜ å°„åˆ°æ•°æ<C2B0>®åº“æ ¼å¼<C3A5>**:
|
||
```typescript
|
||
const dbResult = {
|
||
projectId,
|
||
literatureId: literature.id,
|
||
// DeepSeek结果
|
||
dsModelName: screeningResult.deepseekModel,
|
||
dsPJudgment: screeningResult.deepseek.judgment.P,
|
||
// ... å®Œæ•´çš„å—æ®µæ˜ å°?
|
||
};
|
||
```
|
||
|
||
---
|
||
|
||
## 🔄 完整æµ<C3A6>程
|
||
|
||
### 用户æ“<C3A6>作æµ<C3A6>程
|
||
```
|
||
1. 访问"设置与å<C5BD>¯åŠ?页é<C2B5>¢
|
||
�
|
||
2. 填写 PICOS æ ‡å‡†
|
||
�
|
||
3. ä¸Šä¼ Excel 文献列表(例如:199篇)
|
||
�
|
||
4. 点击"开始AIåˆ<C3A5>ç›"
|
||
�
|
||
5. å<>Žç«¯è‡ªåЍ处ç<E2809E>†ï¼?
|
||
a. 创建项目
|
||
b. 导入文献
|
||
c. å<>¯åЍç›é€‰ä»»åŠ?
|
||
�
|
||
6. 真实LLM处ç<E2809E>†ï¼ˆæ¯<C3A6>篇约10-15秒)
|
||
a. 调用 DeepSeek API
|
||
b. 调用 Qwen API
|
||
c. 对比结果,检测冲�
|
||
d. ä¿<C3A4>å˜åˆ°æ•°æ<C2B0>®åº“
|
||
�
|
||
7. å‰<C3A5>端自动跳转åˆ?å®¡æ ¸å·¥ä½œå<C593>?
|
||
�
|
||
8. 显示真实的AIç›é€‰ç»“æž?
|
||
```
|
||
|
||
### 技术æµ<C3A6>ç¨?
|
||
|
||
```
|
||
å‰<EFBFBD>端: TitleScreeningSettings.tsx
|
||
�POST /api/v1/asl/literatures/import
|
||
|
||
å<EFBFBD>Žç«¯: literatureController.ts
|
||
�importLiteratures()
|
||
�startScreeningTask()
|
||
|
||
å<EFBFBD>Žç«¯: screeningService.ts
|
||
�processLiteraturesInBackground()
|
||
�for each literature:
|
||
�llmScreeningService.dualModelScreening()
|
||
|
||
å<EFBFBD>Žç«¯: llmScreeningService.ts
|
||
�Promise.all([
|
||
screenWithModel('deepseek-chat', ...),
|
||
screenWithModel('qwen-max', ...),
|
||
])
|
||
|
||
å<EFBFBD>Žç«¯: LLMFactory
|
||
�getAdapter('deepseek-v3')
|
||
�getAdapter('qwen3-72b')
|
||
|
||
真实API调用
|
||
�DeepSeek API
|
||
�Qwen API
|
||
|
||
结果ä¿<EFBFBD>å˜
|
||
�AslScreeningResult �
|
||
|
||
å‰<EFBFBD>端: ScreeningWorkbench.tsx
|
||
�GET /api/v1/asl/projects/:projectId/screening-results
|
||
�显示真实结果
|
||
```
|
||
|
||
---
|
||
|
||
## â<>±ï¸<C3AF> 性能预期
|
||
|
||
### å<>•篇文献处ç<E2809E>†æ—¶é—´
|
||
| æ¥éª¤ | 耗时(串行) |
|
||
|-----|------------|
|
||
| DeepSeek API 调用 | 5-10�|
|
||
| Qwen API 调用 | 5-10�|
|
||
| 结果ä¿<C3A4>å˜ | 0.1ç§?|
|
||
| **总计** | **10-20�* |
|
||
|
||
### 批é‡<C3A9>处ç<E2809E>†æ—¶é—´ï¼?99篇)
|
||
| 模å¼<C3A5> | 耗时 | 说明 |
|
||
|-----|------|-----|
|
||
| **串行处ç<E2809E>†** | 33-66分钟 | 当å‰<C3A5>实现(é<CB86>¿å…<C3A5>APIé™<C3A9>æµ<C3A6>)|
|
||
| å¹¶å<C2B6>‘处ç<E2809E>†ï¼?个) | 11-22分钟 | å<>¯é€‰ä¼˜åŒ–(需测试ï¼?|
|
||
| å¹¶å<C2B6>‘处ç<E2809E>†ï¼?0个) | 3-7分钟 | 风险:å<C5A1>¯èƒ½è§¦å<C2A6>‘APIé™<C3A9>é¢<C3A9> |
|
||
|
||
**当å‰<C3A5>ç–ç•¥**: 串行处ç<E2809E>†ï¼ˆç¨³å®šä¼˜å…ˆï¼‰
|
||
|
||
---
|
||
|
||
## 🎯 与Mockæ•°æ<C2B0>®çš„对æ¯?
|
||
|
||
### Mock æ•°æ<C2B0>®ï¼ˆæ—§ï¼?
|
||
```javascript
|
||
// â<>?å<>‡æ•°æ<C2B0>?
|
||
dsPEvidence: "模拟è¯<C3A8>æ<EFBFBD>®: ç ”ç©¶äººç¾¤ä¸ŽPICOä¸çš„Pæ ‡å‡†åŒ¹é…<C3A9>"
|
||
dsReason: "åŸºäºŽæ ‡é¢˜å’Œæ‘˜è¦<C3A8>分æž<C3A6>,该文献符å<C2A6>ˆçº³å…¥æ ‡å‡†ã€?
|
||
dsConclusion: randomConclusion() // éš<C3A9>机ï¼?
|
||
|
||
// 特点�
|
||
- 1秒完�99�
|
||
- è¯<C3A8>æ<EFBFBD>®éƒ½æ˜¯"模拟è¯<C3A8>æ<EFBFBD>®"
|
||
- 判æ–结果éš<C3A9>机生æˆ<C3A6>
|
||
```
|
||
|
||
### 真实LLM(新�
|
||
```javascript
|
||
// âœ?真实数æ<C2B0>®
|
||
dsPEvidence: "This study included adult patients with type 2 diabetes mellitus aged 18 years or older, which matches the population criteria."
|
||
dsReason: "The study population consists of T2DM patients, the intervention is an SGLT2 inhibitor (empagliflozin), the comparator is placebo, and the study design is a randomized controlled trial. All PICO criteria are met. The study reports on cardiovascular outcomes including MACE, heart failure hospitalization, and cardiovascular death, which are the outcomes of interest."
|
||
dsConclusion: "include" // AI真实判æ–ï¼?
|
||
|
||
// 特点�
|
||
- 33-66分钟完æˆ<C3A6>199ç¯?
|
||
- è¯<C3A8>æ<EFBFBD>®å¼•用文献原文
|
||
- 判æ–基于Prompt v1.0.0-MVP
|
||
- 准确率:60%(首次测试)
|
||
```
|
||
|
||
---
|
||
|
||
## ðŸ”<C5B8> æ•°æ<C2B0>®éªŒè¯<C3A8>
|
||
|
||
### 验è¯<C3A8>方法
|
||
```bash
|
||
cd AIclinicalresearch/backend
|
||
node check-data.mjs
|
||
```
|
||
|
||
### 预期输出(真实数æ<C2B0>®ï¼‰
|
||
```
|
||
🔬 ç›é€‰ç»“æžœæ ·æœ?
|
||
[1] 文献: Assessment of Thrombectomy versus Combined...
|
||
DeepSeek: include (P:match, I:partial, C:mismatch, S:match)
|
||
Qwen: exclude (P:mismatch, I:mismatch, C:partial, S:match)
|
||
冲çª<C3A7>状æ€? conflict
|
||
是å<C2AF>¦æœ‰è¯<C3A8>æ<EFBFBD>? DeepSeek=true, Qwen=true âœ?
|
||
|
||
è¯<C3A8>æ<EFBFBD>®ç¤ºä¾‹:
|
||
- dsPEvidence: "The study population consists of..."
|
||
- qwenPEvidence: "Patients with acute ischemic stroke..."
|
||
```
|
||
|
||
---
|
||
|
||
## 📊 è´¨é‡<C3A9>ä¿<C3A4>éšœ
|
||
|
||
### 已实现的质é‡<C3A9>措施
|
||
|
||
1. **JSON Schema 验è¯<C3A8>**:
|
||
- 所有LLM输出必须通过Schema验è¯<C3A8>
|
||
- ä¸<C3A4>å<EFBFBD>ˆæ ¼çš„输出会被拒ç»<C3A7>
|
||
|
||
2. **错误处ç<E2809E>†**:
|
||
- å<>•篇文献失败ä¸<C3A4>å½±å“<C3A5>整体任åŠ?
|
||
- 详细错误日志记录
|
||
|
||
3. **进度追踪**:
|
||
- �0篇更新一次进�
|
||
- 实时统计æˆ<C3A6>功/冲çª<C3A7>/失败æ•?
|
||
|
||
4. **å<>¯è¿½æº¯æ€?*:
|
||
- 记录原始LLM输出(`rawOutput`�
|
||
- 记录Prompt版本(`promptVersion`�
|
||
- 记录处ç<E2809E>†æ—¶é—´ï¼ˆ`aiProcessedAt`ï¼?
|
||
|
||
---
|
||
|
||
## 🚀 测试æ¥éª¤
|
||
|
||
### Step 1: 准备测试数æ<C2B0>®
|
||
```
|
||
使用现有测试文件:
|
||
- PICOS: docs/.../测试案例的PICOSã€<C3A3>çº³å…¥æ ‡å‡†ã€<C3A3>æŽ’é™¤æ ‡å‡?txt
|
||
- Excel: docs/.../Test Cases.xlsx (199篇文�
|
||
```
|
||
|
||
### Step 2: 执行测试
|
||
1. å<>¯åЍå<C2A8>Žç«¯: `cd backend && npm run dev`
|
||
2. å<>¯åЍå‰<C3A5>端: `cd frontend-v2 && npm run dev`
|
||
3. 访问: `http://localhost:3001`
|
||
4. 填写PICOS + ä¸Šä¼ Excel
|
||
5. 点击"开始AIåˆ<C3A5>ç›"
|
||
6. **ç‰å¾…30-60分钟**ï¼?99篇Ã?0秒)
|
||
7. æŸ¥çœ‹å®¡æ ¸å·¥ä½œå<C593>?
|
||
|
||
### Step 3: 验è¯<C3A8>结果
|
||
```bash
|
||
cd backend
|
||
node check-data.mjs
|
||
```
|
||
|
||
**检查项**:
|
||
- [ ] 所有文献都有ç›é€‰ç»“æž?
|
||
- [ ] è¯<C3A8>æ<EFBFBD>®ä¸<C3A4>å†<C3A5>æ˜?模拟è¯<C3A8>æ<EFBFBD>®"
|
||
- [ ] è¯<C3A8>æ<EFBFBD>®åŒ…å<E280A6>«æ–‡çŒ®åŽŸæ–‡å¼•ç”¨
|
||
- [ ] 判æ–ç<C2AD>†ç”±è¯¦ç»†ä¸”符å<C2A6>ˆé€»è¾‘
|
||
- [ ] 冲çª<C3A7>检测准确(conclusionä¸<C3A4>å<EFBFBD>Œï¼?
|
||
|
||
---
|
||
|
||
## âš ï¸<C3AF> 注æ„<C3A6>事项
|
||
|
||
### API密钥é…<C3A9>ç½®
|
||
ç¡®ä¿<EFBFBD>环境å<EFBFBD>˜é‡<EFBFBD>å·²é…<EFBFBD>ç½?
|
||
```bash
|
||
# .env
|
||
DEEPSEEK_API_KEY=sk-xxxxx
|
||
QWEN_API_KEY=sk-xxxxx
|
||
```
|
||
|
||
### APIé™<C3A9>æµ<C3A6>
|
||
- DeepSeek: 60 RPM(æ¯<C3A6>分钟请求数)
|
||
- Qwen: 60 RPM
|
||
|
||
**当å‰<C3A5>ç–ç•¥**: 串行处ç<E2809E>†ï¼Œä¸<C3A4>会触å<C2A6>‘é™<C3A9>æµ?
|
||
|
||
### æˆ<C3A6>本估算
|
||
- DeepSeek: ~$0.001/�× 199 = **$0.20**
|
||
- Qwen: ~$0.001/�× 199 = **$0.20**
|
||
- **总计**: **$0.40** / 次完整测�
|
||
|
||
---
|
||
|
||
## 💡 优化建议
|
||
|
||
### çŸæœŸä¼˜åŒ–(Week 2 - Day 4-5ï¼?
|
||
1. **å¹¶å<C2B6>‘控制**: 改为3个并å<C2B6>‘(33分钟 â†?11分钟ï¼?
|
||
2. **进度显示**: å‰<C3A5>端轮询显示进度百分æ¯?
|
||
3. **错误é‡<C3A9>试**: 失败的文献自动é‡<C3A9>è¯?æ¬?
|
||
|
||
### 䏿œŸä¼˜åŒ–(Week 3ï¼?
|
||
1. **消æ<CB86>¯é˜Ÿåˆ—**: 使用Bull Queue异æ¥å¤„ç<E2809E>†
|
||
2. **批é‡<C3A9>优化**: 使用批é‡<C3A9>API接å<C2A5>£ï¼ˆå¦‚果有ï¼?
|
||
3. **ç¼“å˜æœºåˆ¶**: 相å<C2B8>Œæ–‡çŒ®ä¸<C3A4>é‡<C3A9>å¤<C3A5>ç›é€?
|
||
|
||
---
|
||
|
||
## ðŸ“<C5B8> 相关文件
|
||
|
||
### 修改的文�
|
||
- `backend/src/modules/asl/services/screeningService.ts` â?
|
||
|
||
### ä¾<C3A4>赖的文件(已å˜åœ¨ï¼‰
|
||
- `backend/src/modules/asl/services/llmScreeningService.ts`
|
||
- `backend/src/modules/asl/schemas/screening.schema.ts`
|
||
- `backend/prompts/asl/screening/v1.0.0-mvp.txt`
|
||
- `backend/src/common/llm/adapters/LLMFactory.ts`
|
||
|
||
### 测试文件
|
||
- `backend/scripts/test-llm-screening.ts`
|
||
- `backend/scripts/test-samples/asl-test-literatures.json`
|
||
|
||
---
|
||
|
||
## 🎉 æˆ<C3A6>果总结
|
||
|
||
### 已实�
|
||
âœ?真实LLM调用替æ<C2BF>¢Mockæ•°æ<C2B0>®
|
||
âœ?从项目读å<C2BB>–PICOSæ ‡å‡†
|
||
âœ?å<>Œæ¨¡åž‹å¹¶è¡Œç›é€?
|
||
âœ?冲çª<C3A7>æ£€æµ‹ä¸Žæ ‡è®°
|
||
�完整的日志追�
|
||
âœ?错误处ç<E2809E>†æœºåˆ¶
|
||
|
||
### 待优�
|
||
âš ï¸<EFBFBD> 处ç<E2809E>†æ—¶é—´è¾ƒé•¿ï¼?0-60分钟ï¼?
|
||
âš ï¸<EFBFBD> 串行处ç<E2809E>†ï¼ˆå<CB86>¯æ”¹ä¸ºå¹¶å<C2B6>‘ï¼?
|
||
âš ï¸<EFBFBD> å‰<C3A5>端进度显示(需优化轮询频率ï¼?
|
||
|
||
---
|
||
|
||
## 🔗 å<>‚考文æ¡?
|
||
|
||
- [Prompt设计与测试完æˆ<EFBFBD>报告](./2025-11-18-Prompt设计与测试完æˆ<C3A6>报å‘?md)
|
||
- [å<EFBFBD>’䏿•°æ<EFBFBD>®æ³›åŒ–测试报告](./2025-11-18-å<>’䏿•°æ<C2B0>®æ³›åŒ–测试报告.md)
|
||
- [任务分解](../04-å¼€å<E282AC>‘计åˆ?03-任务分解.md)
|
||
|
||
---
|
||
|
||
**报告�*: AI Assistant
|
||
**日期**: 2025-11-21
|
||
**版本**: v1.0.0
|
||
|
||
|
||
|
||
|
||
|