Major Changes: - Add StreamingService with OpenAI Compatible format - Upgrade Chat component V2 with Ant Design X integration - Implement AIA module with 12 intelligent agents - Update API routes to unified /api/v1 prefix - Update system documentation Backend (~1300 lines): - common/streaming: OpenAI Compatible adapter - modules/aia: 12 agents, conversation service, streaming integration - Update route versions (RVW, PKB to v1) Frontend (~3500 lines): - modules/aia: AgentHub + ChatWorkspace (100% prototype restoration) - shared/Chat: AIStreamChat, ThinkingBlock, useAIStream Hook - Update API endpoints to v1 Documentation: - AIA module status guide - Universal capabilities catalog - System overview updates - All module documentation sync Tested: Stream response verified, authentication working Status: AIA V2.0 core completed (85%)
9.0 KiB
真实LLM集æˆ<EFBFBD>完æˆ<EFBFBD>报告
日期: 2025-11-21
任务: å°†Mock AI替æ<C2BF>¢ä¸ºçœŸå®žLLM调用
**状æ€?*: âœ?完æˆ<C3A6>
📋 背景
之å‰<EFBFBD>的状æ€?
- �已完�Prompt 设计(v1.0.0-MVP�
- �已实�
llmScreeningService.ts(真实LLM调用ï¼? - âœ?已完æˆ<C3A6>测试框架和质é‡<C3A9>验è¯<C3A8>
- â<EFBFBD>?问题:
screeningService.tsä¸ä½¿ç”?mockAIScreening生æˆ<C3A6>å<EFBFBD>‡æ•°æ<C2B0>?
用户需�
ä»?设置与å<C5BD>¯åŠ?页é<C2B5>¢ä¸Šä¼ 真实文献数æ<C2B0>®å<C2AE>Žï¼Œ**使用真实çš?DeepSeek å’?Qwen API 进行ç›é€?*,而ä¸<C3A4>是模拟数æ<C2B0>®ã€?
âœ?完æˆ<C3A6>内容
1. 修改 screeningService.ts
文件: backend/src/modules/asl/services/screeningService.ts
æ ¸å¿ƒæ”¹åŠ¨
引入真实LLMæœ<EFBFBD>务:
import { llmScreeningService } from './llmScreeningService.js';
替æ<EFBFBD>¢å¤„ç<EFBFBD>†é€»è¾‘:
// â<>?旧代ç <C3A7>(Mockï¼?
const result = await mockAIScreening(projectId, literature);
// âœ?新代ç <C3A7>(真实LLMï¼?
const screeningResult = await llmScreeningService.dualModelScreening(
literature.id,
literature.title,
literature.abstract,
picoCriteria,
inclusionCriteria,
exclusionCriteria,
[models[0], models[1]],
screeningConfig?.style || 'standard',
literature.authors,
literature.journal,
literature.publicationYear
);
新增功能
-
从项目读å<EFBFBD>–PICOSæ ‡å‡†:
const project = await prisma.aslScreeningProject.findUnique({ where: { id: projectId }, }); const picoCriteria = project.picoCriteria; const inclusionCriteria = project.inclusionCriteria; const exclusionCriteria = project.exclusionCriteria; -
支æŒ<EFBFBD>自定义模型选择:
const models = screeningConfig?.models || ['deepseek-chat', 'qwen-max']; -
详细日志记录:
logger.info('Processing literature', { literatureId: literature.id, title: literature.title?.substring(0, 50) + '...', }); -
ç»“æžœæ˜ å°„åˆ°æ•°æ<EFBFBD>®åº“æ ¼å¼<EFBFBD>:
const dbResult = { projectId, literatureId: literature.id, // DeepSeek结果 dsModelName: screeningResult.deepseekModel, dsPJudgment: screeningResult.deepseek.judgment.P, // ... å®Œæ•´çš„å—æ®µæ˜ å°? };
🔄 完整æµ<C3A6>程
用户æ“<EFBFBD>作æµ<EFBFBD>程
1. 访问"设置与å<C5BD>¯åŠ?页é<C2B5>¢
�
2. 填写 PICOS æ ‡å‡†
�
3. ä¸Šä¼ Excel 文献列表(例如:199篇)
�
4. 点击"开始AIåˆ<C3A5>ç›"
�
5. å<>Žç«¯è‡ªåЍ处ç<E2809E>†ï¼?
a. 创建项目
b. 导入文献
c. å<>¯åЍç›é€‰ä»»åŠ?
�
6. 真实LLM处ç<E2809E>†ï¼ˆæ¯<C3A6>篇约10-15秒)
a. 调用 DeepSeek API
b. 调用 Qwen API
c. 对比结果,检测冲�
d. ä¿<C3A4>å˜åˆ°æ•°æ<C2B0>®åº“
�
7. å‰<C3A5>端自动跳转åˆ?å®¡æ ¸å·¥ä½œå<C593>?
�
8. 显示真实的AIç›é€‰ç»“æž?
技术æµ<EFBFBD>ç¨?
å‰<EFBFBD>端: TitleScreeningSettings.tsx
�POST /api/v1/asl/literatures/import
å<>Žç«¯: literatureController.ts
�importLiteratures()
�startScreeningTask()
å<>Žç«¯: screeningService.ts
�processLiteraturesInBackground()
�for each literature:
�llmScreeningService.dualModelScreening()
å<>Žç«¯: llmScreeningService.ts
�Promise.all([
screenWithModel('deepseek-chat', ...),
screenWithModel('qwen-max', ...),
])
å<>Žç«¯: LLMFactory
�getAdapter('deepseek-v3')
�getAdapter('qwen3-72b')
真实API调用
�DeepSeek API
�Qwen API
结果ä¿<C3A4>å˜
�AslScreeningResult �
å‰<C3A5>端: ScreeningWorkbench.tsx
�GET /api/v1/asl/projects/:projectId/screening-results
�显示真实结果
â<EFBFBD>±ï¸<EFBFBD> 性能预期
å<EFBFBD>•篇文献处ç<EFBFBD>†æ—¶é—´
| æ¥éª¤ | 耗时(串行) |
|---|---|
| DeepSeek API 调用 | 5-10� |
| Qwen API 调用 | 5-10� |
| 结果ä¿<EFBFBD>å˜ | 0.1ç§? |
| 总计 | *10-20� |
批é‡<EFBFBD>处ç<EFBFBD>†æ—¶é—´ï¼?99篇)
| 模å¼<EFBFBD> | 耗时 | 说明 |
|---|---|---|
| 串行处ç<EFBFBD>† | 33-66分钟 | 当å‰<EFBFBD>实现(é<EFBFBD>¿å…<EFBFBD>APIé™<EFBFBD>æµ<EFBFBD>) |
| å¹¶å<EFBFBD>‘处ç<EFBFBD>†ï¼?个) | 11-22分钟 | å<EFBFBD>¯é€‰ä¼˜åŒ–(需测试ï¼? |
| å¹¶å<EFBFBD>‘处ç<EFBFBD>†ï¼?0个) | 3-7分钟 | 风险:å<EFBFBD>¯èƒ½è§¦å<EFBFBD>‘APIé™<EFBFBD>é¢<EFBFBD> |
当å‰<EFBFBD>ç–ç•¥: 串行处ç<E2809E>†ï¼ˆç¨³å®šä¼˜å…ˆï¼‰
🎯 与Mockæ•°æ<C2B0>®çš„对æ¯?
Mock æ•°æ<C2B0>®ï¼ˆæ—§ï¼?
// â<>?å<>‡æ•°æ<C2B0>?
dsPEvidence: "模拟è¯<C3A8>æ<EFBFBD>®: ç ”ç©¶äººç¾¤ä¸ŽPICOä¸çš„Pæ ‡å‡†åŒ¹é…<C3A9>"
dsReason: "åŸºäºŽæ ‡é¢˜å’Œæ‘˜è¦<C3A8>分æž<C3A6>,该文献符å<C2A6>ˆçº³å…¥æ ‡å‡†ã€?
dsConclusion: randomConclusion() // éš<C3A9>机ï¼?
// 特点�
- 1秒完�99�
- è¯<C3A8>æ<EFBFBD>®éƒ½æ˜¯"模拟è¯<EFBFBD>æ<EFBFBD>®"
- 判æ–结果éš<EFBFBD>机生æˆ<EFBFBD>
真实LLM(新�
// âœ?真实数æ<C2B0>®
dsPEvidence: "This study included adult patients with type 2 diabetes mellitus aged 18 years or older, which matches the population criteria."
dsReason: "The study population consists of T2DM patients, the intervention is an SGLT2 inhibitor (empagliflozin), the comparator is placebo, and the study design is a randomized controlled trial. All PICO criteria are met. The study reports on cardiovascular outcomes including MACE, heart failure hospitalization, and cardiovascular death, which are the outcomes of interest."
dsConclusion: "include" // AI真实判æ–ï¼?
// 特点�
- 33-66分钟完æˆ<EFBFBD>199ç¯?
- è¯<EFBFBD>æ<EFBFBD>®å¼•用文献原文
- 判æ–基于Prompt v1.0.0-MVP
- 准确率:60%(首次测试)
ðŸ”<EFBFBD> æ•°æ<C2B0>®éªŒè¯<C3A8>
验è¯<EFBFBD>方法
cd AIclinicalresearch/backend
node check-data.mjs
预期输出(真实数æ<EFBFBD>®ï¼‰
🔬 ç›é€‰ç»“æžœæ ·æœ?
[1] 文献: Assessment of Thrombectomy versus Combined...
DeepSeek: include (P:match, I:partial, C:mismatch, S:match)
Qwen: exclude (P:mismatch, I:mismatch, C:partial, S:match)
冲çª<C3A7>状æ€? conflict
是å<C2AF>¦æœ‰è¯<C3A8>æ<EFBFBD>? DeepSeek=true, Qwen=true âœ?
è¯<C3A8>æ<EFBFBD>®ç¤ºä¾‹:
- dsPEvidence: "The study population consists of..."
- qwenPEvidence: "Patients with acute ischemic stroke..."
📊 è´¨é‡<C3A9>ä¿<C3A4>éšœ
已实现的质é‡<EFBFBD>措施
-
JSON Schema 验è¯<C3A8>:
- 所有LLM输出必须通过Schema验è¯<EFBFBD>
- ä¸<EFBFBD>å<EFBFBD>ˆæ ¼çš„输出会被拒ç»<EFBFBD>
-
错误处ç<EFBFBD>†:
- å<EFBFBD>•篇文献失败ä¸<EFBFBD>å½±å“<EFBFBD>整体任åŠ?
- 详细错误日志记录
-
进度追踪:
- �0篇更新一次进�
- 实时统计æˆ<EFBFBD>功/冲çª<C3A7>/失败æ•?
-
**å<>¯è¿½æº¯æ€?*:
- 记录原始LLM输出(
rawOutput� - 记录Prompt版本(
promptVersionï¼? - 记录处ç<EFBFBD>†æ—¶é—´ï¼ˆ
aiProcessedAtï¼?
- 记录原始LLM输出(
🚀 测试æ¥éª¤
Step 1: 准备测试数æ<C2B0>®
使用现有测试文件:
- PICOS: docs/.../测试案例的PICOSã€<C3A3>çº³å…¥æ ‡å‡†ã€<C3A3>æŽ’é™¤æ ‡å‡?txt
- Excel: docs/.../Test Cases.xlsx (199篇文�
Step 2: 执行测试
- å<EFBFBD>¯åЍå<EFBFBD>Žç«¯:
cd backend && npm run dev - å<EFBFBD>¯åЍå‰<EFBFBD>端:
cd frontend-v2 && npm run dev - 访问:
http://localhost:3001 - 填写PICOS + ä¸Šä¼ Excel
- 点击"开始AIåˆ<C3A5>ç›"
- ç‰å¾…30-60分钟ï¼?99篇Ã?0秒)
- æŸ¥çœ‹å®¡æ ¸å·¥ä½œå<EFBFBD>?
Step 3: 验è¯<C3A8>结果
cd backend
node check-data.mjs
检查项:
- 所有文献都有ç›é€‰ç»“æž?
- è¯<EFBFBD>æ<EFBFBD>®ä¸<EFBFBD>å†<EFBFBD>æ˜?模拟è¯<C3A8>æ<EFBFBD>®"
- è¯<EFBFBD>æ<EFBFBD>®åŒ…å<EFBFBD>«æ–‡çŒ®åŽŸæ–‡å¼•ç”¨
- 判æ–ç<EFBFBD>†ç”±è¯¦ç»†ä¸”符å<EFBFBD>ˆé€»è¾‘
- 冲çª<EFBFBD>检测准确(conclusionä¸<EFBFBD>å<EFBFBD>Œï¼?
âš ï¸<EFBFBD> 注æ„<C3A6>事项
API密钥é…<EFBFBD>ç½®
ç¡®ä¿<EFBFBD>环境å<EFBFBD>˜é‡<EFBFBD>å·²é…<EFBFBD>ç½?
# .env
DEEPSEEK_API_KEY=sk-xxxxx
QWEN_API_KEY=sk-xxxxx
APIé™<EFBFBD>æµ<EFBFBD>
- DeepSeek: 60 RPM(æ¯<C3A6>分钟请求数)
- Qwen: 60 RPM
当å‰<EFBFBD>ç–ç•¥: 串行处ç<E2809E>†ï¼Œä¸<C3A4>会触å<C2A6>‘é™<C3A9>æµ?
æˆ<EFBFBD>本估算
- DeepSeek: ~$0.001/�× 199 = $0.20
- Qwen: ~$0.001/�× 199 = $0.20
- 总计: $0.40 / 次完整测�
💡 优化建议
çŸæœŸä¼˜åŒ–(Week 2 - Day 4-5ï¼?
- å¹¶å<EFBFBD>‘控制: 改为3个并å<C2B6>‘(33分钟 â†?11分钟ï¼?
- 进度显示: å‰<C3A5>端轮询显示进度百分æ¯?
- 错误é‡<EFBFBD>试: 失败的文献自动é‡<C3A9>è¯?æ¬?
䏿œŸä¼˜åŒ–(Week 3ï¼?
- 消æ<EFBFBD>¯é˜Ÿåˆ—: 使用Bull Queue异æ¥å¤„ç<E2809E>†
- 批é‡<EFBFBD>优化: 使用批é‡<C3A9>API接å<C2A5>£ï¼ˆå¦‚果有ï¼?
- ç¼“å˜æœºåˆ¶: 相å<C2B8>Œæ–‡çŒ®ä¸<C3A4>é‡<C3A9>å¤<C3A5>ç›é€?
ðŸ“<EFBFBD> 相关文件
修改的文�
backend/src/modules/asl/services/screeningService.tsâ?
ä¾<EFBFBD>赖的文件(已å˜åœ¨ï¼‰
backend/src/modules/asl/services/llmScreeningService.tsbackend/src/modules/asl/schemas/screening.schema.tsbackend/prompts/asl/screening/v1.0.0-mvp.txtbackend/src/common/llm/adapters/LLMFactory.ts
测试文件
backend/scripts/test-llm-screening.tsbackend/scripts/test-samples/asl-test-literatures.json
🎉 æˆ<C3A6>果总结
已实�
âœ?真实LLM调用替æ<C2BF>¢Mockæ•°æ<C2B0>®
âœ?从项目读å<C2BB>–PICOSæ ‡å‡†
âœ?å<>Œæ¨¡åž‹å¹¶è¡Œç›é€?
âœ?冲çª<C3A7>æ£€æµ‹ä¸Žæ ‡è®°
�完整的日志追�
âœ?错误处ç<E2809E>†æœºåˆ¶
待优�
âš ï¸<EFBFBD> 处ç<E2809E>†æ—¶é—´è¾ƒé•¿ï¼?0-60分钟ï¼? âš ï¸<C3AF> 串行处ç<E2809E>†ï¼ˆå<CB86>¯æ”¹ä¸ºå¹¶å<C2B6>‘ï¼? âš ï¸<C3AF> å‰<C3A5>端进度显示(需优化轮询频率ï¼?
🔗 å<>‚考文æ¡?
**报告�*: AI Assistant
日期: 2025-11-21
版本: v1.0.0