Features: - Backend statistics API (cloud-native Prisma aggregation) - Results page with hybrid solution (AI consensus + human final decision) - Excel export (frontend generation, zero disk write, cloud-native) - PRISMA-style exclusion reason analysis with bar chart - Batch selection and export (3 export methods) - Fixed logic contradiction (inclusion does not show exclusion reason) - Optimized table width (870px, no horizontal scroll) Components: - Backend: screeningController.ts - add getProjectStatistics API - Frontend: ScreeningResults.tsx - complete results page (hybrid solution) - Frontend: excelExport.ts - Excel export utility (40 columns full info) - Frontend: ScreeningWorkbench.tsx - add navigation button - Utils: get-test-projects.mjs - quick test tool Architecture: - Cloud-native: backend aggregation reduces network transfer - Cloud-native: frontend Excel generation (zero file persistence) - Reuse platform: global prisma instance, logger - Performance: statistics API < 500ms, Excel export < 3s (1000 records) Documentation: - Update module status guide (add Week 4 features) - Update task breakdown (mark Week 4 completed) - Update API design spec (add statistics API) - Update database design (add field usage notes) - Create Week 4 development plan - Create Week 4 completion report - Create technical debt list Test: - End-to-end flow test passed - All features verified - Performance test passed - Cloud-native compliance verified Ref: Week 4 Development Plan Scope: ASL Module MVP - Title Abstract Screening Results Cloud-Native: Backend aggregation + Frontend Excel generation
331 lines
8.1 KiB
Markdown
331 lines
8.1 KiB
Markdown
# [AI对接] ASL快速上下文
|
||
|
||
> **阅读时间:** 3-5分钟 | **Token消耗:** ~2000 tokens
|
||
> **层级:** L2 | **前置阅读:** 00-系统总体设计/[AI对接] 快速上下文.md
|
||
|
||
---
|
||
|
||
## 📋 模块定位
|
||
|
||
**AI智能文献筛选系统**,帮助研究者快速筛选和分析大量文献,提高系统评价效率。
|
||
|
||
**商业价值:** ⭐⭐⭐⭐⭐ 可独立售卖
|
||
**开发状态:** ⏳ 即将开发(Week 2-4)
|
||
**依赖能力:** LLM网关(P0)、文档处理引擎、RAG引擎
|
||
|
||
---
|
||
|
||
## 🎯 核心功能(6个模块)
|
||
|
||
1. ✅ **标题摘要初筛** - 双模型AI判断 → Week 2-3重点
|
||
2. ✅ **全文复筛** - PDF全文分析 → Week 3-4重点
|
||
3. ⏳ 全文解析与数据提取
|
||
4. ⏳ 数据分析与报告生成
|
||
5. ⏳ 系统评价与Meta分析
|
||
6. ⏳ 文献管理
|
||
|
||
**本次开发重点:** 标题摘要初筛 + 全文复筛
|
||
|
||
---
|
||
|
||
## 🏗️ 技术架构一览
|
||
|
||
### 前端(React)
|
||
```
|
||
src/pages/Literature/
|
||
├── ProjectManagement/ # 文献项目管理
|
||
├── TitleScreening/ # 标题摘要初筛 ⭐
|
||
├── FullTextScreening/ # 全文复筛 ⭐
|
||
├── DataExtraction/ # 数据提取
|
||
└── Management/ # 文献管理
|
||
```
|
||
|
||
### 后端(Node.js)
|
||
```
|
||
backend/src/modules/asl/
|
||
├── controllers/
|
||
│ ├── projectController.ts # 项目管理
|
||
│ ├── screeningController.ts # 筛选控制 ⭐
|
||
│ └── extractionController.ts # 数据提取
|
||
├── services/
|
||
│ ├── screeningService.ts # 筛选业务逻辑 ⭐
|
||
│ └── extractionService.ts
|
||
└── routes/
|
||
└── literatureRoutes.ts
|
||
```
|
||
|
||
### 数据库(asl_schema)
|
||
```sql
|
||
CREATE SCHEMA asl_schema;
|
||
|
||
核心表:
|
||
- literature_projects # 文献项目
|
||
- literature_items # 文献条目(CSV导入)
|
||
- pico_configs # PICO(S)纳入排除标准配置
|
||
- screening_results # 筛选结果(INCLUDE/EXCLUDE/UNCERTAIN)
|
||
- screening_history # 筛选历史(可回溯)
|
||
- extraction_tasks # 提取任务
|
||
- extraction_results # 提取结果
|
||
```
|
||
|
||
---
|
||
|
||
## 💡 核心业务流程
|
||
|
||
### 标题摘要初筛流程 ⭐
|
||
|
||
```
|
||
1. 用户上传CSV文件(包含:标题、摘要、作者等)
|
||
↓
|
||
2. 配置PICO(S)纳入/排除标准
|
||
- P: Population(研究对象)
|
||
- I: Intervention(干预措施)
|
||
- C: Comparison(对照)
|
||
- O: Outcome(结局指标)
|
||
- S: Study Design(研究类型)
|
||
↓
|
||
3. AI双模型判断(DeepSeek + Qwen3)
|
||
- 每篇文献独立判断
|
||
- 两个模型投票
|
||
- 固定3并发处理
|
||
↓
|
||
4. 返回结果:INCLUDE / EXCLUDE / UNCERTAIN
|
||
- INCLUDE: 两个模型都认为应纳入
|
||
- EXCLUDE: 两个模型都认为应排除
|
||
- UNCERTAIN: 两个模型意见不一致,需人工复核
|
||
↓
|
||
5. 导出Excel(双Sheet设计)
|
||
- Sheet1: 通过的文献(INCLUDE)
|
||
- Sheet2: 未通过的文献(EXCLUDE + UNCERTAIN)
|
||
```
|
||
|
||
### AI判断逻辑(关键!)
|
||
```typescript
|
||
// 双模型投票机制
|
||
if (deepseekResult === "INCLUDE" && qwen3Result === "INCLUDE") {
|
||
finalResult = "INCLUDE";
|
||
} else if (deepseekResult === "EXCLUDE" && qwen3Result === "EXCLUDE") {
|
||
finalResult = "EXCLUDE";
|
||
} else {
|
||
// 意见不一致
|
||
finalResult = "UNCERTAIN"; // 标记为需要人工复核
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 📚 已有设计文档
|
||
|
||
### PRD文档(完整!)
|
||
- `00-项目概述/AI智能文献PRD(1)-产品概览.md`
|
||
- `00-项目概述/AI智能文献PRD(2)-初筛与复筛.md`
|
||
- `00-项目概述/AI智能文献PRD(3)-提取与分析模块.md`
|
||
|
||
**内容:** 完整的功能需求、用户故事、验收标准
|
||
|
||
### 技术设计(完整!)
|
||
- `01-设计文档/02-数据库设计.md` - 完整表结构
|
||
- `01-设计文档/03-API设计.md` - 所有API端点
|
||
- `01-设计文档/04-前端组件设计.md` - 组件树
|
||
- `01-设计文档/05-AI模型集成设计.md` - 双模型投票逻辑
|
||
|
||
### UI原型(完整!)
|
||
- `01-设计文档/07-UI设计/标题摘要初筛原型.html` ⭐
|
||
- `01-设计文档/07-UI设计/全文复筛原型.html` ⭐
|
||
|
||
---
|
||
|
||
## 🔗 依赖的通用能力
|
||
|
||
### 1. LLM网关(❌ 待实现,P0)⭐ **必须先实现**
|
||
|
||
**为什么ASL需要LLM网关:**
|
||
- 标题摘要初筛需要调用2个LLM模型
|
||
- 全文复筛需要调用1个LLM模型
|
||
- 需要成本控制和配额管理
|
||
|
||
**接口需求:**
|
||
```typescript
|
||
// ASL模块需要的接口
|
||
interface LLMGateway {
|
||
// 单次调用(非流式)
|
||
chat(params: {
|
||
userId: string;
|
||
modelType: 'deepseek-v3' | 'qwen3';
|
||
messages: Message[];
|
||
}): Promise<{
|
||
content: string;
|
||
tokenUsage: number;
|
||
}>;
|
||
|
||
// 检查配额
|
||
checkQuota(userId: string): Promise<boolean>;
|
||
}
|
||
```
|
||
|
||
**实施建议:** Week 2 Day 1-3 同步开发LLM网关
|
||
|
||
---
|
||
|
||
### 2. 文档处理引擎(✅ 已实现)
|
||
|
||
**ASL使用场景:**
|
||
- 全文复筛:PDF全文提取
|
||
|
||
**已有接口:**
|
||
```typescript
|
||
// extraction_service已提供
|
||
POST /api/extract/pdf
|
||
```
|
||
|
||
---
|
||
|
||
### 3. RAG引擎(✅ 已实现,可选)
|
||
|
||
**ASL使用场景(可选):**
|
||
- 文献内容检索
|
||
- 文献相似度分析
|
||
|
||
---
|
||
|
||
## 📋 API端点清单
|
||
|
||
### 项目管理
|
||
```
|
||
POST /api/v1/literature/projects # 创建文献项目
|
||
GET /api/v1/literature/projects # 获取项目列表
|
||
GET /api/v1/literature/projects/:id # 获取项目详情
|
||
PUT /api/v1/literature/projects/:id # 更新项目
|
||
DELETE /api/v1/literature/projects/:id # 删除项目
|
||
```
|
||
|
||
### 标题摘要初筛 ⭐
|
||
```
|
||
POST /api/v1/literature/projects/:id/items/import # 导入CSV
|
||
POST /api/v1/literature/projects/:id/pico # 配置PICO
|
||
POST /api/v1/literature/projects/:id/screening/title # 执行初筛
|
||
GET /api/v1/literature/projects/:id/screening/status # 查询进度
|
||
GET /api/v1/literature/projects/:id/screening/results # 获取结果
|
||
POST /api/v1/literature/projects/:id/screening/export # 导出Excel
|
||
```
|
||
|
||
### 全文复筛
|
||
```
|
||
POST /api/v1/literature/projects/:id/screening/fulltext # 执行全文筛选
|
||
```
|
||
|
||
---
|
||
|
||
## 📅 开发计划
|
||
|
||
### Week 2(11月11-15日)
|
||
- **Day 1-2:** 项目管理基础CRUD
|
||
- **Day 3-4:** 标题摘要初筛后端(含LLM网关)
|
||
- **Day 5:** 标题摘要初筛前端
|
||
|
||
### Week 3(11月18-22日)
|
||
- **Day 1-2:** 全文复筛后端
|
||
- **Day 3-4:** 全文复筛前端
|
||
- **Day 5:** 测试和优化
|
||
|
||
### Week 4(11月25-29日)
|
||
- **Day 1-2:** 数据提取功能
|
||
- **Day 3-5:** 整体测试和文档完善
|
||
|
||
---
|
||
|
||
## ⚠️ 关键技术难点
|
||
|
||
### 1. AI判断准确率
|
||
**解决方案:**
|
||
- 双模型投票机制
|
||
- 优化PICO提示词
|
||
- 提供人工复核入口(UNCERTAIN项)
|
||
|
||
### 2. 大批量处理
|
||
**解决方案:**
|
||
- 固定3并发(p-queue)
|
||
- 实时进度显示
|
||
- 失败重试机制
|
||
|
||
### 3. CSV解析
|
||
**解决方案:**
|
||
- 使用papaparse库
|
||
- 支持多种编码(UTF-8、GBK)
|
||
- 容错处理
|
||
|
||
### 4. PDF全文提取
|
||
**解决方案:**
|
||
- 调用extraction_service
|
||
- 降级策略:Nougat → PyMuPDF
|
||
|
||
---
|
||
|
||
## ✅ 快速开发检查清单
|
||
|
||
**开始开发前确认:**
|
||
- [ ] LLM网关是否已实现?(如未实现,Week 2 Day 1-3同步开发)
|
||
- [ ] 数据库Schema是否已创建?(asl_schema)
|
||
- [ ] Prisma Schema是否已更新?
|
||
- [ ] API路由是否已注册?
|
||
- [ ] 前端路由是否已配置?
|
||
|
||
**常见问题:**
|
||
|
||
**Q: LLM调用超时怎么办?**
|
||
A: 设置timeout=60s,添加重试机制(最多3次)
|
||
|
||
**Q: CSV解析失败怎么办?**
|
||
A: 检查编码格式,提供明确的错误提示,支持重新上传
|
||
|
||
**Q: 两个模型都返回UNCERTAIN怎么办?**
|
||
A: 标记为UNCERTAIN,提示用户需要人工复核
|
||
|
||
**Q: PDF提取失败怎么办?**
|
||
A: 降级策略:Nougat → PyMuPDF → 提示用户手动处理
|
||
|
||
---
|
||
|
||
## 📖 更多详细信息
|
||
|
||
**需要完整PRD:**
|
||
→ `00-项目概述/AI智能文献PRD(1-3).md`(3个文档)
|
||
|
||
**需要数据库详情:**
|
||
→ AI智能文献目录下的 `02-技术设计/01-数据库设计.md`
|
||
|
||
**需要API详情:**
|
||
→ AI智能文献目录下的 `02-技术设计/02-API设计规范.md`
|
||
|
||
**需要UI设计:**
|
||
→ `01-设计文档/AI智能文献-标题摘要初筛原型.html`
|
||
→ `01-设计文档/AI智能文献-全文复筛.html`
|
||
|
||
---
|
||
|
||
**最后更新:** 2025-11-06
|
||
**维护人:** 技术架构师
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|