Features: - Backend statistics API (cloud-native Prisma aggregation) - Results page with hybrid solution (AI consensus + human final decision) - Excel export (frontend generation, zero disk write, cloud-native) - PRISMA-style exclusion reason analysis with bar chart - Batch selection and export (3 export methods) - Fixed logic contradiction (inclusion does not show exclusion reason) - Optimized table width (870px, no horizontal scroll) Components: - Backend: screeningController.ts - add getProjectStatistics API - Frontend: ScreeningResults.tsx - complete results page (hybrid solution) - Frontend: excelExport.ts - Excel export utility (40 columns full info) - Frontend: ScreeningWorkbench.tsx - add navigation button - Utils: get-test-projects.mjs - quick test tool Architecture: - Cloud-native: backend aggregation reduces network transfer - Cloud-native: frontend Excel generation (zero file persistence) - Reuse platform: global prisma instance, logger - Performance: statistics API < 500ms, Excel export < 3s (1000 records) Documentation: - Update module status guide (add Week 4 features) - Update task breakdown (mark Week 4 completed) - Update API design spec (add statistics API) - Update database design (add field usage notes) - Create Week 4 development plan - Create Week 4 completion report - Create technical debt list Test: - End-to-end flow test passed - All features verified - Performance test passed - Cloud-native compliance verified Ref: Week 4 Development Plan Scope: ASL Module MVP - Title Abstract Screening Results Cloud-Native: Backend aggregation + Frontend Excel generation
868 lines
20 KiB
Markdown
868 lines
20 KiB
Markdown
# AI智能文献模块 - 技术债务清单
|
||
|
||
> **文档版本:** v1.0
|
||
> **创建日期:** 2025-11-21
|
||
> **维护者:** AI智能文献开发团队
|
||
> **最后更新:** 2025-11-21
|
||
> **文档目的:** 记录MVP完成后需要优化的技术问题
|
||
|
||
---
|
||
|
||
## 📋 文档说明
|
||
|
||
本文档记录AI智能文献模块在MVP开发完成后,发现的需要优化但不影响核心功能的技术问题。这些问题将在MVP稳定运行后,按优先级逐步解决。
|
||
|
||
**当前MVP状态**:
|
||
- ✅ 核心功能完整(上传→筛选→复核)
|
||
- ✅ 双模型筛选可用(DeepSeek + Qwen)
|
||
- ✅ 前后端联调通过
|
||
- ⚠️ 准确率60%,低于目标85%
|
||
- ⚠️ 性能较慢,199篇约33-66分钟
|
||
|
||
---
|
||
|
||
## 🔴 优先级1:质量优化(准确率)
|
||
|
||
### 问题描述
|
||
|
||
**当前状态**:
|
||
- 准确率:60%
|
||
- 目标:≥85%
|
||
- 差距:25%
|
||
|
||
**影响范围**:
|
||
- 直接影响用户对AI筛选结果的信任度
|
||
- 增加人工复核工作量
|
||
- 可能导致漏筛或误筛
|
||
|
||
**根本原因**(基于2025-11-18测试报告):
|
||
1. **Prompt不够清晰**:AI对"边界情况"的理解与人类不一致
|
||
2. **缺少Few-shot示例**:模型没有参考案例,难以把握标准
|
||
3. **PICOS标准模糊**:用户输入的标准可能含糊不清
|
||
4. **冲突检测不敏感**:只检测结论不一致,忽略了置信度和PICO差异
|
||
|
||
---
|
||
|
||
### 优化方案1:Few-shot示例
|
||
|
||
**目标**:在Prompt中添加3-5个高质量示例
|
||
|
||
**实施步骤**:
|
||
|
||
#### Step 1: 设计示例结构
|
||
```
|
||
每个示例包含:
|
||
1. 文献标题和摘要(精简版)
|
||
2. PICOS标准
|
||
3. 纳入/排除标准
|
||
4. 正确的判断结果(include/exclude)
|
||
5. 详细的推理过程
|
||
```
|
||
|
||
#### Step 2: 选择示例类型
|
||
```
|
||
示例1:明确应纳入 - 完美匹配所有PICOS
|
||
示例2:明确应排除 - 人群不匹配
|
||
示例3:明确应排除 - 研究设计不符
|
||
示例4:边界情况 - 部分匹配,但应纳入
|
||
示例5:边界情况 - 看似匹配,但应排除
|
||
```
|
||
|
||
#### Step 3: 编写示例
|
||
```
|
||
参考真实测试案例中的成功和失败案例
|
||
确保示例覆盖常见的判断场景
|
||
```
|
||
|
||
#### Step 4: 集成到Prompt
|
||
```
|
||
位置:backend/prompts/asl/screening/v1.1.0-fewshot.txt
|
||
格式:
|
||
---
|
||
## 示例1:明确纳入
|
||
【文献】:...
|
||
【PICOS】:...
|
||
【判断】:include
|
||
【原因】:...
|
||
---
|
||
```
|
||
|
||
**预计提升**:准确率 +10-15%(60% → 70-75%)
|
||
|
||
**预计耗时**:1天
|
||
|
||
---
|
||
|
||
### 优化方案2:PICOS标准明确化
|
||
|
||
**目标**:帮助AI更准确理解用户的PICOS标准
|
||
|
||
**实施步骤**:
|
||
|
||
#### Step 1: 增强PICOS输入
|
||
```typescript
|
||
// 当前输入
|
||
picoCriteria: {
|
||
P: "2型糖尿病成人患者",
|
||
I: "SGLT2抑制剂",
|
||
...
|
||
}
|
||
|
||
// 优化后输入
|
||
picoCriteria: {
|
||
P: {
|
||
description: "2型糖尿病成人患者",
|
||
keywords: ["2型糖尿病", "成人", "T2DM"],
|
||
mustInclude: ["糖尿病"],
|
||
mustExclude: ["1型", "儿童", "青少年"]
|
||
},
|
||
...
|
||
}
|
||
```
|
||
|
||
#### Step 2: 在Prompt中明确要求
|
||
```
|
||
在Prompt中添加:
|
||
- 明确哪些关键词必须出现
|
||
- 明确哪些关键词不能出现
|
||
- 部分匹配的判断标准(如"部分匹配"意味着什么)
|
||
```
|
||
|
||
#### Step 3: 调整前端表单
|
||
```
|
||
在TitleScreeningSettings.tsx中:
|
||
- 为每个PICO字段添加"关键词提取"功能
|
||
- 添加"必须包含"和"必须排除"的高级选项
|
||
- 提供标准模板
|
||
```
|
||
|
||
**预计提升**:准确率 +5-10%(75% → 80-85%)
|
||
|
||
**预计耗时**:2天
|
||
|
||
---
|
||
|
||
### 优化方案3:置信度阈值调优
|
||
|
||
**目标**:提高模型判断的置信度,减少不确定性
|
||
|
||
**实施步骤**:
|
||
|
||
#### Step 1: 分析置信度分布
|
||
```sql
|
||
-- 查询置信度分布
|
||
SELECT
|
||
ROUND(ds_confidence * 10) / 10 as confidence_range,
|
||
COUNT(*) as count
|
||
FROM asl_schema.screening_results
|
||
GROUP BY confidence_range
|
||
ORDER BY confidence_range;
|
||
```
|
||
|
||
#### Step 2: 调整Prompt要求
|
||
```
|
||
在Prompt中明确:
|
||
- 什么情况下应该给出高置信度(0.8-1.0)
|
||
- 什么情况下应该给出中置信度(0.5-0.8)
|
||
- 什么情况下应该给出低置信度(0-0.5)
|
||
- 低于0.7的自动标记为"需要人工复核"
|
||
```
|
||
|
||
#### Step 3: 优化冲突检测
|
||
```typescript
|
||
// 当前:只检测结论不一致
|
||
hasConflict = (dsConclusion !== qwenConclusion);
|
||
|
||
// 优化:增加置信度差异检测
|
||
hasConflict =
|
||
(dsConclusion !== qwenConclusion) || // 结论不一致
|
||
(Math.abs(dsConfidence - qwenConfidence) > 0.3) || // 置信度差异大
|
||
(dsJudgments.P !== qwenJudgments.P && important.includes('P')); // 关键PICO不一致
|
||
```
|
||
|
||
**预计提升**:冲突检测准确率 +10%,减少漏检
|
||
|
||
**预计耗时**:0.5天
|
||
|
||
---
|
||
|
||
### 优化方案4:测试与迭代
|
||
|
||
**目标**:持续测试和优化,直到准确率≥85%
|
||
|
||
**实施步骤**:
|
||
|
||
#### Step 1: 使用现有测试脚本
|
||
```bash
|
||
cd backend
|
||
npm run test:llm
|
||
|
||
# 或直接运行
|
||
npx ts-node scripts/test-llm-screening.ts
|
||
```
|
||
|
||
#### Step 2: 分析失败案例
|
||
```
|
||
对于每个失败案例:
|
||
1. 记录AI的判断结果
|
||
2. 记录正确答案
|
||
3. 分析差异原因
|
||
4. 调整Prompt或示例
|
||
```
|
||
|
||
#### Step 3: A/B测试
|
||
```
|
||
测试不同版本的Prompt:
|
||
- v1.0.0-mvp(当前,60%)
|
||
- v1.1.0-fewshot(+Few-shot)
|
||
- v1.2.0-picos-enhanced(+PICOS明确化)
|
||
- v1.3.0-confidence(+置信度优化)
|
||
```
|
||
|
||
#### Step 4: 记录测试结果
|
||
```
|
||
创建测试报告:
|
||
- 准确率变化曲线
|
||
- 各版本对比
|
||
- 失败案例分析
|
||
- 最终推荐版本
|
||
```
|
||
|
||
**预计耗时**:1-2天(迭代)
|
||
|
||
---
|
||
|
||
### 质量优化总计
|
||
|
||
**预计提升**:60% → 85-90%
|
||
|
||
**预计总耗时**:4-5天
|
||
|
||
**负责人**:AI工程师 + 医学专家
|
||
|
||
**验收标准**:
|
||
- ✅ 准确率 ≥ 85%
|
||
- ✅ 双模型一致率 ≥ 80%
|
||
- ✅ 人工复核队列 ≤ 20%
|
||
- ✅ 置信度分布合理(高置信度占60%+)
|
||
|
||
---
|
||
|
||
## 🟡 优先级2:性能优化(并发处理)
|
||
|
||
### 问题描述
|
||
|
||
**当前状态**:
|
||
- 处理方式:串行(一篇接一篇)
|
||
- 处理速度:10-20秒/篇(DeepSeek + Qwen并行)
|
||
- 总耗时:199篇约33-66分钟
|
||
|
||
**目标**:
|
||
- 处理方式:3-5并发
|
||
- 总耗时:199篇约10-20分钟(提速3倍)
|
||
|
||
**影响范围**:
|
||
- 用户体验(等待时间长)
|
||
- 云服务成本(长时间占用资源)
|
||
|
||
---
|
||
|
||
### 优化方案:并发处理
|
||
|
||
**实施步骤**:
|
||
|
||
#### Step 1: 安装并发控制库
|
||
```bash
|
||
cd backend
|
||
npm install p-limit
|
||
```
|
||
|
||
#### Step 2: 修改筛选服务
|
||
```typescript
|
||
// 文件:backend/src/modules/asl/services/screeningService.ts
|
||
|
||
import pLimit from 'p-limit';
|
||
|
||
// 在 processLiteraturesInBackground 中修改
|
||
|
||
// ❌ 当前:串行处理
|
||
for (const literature of literatures) {
|
||
await llmScreeningService.dualModelScreening(...);
|
||
}
|
||
|
||
// ✅ 优化后:并发处理
|
||
const concurrency = 3; // 3个并发
|
||
const limit = pLimit(concurrency);
|
||
|
||
const tasks = literatures.map((literature, index) =>
|
||
limit(async () => {
|
||
try {
|
||
console.log(`\n🔍 开始处理文献 ${index + 1}/${literatures.length}`);
|
||
|
||
// 调用LLM筛选
|
||
const screeningResult = await llmScreeningService.dualModelScreening(...);
|
||
|
||
// 保存结果
|
||
await prisma.aslScreeningResult.create({ data: screeningResult });
|
||
|
||
// 更新进度
|
||
await updateTaskProgress(...);
|
||
|
||
console.log(`✅ 文献 ${index + 1}/${literatures.length} 处理成功`);
|
||
} catch (error) {
|
||
console.error(`❌ 文献 ${index + 1}/${literatures.length} 处理失败:`, error);
|
||
// 继续处理其他文献
|
||
}
|
||
})
|
||
);
|
||
|
||
await Promise.all(tasks);
|
||
```
|
||
|
||
#### Step 3: 添加进度更新优化
|
||
```typescript
|
||
// 当前问题:高并发下频繁更新数据库
|
||
// 解决方案:批量更新或使用内存计数器
|
||
|
||
let processedCount = 0;
|
||
let successCount = 0;
|
||
let conflictCount = 0;
|
||
let failedCount = 0;
|
||
|
||
// 每5篇或每10秒更新一次数据库
|
||
const updateInterval = setInterval(async () => {
|
||
await prisma.aslScreeningTask.update({
|
||
where: { id: taskId },
|
||
data: {
|
||
processedItems: processedCount,
|
||
successItems: successCount,
|
||
conflictItems: conflictCount,
|
||
failedItems: failedCount,
|
||
}
|
||
});
|
||
}, 10000); // 10秒更新一次
|
||
|
||
// 处理完成后清理
|
||
clearInterval(updateInterval);
|
||
```
|
||
|
||
#### Step 4: 添加限流保护
|
||
```typescript
|
||
// 防止API限流
|
||
const API_RATE_LIMITS = {
|
||
'deepseek-chat': { rpm: 30, tpm: 100000 }, // 每分钟30次
|
||
'qwen-max': { rpm: 60, tpm: 200000 },
|
||
};
|
||
|
||
// 动态调整并发数
|
||
function calculateOptimalConcurrency(model: string): number {
|
||
const limit = API_RATE_LIMITS[model];
|
||
// 保守估计:使用限制的50%
|
||
return Math.floor(limit.rpm / 20); // DeepSeek: 1-2, Qwen: 3
|
||
}
|
||
|
||
const concurrency = Math.min(
|
||
calculateOptimalConcurrency('deepseek-chat'),
|
||
calculateOptimalConcurrency('qwen-max')
|
||
); // 取最小值,约3
|
||
```
|
||
|
||
#### Step 5: 添加错误重试
|
||
```typescript
|
||
async function processWithRetry(
|
||
literature: any,
|
||
maxRetries: number = 2
|
||
): Promise<any> {
|
||
for (let attempt = 1; attempt <= maxRetries; attempt++) {
|
||
try {
|
||
return await llmScreeningService.dualModelScreening(...);
|
||
} catch (error) {
|
||
console.error(`❌ 尝试 ${attempt}/${maxRetries} 失败:`, error);
|
||
if (attempt === maxRetries) throw error;
|
||
// 等待后重试(指数退避)
|
||
await new Promise(resolve => setTimeout(resolve, 1000 * attempt));
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
**预计提升**:
|
||
- 处理速度:3倍提升
|
||
- 199篇文献:33-66分钟 → 10-20分钟
|
||
- 用户体验:显著改善
|
||
|
||
**预计耗时**:0.5-1天
|
||
|
||
**负责人**:后端开发
|
||
|
||
**验收标准**:
|
||
- ✅ 199篇文献筛选 ≤ 20分钟
|
||
- ✅ API调用不触发限流
|
||
- ✅ 错误率不增加
|
||
- ✅ 进度显示正常
|
||
|
||
---
|
||
|
||
## 🟢 优先级3:用户体验优化
|
||
|
||
### 问题清单
|
||
|
||
#### 1. 浏览器性能警告
|
||
```
|
||
[Violation]'setTimeout' handler took 72ms
|
||
```
|
||
|
||
**问题原因**:
|
||
- React组件渲染耗时
|
||
- 表格数据量大
|
||
|
||
**解决方案**:
|
||
- 使用虚拟滚动(`react-window`)
|
||
- 优化表格渲染(减少不必要的re-render)
|
||
- 使用`useMemo`缓存计算结果
|
||
|
||
**预计耗时**:0.5天
|
||
|
||
---
|
||
|
||
#### 2. 无估计剩余时间
|
||
|
||
**问题**:用户不知道还需要等多久
|
||
|
||
**解决方案**:
|
||
```typescript
|
||
// 计算预估时间
|
||
const avgTimePerLit = (Date.now() - task.startedAt) / task.processedItems;
|
||
const remainingLits = task.totalItems - task.processedItems;
|
||
const estimatedTimeRemaining = avgTimePerLit * remainingLits;
|
||
|
||
// 显示
|
||
<div>
|
||
预计剩余时间: {formatDuration(estimatedTimeRemaining)}
|
||
</div>
|
||
```
|
||
|
||
**预计耗时**:0.5天
|
||
|
||
---
|
||
|
||
#### 3. 无当前处理文献显示
|
||
|
||
**问题**:用户不知道AI正在处理哪篇文献
|
||
|
||
**解决方案**:
|
||
```typescript
|
||
// 在 screeningService.ts 中
|
||
await prisma.aslScreeningTask.update({
|
||
where: { id: taskId },
|
||
data: {
|
||
currentLiteratureTitle: literature.title, // 新增字段
|
||
currentLiteratureId: literature.id,
|
||
}
|
||
});
|
||
|
||
// 前端显示
|
||
<div>
|
||
当前处理: {task.currentLiteratureTitle}
|
||
</div>
|
||
```
|
||
|
||
**预计耗时**:0.5天
|
||
|
||
---
|
||
|
||
#### 4. 表格小屏幕适配
|
||
|
||
**问题**:小屏幕上表格列宽度不适配
|
||
|
||
**解决方案**:
|
||
- 使用响应式布局
|
||
- 添加"紧凑模式"切换
|
||
- 移动端使用卡片布局代替表格
|
||
|
||
**预计耗时**:1天
|
||
|
||
---
|
||
|
||
## 🟣 优先级4:Excel导出优化
|
||
|
||
### 问题描述
|
||
|
||
**当前状态**:
|
||
- 导出方式:前端生成(`xlsx`库)
|
||
- 适用数据量:<5000条
|
||
- 生成速度:<1000条约2-3秒
|
||
|
||
**目标状态**(当数据量>5000条或需要复杂格式时):
|
||
- 导出方式:后端生成 + OSS存储
|
||
- 适用数据量:无限制
|
||
- 支持复杂格式:多Sheet、图表、样式定制
|
||
|
||
**触发条件**:
|
||
- 单次导出数据量 >5000条
|
||
- 需要复杂Excel格式(多Sheet、图表等)
|
||
- 用户反馈前端导出卡顿
|
||
|
||
---
|
||
|
||
### 优化方案:后端导出+OSS存储
|
||
|
||
**实施步骤**:
|
||
|
||
#### Step 1: 后端安装Excel生成库
|
||
```bash
|
||
cd backend
|
||
npm install exceljs
|
||
```
|
||
|
||
#### Step 2: 实现后端导出服务
|
||
```typescript
|
||
// backend/src/modules/asl/services/exportService.ts
|
||
import ExcelJS from 'exceljs';
|
||
import { storage } from '@/common/storage';
|
||
import { logger } from '@/common/logging';
|
||
|
||
export async function exportScreeningResults(projectId: string, filter: string) {
|
||
// 1. 查询数据
|
||
const results = await prisma.aslScreeningResult.findMany({
|
||
where: buildWhereClause(projectId, filter),
|
||
include: { literature: true },
|
||
});
|
||
|
||
// 2. 生成Excel(内存中)
|
||
const workbook = new ExcelJS.Workbook();
|
||
const worksheet = workbook.addWorksheet('筛选结果');
|
||
|
||
// 设置表头
|
||
worksheet.columns = [
|
||
{ header: '序号', key: 'index', width: 6 },
|
||
{ header: '文献标题', key: 'title', width: 50 },
|
||
// ... 更多列
|
||
];
|
||
|
||
// 填充数据
|
||
results.forEach((result, idx) => {
|
||
worksheet.addRow({
|
||
index: idx + 1,
|
||
title: result.literature.title,
|
||
// ... 更多字段
|
||
});
|
||
});
|
||
|
||
// 3. 转为Buffer
|
||
const buffer = await workbook.xlsx.writeBuffer();
|
||
|
||
// 4. ⭐ 上传到OSS(使用平台存储服务)
|
||
const key = `asl/exports/${projectId}/${Date.now()}.xlsx`;
|
||
const url = await storage.upload(key, Buffer.from(buffer));
|
||
|
||
// 5. 记录日志
|
||
logger.info('Excel exported', { projectId, recordCount: results.length, url });
|
||
|
||
return {
|
||
url,
|
||
filename: `screening-results-${Date.now()}.xlsx`,
|
||
recordCount: results.length,
|
||
};
|
||
}
|
||
```
|
||
|
||
#### Step 3: 实现导出API
|
||
```typescript
|
||
// backend/src/modules/asl/controllers/exportController.ts
|
||
export async function exportResults(
|
||
request: FastifyRequest<{
|
||
Params: { projectId: string };
|
||
Querystring: { filter?: string };
|
||
}>,
|
||
reply: FastifyReply
|
||
) {
|
||
try {
|
||
const { projectId } = request.params;
|
||
const filter = request.query.filter || 'all';
|
||
|
||
// 导出并上传到OSS
|
||
const result = await exportService.exportScreeningResults(projectId, filter);
|
||
|
||
return reply.send({
|
||
success: true,
|
||
data: result,
|
||
});
|
||
} catch (error) {
|
||
logger.error('Export failed', { error });
|
||
return reply.status(500).send({
|
||
success: false,
|
||
error: '导出失败',
|
||
});
|
||
}
|
||
}
|
||
```
|
||
|
||
#### Step 4: 前端调用
|
||
```typescript
|
||
// 前端
|
||
const handleExportLarge = async () => {
|
||
try {
|
||
message.loading('正在生成Excel,请稍候...', 0);
|
||
|
||
// 调用后端导出API
|
||
const { data } = await aslApi.exportResults(projectId, { filter: 'all' });
|
||
|
||
message.destroy();
|
||
message.success(`成功导出 ${data.recordCount} 条记录`);
|
||
|
||
// 通过OSS URL下载
|
||
window.open(data.url, '_blank');
|
||
} catch (error) {
|
||
message.destroy();
|
||
message.error('导出失败');
|
||
}
|
||
};
|
||
```
|
||
|
||
#### Step 5: OSS文件清理(可选)
|
||
```typescript
|
||
// 定时任务:清理7天前的导出文件
|
||
import { jobQueue } from '@/common/jobs';
|
||
|
||
jobQueue.schedule('cleanup-exports', '0 2 * * *', async () => {
|
||
const sevenDaysAgo = new Date(Date.now() - 7 * 24 * 60 * 60 * 1000);
|
||
|
||
// 列出并删除过期文件
|
||
const files = await storage.list('asl/exports/');
|
||
for (const file of files) {
|
||
if (file.lastModified < sevenDaysAgo) {
|
||
await storage.delete(file.key);
|
||
}
|
||
}
|
||
|
||
logger.info('Cleaned up old export files');
|
||
});
|
||
```
|
||
|
||
**预计提升**:
|
||
- 支持无限数据量
|
||
- 支持复杂格式(多Sheet、图表、样式)
|
||
- 不占用前端资源
|
||
|
||
**预计耗时**:1-2天
|
||
|
||
**负责人**:后端开发
|
||
|
||
**验收标准**:
|
||
- ✅ 可导出>5000条数据
|
||
- ✅ 文件上传到OSS
|
||
- ✅ 前端通过URL下载
|
||
- ✅ 符合云原生规范(使用平台存储服务)
|
||
|
||
---
|
||
|
||
## 🔵 优先级5:架构优化(云原生)
|
||
|
||
### 问题清单
|
||
|
||
#### 1. 异步任务未使用消息队列
|
||
|
||
**当前状态**:
|
||
- 筛选任务在后台线程中执行
|
||
- 服务重启会丢失任务
|
||
|
||
**目标状态**:
|
||
- 使用Bull队列(Redis)
|
||
- 任务持久化
|
||
- 支持分布式处理
|
||
|
||
**解决方案**:
|
||
```typescript
|
||
// 使用平台提供的jobQueue
|
||
import { jobQueue } from '@/common/jobs';
|
||
|
||
// 创建任务
|
||
await jobQueue.push('asl:screening', {
|
||
projectId,
|
||
literatures,
|
||
config,
|
||
});
|
||
|
||
// Worker处理
|
||
jobQueue.process('asl:screening', async (job) => {
|
||
await screeningService.processLiteratures(job.data);
|
||
});
|
||
```
|
||
|
||
**预计耗时**:1-2天
|
||
|
||
---
|
||
|
||
#### 2. 无断点续传
|
||
|
||
**问题**:任务中断后需要重新开始
|
||
|
||
**解决方案**:
|
||
```typescript
|
||
// 检查是否有未完成的任务
|
||
const existingTask = await prisma.aslScreeningTask.findFirst({
|
||
where: {
|
||
projectId,
|
||
status: 'running',
|
||
}
|
||
});
|
||
|
||
if (existingTask) {
|
||
// 恢复任务
|
||
const processedLiteratureIds = await getProcessedLiteratureIds(existingTask.id);
|
||
const remainingLiteratures = literatures.filter(
|
||
lit => !processedLiteratureIds.includes(lit.id)
|
||
);
|
||
await resumeTask(existingTask.id, remainingLiteratures);
|
||
} else {
|
||
// 创建新任务
|
||
await startNewTask(projectId, literatures);
|
||
}
|
||
```
|
||
|
||
**预计耗时**:1天
|
||
|
||
---
|
||
|
||
#### 3. 无成本控制
|
||
|
||
**问题**:无法控制API调用成本
|
||
|
||
**解决方案**:
|
||
```typescript
|
||
// 添加成本估算
|
||
interface CostEstimate {
|
||
totalTokens: number;
|
||
estimatedCost: number; // USD
|
||
processingTime: number; // seconds
|
||
}
|
||
|
||
function estimateCost(literatures: Literature[]): CostEstimate {
|
||
const avgTokensPerLit = 1500; // 标题+摘要约1500 tokens
|
||
const totalTokens = literatures.length * avgTokensPerLit * 2; // 2个模型
|
||
|
||
const deepseekCost = (totalTokens / 1000) * 0.001; // $0.001/1K tokens
|
||
const qwenCost = (totalTokens / 1000) * 0.002; // $0.002/1K tokens
|
||
|
||
return {
|
||
totalTokens,
|
||
estimatedCost: deepseekCost + qwenCost,
|
||
processingTime: literatures.length * 15, // 15秒/篇
|
||
};
|
||
}
|
||
|
||
// 前端显示
|
||
const estimate = estimateCost(literatures);
|
||
<Alert>
|
||
预计消耗: {estimate.totalTokens} tokens
|
||
预计费用: ${estimate.estimatedCost.toFixed(2)}
|
||
预计时间: {formatDuration(estimate.processingTime)}
|
||
</Alert>
|
||
```
|
||
|
||
**预计耗时**:0.5天
|
||
|
||
---
|
||
|
||
## 📊 技术债务优先级矩阵
|
||
|
||
| 债务项 | 影响范围 | 紧迫性 | 预计耗时 | ROI | 优先级 |
|
||
|--------|---------|--------|---------|-----|--------|
|
||
| **Prompt优化** | 核心质量 | 高 | 4-5天 | 高 | P1 🔴 |
|
||
| **并发处理** | 用户体验 | 中 | 0.5-1天 | 高 | P2 🟡 |
|
||
| **估计剩余时间** | 用户体验 | 中 | 0.5天 | 中 | P3 🟢 |
|
||
| **当前文献显示** | 用户体验 | 低 | 0.5天 | 中 | P3 🟢 |
|
||
| **浏览器性能** | 用户体验 | 低 | 0.5天 | 低 | P4 🔵 |
|
||
| **消息队列** | 架构稳定性 | 低 | 1-2天 | 中 | P4 🔵 |
|
||
| **断点续传** | 用户体验 | 低 | 1天 | 中 | P4 🔵 |
|
||
| **成本控制** | 运营 | 低 | 0.5天 | 低 | P4 🔵 |
|
||
| **小屏幕适配** | 用户体验 | 低 | 1天 | 低 | P4 🔵 |
|
||
|
||
---
|
||
|
||
## 🗓️ 建议的解决顺序
|
||
|
||
### 阶段1:质量优化(必须)
|
||
```
|
||
时间:1周
|
||
任务:
|
||
1. Few-shot示例设计(1天)
|
||
2. PICOS标准明确化(2天)
|
||
3. 置信度优化(0.5天)
|
||
4. 测试迭代(1-2天)
|
||
目标:准确率 60% → 85%
|
||
```
|
||
|
||
### 阶段2:性能优化(推荐)
|
||
```
|
||
时间:1-2天
|
||
任务:
|
||
1. 并发处理(0.5-1天)
|
||
2. 进度优化(0.5天)
|
||
目标:199篇 33-66分钟 → 10-20分钟
|
||
```
|
||
|
||
### 阶段3:体验优化(可选)
|
||
```
|
||
时间:2-3天
|
||
任务:
|
||
1. 估计剩余时间(0.5天)
|
||
2. 当前文献显示(0.5天)
|
||
3. 浏览器性能(0.5天)
|
||
4. 小屏幕适配(1天)
|
||
目标:提升用户体验
|
||
```
|
||
|
||
### 阶段4:架构优化(长期)
|
||
```
|
||
时间:3-4天
|
||
任务:
|
||
1. 消息队列集成(1-2天)
|
||
2. 断点续传(1天)
|
||
3. 成本控制(0.5天)
|
||
目标:生产环境就绪
|
||
```
|
||
|
||
---
|
||
|
||
## 📝 决策记录
|
||
|
||
### 2025-11-21:推迟质量优化,优先完成Week 4功能
|
||
|
||
**决策人**:用户
|
||
|
||
**决策内容**:
|
||
- 将Prompt优化、并发处理等优化任务记录为技术债务
|
||
- 优先完成Week 4功能(结果展示、统计、导出)
|
||
- 待Week 4完成后,再根据实际需要处理技术债务
|
||
|
||
**理由**:
|
||
1. MVP核心功能已可用,可以先完成功能闭环
|
||
2. 统计和导出功能是用户强需求
|
||
3. 质量优化可以在功能完整后迭代
|
||
|
||
**后续计划**:
|
||
- Week 4功能完成后评估
|
||
- 根据用户反馈决定优化优先级
|
||
|
||
---
|
||
|
||
## 📚 相关文档
|
||
|
||
- [模块当前状态与开发指南](../00-模块当前状态与开发指南.md) - 已知问题来源
|
||
- [任务分解](../04-开发计划/03-任务分解.md) - Week 4任务清单
|
||
- [Prompt设计与测试报告](../05-开发记录/2025-11-18-Prompt设计与测试完成报告.md) - 质量问题分析
|
||
- [今日工作总结](../05-开发记录/2025-11-18-今日工作总结.md) - 边界问题诊断
|
||
|
||
---
|
||
|
||
**文档维护**:
|
||
- 每次发现新的技术债务时更新
|
||
- 每次解决技术债务后标记状态
|
||
- 定期评估优先级(每月)
|
||
|
||
**最后更新**:2025-11-21
|
||
**下次评估**:Week 4完成后
|
||
|