Features - User Management (Phase 4.1): - Database: Add user_modules table for fine-grained module permissions - Database: Add 4 user permissions (view/create/edit/delete) to role_permissions - Backend: UserService (780 lines) - CRUD with tenant isolation - Backend: UserController + UserRoutes (648 lines) - 13 API endpoints - Backend: Batch import users from Excel - Frontend: UserListPage (412 lines) - list/filter/search/pagination - Frontend: UserFormPage (341 lines) - create/edit with module config - Frontend: UserDetailPage (393 lines) - details/tenant/module management - Frontend: 3 modal components (592 lines) - import/assign/configure - API: GET/POST/PUT/DELETE /api/admin/users/* endpoints Architecture Upgrade - Module Permission System: - Backend: Add getUserModules() method in auth.service - Backend: Login API returns modules array in user object - Frontend: AuthContext adds hasModule() method - Frontend: Navigation filters modules based on user.modules - Frontend: RouteGuard checks requiredModule instead of requiredVersion - Frontend: Remove deprecated version-based permission system - UX: Only show accessible modules in navigation (clean UI) - UX: Smart redirect after login (avoid 403 for regular users) Fixes: - Fix UTF-8 encoding corruption in ~100 docs files - Fix pageSize type conversion in userService (String to Number) - Fix authUser undefined error in TopNavigation - Fix login redirect logic with role-based access check - Update Git commit guidelines v1.2 with UTF-8 safety rules Database Changes: - CREATE TABLE user_modules (user_id, tenant_id, module_code, is_enabled) - ADD UNIQUE CONSTRAINT (user_id, tenant_id, module_code) - INSERT 4 permissions + role assignments - UPDATE PUBLIC tenant with 8 module subscriptions Technical: - Backend: 5 new files (~2400 lines) - Frontend: 10 new files (~2500 lines) - Docs: 1 development record + 2 status updates + 1 guideline update - Total: ~4900 lines of code Status: User management 100% complete, module permission system operational
860 lines
17 KiB
Markdown
860 lines
17 KiB
Markdown
# 智能Prompt生成模块 - 开发计划
|
||
|
||
**版本**: v1.0
|
||
**日期**: 2025-11-18
|
||
**原则**: 简单、直接、可执行
|
||
|
||
---
|
||
|
||
## 核心目标
|
||
|
||
**解决问题**: 消除AI与人类对边界情况的理解差异
|
||
|
||
**核心流程**:
|
||
```
|
||
用户输入PICOS → AI理解分析 → 生成Prompt → 用户修改 → 开始筛选
|
||
```
|
||
|
||
---
|
||
|
||
## MVP阶段(必做)
|
||
|
||
### 功能范围
|
||
|
||
#### 1. 用户输入 ✅
|
||
|
||
**前端表单**:
|
||
```typescript
|
||
{
|
||
pico: {
|
||
population: string; // 研究人群
|
||
intervention: string; // 干预措施
|
||
comparison: string; // 对照
|
||
outcome: string; // 结局指标
|
||
studyDesign: string; // 研究设计
|
||
},
|
||
inclusionCriteria: string; // 纳入标准
|
||
exclusionCriteria: string; // 排除标准
|
||
}
|
||
```
|
||
|
||
**实现**: 一个表单页面,7个输入框
|
||
|
||
---
|
||
|
||
#### 2. AI理解与分析 🆕
|
||
|
||
**输入**: 用户的PICOS + 纳排标准
|
||
|
||
**输出**:
|
||
```typescript
|
||
{
|
||
understanding: {
|
||
mustInclude: string[]; // 必须纳入的要素(3-5条)
|
||
mustExclude: string[]; // 必须排除的要素(3-5条)
|
||
ambiguities: [ // 模糊的边界情况(5-8个)
|
||
{
|
||
id: number;
|
||
question: string; // "如果研究人群是欧美但RCT质量高?"
|
||
aiSuggestion: 'include' | 'exclude' | 'uncertain';
|
||
reason: string; // AI的建议理由
|
||
}
|
||
]
|
||
}
|
||
}
|
||
```
|
||
|
||
**API**:
|
||
```
|
||
POST /api/v1/asl/analyze-picos
|
||
```
|
||
|
||
**实现**: 调用LLM分析用户输入
|
||
|
||
---
|
||
|
||
#### 3. 用户确认界面 🆕
|
||
|
||
**显示**:
|
||
- ✅ 必须纳入(可勾选/取消)
|
||
- ❌ 必须排除(可勾选/取消)
|
||
- 🤔 边界情况(逐个确认:纳入/排除/不确定)
|
||
|
||
**实现**: Modal对话框,分三个区域
|
||
|
||
---
|
||
|
||
#### 4. 自动生成Prompt 🆕
|
||
|
||
**输入**: 用户确认后的规则
|
||
|
||
**输出**: 完整的筛选Prompt
|
||
|
||
**关键**: 将用户确认的边界规则注入到Prompt中
|
||
|
||
```
|
||
## 特殊规则(基于您的确认)
|
||
|
||
1. 地域要求:优先亚洲人群,但欧美高质量RCT也可纳入
|
||
2. 研究类型:排除综述,但2020年后Meta分析可纳入
|
||
3. 对照类型:安慰剂对照,或另一种标准药物也可接受
|
||
...
|
||
```
|
||
|
||
**API**:
|
||
```
|
||
POST /api/v1/asl/generate-prompt
|
||
```
|
||
|
||
---
|
||
|
||
#### 5. Prompt编辑器 🆕
|
||
|
||
**功能**:
|
||
- 显示生成的Prompt
|
||
- 支持用户编辑
|
||
- 保存并使用
|
||
|
||
**实现**: 简单的Textarea + 保存按钮
|
||
|
||
---
|
||
|
||
#### 6. 筛选结果增强 ⭐ **重要**
|
||
|
||
**当前问题**: 只显示最终决策(include/exclude/pending)
|
||
|
||
**改进**: 显示**两个模型的完整理由**
|
||
|
||
```typescript
|
||
{
|
||
literatureId: string;
|
||
finalDecision: 'include' | 'exclude' | 'pending';
|
||
|
||
// ⭐ 新增:两个模型的详细结果
|
||
model1: {
|
||
modelName: 'DeepSeek-V3';
|
||
conclusion: 'exclude';
|
||
confidence: 0.92;
|
||
judgment: { P: 'match', I: 'match', C: 'mismatch', S: 'match' };
|
||
reason: '虽然P、I、S维度匹配,但对照组为另一种药物而非安慰剂...' // ⭐ 关键
|
||
},
|
||
model2: {
|
||
modelName: 'Qwen-Max';
|
||
conclusion: 'include';
|
||
confidence: 0.85;
|
||
judgment: { P: 'match', I: 'match', C: 'partial', S: 'match' };
|
||
reason: '研究人群和干预措施匹配,对照组虽非安慰剂但有对比意义...' // ⭐ 关键
|
||
},
|
||
|
||
hasConflict: true; // 两个模型判断不一致
|
||
conflictFields: ['conclusion', 'C'];
|
||
}
|
||
```
|
||
|
||
**前端显示**:
|
||
```jsx
|
||
<Card title="筛选结果">
|
||
<Alert type={finalDecision === 'pending' ? 'warning' : 'success'}>
|
||
最终决策: {finalDecision}
|
||
</Alert>
|
||
|
||
<Divider />
|
||
|
||
<Row gutter={16}>
|
||
<Col span={12}>
|
||
<Card title="🤖 DeepSeek-V3" type="inner">
|
||
<Tag color={model1.conclusion === 'include' ? 'green' : 'red'}>
|
||
{model1.conclusion}
|
||
</Tag>
|
||
<Statistic title="置信度" value={model1.confidence} />
|
||
<Divider />
|
||
<h4>判断理由:</h4>
|
||
<p>{model1.reason}</p> {/* ⭐ 显示理由 */}
|
||
<Collapse>
|
||
<Panel header="PICO维度详情">
|
||
P: {model1.judgment.P}<br/>
|
||
I: {model1.judgment.I}<br/>
|
||
C: {model1.judgment.C}<br/>
|
||
S: {model1.judgment.S}
|
||
</Panel>
|
||
</Collapse>
|
||
</Card>
|
||
</Col>
|
||
|
||
<Col span={12}>
|
||
<Card title="🤖 Qwen-Max" type="inner">
|
||
{/* 同上 */}
|
||
</Card>
|
||
</Col>
|
||
</Row>
|
||
|
||
{hasConflict && (
|
||
<Alert type="warning" showIcon>
|
||
⚠️ 两个模型判断不一致,建议人工复核
|
||
</Alert>
|
||
)}
|
||
|
||
{/* ⭐ 人工复核按钮 */}
|
||
<Button type="primary" onClick={handleManualReview}>
|
||
人工复核此文献
|
||
</Button>
|
||
</Card>
|
||
```
|
||
|
||
---
|
||
|
||
### MVP开发清单
|
||
|
||
**Week 1: 后端**
|
||
|
||
| 任务 | 估时 | 优先级 |
|
||
|------|------|--------|
|
||
| API: 分析PICOS | 2天 | P0 |
|
||
| API: 生成Prompt | 1天 | P0 |
|
||
| 增强筛选结果结构 | 0.5天 | P0 |
|
||
| 测试 | 0.5天 | P0 |
|
||
|
||
**Week 2: 前端**
|
||
|
||
| 任务 | 估时 | 优先级 |
|
||
|------|------|--------|
|
||
| PICOS输入表单 | 0.5天 | P0 |
|
||
| 用户确认界面 | 1.5天 | P0 |
|
||
| Prompt编辑器 | 0.5天 | P0 |
|
||
| 结果展示增强 | 1天 | P0 |
|
||
| 测试与调优 | 0.5天 | P0 |
|
||
|
||
**总计**: 2周(10个工作日)
|
||
|
||
---
|
||
|
||
## 2.0阶段(可选功能)
|
||
|
||
### 功能1: Few-shot自动学习 🔮
|
||
|
||
**触发场景**: 用户纠正AI判断后
|
||
|
||
**流程**:
|
||
```
|
||
1. AI判断: Exclude
|
||
2. 用户纠正: 应该是Include
|
||
3. 用户说明理由: "虽然是欧美人群,但RCT质量高"
|
||
↓
|
||
4. 系统记录案例
|
||
↓
|
||
5. 下次筛选时,将此案例作为Few-shot示例加入Prompt
|
||
```
|
||
|
||
**数据结构**:
|
||
```typescript
|
||
{
|
||
caseId: string;
|
||
literature: {
|
||
title: string;
|
||
abstract: string;
|
||
},
|
||
aiDecision: 'exclude';
|
||
userDecision: 'include';
|
||
userReason: '虽然是欧美人群,但RCT质量高';
|
||
picoCriteria: {...}; // 当时的PICOS
|
||
createdAt: Date;
|
||
}
|
||
```
|
||
|
||
**Prompt增强**:
|
||
```
|
||
## 参考案例(Few-shot示例)
|
||
|
||
以下是您之前纠正的案例,请参考:
|
||
|
||
案例1:
|
||
标题: TICA-CLOP STUDY...
|
||
AI判断: Exclude(因为北非人群)
|
||
您的决策: Include
|
||
您的理由: 虽然是北非人群,但RCT质量高,方法有参考价值
|
||
→ 启示: 地域要求可以灵活,如果研究质量高
|
||
|
||
案例2:
|
||
...
|
||
```
|
||
|
||
**实现复杂度**: 中等(需要案例库管理)
|
||
|
||
---
|
||
|
||
### 功能2: 测试模式 🧪
|
||
|
||
**使用场景**: 用户想先测试10篇文献,训练AI理解
|
||
|
||
**流程**:
|
||
```
|
||
1. 用户上传10篇测试文献(5篇纳入 + 5篇排除)
|
||
↓
|
||
2. 用户逐篇标注: Include/Exclude + 理由
|
||
↓
|
||
3. AI学习用户的判断模式
|
||
↓
|
||
4. 生成定制化Prompt
|
||
↓
|
||
5. 用于正式筛选
|
||
```
|
||
|
||
**界面**:
|
||
```jsx
|
||
<TestMode>
|
||
<Upload>上传10篇测试文献(Excel/JSON)</Upload>
|
||
|
||
<Table>
|
||
{testCases.map(lit => (
|
||
<Row>
|
||
<td>{lit.title}</td>
|
||
<td>
|
||
<Radio.Group>
|
||
<Radio value="include">纳入</Radio>
|
||
<Radio value="exclude">排除</Radio>
|
||
</Radio.Group>
|
||
</td>
|
||
<td>
|
||
<Input.TextArea placeholder="请说明理由" />
|
||
</td>
|
||
</Row>
|
||
))}
|
||
</Table>
|
||
|
||
<Button onClick={analyzeTestCases}>
|
||
分析我的判断模式
|
||
</Button>
|
||
</TestMode>
|
||
```
|
||
|
||
**AI分析**:
|
||
```
|
||
用户的判断模式分析:
|
||
|
||
1. 地域灵活性:
|
||
- 案例1(北非RCT)→ 纳入
|
||
- 案例3(欧洲队列)→ 排除
|
||
→ 结论: 只要是RCT就可接受非亚洲人群
|
||
|
||
2. 研究类型:
|
||
- 案例2(Meta分析)→ 纳入
|
||
- 案例5(传统综述)→ 排除
|
||
→ 结论: Meta分析可接受,传统综述排除
|
||
|
||
3. 时间要求:
|
||
- 案例4(2019年发表)→ 排除
|
||
→ 结论: 严格执行2020年后要求
|
||
```
|
||
|
||
**实现复杂度**: 高(需要模式识别)
|
||
|
||
---
|
||
|
||
### 功能3: Prompt模板库 📚
|
||
|
||
**功能**:
|
||
- 保存用户生成的Prompt为模板
|
||
- 下次可以直接复用
|
||
- 可以分享给团队成员
|
||
|
||
**实现复杂度**: 低
|
||
|
||
---
|
||
|
||
### 2.0开发清单
|
||
|
||
| 功能 | 估时 | 优先级 | 依赖 |
|
||
|------|------|--------|------|
|
||
| Few-shot学习 | 3天 | P1 | MVP完成 |
|
||
| 测试模式 | 5天 | P2 | MVP完成 |
|
||
| Prompt模板库 | 2天 | P1 | MVP完成 |
|
||
|
||
**总计**: 2周
|
||
|
||
---
|
||
|
||
## 技术实现细节
|
||
|
||
### 1. AI分析PICOS的Prompt
|
||
|
||
```typescript
|
||
const analyzePrompt = `
|
||
你是医学文献筛选专家。用户提供了PICOS标准和纳排标准,请分析并生成:
|
||
|
||
【用户输入】
|
||
人群: ${population}
|
||
干预: ${intervention}
|
||
对照: ${comparison}
|
||
结局: ${outcome}
|
||
设计: ${studyDesign}
|
||
|
||
纳入标准:
|
||
${inclusionCriteria}
|
||
|
||
排除标准:
|
||
${exclusionCriteria}
|
||
|
||
【分析任务】
|
||
1. 提取必须纳入的核心要素(3-5条)
|
||
2. 提取必须排除的要素(3-5条)
|
||
3. 识别模糊的边界情况(5-8个),每个边界情况包括:
|
||
- 具体问题描述
|
||
- 你的建议(include/exclude/uncertain)
|
||
- 建议理由
|
||
|
||
【输出格式】
|
||
严格JSON格式:
|
||
{
|
||
"mustInclude": ["要素1", "要素2", ...],
|
||
"mustExclude": ["要素1", "要素2", ...],
|
||
"ambiguities": [
|
||
{
|
||
"id": 1,
|
||
"question": "如果研究人群是欧美但RCT质量高?",
|
||
"aiSuggestion": "exclude",
|
||
"reason": "用户明确要求'亚洲人群',其他地域不符合"
|
||
},
|
||
...
|
||
]
|
||
}
|
||
`;
|
||
```
|
||
|
||
---
|
||
|
||
### 2. 生成Prompt的核心逻辑
|
||
|
||
```typescript
|
||
function generateCustomPrompt(
|
||
pico: PicoCriteria,
|
||
inclusionCriteria: string,
|
||
exclusionCriteria: string,
|
||
userConfirmedRules: BoundaryRule[]
|
||
): string {
|
||
|
||
// 基础Prompt(从标准模板开始)
|
||
let prompt = getStandardPromptTemplate();
|
||
|
||
// 注入用户确认的边界规则
|
||
const boundaryRulesSection = `
|
||
## ⭐ 特殊边界规则(基于您的确认)
|
||
|
||
${userConfirmedRules.map((rule, index) => `
|
||
${index + 1}. ${rule.category}:
|
||
- 标准规则: ${rule.standardRule}
|
||
- 您的确认: ${rule.userDecision === 'include' ? '✅ 可以纳入' : '❌ 必须排除'}
|
||
- 具体情况: ${rule.situation}
|
||
`).join('\n')}
|
||
|
||
⚠️ 请严格遵守以上特殊规则,这些是用户明确确认的判断标准。
|
||
`;
|
||
|
||
// 将边界规则插入到Prompt的合适位置
|
||
prompt = prompt.replace(
|
||
'## 筛选任务',
|
||
boundaryRulesSection + '\n\n## 筛选任务'
|
||
);
|
||
|
||
return prompt;
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
### 3. 数据库设计
|
||
|
||
**新表: prompt_configurations**
|
||
|
||
```sql
|
||
CREATE TABLE asl_schema.prompt_configurations (
|
||
id UUID PRIMARY KEY,
|
||
user_id VARCHAR(50) NOT NULL,
|
||
project_id UUID NOT NULL,
|
||
|
||
-- 用户输入
|
||
pico_criteria JSONB NOT NULL,
|
||
inclusion_criteria TEXT NOT NULL,
|
||
exclusion_criteria TEXT NOT NULL,
|
||
|
||
-- AI分析结果
|
||
ai_understanding JSONB NOT NULL, -- mustInclude, mustExclude, ambiguities
|
||
|
||
-- 用户确认
|
||
user_confirmed_rules JSONB NOT NULL, -- 用户确认后的边界规则
|
||
|
||
-- 生成的Prompt
|
||
generated_prompt TEXT NOT NULL,
|
||
final_prompt TEXT NOT NULL, -- 用户编辑后的最终版本
|
||
|
||
-- 元数据
|
||
version VARCHAR(20) DEFAULT 'v1.0',
|
||
is_template BOOLEAN DEFAULT false,
|
||
template_name VARCHAR(100),
|
||
|
||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
||
);
|
||
```
|
||
|
||
**新表: few_shot_cases**(2.0阶段)
|
||
|
||
```sql
|
||
CREATE TABLE asl_schema.few_shot_cases (
|
||
id UUID PRIMARY KEY,
|
||
user_id VARCHAR(50) NOT NULL,
|
||
project_id UUID NOT NULL,
|
||
|
||
-- 文献信息
|
||
literature_id UUID NOT NULL,
|
||
literature_title TEXT NOT NULL,
|
||
literature_abstract TEXT NOT NULL,
|
||
|
||
-- AI判断
|
||
ai_decision VARCHAR(20) NOT NULL, -- include/exclude
|
||
ai_reason TEXT NOT NULL,
|
||
|
||
-- 用户纠正
|
||
user_decision VARCHAR(20) NOT NULL,
|
||
user_reason TEXT NOT NULL,
|
||
|
||
-- PICOS上下文
|
||
pico_criteria JSONB NOT NULL,
|
||
|
||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
||
);
|
||
```
|
||
|
||
---
|
||
|
||
## API设计
|
||
|
||
### MVP阶段
|
||
|
||
#### 1. 分析PICOS
|
||
|
||
```
|
||
POST /api/v1/asl/prompt/analyze
|
||
|
||
Request:
|
||
{
|
||
"projectId": "uuid",
|
||
"pico": {
|
||
"population": "...",
|
||
"intervention": "...",
|
||
"comparison": "...",
|
||
"outcome": "...",
|
||
"studyDesign": "..."
|
||
},
|
||
"inclusionCriteria": "...",
|
||
"exclusionCriteria": "..."
|
||
}
|
||
|
||
Response:
|
||
{
|
||
"success": true,
|
||
"data": {
|
||
"configId": "uuid", // 保存的配置ID
|
||
"understanding": {
|
||
"mustInclude": ["要素1", "要素2"],
|
||
"mustExclude": ["要素1", "要素2"],
|
||
"ambiguities": [
|
||
{
|
||
"id": 1,
|
||
"question": "...",
|
||
"aiSuggestion": "exclude",
|
||
"reason": "..."
|
||
}
|
||
]
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
#### 2. 确认边界规则
|
||
|
||
```
|
||
POST /api/v1/asl/prompt/confirm-rules
|
||
|
||
Request:
|
||
{
|
||
"configId": "uuid",
|
||
"confirmedRules": [
|
||
{
|
||
"ambiguityId": 1,
|
||
"userDecision": "include", // include/exclude/uncertain
|
||
"userNote": "虽然不是亚洲人群,但RCT质量高" // 可选
|
||
}
|
||
]
|
||
}
|
||
|
||
Response:
|
||
{
|
||
"success": true,
|
||
"data": {
|
||
"generatedPrompt": "完整的Prompt文本..."
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
#### 3. 保存最终Prompt
|
||
|
||
```
|
||
POST /api/v1/asl/prompt/save
|
||
|
||
Request:
|
||
{
|
||
"configId": "uuid",
|
||
"finalPrompt": "用户编辑后的Prompt...",
|
||
"saveAsTemplate": false,
|
||
"templateName": "" // 如果保存为模板
|
||
}
|
||
|
||
Response:
|
||
{
|
||
"success": true,
|
||
"data": {
|
||
"configId": "uuid",
|
||
"promptVersion": "v1.0.1"
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
#### 4. 使用自定义Prompt筛选
|
||
|
||
```
|
||
POST /api/v1/asl/screen/literature
|
||
|
||
Request:
|
||
{
|
||
"projectId": "uuid",
|
||
"literatureId": "uuid",
|
||
"configId": "uuid", // 使用哪个Prompt配置
|
||
"models": ["deepseek-chat", "qwen-max"]
|
||
}
|
||
|
||
Response:
|
||
{
|
||
"success": true,
|
||
"data": {
|
||
"literatureId": "uuid",
|
||
"finalDecision": "pending",
|
||
|
||
// ⭐ 关键:两个模型的详细结果
|
||
"model1": {
|
||
"modelName": "DeepSeek-V3",
|
||
"conclusion": "exclude",
|
||
"confidence": 0.92,
|
||
"judgment": {...},
|
||
"evidence": {...},
|
||
"reason": "完整的排除理由..." // ⭐
|
||
},
|
||
"model2": {
|
||
"modelName": "Qwen-Max",
|
||
"conclusion": "include",
|
||
"confidence": 0.85,
|
||
"judgment": {...},
|
||
"evidence": {...},
|
||
"reason": "完整的纳入理由..." // ⭐
|
||
},
|
||
|
||
"hasConflict": true,
|
||
"conflictFields": ["conclusion"]
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
### 2.0阶段(可选)
|
||
|
||
#### 5. 提交Few-shot案例
|
||
|
||
```
|
||
POST /api/v1/asl/prompt/add-few-shot
|
||
|
||
Request:
|
||
{
|
||
"configId": "uuid",
|
||
"literatureId": "uuid",
|
||
"aiDecision": "exclude",
|
||
"aiReason": "...",
|
||
"userDecision": "include",
|
||
"userReason": "虽然是欧美人群,但..."
|
||
}
|
||
|
||
Response:
|
||
{
|
||
"success": true,
|
||
"data": {
|
||
"caseId": "uuid",
|
||
"totalCases": 3 // 已有多少个Few-shot案例
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
#### 6. 基于Few-shot重新生成Prompt
|
||
|
||
```
|
||
POST /api/v1/asl/prompt/regenerate-with-few-shot
|
||
|
||
Request:
|
||
{
|
||
"configId": "uuid"
|
||
}
|
||
|
||
Response:
|
||
{
|
||
"success": true,
|
||
"data": {
|
||
"updatedPrompt": "包含Few-shot示例的新Prompt...",
|
||
"fewShotCasesUsed": 3
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 测试计划
|
||
|
||
### MVP测试
|
||
|
||
**测试数据**: 卒中研究(已有5篇)
|
||
|
||
**测试场景**:
|
||
|
||
1. **场景1: 正常流程**
|
||
- 输入PICOS → AI分析 → 用户确认 → 生成Prompt → 筛选
|
||
- 验证:两个模型的理由是否完整显示
|
||
|
||
2. **场景2: 边界情况确认**
|
||
- 用户确认"欧美RCT可纳入" → 验证Prompt中是否包含此规则
|
||
- 验证:实际筛选时是否遵守此规则
|
||
|
||
3. **场景3: 用户编辑Prompt**
|
||
- 用户修改生成的Prompt → 验证修改是否生效
|
||
|
||
4. **场景4: 模型冲突**
|
||
- 验证:两个模型判断不一致时,理由是否清晰展示
|
||
|
||
**测试指标**:
|
||
- Prompt生成准确率: >90%
|
||
- 用户满意度: >80%
|
||
- 理由展示完整性: 100%
|
||
|
||
---
|
||
|
||
### 2.0测试
|
||
|
||
**测试场景**:
|
||
|
||
1. **Few-shot学习**
|
||
- 用户纠正3个案例 → 验证Prompt中是否包含这些案例
|
||
- 验证:新的筛选是否改进
|
||
|
||
2. **测试模式**
|
||
- 用户标注10篇 → AI分析模式 → 生成Prompt
|
||
- 验证:生成的Prompt是否符合用户偏好
|
||
|
||
---
|
||
|
||
## 成功标准
|
||
|
||
### MVP阶段
|
||
|
||
| 指标 | 目标 |
|
||
|------|------|
|
||
| Prompt生成准确率 | >90% |
|
||
| 用户完成配置时间 | <5分钟 |
|
||
| 理由展示完整性 | 100% |
|
||
| 模型冲突识别率 | 100% |
|
||
| 用户满意度 | >80% |
|
||
|
||
### 2.0阶段
|
||
|
||
| 指标 | 目标 |
|
||
|------|------|
|
||
| Few-shot改进准确率 | +15% |
|
||
| 测试模式匹配度 | >85% |
|
||
| Prompt模板复用率 | >60% |
|
||
|
||
---
|
||
|
||
## 风险与应对
|
||
|
||
### 风险1: LLM生成的边界问题质量不稳定
|
||
|
||
**应对**:
|
||
- 使用Few-shot Prompt
|
||
- 人工审核常见边界情况
|
||
- 提供默认边界问题库
|
||
|
||
### 风险2: 用户不愿意花时间确认
|
||
|
||
**应对**:
|
||
- 只显示5个高优先级问题
|
||
- 其他使用AI默认建议
|
||
- 提供"快速模式"(跳过确认)
|
||
|
||
### 风险3: 两个模型理由过长,难以对比
|
||
|
||
**应对**:
|
||
- 提取理由关键句(前100字)
|
||
- 提供展开/收起按钮
|
||
- 高亮冲突点
|
||
|
||
---
|
||
|
||
## 总结
|
||
|
||
### MVP核心(必做)
|
||
|
||
1. ✅ PICOS输入表单
|
||
2. ✅ AI分析与边界问题生成
|
||
3. ✅ 用户确认界面
|
||
4. ✅ 自动生成Prompt
|
||
5. ✅ Prompt编辑器
|
||
6. ✅ **显示两个模型的完整理由** ⭐
|
||
|
||
**开发时间**: 2周
|
||
|
||
---
|
||
|
||
### 2.0扩展(可选)
|
||
|
||
1. 🔮 Few-shot自动学习
|
||
2. 🧪 测试模式
|
||
3. 📚 Prompt模板库
|
||
|
||
**开发时间**: 2周
|
||
|
||
---
|
||
|
||
**原则**: MVP先做到简单可用,2.0再做智能化
|
||
|
||
**下一步**: 开始MVP阶段开发
|
||
|
||
---
|
||
|
||
**文档版本**: v1.0
|
||
**作者**: AI Assistant
|
||
**审核**: [待用户确认]
|
||
**日期**: 2025-11-18
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|