Files
AIclinicalresearch/backend/scripts/test-results/test-report-2025-11-18T07-48-51-245Z.md
HaHafeng 3634933ece refactor(asl): ASL frontend architecture refactoring with left navigation
- feat: Create ASLLayout component with 7-module left navigation
- feat: Implement Title Screening Settings page with optimized PICOS layout
- feat: Add placeholder pages for Workbench and Results
- fix: Fix nested routing structure for React Router v6
- fix: Resolve Spin component warning in MainLayout
- fix: Add QueryClientProvider to App.tsx
- style: Optimize PICOS form layout (P+I left, C+O+S right)
- style: Align Inclusion/Exclusion criteria side-by-side
- docs: Add architecture refactoring and routing fix reports

Ref: Week 2 Frontend Development
Scope: ASL module MVP - Title Abstract Screening
2025-11-18 21:51:51 +08:00

187 lines
4.7 KiB
Markdown

# LLM筛选质量测试报告
**测试时间**: 2025-11-18T07:48:51.247Z
**测试模型**: deepseek-chat + qwen-max
**测试样本数**: 10
---
## 质量指标
| 指标 | 实际值 | 目标值 | 状态 |
|------|--------|--------|------|
| 准确率 | 0.0% | ≥85% | ❌ |
| 一致率 | 0.0% | ≥80% | ❌ |
| 平均置信度 | 0.00 | - | - |
| 需人工复核率 | 100.0% | ≤20% | ❌ |
---
## 混淆矩阵
```
预测纳入 预测排除 不确定
实际纳入 0 0 -
实际排除 0 0 -
不确定 - - 0
```
---
## 详细结果
### 1. test-001
**标题**: Efficacy and Safety of Empagliflozin in Patients with Type 2 Diabetes: A Randomized, Double-Blind, Placebo-Controlled Trial
**期望决策**: include
**实际决策**: error
**结果**: ❌ 错误
**一致性**: ❌ 冲突
**平均置信度**: 0.00
**处理时间**: 8868ms
**需人工复核**: 是
**DeepSeek结论**: undefined (置信度: undefined)
**Qwen结论**: undefined (置信度: undefined)
### 2. test-002
**标题**: Cardiovascular Outcomes with Ertugliflozin in Type 2 Diabetes
**期望决策**: include
**实际决策**: error
**结果**: ❌ 错误
**一致性**: ❌ 冲突
**平均置信度**: 0.00
**处理时间**: 7365ms
**需人工复核**: 是
**DeepSeek结论**: undefined (置信度: undefined)
**Qwen结论**: undefined (置信度: undefined)
### 3. test-003
**标题**: Systematic Review and Meta-Analysis of SGLT2 Inhibitors in Type 2 Diabetes: A Comprehensive Assessment
**期望决策**: exclude
**实际决策**: error
**结果**: ❌ 错误
**一致性**: ❌ 冲突
**平均置信度**: 0.00
**处理时间**: 8163ms
**需人工复核**: 是
**DeepSeek结论**: undefined (置信度: undefined)
**Qwen结论**: undefined (置信度: undefined)
### 4. test-004
**标题**: Dapagliflozin Improves Cardiac Function in Diabetic Rats: An Experimental Study
**期望决策**: exclude
**实际决策**: error
**结果**: ❌ 错误
**一致性**: ❌ 冲突
**平均置信度**: 0.00
**处理时间**: 12106ms
**需人工复核**: 是
**DeepSeek结论**: undefined (置信度: undefined)
**Qwen结论**: undefined (置信度: undefined)
### 5. test-005
**标题**: Canagliflozin and Renal Outcomes in Type 2 Diabetes and Nephropathy
**期望决策**: include
**实际决策**: error
**结果**: ❌ 错误
**一致性**: ❌ 冲突
**平均置信度**: 0.00
**处理时间**: 4700ms
**需人工复核**: 是
**DeepSeek结论**: undefined (置信度: undefined)
**Qwen结论**: undefined (置信度: undefined)
### 6. test-006
**标题**: Real-World Experience with SGLT2 Inhibitors: A Retrospective Cohort Study
**期望决策**: exclude
**实际决策**: error
**结果**: ❌ 错误
**一致性**: ❌ 冲突
**平均置信度**: 0.00
**处理时间**: 7922ms
**需人工复核**: 是
**DeepSeek结论**: undefined (置信度: undefined)
**Qwen结论**: undefined (置信度: undefined)
### 7. test-007
**标题**: Pharmacokinetics and Pharmacodynamics of Empagliflozin in Healthy Volunteers
**期望决策**: exclude
**实际决策**: error
**结果**: ❌ 错误
**一致性**: ❌ 冲突
**平均置信度**: 0.00
**处理时间**: 7877ms
**需人工复核**: 是
**DeepSeek结论**: undefined (置信度: undefined)
**Qwen结论**: undefined (置信度: undefined)
### 8. test-008
**标题**: Comparative Effectiveness of SGLT2 Inhibitors versus DPP-4 Inhibitors in Elderly Patients with Type 2 Diabetes
**期望决策**: exclude
**实际决策**: error
**结果**: ❌ 错误
**一致性**: ❌ 冲突
**平均置信度**: 0.00
**处理时间**: 11004ms
**需人工复核**: 是
**DeepSeek结论**: undefined (置信度: undefined)
**Qwen结论**: undefined (置信度: undefined)
### 9. test-009
**标题**: Severe Diabetic Ketoacidosis Associated with SGLT2 Inhibitor Use: A Case Report
**期望决策**: exclude
**实际决策**: error
**结果**: ❌ 错误
**一致性**: ❌ 冲突
**平均置信度**: 0.00
**处理时间**: 11130ms
**需人工复核**: 是
**DeepSeek结论**: undefined (置信度: undefined)
**Qwen结论**: undefined (置信度: undefined)
### 10. test-010
**标题**: Effect of Sotagliflozin on Cardiovascular and Renal Events in Patients with Type 2 Diabetes and Moderate Renal Impairment
**期望决策**: uncertain
**实际决策**: error
**结果**: ❌ 错误
**一致性**: ❌ 冲突
**平均置信度**: 0.00
**处理时间**: 7387ms
**需人工复核**: 是
**DeepSeek结论**: undefined (置信度: undefined)
**Qwen结论**: undefined (置信度: undefined)
---
**生成时间**: 2025-11-18T07:48:51.247Z