Features - User Management (Phase 4.1): - Database: Add user_modules table for fine-grained module permissions - Database: Add 4 user permissions (view/create/edit/delete) to role_permissions - Backend: UserService (780 lines) - CRUD with tenant isolation - Backend: UserController + UserRoutes (648 lines) - 13 API endpoints - Backend: Batch import users from Excel - Frontend: UserListPage (412 lines) - list/filter/search/pagination - Frontend: UserFormPage (341 lines) - create/edit with module config - Frontend: UserDetailPage (393 lines) - details/tenant/module management - Frontend: 3 modal components (592 lines) - import/assign/configure - API: GET/POST/PUT/DELETE /api/admin/users/* endpoints Architecture Upgrade - Module Permission System: - Backend: Add getUserModules() method in auth.service - Backend: Login API returns modules array in user object - Frontend: AuthContext adds hasModule() method - Frontend: Navigation filters modules based on user.modules - Frontend: RouteGuard checks requiredModule instead of requiredVersion - Frontend: Remove deprecated version-based permission system - UX: Only show accessible modules in navigation (clean UI) - UX: Smart redirect after login (avoid 403 for regular users) Fixes: - Fix UTF-8 encoding corruption in ~100 docs files - Fix pageSize type conversion in userService (String to Number) - Fix authUser undefined error in TopNavigation - Fix login redirect logic with role-based access check - Update Git commit guidelines v1.2 with UTF-8 safety rules Database Changes: - CREATE TABLE user_modules (user_id, tenant_id, module_code, is_enabled) - ADD UNIQUE CONSTRAINT (user_id, tenant_id, module_code) - INSERT 4 permissions + role assignments - UPDATE PUBLIC tenant with 8 module subscriptions Technical: - Backend: 5 new files (~2400 lines) - Frontend: 10 new files (~2500 lines) - Docs: 1 development record + 2 status updates + 1 guideline update - Total: ~4900 lines of code Status: User management 100% complete, module permission system operational
273 lines
6.9 KiB
Markdown
273 lines
6.9 KiB
Markdown
# ASL 测试数据集
|
||
|
||
> **创建日期:** 2025-11-15
|
||
> **维护人:** ASL 开发团队
|
||
> **用途:** 用于测试 AI 智能文献模块的准确率和质量
|
||
|
||
---
|
||
|
||
## 📋 数据集概览
|
||
|
||
本目录包含用于测试 ASL 模块各项功能的测试数据集,包括:
|
||
|
||
| 测试类型 | 文件夹 | 数据量 | 状态 |
|
||
|---------|--------|--------|------|
|
||
| **标题摘要初筛** | `screening/` | 199 篇 | ✅ 待导入 |
|
||
| **PDF 全文提取** | `pdf-extraction/` | 待补充 | ⏳ 待补充 |
|
||
|
||
---
|
||
|
||
## 📁 文件夹结构
|
||
|
||
```
|
||
03-测试数据/
|
||
├── README.md ← 当前文件
|
||
│
|
||
├── screening/ ← 标题摘要初筛测试数据
|
||
│ ├── literature-list-199.xlsx ← 199 篇文献列表(标题+摘要)
|
||
│ ├── picos-criteria.txt ← PICOS 标准定义
|
||
│ ├── inclusion-criteria.txt ← 纳入标准
|
||
│ ├── exclusion-criteria.txt ← 排除标准
|
||
│ └── gold-standard.json ← 人工标注的正确结果(金标准)
|
||
│
|
||
└── pdf-extraction/ ← PDF 全文提取测试数据
|
||
├── sample-01-rct.pdf ← RCT 研究样本
|
||
├── sample-02-cohort.pdf ← 队列研究样本
|
||
├── sample-03-with-tables.pdf ← 包含复杂表格的样本
|
||
├── sample-04-chinese.pdf ← 中文文献样本
|
||
└── README.md
|
||
```
|
||
|
||
---
|
||
|
||
## 🎯 使用方法
|
||
|
||
### 1. 导入测试数据
|
||
|
||
**请按以下步骤导入您的测试数据:**
|
||
|
||
#### (1)标题摘要初筛测试数据
|
||
|
||
**文件清单:**
|
||
- `literature-list-199.xlsx`:199 篇英文文献列表
|
||
- `picos-criteria.txt`:PICOS 标准(Population, Intervention, Comparison, Outcome, Study Design)
|
||
- `gold-standard.json`:人工标注的正确结果
|
||
|
||
**Excel 文件格式要求:**
|
||
```
|
||
列名(必须):
|
||
- Title(标题)
|
||
- Abstract(摘要)
|
||
- DOI(可选)
|
||
- Authors(作者,可选)
|
||
- Year(年份,可选)
|
||
- Journal(期刊,可选)
|
||
|
||
示例:
|
||
| Title | Abstract | DOI | Authors | Year | Journal |
|
||
|--------------------------------|---------------------------|---------------|--------------|------|---------|
|
||
| Effect of aspirin on ... | Background: ... | 10.1038/... | Smith J, ... | 2020 | NEJM |
|
||
```
|
||
|
||
**PICOS 标准格式:**
|
||
```txt
|
||
# PICOS 标准
|
||
|
||
## Population(人群)
|
||
- 成年高血压患者(年龄 ≥ 18 岁)
|
||
- 无心血管疾病史
|
||
|
||
## Intervention(干预)
|
||
- 每日服用阿司匹林 100mg
|
||
|
||
## Comparison(对照)
|
||
- 安慰剂或无治疗
|
||
|
||
## Outcome(结局)
|
||
- 主要结局:心血管事件发生率
|
||
- 次要结局:全因死亡率
|
||
|
||
## Study Design(研究设计)
|
||
- 随机对照试验(RCT)
|
||
- 队列研究(Cohort Study)
|
||
```
|
||
|
||
**金标准格式(JSON):**
|
||
```json
|
||
{
|
||
"metadata": {
|
||
"total": 199,
|
||
"annotatedBy": "医学专家姓名",
|
||
"annotatedDate": "2025-11-15",
|
||
"expectedAccuracy": 0.90
|
||
},
|
||
"results": [
|
||
{
|
||
"id": 1,
|
||
"doi": "10.1038/nature12373",
|
||
"title": "...",
|
||
"decision": "include",
|
||
"reason": "符合 PICO 标准:人群为成年高血压患者,干预为阿司匹林...",
|
||
"confidence": 1.0
|
||
},
|
||
{
|
||
"id": 2,
|
||
"decision": "exclude",
|
||
"reason": "不符合纳入标准:人群为儿童患者",
|
||
"confidence": 0.95
|
||
}
|
||
]
|
||
}
|
||
```
|
||
|
||
#### (2)PDF 全文提取测试数据
|
||
|
||
**建议准备的样本类型:**
|
||
- RCT 研究(随机对照试验)
|
||
- 队列研究(Cohort Study)
|
||
- 包含复杂表格的文献
|
||
- 包含数学公式的文献
|
||
- 中文医学文献(测试语言检测)
|
||
|
||
**样本数量建议:** 5-10 篇
|
||
|
||
### 2. 运行测试
|
||
|
||
#### (1)标题摘要初筛测试
|
||
|
||
```bash
|
||
# 进入后端目录
|
||
cd AIclinicalresearch/backend
|
||
|
||
# 运行初筛测试
|
||
npm run test:asl:screening
|
||
|
||
# 或者手动测试:
|
||
# 1. 启动后端服务
|
||
npm run dev
|
||
|
||
# 2. 通过前端上传 literature-list-199.xlsx
|
||
# 3. 配置 PICOS 标准(复制 picos-criteria.txt 内容)
|
||
# 4. 运行批量筛选
|
||
# 5. 导出结果,与 gold-standard.json 对比
|
||
```
|
||
|
||
#### (2)评估准确率
|
||
|
||
```bash
|
||
# 自动评估准确率(与金标准对比)
|
||
npm run test:asl:evaluate -- \
|
||
--result ./screening-result.json \
|
||
--gold-standard ./gold-standard.json
|
||
|
||
# 输出示例:
|
||
# ✅ 准确率: 92.5%
|
||
# ✅ 一致率: 88.9%
|
||
# ⚠️ 假阳性率: 5.2%
|
||
# ⚠️ 假阴性率: 2.3%
|
||
```
|
||
|
||
### 3. 质量指标
|
||
|
||
| 指标 | MVP 目标 | V1.0 目标 | V2.0 目标 |
|
||
|------|---------|----------|----------|
|
||
| **准确率** | ≥ 85% | ≥ 90% | ≥ 95% |
|
||
| **一致率**(双模型) | ≥ 80% | ≥ 85% | ≥ 90% |
|
||
| **假阳性率** | ≤ 10% | ≤ 5% | ≤ 3% |
|
||
| **假阴性率** | ≤ 5% | ≤ 3% | ≤ 2% |
|
||
|
||
---
|
||
|
||
## 📊 测试数据统计
|
||
|
||
### 标题摘要初筛数据集
|
||
|
||
**基本信息:**
|
||
- **总数量**: 199 篇
|
||
- **数据来源**: [请填写数据来源]
|
||
- **领域**: 医学/临床研究
|
||
- **语言**: 英文
|
||
- **年份范围**: [请填写]
|
||
|
||
**预期分布:**
|
||
```
|
||
纳入(Include): ~45 篇(23%)
|
||
排除(Exclude): ~132 篇(66%)
|
||
不确定(Uncertain): ~22 篇(11%)
|
||
```
|
||
|
||
**研究类型分布(预估):**
|
||
```
|
||
RCT: ~60 篇(30%)
|
||
队列研究: ~50 篇(25%)
|
||
病例对照: ~30 篇(15%)
|
||
横断面研究: ~30 篇(15%)
|
||
其他: ~29 篇(15%)
|
||
```
|
||
|
||
### PDF 全文提取数据集
|
||
|
||
**待补充**
|
||
|
||
---
|
||
|
||
## ⚠️ 数据使用注意事项
|
||
|
||
### 1. 版权声明
|
||
|
||
- 本测试数据集仅用于 ASL 模块开发和测试
|
||
- 不得用于商业用途
|
||
- 不得公开分发或传播
|
||
- 请遵守原文献的版权许可
|
||
|
||
### 2. 数据隐私
|
||
|
||
- 确保测试数据不包含敏感信息
|
||
- 如包含患者数据,必须已脱敏处理
|
||
- 遵守 GDPR、HIPAA 等数据保护法规
|
||
|
||
### 3. 质量要求
|
||
|
||
- **金标准必须由医学专家标注**
|
||
- 标注人需具备相关领域专业知识
|
||
- 标注过程需有质量控制机制
|
||
- 建议双人独立标注,冲突需第三方仲裁
|
||
|
||
---
|
||
|
||
## 🔄 数据更新记录
|
||
|
||
| 日期 | 更新内容 | 更新人 |
|
||
|------|---------|--------|
|
||
| 2025-11-15 | 创建测试数据目录结构 | ASL 团队 |
|
||
| 待更新 | 导入 199 篇文献测试数据 | - |
|
||
| 待更新 | 添加 PDF 样本数据 | - |
|
||
|
||
---
|
||
|
||
## 📞 联系方式
|
||
|
||
如有问题,请联系:
|
||
- **项目负责人**: [姓名]
|
||
- **邮箱**: [邮箱]
|
||
- **文档维护**: [文档路径]
|
||
|
||
---
|
||
|
||
## 📚 相关文档
|
||
|
||
- [标题摘要初筛测试用例](../02-标题摘要初筛测试用例.md)
|
||
- [测试计划](../01-测试计划.md)
|
||
- [文献处理技术选型](../../02-技术设计/07-文献处理技术选型.md)
|
||
- [质量保障与可追溯策略](../../02-技术设计/06-质量保障与可追溯策略.md)
|
||
|
||
---
|
||
|
||
**下一步行动:**
|
||
1. ✅ 创建测试数据目录结构
|
||
2. ⏳ 导入您的 199 篇文献测试数据(`literature-list-199.xlsx`)
|
||
3. ⏳ 创建 PICOS 标准文件(`picos-criteria.txt`)
|
||
4. ⏳ 准备金标准标注(`gold-standard.json`)
|
||
5. ⏳ 补充 PDF 样本数据
|
||
|