Files

HaHafeng 66255368b7 feat(admin): Add user management and upgrade to module permission system

Features - User Management (Phase 4.1):
- Database: Add user_modules table for fine-grained module permissions
- Database: Add 4 user permissions (view/create/edit/delete) to role_permissions
- Backend: UserService (780 lines) - CRUD with tenant isolation
- Backend: UserController + UserRoutes (648 lines) - 13 API endpoints
- Backend: Batch import users from Excel
- Frontend: UserListPage (412 lines) - list/filter/search/pagination
- Frontend: UserFormPage (341 lines) - create/edit with module config
- Frontend: UserDetailPage (393 lines) - details/tenant/module management
- Frontend: 3 modal components (592 lines) - import/assign/configure
- API: GET/POST/PUT/DELETE /api/admin/users/* endpoints

Architecture Upgrade - Module Permission System:
- Backend: Add getUserModules() method in auth.service
- Backend: Login API returns modules array in user object
- Frontend: AuthContext adds hasModule() method
- Frontend: Navigation filters modules based on user.modules
- Frontend: RouteGuard checks requiredModule instead of requiredVersion
- Frontend: Remove deprecated version-based permission system
- UX: Only show accessible modules in navigation (clean UI)
- UX: Smart redirect after login (avoid 403 for regular users)

Fixes:
- Fix UTF-8 encoding corruption in ~100 docs files
- Fix pageSize type conversion in userService (String to Number)
- Fix authUser undefined error in TopNavigation
- Fix login redirect logic with role-based access check
- Update Git commit guidelines v1.2 with UTF-8 safety rules

Database Changes:
- CREATE TABLE user_modules (user_id, tenant_id, module_code, is_enabled)
- ADD UNIQUE CONSTRAINT (user_id, tenant_id, module_code)
- INSERT 4 permissions + role assignments
- UPDATE PUBLIC tenant with 8 module subscriptions

Technical:
- Backend: 5 new files (~2400 lines)
- Frontend: 10 new files (~2500 lines)
- Docs: 1 development record + 2 status updates + 1 guideline update
- Total: ~4900 lines of code

Status: User management 100% complete, module permission system operational

2026-01-16 13:42:10 +08:00

16 KiB

Raw Permalink Blame History

API设计文档 - 工具B（病历结构化机器人）

模块: DC数据清洗整理 - 工具B
版本: V2.0 (MVP)
Base URL: /api/v1/dc/tool-b
更新日期: 2025-12-03
状态: ✅ MVP完成（8个API端点全部可用，已验证）

一、API概览

1.1 端点列表

#	方法	路径	说明	后端状态	前端状态	测试状态
0	POST	`/upload`	文件上传	✅ 已完成	✅ 已对接	✅ 通过
1	POST	`/health-check`	健康检查	✅ 已完成	✅ 已对接	✅ 通过
2	GET	`/templates`	获取模板列表	✅ 已完成	✅ 已对接	✅ 通过
3	POST	`/tasks`	创建提取任务	✅ 已完成	✅ 已对接	✅ 通过
4	GET	`/tasks/:taskId/progress`	查询任务进度	✅ 已完成	✅ 已对接	✅ 通过
5	GET	`/tasks/:taskId/items`	获取验证网格数据	✅ 已完成	✅ 已对接	✅ 通过
6	POST	`/items/:itemId/resolve`	裁决冲突	✅ 已完成	✅ 已对接	✅ 通过
7	GET	`/tasks/:taskId/export`	导出Excel结果	✅ 已完成	✅ 已对接	✅ 通过

✅ MVP完成状态（2025-12-03）：

后端代码：~2200行（含Service、Controller、Routes）
前端代码：~1400行（5步工作流完整实现）
数据库表：4张表已创建，3个预设模板已就绪
API对接：8个端点全部集成并测试通过
LLM调用：DeepSeek-V3 + Qwen-Max 双模型验证成功
真实测试：9条病理数据提取成功，Token消耗~10k
已知问题：4个技术债务（见07-技术债务/Tool-B技术债务清单.md）

1.2 通用规范

请求头：

Content-Type: application/json
Authorization: Bearer {token}  # 未来实现

响应格式：

{
  "data": {...},      // 成功时返回
  "error": "...",     // 失败时返回
  "code": 200
}

HTTP状态码：

200: 成功
400: 请求参数错误
401: 未认证
403: 无权限
404: 资源不存在
500: 服务器内部错误

二、认证与鉴权

2.1 认证机制

当前阶段（MVP）：

❌ 暂不实现认证
使用临时userId标识（从请求上下文获取）

未来实现（V1.0）：

Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...

2.2 权限模型

操作	权限要求	说明
健康检查	user	所有用户
查看模板	user	所有用户
创建任务	user	所有用户
查询任务	owner	仅任务创建者
裁决冲突	owner	仅任务创建者

三、API端点详情

3.1 健康检查

端点: POST /api/v1/dc/tool-b/health-check

用途: 检查Excel列的数据质量，拦截低质量数据

请求体：

{
  "fileKey": "uploads/user123/data.xlsx",
  "columnName": "病历文本"
}

请求参数：

字段	类型	必填	说明
`fileKey`	string	✅	Storage中的文件路径
`columnName`	string	✅	要检查的列名

响应（成功 - 200）：

{
  "status": "good",
  "emptyRate": 0.12,
  "avgLength": 256.8,
  "totalRows": 500,
  "estimatedTokens": 150000,
  "message": "健康度良好，预计消耗约 150.0k Token（双模型约 300.0k Token）"
}

响应（失败 - 200但status=bad）：

{
  "status": "bad",
  "emptyRate": 0.85,
  "avgLength": 256.8,
  "totalRows": 500,
  "estimatedTokens": 0,
  "message": "空值率过高（85.0%），该列不适合提取"
}

响应字段：

字段	类型	说明
`status`	string	`good` 或 `bad`
`emptyRate`	number	空值率 (0-1)
`avgLength`	number	平均文本长度
`totalRows`	number	总行数
`estimatedTokens`	number	预估Token数
`message`	string	提示信息

业务规则：

空值率 > 80% → status = 'bad'
平均长度 < 10 → status = 'bad'
只检查前100行（性能优化）

错误响应：

{
  "error": "列'病历文本'不存在",
  "code": 400
}

3.2 获取模板列表

端点: GET /api/v1/dc/tool-b/templates

用途: 获取所有预设的提取模板

请求: 无参数

响应（200）：

{
  "templates": [
    {
      "diseaseType": "lung_cancer",
      "reportType": "pathology",
      "displayName": "肺癌病理报告",
      "fields": [
        {
          "name": "病理类型",
          "desc": "如：浸润性腺癌、鳞状细胞癌",
          "width": "w-40"
        },
        {
          "name": "分化程度",
          "desc": "高/中/低分化",
          "width": "w-32"
        }
      ]
    },
    {
      "diseaseType": "diabetes",
      "reportType": "admission",
      "displayName": "糖尿病入院记录",
      "fields": [...]
    }
  ]
}

响应字段：

字段	类型	说明
`templates`	array	模板列表
`templates[].diseaseType`	string	疾病类型
`templates[].reportType`	string	报告类型
`templates[].displayName`	string	显示名称
`templates[].fields`	array	提取字段配置

缓存策略：

客户端缓存：1小时
服务端缓存：永久（直到重启）

3.3 创建提取任务

端点: POST /api/v1/dc/tool-b/tasks

用途: 创建批量提取任务，推送到异步队列

请求体：

{
  "projectName": "肺癌病理数据提取-2025Q1",
  "fileKey": "uploads/user123/lung_cancer_pathology.xlsx",
  "textColumn": "病历文本",
  "diseaseType": "lung_cancer",
  "reportType": "pathology",
  "targetFields": [
    {
      "name": "病理类型",
      "desc": "如：浸润性腺癌、鳞状细胞癌"
    },
    {
      "name": "分化程度",
      "desc": "高/中/低分化"
    }
  ]
}

请求参数：

字段	类型	必填	说明
`projectName`	string	✅	任务名称
`fileKey`	string	✅	Storage中的文件路径
`textColumn`	string	✅	文本列名
`diseaseType`	string	✅	疾病类型
`reportType`	string	✅	报告类型
`targetFields`	array	✅	提取字段配置

响应（200）：

{
  "taskId": "550e8400-e29b-41d4-a716-446655440000"
}

流程：

验证文件存在
解析Excel，统计总行数
创建任务记录（status=pending）
推送到BullMQ队列
立即返回taskId

错误响应：

{
  "error": "文件不存在: uploads/user123/lung_cancer_pathology.xlsx",
  "code": 404
}

3.4 查询任务进度

端点: GET /api/v1/dc/tool-b/tasks/:taskId/progress

用途: 实时查询任务处理进度

请求:

GET /api/v1/dc/tool-b/tasks/550e8400-e29b-41d4-a716-446655440000/progress

响应（200）：

{
  "taskId": "550e8400-e29b-41d4-a716-446655440000",
  "status": "processing",
  "progress": 50,
  "totalCount": 500,
  "processedCount": 250,
  "cleanCount": 200,
  "conflictCount": 45,
  "failedCount": 5,
  "totalTokens": 75000,
  "totalCost": 0.135,
  "startedAt": "2025-11-27T10:00:00.000Z",
  "completedAt": null
}

响应字段：

字段	类型	说明
`status`	string	`pending/processing/completed/failed`
`progress`	number	进度百分比 (0-100)
`totalCount`	number	总记录数
`processedCount`	number	已处理数
`cleanCount`	number	一致记录数
`conflictCount`	number	冲突记录数
`failedCount`	number	失败记录数
`totalTokens`	number	累计Token数
`totalCost`	number	累计成本($)

轮询建议：

客户端每3秒轮询一次
当status = 'completed'时停止轮询

3.5 获取验证网格数据

端点: GET /api/v1/dc/tool-b/tasks/:taskId/items

用途: 获取双模型提取结果，用于人工裁决

请求:

GET /api/v1/dc/tool-b/tasks/550e8400.../items?page=1&limit=50&status=conflict

查询参数：

参数	类型	必填	默认值	说明
`page`	number	❌	1	页码
`limit`	number	❌	50	每页数量
`status`	string	❌	-	过滤状态

响应（200）：

{
  "items": [
    {
      "id": "item-123",
      "rowIndex": 5,
      "originalText": "患者，男，45岁，诊断为浸润性腺癌，中分化，肿瘤最大径3cm...",
      "resultA": {
        "病理类型": "浸润性腺癌",
        "分化程度": "中分化",
        "肿瘤大小": "3cm"
      },
      "resultB": {
        "病理类型": "浸润性腺癌",
        "分化程度": "中分化",
        "肿瘤大小": "3.0cm"
      },
      "status": "conflict",
      "conflictFields": ["肿瘤大小"],
      "finalResult": null
    }
  ],
  "pagination": {
    "total": 45,
    "page": 1,
    "pageSize": 50,
    "totalPages": 1
  }
}

响应字段：

字段	类型	说明
`items`	array	记录列表
`items[].status`	string	`clean/conflict/resolved/failed`
`items[].conflictFields`	array	冲突字段列表
`pagination`	object	分页信息

3.6 裁决冲突

端点: POST /api/v1/dc/tool-b/items/:itemId/resolve

用途: 人工选择正确的提取结果

请求:

{
  "field": "肿瘤大小",
  "chosenValue": "3cm"
}

请求参数：

字段	类型	必填	说明
`field`	string	✅	冲突字段名
`chosenValue`	string	✅	选择的值

响应（200）：

{
  "success": true
}

业务逻辑：

更新finalResult[field] = chosenValue
从conflictFields中移除该字段
如果所有冲突解决，更新status = 'resolved'

3.7 导出结果

端点: GET /api/v1/dc/tool-b/tasks/:taskId/export

用途: 导出最终提取结果为Excel

请求:

GET /api/v1/dc/tool-b/tasks/550e8400.../export?format=xlsx

查询参数：

参数	类型	必填	默认值	说明
`format`	string	❌	`xlsx`	导出格式：`xlsx/csv`

响应（200）：

文件流下载
Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Content-Disposition: attachment; filename="extraction_result_2025-11-27.xlsx"

导出内容：

包含原始列 + 所有提取字段
只包含clean和resolved状态的记录
冲突记录不导出（需人工裁决）

四、数据模型

4.1 HealthCheckResult

interface HealthCheckResult {
  status: 'good' | 'bad';
  emptyRate: number;
  avgLength: number;
  totalRows: number;
  estimatedTokens: number;
  message: string;
}

4.2 Template

interface Template {
  diseaseType: string;
  reportType: string;
  displayName: string;
  fields: TemplateField[];
}

interface TemplateField {
  name: string;
  desc: string;
  width?: string;
}

4.3 ExtractionTask

interface ExtractionTask {
  id: string;
  userId: string;
  projectName: string;
  sourceFileKey: string;
  textColumn: string;
  
  diseaseType: string;
  reportType: string;
  targetFields: TemplateField[];
  
  status: 'pending' | 'processing' | 'completed' | 'failed';
  totalCount: number;
  processedCount: number;
  cleanCount: number;
  conflictCount: number;
  failedCount: number;
  
  totalTokens: number;
  totalCost: number;
  
  createdAt: Date;
  startedAt?: Date;
  completedAt?: Date;
}

4.4 ExtractionItem

interface ExtractionItem {
  id: string;
  taskId: string;
  rowIndex: number;
  originalText: string;
  
  resultA?: Record<string, any>;
  resultB?: Record<string, any>;
  
  status: 'pending' | 'clean' | 'conflict' | 'resolved' | 'failed';
  conflictFields: string[];
  
  finalResult?: Record<string, any>;
  
  tokensA: number;
  tokensB: number;
}

五、错误处理

5.1 错误响应格式

{
  "error": "错误描述",
  "code": 400,
  "details": {
    "field": "fileKey",
    "reason": "文件不存在"
  }
}

5.2 常见错误码

HTTP状态	code	说明	示例
400	`INVALID_PARAMS`	参数错误	缺少fileKey
400	`COLUMN_NOT_FOUND`	列不存在	列"病历文本"不存在
400	`BAD_HEALTH`	健康检查未通过	空值率过高
404	`FILE_NOT_FOUND`	文件不存在	文件路径无效
404	`TASK_NOT_FOUND`	任务不存在	taskId无效
403	`FORBIDDEN`	无权访问	只能访问自己的任务
500	`INTERNAL_ERROR`	服务器错误	数据库连接失败

5.3 错误处理最佳实践

客户端：

try {
  const response = await fetch('/api/v1/dc/tool-b/health-check', {
    method: 'POST',
    body: JSON.stringify({ fileKey, columnName })
  });
  
  if (!response.ok) {
    const error = await response.json();
    throw new Error(error.error);
  }
  
  const data = await response.json();
  
  if (data.status === 'bad') {
    alert(data.message); // 健康检查未通过
    return;
  }
  
  // 继续下一步
} catch (error) {
  console.error('健康检查失败:', error);
}

六、性能指标

6.1 响应时间目标

API	目标	说明
`/health-check`	< 3秒	Excel解析+统计
`/templates`	< 100ms	内存缓存
`/tasks` (create)	< 500ms	快速创建并返回
`/tasks/:id/progress`	< 100ms	数据库单查询
`/tasks/:id/items`	< 500ms	分页查询
`/items/:id/resolve`	< 200ms	单行更新
`/tasks/:id/export`	< 10秒	生成Excel文件

6.2 并发处理能力

健康检查: 10 req/s（IO密集）
任务创建: 5 req/s（写入数据库）
进度查询: 100 req/s（读密集，可缓存）
验证网格: 50 req/s（分页查询）

6.3 优化策略

缓存：

/templates → 永久缓存（内存）
/tasks/:id/progress → Redis缓存（5秒TTL）

异步处理：

任务处理使用BullMQ后台队列
避免阻塞用户请求

分页：

验证网格默认50条/页
最大1000条/页

七、版本控制

7.1 API版本策略

当前版本: v1

URL格式: /api/v1/dc/tool-b/*

向后兼容承诺：

v1版本在2026年前保持稳定
新功能通过可选参数添加
破坏性变更发布v2

7.2 废弃通知

当API需要废弃时：

HTTP/1.1 200 OK
X-API-Deprecated: true
X-API-Sunset: 2026-12-31
X-API-Replacement: /api/v2/dc/tool-b/health-check

八、测试

8.1 Postman Collection

完整的API测试集合：

docs/03-业务模块/DC-数据清洗整理/02-技术设计/ToolB-API.postman_collection.json

8.2 示例请求

健康检查：

curl -X POST http://localhost:3001/api/v1/dc/tool-b/health-check \
  -H "Content-Type: application/json" \
  -d '{
    "fileKey": "uploads/test.xlsx",
    "columnName": "病历文本"
  }'

获取模板：

curl http://localhost:3001/api/v1/dc/tool-b/templates

创建任务：

curl -X POST http://localhost:3001/api/v1/dc/tool-b/tasks \
  -H "Content-Type: application/json" \
  -d '{
    "projectName": "测试任务",
    "fileKey": "uploads/test.xlsx",
    "textColumn": "病历文本",
    "diseaseType": "lung_cancer",
    "reportType": "pathology",
    "targetFields": [{"name": "病理类型", "desc": "..."}]
  }'

九、附录

9.1 相关文档

数据库设计文档
[PRD文档](../01-需求分析/PRD：Tool B - 病历结构化机器人 (The AI Structurer).md)
开发计划

9.2 变更日志

版本	日期	变更内容
V1.0	2025-11-27	初始版本，7个API端点

文档结束 ✅

16 KiB Raw Permalink Blame History Unescape Escape

API设计文档 - 工具B（病历结构化机器人）

📋 目录

一、API概览

1.1 端点列表

1.2 通用规范

二、认证与鉴权

2.1 认证机制

2.2 权限模型

三、API端点详情

3.1 健康检查

3.2 获取模板列表

3.3 创建提取任务

3.4 查询任务进度

3.5 获取验证网格数据

3.6 裁决冲突

3.7 导出结果

四、数据模型

4.1 HealthCheckResult

4.2 Template

4.3 ExtractionTask

4.4 ExtractionItem

五、错误处理

5.1 错误响应格式

5.2 常见错误码

5.3 错误处理最佳实践

六、性能指标

6.1 响应时间目标

6.2 并发处理能力

6.3 优化策略

七、版本控制

7.1 API版本策略

7.2 废弃通知

八、测试

8.1 Postman Collection

8.2 示例请求

九、附录

9.1 相关文档

9.2 变更日志

16 KiB

Raw Permalink Blame History