Features: - Backend statistics API (cloud-native Prisma aggregation) - Results page with hybrid solution (AI consensus + human final decision) - Excel export (frontend generation, zero disk write, cloud-native) - PRISMA-style exclusion reason analysis with bar chart - Batch selection and export (3 export methods) - Fixed logic contradiction (inclusion does not show exclusion reason) - Optimized table width (870px, no horizontal scroll) Components: - Backend: screeningController.ts - add getProjectStatistics API - Frontend: ScreeningResults.tsx - complete results page (hybrid solution) - Frontend: excelExport.ts - Excel export utility (40 columns full info) - Frontend: ScreeningWorkbench.tsx - add navigation button - Utils: get-test-projects.mjs - quick test tool Architecture: - Cloud-native: backend aggregation reduces network transfer - Cloud-native: frontend Excel generation (zero file persistence) - Reuse platform: global prisma instance, logger - Performance: statistics API < 500ms, Excel export < 3s (1000 records) Documentation: - Update module status guide (add Week 4 features) - Update task breakdown (mark Week 4 completed) - Update API design spec (add statistics API) - Update database design (add field usage notes) - Create Week 4 development plan - Create Week 4 completion report - Create technical debt list Test: - End-to-end flow test passed - All features verified - Performance test passed - Cloud-native compliance verified Ref: Week 4 Development Plan Scope: ASL Module MVP - Title Abstract Screening Results Cloud-Native: Backend aggregation + Frontend Excel generation
1.4 KiB
1.4 KiB
数据ETL引擎
能力定位: 通用能力层
复用率: 29% (2个模块依赖)
优先级: P2
状态: ⏳ 待实现
📋 能力概述
数据ETL引擎负责:
- Excel多表JOIN
- 数据清洗
- 数据转换
- 数据验证
📊 依赖模块
2个模块依赖(29%复用率):
- DC - 数据清洗整理(核心依赖)
- SSA - 智能统计分析(数据预处理)
💡 核心功能
1. Excel多表处理
- 读取多个Excel文件
- 自动JOIN操作
- GROUP BY聚合
2. 数据清洗
- 缺失值处理
- 重复值处理
- 异常值检测
3. 数据转换
- 类型转换
- 格式标准化
🏗️ 技术方案
云端版(最优)
# 基于Polars(性能极高)
class ETLEngine:
def read_excel(self, files: List[File]) -> List[DataFrame]
def join(self, dfs: List[DataFrame], keys: List[str]) -> DataFrame
def clean(self, df: DataFrame, rules: Dict) -> DataFrame
def export(self, df: DataFrame, format: str) -> bytes
单机版(兼容)
# 基于SQLite(内存友好)
# 分块读取,数据库引擎处理JOIN
🔗 相关文档
最后更新: 2025-11-06
维护人: 技术架构师