AIclinicalresearch

Author	SHA1	Message	Date
HaHafeng	fa72beea6c	feat(platform): Complete Postgres-Only architecture refactoring (Phase 1-7) Major Changes: - Implement Platform-Only architecture pattern (unified task management) - Add PostgresCacheAdapter for unified caching (platform_schema.app_cache) - Add PgBossQueue for job queue management (platform_schema.job) - Implement CheckpointService using job.data (generic for all modules) - Add intelligent threshold-based dual-mode processing (THRESHOLD=50) - Add task splitting mechanism (auto chunk size recommendation) - Refactor ASL screening service with smart mode selection - Refactor DC extraction service with smart mode selection - Register workers for ASL and DC modules Technical Highlights: - All task management data stored in platform_schema.job.data (JSONB) - Business tables remain clean (no task management fields) - CheckpointService is generic (shared by all modules) - Zero code duplication (DRY principle) - Follows 3-layer architecture principle - Zero additional cost (no Redis needed, save 8400 CNY/year) Code Statistics: - New code: ~1750 lines - Modified code: ~500 lines - Test code: ~1800 lines - Documentation: ~3000 lines Testing: - Unit tests: 8/8 passed - Integration tests: 2/2 passed - Architecture validation: passed - Linter errors: 0 Files: - Platform layer: PostgresCacheAdapter, PgBossQueue, CheckpointService, utils - ASL module: screeningService, screeningWorker - DC module: ExtractionController, extractionWorker - Tests: 11 test files - Docs: Updated 4 key documents Status: Phase 1-7 completed, Phase 8-9 pending	2025-12-13 16:10:04 +08:00
HaHafeng	74cf346453	feat(dc/tool-c): Add missing value imputation feature with 6 methods and MICE Major features: 1. Missing value imputation (6 simple methods + MICE): - Mean/Median/Mode/Constant imputation - Forward fill (ffill) and Backward fill (bfill) for time series - MICE multivariate imputation (in progress, shape issue to fix) 2. Auto precision detection: - Automatically match decimal places of original data - Prevent false precision (e.g. 13.57 instead of 13.566716417910449) 3. Categorical variable detection: - Auto-detect and skip categorical columns in MICE - Show warnings for unsuitable columns - Suggest mode imputation for categorical data 4. UI improvements: - Rename button: "Delete Missing" to "Missing Value Handling" - Remove standalone "Dedup" and "MICE" buttons - 3-tab dialog: Delete / Fill / Advanced Fill - Display column statistics and recommended methods - Extended warning messages (8 seconds for skipped columns) 5. Bug fixes: - Fix sessionService.updateSessionData -> saveProcessedData - Fix OperationResult interface (add message and stats) - Fix Toolbar button labels and removal Modified files: Python: operations/fillna.py (new, 556 lines), main.py (3 new endpoints) Backend: QuickActionService.ts, QuickActionController.ts, routes/index.ts Frontend: MissingValueDialog.tsx (new, 437 lines), Toolbar.tsx, index.tsx Tests: test_fillna_operations.py (774 lines), test scripts and docs Docs: 5 documentation files updated Known issues: - MICE imputation has DataFrame shape mismatch issue (under debugging) - Workaround: Use 6 simple imputation methods first Status: Development complete, MICE debugging in progress Lines added: ~2000 lines across 3 tiers	2025-12-10 13:06:00 +08:00
HaHafeng	75ceeb0653	hotfix(dc/tool-c): Fix compute formula validation and binning NaN serialization Critical fixes: 1. Compute column: Add Chinese comma support in formula validation - Problem: Formula with Chinese comma failed validation - Fix: Add Chinese comma character to allowed_chars regex - Example: Support formulas like 'col1（kg）+ col2，col3' 2. Binning operation: Fix NaN serialization error - Problem: 'Out of range float values are not JSON compliant: nan' - Fix: Enhanced NaN/inf handling in binning endpoint - Added np.inf/-np.inf replacement before JSON serialization - Added manual JSON serialization with NaN->null conversion 3. Enhanced all operation endpoints for consistency - Updated conditional, dropna endpoints with same NaN/inf handling - Ensures all operations return JSON-compliant data Modified files: - extraction_service/operations/compute.py: Add Chinese comma to regex - extraction_service/main.py: Enhanced NaN handling in binning/conditional/dropna Status: Hotfix complete, ready for testing	2025-12-09 08:45:27 +08:00
HaHafeng	91cab452d1	fix(dc/tool-c): Fix special character handling and improve UX Major fixes: - Fix pivot transformation with special characters in column names - Fix compute column validation for Chinese punctuation - Fix recode dialog to fetch unique values from full dataset via new API - Add column mapping mechanism to handle special characters Database migration: - Add column_mapping field to dc_tool_c_sessions table - Migration file: 20251208_add_column_mapping UX improvements: - Darken table grid lines for better visibility - Reduce column width by 40% with tooltip support - Insert new columns next to source columns - Preserve original row order after operations - Add notice about 50-row preview limit Modified files: - Backend: SessionService, SessionController, QuickActionService, routes - Python: pivot.py, compute.py, recode.py, binning.py, conditional.py - Frontend: DataGrid, RecodeDialog, index.tsx, ag-grid-custom.css - Database: schema.prisma, migration SQL Status: Code complete, database migrated, ready for testing	2025-12-08 23:20:55 +08:00
HaHafeng	f729699510	feat(dc): Complete Tool C quick action buttons Phase 1-2 - 7 functions Summary: - Implement 7 quick action functions (filter, recode, binning, conditional, dropna, compute, pivot) - Refactor to pre-written Python functions architecture (stable and secure) - Add 7 Python operations modules with full type hints - Add 7 frontend Dialog components with user-friendly UI - Fix NaN serialization issues and auto type conversion - Update all related documentation Technical Details: - Python: operations/ module (filter.py, recode.py, binning.py, conditional.py, dropna.py, compute.py, pivot.py) - Backend: QuickActionService.ts with 7 execute methods - Frontend: 7 Dialog components with complete validation - Toolbar: Enable 7 quick action buttons Status: Phase 1-2 completed, basic testing passed, ready for further testing	2025-12-08 17:38:08 +08:00
HaHafeng	f01981bf78	feat(dc/tool-c): 完成AI代码生成服务（Day 3 MVP）核心功能： - 新增AICodeService（550行）：AI代码生成核心服务 - 新增AIController（257行）：4个API端点 - 新增dc_tool_c_ai_history表：存储对话历史 - 实现自我修正机制：最多3次智能重试 - 集成LLMFactory：复用通用能力层 - 10个Few-shot示例：覆盖Level 1-4场景技术优化： - 修复NaN序列化问题（Python端转None） - 修复数据传递问题（从Session获取真实数据） - 优化System Prompt（明确环境信息） - 调整Few-shot示例（移除import语句）测试结果： - 通过率：9/11（81.8%）达到MVP标准 - 成功场景：缺失值处理、编码、分箱、BMI、筛选、填补、统计、分类 - 待优化：数值清洗、智能去重（已记录技术债务TD-C-006） API端点： - POST /api/v1/dc/tool-c/ai/generate（生成代码） - POST /api/v1/dc/tool-c/ai/execute（执行代码） - POST /api/v1/dc/tool-c/ai/process（生成并执行，一步到位） - GET /api/v1/dc/tool-c/ai/history/:sessionId（对话历史）文档更新： - 新增Day 3开发完成总结（770行） - 新增复杂场景优化技术债务（TD-C-006） - 更新工具C当前状态文档 - 更新技术债务清单影响范围： - backend/src/modules/dc/tool-c/（新增2个文件，更新1个文件） - backend/scripts/create-tool-c-ai-history-table.mjs（新增） - backend/prisma/schema.prisma（新增DcToolCAiHistory模型） - extraction_service/services/dc_executor.py（NaN序列化修复） - docs/03-业务模块/DC-数据清洗整理/（5份文档更新） Breaking Changes: 无总代码行数：+950行 Refs: #Tool-C-Day3	2025-12-07 16:21:32 +08:00
HaHafeng	2348234013	feat(dc/tool-c): Day 2 - Session管理与数据处理完成核心功能: - 数据库: 创建dc_tool_c_sessions表 (12字段, 3索引) - 服务层: SessionService (383行), DataProcessService (303行) - 控制器: SessionController (300行, 6个API端点) - 路由: 新增6个Session管理路由 - 测试: 7个API测试全部通过 (100%) 技术亮点: - 零落盘架构: Excel内存解析, OSS存储 - Session管理: 10分钟过期, 心跳延长机制 - 云原生规范: storage/logger/prisma全平台复用 - 完整测试: 上传/预览/完整数据/删除/心跳文件清单: - backend/prisma/schema.prisma (新增DcToolCSession模型) - backend/prisma/migrations/create_tool_c_session.sql - backend/scripts/create-tool-c-table.mjs - backend/src/modules/dc/tool-c/services/ (SessionService, DataProcessService) - backend/src/modules/dc/tool-c/controllers/SessionController.ts - backend/src/modules/dc/tool-c/routes/index.ts - backend/test-tool-c-day2.mjs - docs/03-业务模块/DC-数据清洗整理/00-工具C当前状态与开发指南.md - docs/03-业务模块/DC-数据清洗整理/06-开发记录/2025-12-06_工具C_Day2开发完成总结.md 代码统计: ~1900行测试结果: 7/7 通过 (100%) 云原生规范: 完全符合	2025-12-06 22:12:47 +08:00

7 Commits