Files

HaHafeng 74cf346453 feat(dc/tool-c): Add missing value imputation feature with 6 methods and MICE

Major features:
1. Missing value imputation (6 simple methods + MICE):
   - Mean/Median/Mode/Constant imputation
   - Forward fill (ffill) and Backward fill (bfill) for time series
   - MICE multivariate imputation (in progress, shape issue to fix)

2. Auto precision detection:
   - Automatically match decimal places of original data
   - Prevent false precision (e.g. 13.57 instead of 13.566716417910449)

3. Categorical variable detection:
   - Auto-detect and skip categorical columns in MICE
   - Show warnings for unsuitable columns
   - Suggest mode imputation for categorical data

4. UI improvements:
   - Rename button: "Delete Missing" to "Missing Value Handling"
   - Remove standalone "Dedup" and "MICE" buttons
   - 3-tab dialog: Delete / Fill / Advanced Fill
   - Display column statistics and recommended methods
   - Extended warning messages (8 seconds for skipped columns)

5. Bug fixes:
   - Fix sessionService.updateSessionData -> saveProcessedData
   - Fix OperationResult interface (add message and stats)
   - Fix Toolbar button labels and removal

Modified files:
Python: operations/fillna.py (new, 556 lines), main.py (3 new endpoints)
Backend: QuickActionService.ts, QuickActionController.ts, routes/index.ts
Frontend: MissingValueDialog.tsx (new, 437 lines), Toolbar.tsx, index.tsx
Tests: test_fillna_operations.py (774 lines), test scripts and docs
Docs: 5 documentation files updated

Known issues:
- MICE imputation has DataFrame shape mismatch issue (under debugging)
- Workaround: Use 6 simple imputation methods first

Status: Development complete, MICE debugging in progress
Lines added: ~2000 lines across 3 tiers

2025-12-10 13:06:00 +08:00

1.8 KiB

Raw Blame History

🚀 快速开始 - 1分钟运行测试

Windows用户

方法1：双击运行（最简单）

双击 run_tests.bat
等待测试完成

方法2：命令行

cd AIclinicalresearch\tests
run_tests.bat

Linux/Mac用户

cd AIclinicalresearch/tests
chmod +x run_tests.sh
./run_tests.sh

⚠️ 前提条件

必须先启动Python服务！

# 打开新终端
cd AIclinicalresearch/extraction_service
python main.py

看到这行表示启动成功：

INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8001

📊 预期结果

✅ 全部通过：

总测试数: 18
✅ 通过: 18
❌ 失败: 0
通过率: 100.0%

🎉 所有测试通过！

⚠️ 部分失败：

查看红色错误信息
检查失败的具体测试
查看Python服务日志

🎯 测试内容

✅ 6种简单填补方法（均值、中位数、众数、固定值、前向填充、后向填充）
✅ MICE多重插补（单列、多列）
✅ 边界情况（100%缺失、0%缺失、特殊字符）
✅ 各种数据类型（数值、分类、混合）
✅ 性能测试（1000行数据）

💡 提示

第一次运行会自动安装依赖（pandas, numpy, requests）
测试时间约 45-60 秒
测试数据自动生成，无需手动准备
颜色输出：绿色=通过，红色=失败，黄色=警告

🆘 遇到问题？

问题1：无法连接到服务

解决：确保Python服务在运行（python main.py）

问题2：依赖安装失败

解决：手动安装 pip install pandas numpy requests

问题3：测试失败

解决：查看错误信息，检查代码逻辑

1.8 KiB Raw Blame History Unescape Escape

🚀 快速开始 - 1分钟运行测试

Windows用户

方法1：双击运行（最简单）

方法2：命令行

Linux/Mac用户

⚠️ 前提条件

📊 预期结果

🎯 测试内容

💡 提示

🆘 遇到问题？

问题1：无法连接到服务

问题2：依赖安装失败

问题3：测试失败

1.8 KiB

Raw Blame History