feat(dc/tool-c): Add missing value imputation feature with 6 methods and MICE

Major features:
1. Missing value imputation (6 simple methods + MICE):
   - Mean/Median/Mode/Constant imputation
   - Forward fill (ffill) and Backward fill (bfill) for time series
   - MICE multivariate imputation (in progress, shape issue to fix)

2. Auto precision detection:
   - Automatically match decimal places of original data
   - Prevent false precision (e.g. 13.57 instead of 13.566716417910449)

3. Categorical variable detection:
   - Auto-detect and skip categorical columns in MICE
   - Show warnings for unsuitable columns
   - Suggest mode imputation for categorical data

4. UI improvements:
   - Rename button: "Delete Missing" to "Missing Value Handling"
   - Remove standalone "Dedup" and "MICE" buttons
   - 3-tab dialog: Delete / Fill / Advanced Fill
   - Display column statistics and recommended methods
   - Extended warning messages (8 seconds for skipped columns)

5. Bug fixes:
   - Fix sessionService.updateSessionData -> saveProcessedData
   - Fix OperationResult interface (add message and stats)
   - Fix Toolbar button labels and removal

Modified files:
Python: operations/fillna.py (new, 556 lines), main.py (3 new endpoints)
Backend: QuickActionService.ts, QuickActionController.ts, routes/index.ts
Frontend: MissingValueDialog.tsx (new, 437 lines), Toolbar.tsx, index.tsx
Tests: test_fillna_operations.py (774 lines), test scripts and docs
Docs: 5 documentation files updated

Known issues:
- MICE imputation has DataFrame shape mismatch issue (under debugging)
- Workaround: Use 6 simple imputation methods first

Status: Development complete, MICE debugging in progress
Lines added: ~2000 lines across 3 tiers
This commit is contained in:
2025-12-10 13:06:00 +08:00
parent f4f1d09837
commit 74cf346453
102 changed files with 3806 additions and 181 deletions

45
tests/run_tests.sh Normal file
View File

@@ -0,0 +1,45 @@
#!/bin/bash
# Linux/Mac脚本 - 运行缺失值处理功能测试
echo "========================================"
echo "缺失值处理功能 - 自动化测试"
echo "========================================"
echo
# 检查Python是否安装
if ! command -v python3 &> /dev/null; then
echo "[错误] Python未安装"
exit 1
fi
echo "[1/3] 检查Python服务状态..."
if ! curl -s http://localhost:8001/health > /dev/null 2>&1; then
echo "[警告] Python服务未运行请先启动服务"
echo " cd extraction_service"
echo " python main.py"
echo
exit 1
fi
echo "[OK] Python服务运行正常"
echo
echo "[2/3] 检查依赖..."
python3 -c "import pandas, numpy, requests" 2> /dev/null
if [ $? -ne 0 ]; then
echo "[警告] 缺少依赖,正在安装..."
pip3 install pandas numpy requests
fi
echo "[OK] 依赖检查完成"
echo
echo "[3/3] 运行测试..."
echo
python3 test_fillna_operations.py
echo
echo "========================================"
echo "测试完成"
echo "========================================"