Files
AIclinicalresearch/extraction_service/test_module.py
HaHafeng 74cf346453 feat(dc/tool-c): Add missing value imputation feature with 6 methods and MICE
Major features:
1. Missing value imputation (6 simple methods + MICE):
   - Mean/Median/Mode/Constant imputation
   - Forward fill (ffill) and Backward fill (bfill) for time series
   - MICE multivariate imputation (in progress, shape issue to fix)

2. Auto precision detection:
   - Automatically match decimal places of original data
   - Prevent false precision (e.g. 13.57 instead of 13.566716417910449)

3. Categorical variable detection:
   - Auto-detect and skip categorical columns in MICE
   - Show warnings for unsuitable columns
   - Suggest mode imputation for categorical data

4. UI improvements:
   - Rename button: "Delete Missing" to "Missing Value Handling"
   - Remove standalone "Dedup" and "MICE" buttons
   - 3-tab dialog: Delete / Fill / Advanced Fill
   - Display column statistics and recommended methods
   - Extended warning messages (8 seconds for skipped columns)

5. Bug fixes:
   - Fix sessionService.updateSessionData -> saveProcessedData
   - Fix OperationResult interface (add message and stats)
   - Fix Toolbar button labels and removal

Modified files:
Python: operations/fillna.py (new, 556 lines), main.py (3 new endpoints)
Backend: QuickActionService.ts, QuickActionController.ts, routes/index.ts
Frontend: MissingValueDialog.tsx (new, 437 lines), Toolbar.tsx, index.tsx
Tests: test_fillna_operations.py (774 lines), test scripts and docs
Docs: 5 documentation files updated

Known issues:
- MICE imputation has DataFrame shape mismatch issue (under debugging)
- Workaround: Use 6 simple imputation methods first

Status: Development complete, MICE debugging in progress
Lines added: ~2000 lines across 3 tiers
2025-12-10 13:06:00 +08:00

34 lines
858 B
Python

"""测试dc_executor模块"""
print("测试dc_executor模块导入...")
try:
from services.dc_executor import validate_code, execute_pandas_code
print("✅ 模块导入成功")
# 测试验证功能
print("\n测试validate_code...")
result = validate_code("df['x'] = 1")
print(f"✅ validate_code成功: {result}")
# 测试执行功能
print("\n测试execute_pandas_code...")
test_data = [{"age": 25}, {"age": 65}]
result = execute_pandas_code(test_data, "df['old'] = df['age'] > 60")
print(f"✅ execute_pandas_code成功: success={result['success']}")
if result['success']:
print(f" 结果: {result['result_data']}")
print("\n🎉 所有模块测试通过!")
except Exception as e:
print(f"❌ 测试失败: {e}")
import traceback
traceback.print_exc()