Files
AIclinicalresearch/extraction_service/test_execute_simple.py
HaHafeng 74cf346453 feat(dc/tool-c): Add missing value imputation feature with 6 methods and MICE
Major features:
1. Missing value imputation (6 simple methods + MICE):
   - Mean/Median/Mode/Constant imputation
   - Forward fill (ffill) and Backward fill (bfill) for time series
   - MICE multivariate imputation (in progress, shape issue to fix)

2. Auto precision detection:
   - Automatically match decimal places of original data
   - Prevent false precision (e.g. 13.57 instead of 13.566716417910449)

3. Categorical variable detection:
   - Auto-detect and skip categorical columns in MICE
   - Show warnings for unsuitable columns
   - Suggest mode imputation for categorical data

4. UI improvements:
   - Rename button: "Delete Missing" to "Missing Value Handling"
   - Remove standalone "Dedup" and "MICE" buttons
   - 3-tab dialog: Delete / Fill / Advanced Fill
   - Display column statistics and recommended methods
   - Extended warning messages (8 seconds for skipped columns)

5. Bug fixes:
   - Fix sessionService.updateSessionData -> saveProcessedData
   - Fix OperationResult interface (add message and stats)
   - Fix Toolbar button labels and removal

Modified files:
Python: operations/fillna.py (new, 556 lines), main.py (3 new endpoints)
Backend: QuickActionService.ts, QuickActionController.ts, routes/index.ts
Frontend: MissingValueDialog.tsx (new, 437 lines), Toolbar.tsx, index.tsx
Tests: test_fillna_operations.py (774 lines), test scripts and docs
Docs: 5 documentation files updated

Known issues:
- MICE imputation has DataFrame shape mismatch issue (under debugging)
- Workaround: Use 6 simple imputation methods first

Status: Development complete, MICE debugging in progress
Lines added: ~2000 lines across 3 tiers
2025-12-10 13:06:00 +08:00

54 lines
1.3 KiB
Python

"""简单的代码执行测试"""
import requests
import json
# 测试数据
test_data = [
{"patient_id": "P001", "age": 25, "gender": ""},
{"patient_id": "P002", "age": 65, "gender": ""},
{"patient_id": "P003", "age": 45, "gender": ""},
]
# 测试代码
test_code = """
df['age_group'] = df['age'].apply(lambda x: '老年' if x > 60 else '非老年')
print(f"处理完成,共 {len(df)} 行")
"""
print("=" * 60)
print("测试: Pandas代码执行")
print("=" * 60)
try:
response = requests.post(
"http://localhost:8000/api/dc/execute",
json={"data": test_data, "code": test_code},
timeout=10
)
print(f"\n状态码: {response.status_code}")
result = response.json()
print(json.dumps(result, indent=2, ensure_ascii=False))
if result.get("success"):
print("\n✅ 代码执行成功!")
print(f"结果数据: {len(result.get('result_data', []))}")
print(f"执行时间: {result.get('execution_time', 0):.3f}")
print(f"\n打印输出:\n{result.get('output', '')}")
print(f"\n结果数据示例:")
for row in result.get('result_data', [])[:3]:
print(f" {row}")
else:
print(f"\n❌ 代码执行失败: {result.get('error')}")
except Exception as e:
print(f"\n❌ 测试异常: {str(e)}")