feat(dc/tool-c): Add missing value imputation feature with 6 methods and MICE
Major features: 1. Missing value imputation (6 simple methods + MICE): - Mean/Median/Mode/Constant imputation - Forward fill (ffill) and Backward fill (bfill) for time series - MICE multivariate imputation (in progress, shape issue to fix) 2. Auto precision detection: - Automatically match decimal places of original data - Prevent false precision (e.g. 13.57 instead of 13.566716417910449) 3. Categorical variable detection: - Auto-detect and skip categorical columns in MICE - Show warnings for unsuitable columns - Suggest mode imputation for categorical data 4. UI improvements: - Rename button: "Delete Missing" to "Missing Value Handling" - Remove standalone "Dedup" and "MICE" buttons - 3-tab dialog: Delete / Fill / Advanced Fill - Display column statistics and recommended methods - Extended warning messages (8 seconds for skipped columns) 5. Bug fixes: - Fix sessionService.updateSessionData -> saveProcessedData - Fix OperationResult interface (add message and stats) - Fix Toolbar button labels and removal Modified files: Python: operations/fillna.py (new, 556 lines), main.py (3 new endpoints) Backend: QuickActionService.ts, QuickActionController.ts, routes/index.ts Frontend: MissingValueDialog.tsx (new, 437 lines), Toolbar.tsx, index.tsx Tests: test_fillna_operations.py (774 lines), test scripts and docs Docs: 5 documentation files updated Known issues: - MICE imputation has DataFrame shape mismatch issue (under debugging) - Workaround: Use 6 simple imputation methods first Status: Development complete, MICE debugging in progress Lines added: ~2000 lines across 3 tiers
This commit is contained in:
98
tests/QUICKSTART_快速开始.md
Normal file
98
tests/QUICKSTART_快速开始.md
Normal file
@@ -0,0 +1,98 @@
|
||||
# 🚀 快速开始 - 1分钟运行测试
|
||||
|
||||
## Windows用户
|
||||
|
||||
### 方法1:双击运行(最简单)
|
||||
1. 双击 `run_tests.bat`
|
||||
2. 等待测试完成
|
||||
|
||||
### 方法2:命令行
|
||||
```cmd
|
||||
cd AIclinicalresearch\tests
|
||||
run_tests.bat
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Linux/Mac用户
|
||||
|
||||
```bash
|
||||
cd AIclinicalresearch/tests
|
||||
chmod +x run_tests.sh
|
||||
./run_tests.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ 前提条件
|
||||
|
||||
**必须先启动Python服务!**
|
||||
|
||||
```bash
|
||||
# 打开新终端
|
||||
cd AIclinicalresearch/extraction_service
|
||||
python main.py
|
||||
```
|
||||
|
||||
看到这行表示启动成功:
|
||||
```
|
||||
INFO: Application startup complete.
|
||||
INFO: Uvicorn running on http://0.0.0.0:8001
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📊 预期结果
|
||||
|
||||
✅ **全部通过**:
|
||||
```
|
||||
总测试数: 18
|
||||
✅ 通过: 18
|
||||
❌ 失败: 0
|
||||
通过率: 100.0%
|
||||
|
||||
🎉 所有测试通过!
|
||||
```
|
||||
|
||||
⚠️ **部分失败**:
|
||||
- 查看红色错误信息
|
||||
- 检查失败的具体测试
|
||||
- 查看Python服务日志
|
||||
|
||||
---
|
||||
|
||||
## 🎯 测试内容
|
||||
|
||||
- ✅ 6种简单填补方法(均值、中位数、众数、固定值、前向填充、后向填充)
|
||||
- ✅ MICE多重插补(单列、多列)
|
||||
- ✅ 边界情况(100%缺失、0%缺失、特殊字符)
|
||||
- ✅ 各种数据类型(数值、分类、混合)
|
||||
- ✅ 性能测试(1000行数据)
|
||||
|
||||
---
|
||||
|
||||
## 💡 提示
|
||||
|
||||
- **第一次运行**会自动安装依赖(pandas, numpy, requests)
|
||||
- **测试时间**约 45-60 秒
|
||||
- **测试数据**自动生成,无需手动准备
|
||||
- **颜色输出**:绿色=通过,红色=失败,黄色=警告
|
||||
|
||||
---
|
||||
|
||||
## 🆘 遇到问题?
|
||||
|
||||
### 问题1:无法连接到服务
|
||||
**解决**:确保Python服务在运行(`python main.py`)
|
||||
|
||||
### 问题2:依赖安装失败
|
||||
**解决**:手动安装 `pip install pandas numpy requests`
|
||||
|
||||
### 问题3:测试失败
|
||||
**解决**:查看错误信息,检查代码逻辑
|
||||
|
||||
---
|
||||
|
||||
**准备好了吗?启动服务,运行测试!** 🚀
|
||||
|
||||
|
||||
Reference in New Issue
Block a user