feat(dc/tool-c): Add pivot column ordering and NA handling features

Major features:
1. Pivot transformation enhancements:
   - Add option to keep unselected columns with 3 aggregation methods
   - Maintain original column order after pivot (aligned with source file)
   - Preserve pivot value order (first appearance order)

2. NA handling across 4 core functions:
   - Recode: Support keep/map/drop for NA values
   - Filter: Already supports is_null/not_null operators
   - Binning: Support keep/label/assign for NA values (fix nan display)
   - Conditional: Add is_null/not_null operators

3. UI improvements:
   - Enable column header tooltips with custom header component
   - Add closeable alert for 50-row preview
   - Fix page scrollbar issues

Modified files:
Python: pivot.py, recode.py, binning.py, conditional.py, main.py
Backend: SessionController, QuickActionController, QuickActionService
Frontend: PivotDialog, RecodeDialog, BinningDialog, ConditionalDialog, DataGrid, index

Status: Ready for testing
This commit is contained in:
2025-12-09 14:40:14 +08:00
parent 75ceeb0653
commit f4f1d09837
19 changed files with 2314 additions and 123 deletions

View File

@@ -0,0 +1,190 @@
# 工具C - Pivot列顺序优化总结
## 📋 问题描述
**用户需求**:长宽转换后,列的排序应该与上传文件时的列顺序保持一致。
**当前问题**:系统按字母顺序排列转换后的列,导致顺序与原文件不一致。
---
## 🎯 解决方案方案A - Python端排序
### 核心思路
1. Node.js后端从session获取**原始列顺序**
2. Node.js后端从数据中提取**透视列值的原始顺序**(按首次出现顺序)
3. 传递给Python
4. Python在pivot后按原始顺序重排列
---
## 🛠️ 实现细节
### 1. Python端pivot.py
**新增参数**
- `original_column_order: List[str]`:原始列顺序(如`['Record ID', 'Event Name', 'FMA', '体重', '收缩压', ...]`
- `pivot_value_order: List[str]`:透视列值的原始顺序(如`['基线', '1个月', '2个月', ...]`
**排序逻辑**
```python
if original_column_order:
# 1. 索引列始终在最前面
final_cols = [index_column]
# 2. 按原始列顺序添加转换后的列
for orig_col in original_column_order:
if orig_col in value_columns:
# 找出所有属于这个原列的新列
related_cols = [c for c in df_pivot.columns if c.startswith(f'{orig_col}___')]
# ✨ 按透视列的原始顺序排序
if pivot_value_order:
pivot_order_map = {val: idx for idx, val in enumerate(pivot_value_order)}
related_cols_sorted = sorted(
related_cols,
key=lambda c: pivot_order_map.get(c.split('___')[1], 999)
)
else:
related_cols_sorted = sorted(related_cols)
final_cols.extend(related_cols_sorted)
# 3. 添加未选择的列(保持原始顺序)
if keep_unused_columns:
for orig_col in original_column_order:
if orig_col in df_pivot.columns and orig_col not in final_cols:
final_cols.append(orig_col)
# 4. 重排列
df_pivot = df_pivot[final_cols]
```
### 2. Python端main.py
**PivotRequest模型**
```python
class PivotRequest(BaseModel):
# ... 原有字段 ...
original_column_order: List[str] = [] # ✨ 新增
pivot_value_order: List[str] = [] # ✨ 新增
```
**调用pivot_long_to_wide**
```python
result_df = pivot_long_to_wide(
df,
request.index_column,
request.pivot_column,
request.value_columns,
request.aggfunc,
request.column_mapping,
request.keep_unused_columns,
request.unused_agg_method,
request.original_column_order, # ✨ 新增
request.pivot_value_order # ✨ 新增
)
```
### 3. Node.js后端QuickActionController.ts
**获取原始列顺序**
```typescript
const originalColumnOrder = session.columns || [];
```
**获取透视列值的原始顺序**
```typescript
const pivotColumn = params.pivotColumn;
const seenPivotValues = new Set();
const pivotValueOrder: string[] = [];
for (const row of fullData) {
const pivotValue = row[pivotColumn];
if (pivotValue !== null && pivotValue !== undefined && !seenPivotValues.has(pivotValue)) {
seenPivotValues.add(pivotValue);
pivotValueOrder.push(String(pivotValue));
}
}
```
**传递给QuickActionService**
```typescript
executeResult = await quickActionService.executePivot(
fullData,
params,
session.columnMapping,
originalColumnOrder, // ✨ 新增
pivotValueOrder // ✨ 新增
);
```
### 4. Node.js后端QuickActionService.ts
**方法签名**
```typescript
async executePivot(
data: any[],
params: PivotParams,
columnMapping?: any[],
originalColumnOrder?: string[], // ✨ 新增
pivotValueOrder?: string[] // ✨ 新增
): Promise<OperationResult>
```
**传递给Python**
```typescript
const response = await axios.post(`${PYTHON_SERVICE_URL}/api/operations/pivot`, {
// ... 原有参数 ...
original_column_order: originalColumnOrder || [], // ✨ 新增
pivot_value_order: pivotValueOrder || [], // ✨ 新增
});
```
---
## 📊 效果对比
### 修改前(按字母顺序)
```
Record ID | FMA___基线 | FMA___1个月 | 收缩压___基线 | 收缩压___1个月 | 体重___基线 | 体重___1个月
↑ ↑ ↑ ↑ ↑ ↑ ↑
索引列 F开头 F开头 S开头(拼音) S开头 T开头 T开头
```
### 修改后(按原始顺序)
```
Record ID | FMA___基线 | FMA___1个月 | 体重___基线 | 体重___1个月 | 收缩压___基线 | 收缩压___1个月
↑ ↑ ↑ ↑ ↑ ↑ ↑
索引列 原文件第3列 原文件第3列 原文件第4列 原文件第4列 原文件第5列 原文件第5列
```
### 透视值内部顺序(按原始出现顺序)
```
FMA___基线 | FMA___1个月 | FMA___2个月
↑ ↑ ↑
首次出现 第二次出现 第三次出现
(而不是按"1个月"、"2个月"、"基线"的字母顺序)
```
---
## ✅ 开发完成
### 修改文件清单
1.`extraction_service/operations/pivot.py`
2.`extraction_service/main.py`
3.`backend/src/modules/dc/tool-c/controllers/QuickActionController.ts`
4.`backend/src/modules/dc/tool-c/services/QuickActionService.ts`
### 优势
- ✅ 列顺序与原文件一致(用户熟悉)
- ✅ 透视值顺序按时间顺序基线→1个月→2个月
- ✅ 未选择的列也保持原始顺序
- ✅ 导出Excel时顺序正确
---
**开发时间**2025-12-09
**状态**:✅ 已完成,等待测试