Files
AIclinicalresearch/docs/03-业务模块/DC-数据清洗整理/04-开发计划/工具C_Pivot列顺序优化总结.md
HaHafeng decff0bb1f docs(deploy): Complete full system deployment to Aliyun SAE
Summary:
- Successfully deployed complete system to Aliyun SAE (2025-12-25)
- All services running: Python microservice + Node.js backend + Frontend Nginx + CLB
- Public access available at http://8.140.53.236/

Major Achievements:
1. Python microservice deployed (v1.0, internal IP: 172.17.173.66:8000)
2. Node.js backend deployed (v1.3, internal IP: 172.17.173.73:3001)
   - Fixed 4 critical issues: bash path, config directory, pino-pretty, ES Module
3. Frontend Nginx deployed (v1.0, internal IP: 172.17.173.72:80)
4. CLB load balancer configured (public IP: 8.140.53.236)

New Documentation (9 docs):
- 11-Node.js backend SAE deployment config checklist (21 env vars)
- 12-Node.js backend SAE deployment operation manual
- 13-Node.js backend image fix record (config directory)
- 14-Node.js backend pino-pretty fix
- 15-Node.js backend deployment success summary
- 16-Frontend Nginx deployment success summary
- 17-Complete deployment practical manual 2025 edition (1800 lines)
- 18-Deployment documentation usage guide
- 19-Daily update quick operation manual (670 lines)

Key Fixes:
- Environment variable name correction: EXTRACTION_SERVICE_URL (not PYTHON_SERVICE_URL)
- Dockerfile fix: added COPY config ./config
- Logger configuration: conditional pino-pretty for dev only
- Health check fix: ES Module compatibility (require -> import)

Updated Files:
- System status document updated with full deployment info
- Deployment progress overview updated with latest IPs
- All 3 Docker services' Dockerfiles and configs refined

Verification:
- All health checks passed
- Tool C 7 features working correctly
- Literature screening module functional
- Response time < 1 second

BREAKING CHANGE: Node.js backend internal IP changed from 172.17.173.71 to 172.17.173.73

Closes #deployment-milestone
2025-12-25 21:24:37 +08:00

212 lines
5.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 工具C - Pivot列顺序优化总结
## 📋 问题描述
**用户需求**:长宽转换后,列的排序应该与上传文件时的列顺序保持一致。
**当前问题**:系统按字母顺序排列转换后的列,导致顺序与原文件不一致。
---
## 🎯 解决方案方案A - Python端排序
### 核心思路
1. Node.js后端从session获取**原始列顺序**
2. Node.js后端从数据中提取**透视列值的原始顺序**(按首次出现顺序)
3. 传递给Python
4. Python在pivot后按原始顺序重排列
---
## 🛠️ 实现细节
### 1. Python端pivot.py
**新增参数**
- `original_column_order: List[str]`:原始列顺序(如`['Record ID', 'Event Name', 'FMA', '体重', '收缩压', ...]`
- `pivot_value_order: List[str]`:透视列值的原始顺序(如`['基线', '1个月', '2个月', ...]`
**排序逻辑**
```python
if original_column_order:
# 1. 索引列始终在最前面
final_cols = [index_column]
# 2. 按原始列顺序添加转换后的列
for orig_col in original_column_order:
if orig_col in value_columns:
# 找出所有属于这个原列的新列
related_cols = [c for c in df_pivot.columns if c.startswith(f'{orig_col}___')]
# ✨ 按透视列的原始顺序排序
if pivot_value_order:
pivot_order_map = {val: idx for idx, val in enumerate(pivot_value_order)}
related_cols_sorted = sorted(
related_cols,
key=lambda c: pivot_order_map.get(c.split('___')[1], 999)
)
else:
related_cols_sorted = sorted(related_cols)
final_cols.extend(related_cols_sorted)
# 3. 添加未选择的列(保持原始顺序)
if keep_unused_columns:
for orig_col in original_column_order:
if orig_col in df_pivot.columns and orig_col not in final_cols:
final_cols.append(orig_col)
# 4. 重排列
df_pivot = df_pivot[final_cols]
```
### 2. Python端main.py
**PivotRequest模型**
```python
class PivotRequest(BaseModel):
# ... 原有字段 ...
original_column_order: List[str] = [] # ✨ 新增
pivot_value_order: List[str] = [] # ✨ 新增
```
**调用pivot_long_to_wide**
```python
result_df = pivot_long_to_wide(
df,
request.index_column,
request.pivot_column,
request.value_columns,
request.aggfunc,
request.column_mapping,
request.keep_unused_columns,
request.unused_agg_method,
request.original_column_order, # ✨ 新增
request.pivot_value_order # ✨ 新增
)
```
### 3. Node.js后端QuickActionController.ts
**获取原始列顺序**
```typescript
const originalColumnOrder = session.columns || [];
```
**获取透视列值的原始顺序**
```typescript
const pivotColumn = params.pivotColumn;
const seenPivotValues = new Set();
const pivotValueOrder: string[] = [];
for (const row of fullData) {
const pivotValue = row[pivotColumn];
if (pivotValue !== null && pivotValue !== undefined && !seenPivotValues.has(pivotValue)) {
seenPivotValues.add(pivotValue);
pivotValueOrder.push(String(pivotValue));
}
}
```
**传递给QuickActionService**
```typescript
executeResult = await quickActionService.executePivot(
fullData,
params,
session.columnMapping,
originalColumnOrder, // ✨ 新增
pivotValueOrder // ✨ 新增
);
```
### 4. Node.js后端QuickActionService.ts
**方法签名**
```typescript
async executePivot(
data: any[],
params: PivotParams,
columnMapping?: any[],
originalColumnOrder?: string[], // ✨ 新增
pivotValueOrder?: string[] // ✨ 新增
): Promise<OperationResult>
```
**传递给Python**
```typescript
const response = await axios.post(`${PYTHON_SERVICE_URL}/api/operations/pivot`, {
// ... 原有参数 ...
original_column_order: originalColumnOrder || [], // ✨ 新增
pivot_value_order: pivotValueOrder || [], // ✨ 新增
});
```
---
## 📊 效果对比
### 修改前(按字母顺序)
```
Record ID | FMA___基线 | FMA___1个月 | 收缩压___基线 | 收缩压___1个月 | 体重___基线 | 体重___1个月
↑ ↑ ↑ ↑ ↑ ↑ ↑
索引列 F开头 F开头 S开头(拼音) S开头 T开头 T开头
```
### 修改后(按原始顺序)
```
Record ID | FMA___基线 | FMA___1个月 | 体重___基线 | 体重___1个月 | 收缩压___基线 | 收缩压___1个月
↑ ↑ ↑ ↑ ↑ ↑ ↑
索引列 原文件第3列 原文件第3列 原文件第4列 原文件第4列 原文件第5列 原文件第5列
```
### 透视值内部顺序(按原始出现顺序)
```
FMA___基线 | FMA___1个月 | FMA___2个月
↑ ↑ ↑
首次出现 第二次出现 第三次出现
(而不是按"1个月"、"2个月"、"基线"的字母顺序)
```
---
## ✅ 开发完成
### 修改文件清单
1.`extraction_service/operations/pivot.py`
2.`extraction_service/main.py`
3.`backend/src/modules/dc/tool-c/controllers/QuickActionController.ts`
4.`backend/src/modules/dc/tool-c/services/QuickActionService.ts`
### 优势
- ✅ 列顺序与原文件一致(用户熟悉)
- ✅ 透视值顺序按时间顺序基线→1个月→2个月
- ✅ 未选择的列也保持原始顺序
- ✅ 导出Excel时顺序正确
---
**开发时间**2025-12-09
**状态**:✅ 已完成,等待测试