Files
AIclinicalresearch/python-microservice/operations/recode.py
HaHafeng 5523ef36ea feat(admin): Complete Phase 3.5.1-3.5.4 Prompt Management System (83%)
Summary:
- Implement Prompt management infrastructure and core services
- Build admin portal frontend with light theme
- Integrate CodeMirror 6 editor for non-technical users

Phase 3.5.1: Infrastructure Setup
- Create capability_schema for Prompt storage
- Add prompt_templates and prompt_versions tables
- Add prompt:view/edit/debug/publish permissions
- Migrate RVW prompts to database (RVW_EDITORIAL, RVW_METHODOLOGY)

Phase 3.5.2: PromptService Core
- Implement gray preview logic (DRAFT for debuggers, ACTIVE for users)
- Module-level debug control (setDebugMode)
- Handlebars template rendering
- Variable extraction and validation (extractVariables, validateVariables)
- Three-level disaster recovery (database -> cache -> hardcoded fallback)

Phase 3.5.3: Management API
- 8 RESTful endpoints (/api/admin/prompts/*)
- Permission control (PROMPT_ENGINEER can edit, SUPER_ADMIN can publish)

Phase 3.5.4: Frontend Management UI
- Build admin portal architecture (AdminLayout, OrgLayout)
- Add route system (/admin/*, /org/*)
- Implement PromptListPage (filter, search, debug switch)
- Implement PromptEditor (CodeMirror 6 simplified for clinical users)
- Implement PromptEditorPage (edit, save, publish, test, version history)

Technical Details:
- Backend: 6 files, ~2044 lines (prompt.service.ts 596 lines)
- Frontend: 9 files, ~1735 lines (PromptEditorPage.tsx 399 lines)
- CodeMirror 6: Line numbers, auto-wrap, variable highlight, search, undo/redo
- Chinese-friendly: 15px font, 1.8 line-height, system fonts

Next Step: Phase 3.5.5 - Integrate RVW module with PromptService

Tested: Backend API tests passed (8/8), Frontend pending user testing
Status: Ready for Phase 3.5.5 RVW integration
2026-01-11 21:25:16 +08:00

125 lines
2.3 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
"""
数值映射(重编码)操作
将分类变量的原始值映射为新值男→1女→2
"""
import pandas as pd
from typing import Dict, Any, Optional
def apply_recode(
df: pd.DataFrame,
column: str,
mapping: Dict[Any, Any],
create_new_column: bool = True,
new_column_name: Optional[str] = None
) -> pd.DataFrame:
"""
应用数值映射
Args:
df: 输入数据框
column: 要重编码的列名
mapping: 映射字典,如 {'': 1, '': 2}
create_new_column: 是否创建新列True或覆盖原列False
new_column_name: 新列名create_new_column=True时使用
Returns:
重编码后的数据框
Examples:
>>> df = pd.DataFrame({'性别': ['', '', '', '']})
>>> mapping = {'': 1, '': 2}
>>> result = apply_recode(df, '性别', mapping, True, '性别_编码')
>>> result['性别_编码'].tolist()
[1, 2, 1, 2]
"""
if df.empty:
return df
# 验证列是否存在
if column not in df.columns:
raise KeyError(f"'{column}' 不存在")
if not mapping:
raise ValueError('映射字典不能为空')
# 确定目标列名
if create_new_column:
target_column = new_column_name or f'{column}_编码'
else:
target_column = column
# 创建结果数据框(避免修改原数据)
result = df.copy()
# 应用映射
result[target_column] = result[column].map(mapping)
# 统计结果
mapped_count = result[target_column].notna().sum()
unmapped_count = result[target_column].isna().sum()
total_count = len(result)
print(f'映射完成: {mapped_count} 个值成功映射')
if unmapped_count > 0:
print(f'警告: {unmapped_count} 个值未找到对应映射')
# 找出未映射的唯一值
unmapped_mask = result[target_column].isna()
unmapped_values = result.loc[unmapped_mask, column].unique()
print(f'未映射的值: {list(unmapped_values)[:10]}') # 最多显示10个
# 映射成功率
success_rate = (mapped_count / total_count * 100) if total_count > 0 else 0
print(f'映射成功率: {success_rate:.1f}%')
return result