feat(aia): Complete AIA V2.0 with universal streaming capabilities
Major Changes: - Add StreamingService with OpenAI Compatible format - Upgrade Chat component V2 with Ant Design X integration - Implement AIA module with 12 intelligent agents - Update API routes to unified /api/v1 prefix - Update system documentation Backend (~1300 lines): - common/streaming: OpenAI Compatible adapter - modules/aia: 12 agents, conversation service, streaming integration - Update route versions (RVW, PKB to v1) Frontend (~3500 lines): - modules/aia: AgentHub + ChatWorkspace (100% prototype restoration) - shared/Chat: AIStreamChat, ThinkingBlock, useAIStream Hook - Update API endpoints to v1 Documentation: - AIA module status guide - Universal capabilities catalog - System overview updates - All module documentation sync Tested: Stream response verified, authentication working Status: AIA V2.0 core completed (85%)
This commit is contained in:
@@ -1,48 +1,48 @@
|
||||
# 工具C - AI Copilot Few-shot示例库
|
||||
# 撌亙<EFBFBD>C - AI Copilot Few-shot蝷箔<EFBFBD>摨?
|
||||
|
||||
> **<2A><>﹝<EFBFBD><EFB99D>𧋦**: V1.0
|
||||
> **<2A>𥕦遣<F0A595A6>交<EFBFBD>**: 2025-12-06
|
||||
> **用途**: System Prompt中的Few-shot示例
|
||||
> **覆盖场景**: 从基础清洗到高级插补,10个核心场景
|
||||
> **<EFBFBD>券<EFBFBD>?*: System Prompt銝剔<EFBFBD>Few-shot蝷箔<EFBFBD>
|
||||
> **閬<EFBFBD><EFBFBD><EFBFBD>箸艶**: 隞𤾸抅蝖<E68A85>皜<EFBFBD><E79A9C><EFBFBD>圈<EFBFBD>蝥扳<E89DA5>銵伐<E98AB5>10銝芣瓲敹<E793B2>㦤<EFBFBD>?
|
||||
|
||||
---
|
||||
|
||||
## <20><> 蝷箔<E89DB7><E7AE94>餉<EFBFBD>
|
||||
|
||||
| 编号 | 场景名称 | 级别 | 技术要点 | 医疗价值 |
|
||||
| 蝻硋噡 | <20>箸艶<E7AEB8>滨妍 | 蝥批<E89DA5> | <20><><EFBFBD>航<EFBFBD><E888AA>?| <20>餌<EFBFBD>隞瑕<E99A9E>?|
|
||||
|------|---------|------|---------|---------|
|
||||
| 1 | 统一缺失值标记 | Level 1 | replace | 数据标准化 ⭐⭐⭐ |
|
||||
| 2 | 数值列清洗 | Level 1 | 正则+类型转换 | 检验值处理 ⭐⭐⭐⭐ |
|
||||
| 3 | 分类变量编码 | Level 2 | map | 统计建模 ⭐⭐⭐⭐⭐ |
|
||||
| 1 | 蝏煺<EFBFBD>蝻箏仃<EFBFBD>潭<EFBFBD>霈?| Level 1 | replace | <EFBFBD>唳旿<EFBFBD><EFBFBD><EFBFBD><EFBFBD>?潃鐥<E6BD83>潃?|
|
||||
| 2 | <EFBFBD>啣<EFBFBD>澆<EFBFBD>皜<EFBFBD><EFBFBD> | Level 1 | 甇<EFBFBD><EFBFBD>+蝐餃<E89D90>頧祆揢 | 璉<>撉<EFBFBD><E69289>澆<EFBFBD><E6BE86>?潃鐥<E6BD83>潃鐥<E6BD83> |
|
||||
| 3 | <EFBFBD><EFBFBD>掩<EFBFBD>㗛<EFBFBD>蝻𣇉<EFBFBD> | Level 2 | map | 蝏蠘恣撱箸芋 潃鐥<E6BD83>潃鐥<E6BD83>潃?|
|
||||
| 4 | 餈䂿賒<E482BF>㗛<EFBFBD><E3979B><EFBFBD>拳 | Level 2 | cut | <20><><EFBFBD><EFBFBD><EFBFBD><EFBFBD> 潃鐥<E6BD83>潃鐥<E6BD83> |
|
||||
| 5 | BMI计算与分类 | Level 3 | 公式+条件 | 临床指标 ⭐⭐⭐⭐⭐ |
|
||||
| 6 | 日期计算 | Level 3 | datetime | 时间间隔 ⭐⭐⭐⭐⭐ |
|
||||
| 7 | 条件筛选 | Level 3 | 多条件过滤 | 入组标准 ⭐⭐⭐⭐⭐ |
|
||||
| 8 | 简单缺失值填补 | Level 4 | fillna | 缺失处理 ⭐⭐⭐⭐ |
|
||||
| 9 | 多重插补(MICE) | Level 4 | IterativeImputer | 高级填补 ⭐⭐⭐⭐⭐ |
|
||||
| 5 | BMI霈∠<EFBFBD>銝𤾸<EFBFBD>蝐?| Level 3 | <EFBFBD>砍<EFBFBD>+<2B>∩辣 | 銝游<E98A9D><E6B8B8><EFBFBD><EFBFBD> 潃鐥<E6BD83>潃鐥<E6BD83>潃?|
|
||||
| 6 | <EFBFBD>交<EFBFBD>霈∠<EFBFBD> | Level 3 | datetime | <EFBFBD>園𡢿<EFBFBD>湧<EFBFBD> 潃鐥<E6BD83>潃鐥<E6BD83>潃?|
|
||||
| 7 | <EFBFBD>∩辣蝑偦<EFBFBD>?| Level 3 | 憭𡁏辺隞嗉<EFBFBD>皛?| <20>亦<EFBFBD><E4BAA6><EFBFBD><EFBFBD> 潃鐥<E6BD83>潃鐥<E6BD83>潃?|
|
||||
| 8 | 蝞<EFBFBD><EFBFBD>閧撩憭勗<EFBFBD>澆‵銵?| Level 4 | fillna | 蝻箏仃憭<EFBFBD><EFBFBD> 潃鐥<E6BD83>潃鐥<E6BD83> |
|
||||
| 9 | 憭𡁻<EFBFBD><EFBFBD>坿‘(MICE) | Level 4 | IterativeImputer | 擃条漣憛怨‘ 潃鐥<E6BD83>潃鐥<E6BD83>潃?|
|
||||
| 10 | <20>箄<EFBFBD><E7AE84>駁<EFBFBD> | Level 4 | sort+drop_duplicates | <20>唳旿韐券<E99F90> 潃鐥<E6BD83>潃鐥<E6BD83> |
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Level 1: 基础数据清洗(2个)
|
||||
## <EFBFBD>㴓 Level 1: <EFBFBD>箇<EFBFBD><EFBFBD>唳旿皜<EFBFBD><EFBFBD>嚗?銝迎<E98A9D>
|
||||
|
||||
### 示例1: 统一缺失值标记
|
||||
### 蝷箔<EFBFBD>1: 蝏煺<E89D8F>蝻箏仃<E7AE8F>潭<EFBFBD>霈?
|
||||
|
||||
**<EFBFBD>冽<EFBFBD><EFBFBD><EFBFBD>誘**:
|
||||
```
|
||||
把所有代表缺失的符号(-、不详、NA、N/A)统一替换为标准空值
|
||||
<EFBFBD>𦠜<EFBFBD><EFBFBD>劐誨銵函撩憭梁<EFBFBD>蝚血噡嚗?<3F><><EFBFBD>霂艾<E99C82><E889BE>A<EFBFBD><41>/A嚗厩<E59A97>銝<EFBFBD><E98A9D>踵揢銝箸<E98A9D><E7AEB8><EFBFBD>征<EFBFBD>?
|
||||
```
|
||||
|
||||
**AI<41><49><EFBFBD>隞<EFBFBD><E99A9E>**:
|
||||
```python
|
||||
# 统一缺失值标记
|
||||
# 蝏煺<EFBFBD>蝻箏仃<EFBFBD>潭<EFBFBD>霈?
|
||||
df = df.replace(['-', '銝滩祕', 'NA', 'N/A', '\\', '<EFBFBD>芣䰻'], np.nan)
|
||||
```
|
||||
|
||||
**隞<><E99A9E>霂湔<E99C82>**:
|
||||
- 医疗数据常见多种缺失值表示方式
|
||||
- <EFBFBD>餌<EFBFBD><EFBFBD>唳旿撣貉<EFBFBD>憭𡁶<EFBFBD>蝻箏仃<EFBFBD>潸”蝷箸䲮撘?
|
||||
- 蝏煺<E89D8F>銝箸<E98A9D><E7AEB8><EFBFBD>aN靘蹂<E99D98><E8B982>𡒊賒蝏蠘恣<E8A098><E681A3><EFBFBD>
|
||||
- 适用场景: 数据清洗第一步
|
||||
- <EFBFBD><EFBFBD>鍂<EFBFBD>箸艶: <20>唳旿皜<E697BF><E79A9C>蝚砌<E89D9A>甇?
|
||||
|
||||
**憸<><E686B8>蝏𤘪<E89D8F>**:
|
||||
- <20><><EFBFBD>厰<EFBFBD><E58EB0><EFBFBD><EFBFBD>蝻箏仃<E7AE8F>潭<EFBFBD>霈啗◤蝏煺<E89D8F>銝摸aN
|
||||
@@ -54,30 +54,30 @@ df = df.replace(['-', '不详', 'NA', 'N/A', '\\', '未查'], np.nan)
|
||||
|
||||
**<EFBFBD>冽<EFBFBD><EFBFBD><EFBFBD>誘**:
|
||||
```
|
||||
把肌酐列里的非数字符号去掉,<0.1按0.05处理,转为数值类型
|
||||
<EFBFBD>𡃏<EFBFBD><EFBFBD>𣂼<EFBFBD><EFBFBD>𣬚<EFBFBD><EFBFBD>墧㺭摮㛖泵<EFBFBD>瑕縧<EFBFBD>㚁<EFBFBD><0.1<EFBFBD>?.05憭<EFBFBD><EFBFBD>嚗諹蓮銝箸㺭<EFBFBD>潛掩<EFBFBD>?
|
||||
```
|
||||
|
||||
**AI<41><49><EFBFBD>隞<EFBFBD><E99A9E>**:
|
||||
```python
|
||||
# 处理数值列的符号
|
||||
# 憭<EFBFBD><EFBFBD><EFBFBD>啣<EFBFBD>澆<EFBFBD><EFBFBD><EFBFBD>泵<EFBFBD>?
|
||||
df['creatinine'] = df['creatinine'].astype(str).str.replace('>', '').str.replace('<', '')
|
||||
df.loc[df['creatinine'] == '0.1', 'creatinine'] = '0.05'
|
||||
df['creatinine'] = pd.to_numeric(df['creatinine'], errors='coerce')
|
||||
```
|
||||
|
||||
**隞<><E99A9E>霂湔<E99C82>**:
|
||||
- 检验科导出数据常含符号(>、<、+等)
|
||||
- 璉<EFBFBD>撉𣬚<EFBFBD>撖澆枂<EFBFBD>唳旿撣詨鉄蝚血噡嚗?<3F>?<3F>?蝑㚁<E89D91>
|
||||
- <20><>皜<EFBFBD><E79A9C><EFBFBD>擧<EFBFBD><E693A7>質<EFBFBD>銵峕㺭<E5B395>潸恣蝞堒<E89D9E>蝏蠘恣<E8A098><E681A3><EFBFBD>
|
||||
- errors='coerce'撠<><E692A0>瘜閗蓮<E99697>Y<EFBFBD><EFBCB9>潸挽銝摸aN
|
||||
|
||||
**憸<><E686B8>蝏𤘪<E89D8F>**:
|
||||
- 肌酐列变为纯数值类型
|
||||
- 符号被正确处理
|
||||
- <EFBFBD>屸<EFBFBD><EFBFBD>堒<EFBFBD>銝箇滲<EFBFBD>啣<EFBFBD>潛掩<EFBFBD>?
|
||||
- 蝚血噡鋡急迤蝖桀<EFBFBD><EFBFBD>?
|
||||
- 撘<>虜<EFBFBD>潸蓮銝摸aN
|
||||
|
||||
**<EFBFBD>餌<EFBFBD><EFBFBD>箸艶**:
|
||||
- 璉<>撉峕𥁒<E5B395>𠺪<EFBFBD><F0A0BAAA>屸<EFBFBD><E5B1B8><EFBFBD>蒾蝏<E892BE><E89D8F><EFBFBD><EFBFBD><EFBFBD>蝥Z<E89DA5><EFBCBA>賜<EFBFBD>
|
||||
- 常见符号:>100、<0.1、12.5+
|
||||
- 撣貉<EFBFBD>蝚血噡嚗?100<EFBFBD>?0.1<EFBFBD>?2.5+
|
||||
|
||||
---
|
||||
|
||||
@@ -87,28 +87,28 @@ df['creatinine'] = pd.to_numeric(df['creatinine'], errors='coerce')
|
||||
|
||||
**<EFBFBD>冽<EFBFBD><EFBFBD><EFBFBD>誘**:
|
||||
```
|
||||
把性别列转为数字,男=1,女=0
|
||||
<EFBFBD>𦠜<EFBFBD>批<EFBFBD><EFBFBD>𡑒蓮銝箸㺭摮梹<EFBFBD><EFBFBD>?1嚗<31>戊=0
|
||||
```
|
||||
|
||||
**AI<41><49><EFBFBD>隞<EFBFBD><E99A9E>**:
|
||||
```python
|
||||
# <20>批<EFBFBD>蝻𣇉<E89DBB>
|
||||
df['gender_code'] = df['gender'].map({'男': 1, '女': 0})
|
||||
df['gender_code'] = df['gender'].map({'<EFBFBD>?: 1, '憟?: 0})
|
||||
```
|
||||
|
||||
**隞<><E99A9E>霂湔<E99C82>**:
|
||||
- 将文本分类变量转为数值
|
||||
- 撠<EFBFBD><EFBFBD><EFBFBD>砍<EFBFBD>蝐餃<EFBFBD><EFBFBD>讛蓮銝箸㺭<EFBFBD>?
|
||||
- 靘蹂<E99D98>Logistic<69>𧼮<EFBFBD><F0A7BCAE><EFBFBD><EFBFBD>蝏讐<E89D8F>蝏𦦵<E89D8F>撱箸芋
|
||||
- map方法简洁高效
|
||||
- map<EFBFBD>寞<EFBFBD>蝞<EFBFBD>瘣<EFBFBD><EFBFBD><EFBFBD>?
|
||||
|
||||
**憸<><E686B8>蝏𤘪<E89D8F>**:
|
||||
- 新增gender_code列
|
||||
- <EFBFBD>啣<EFBFBD>gender_code<EFBFBD>?
|
||||
- <20>猾<EFBFBD>1, 憟喇<E6869F>0
|
||||
- 原gender列保持不变
|
||||
- <EFBFBD>鰛ender<EFBFBD>𦯀<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>?
|
||||
|
||||
**<EFBFBD>拙<EFBFBD><EFBFBD>箸艶**:
|
||||
- 婚姻状态: 已婚=1, 未婚=0
|
||||
- 吸烟史: 是=1, 否=0
|
||||
- 憍𡁜宏<EFBFBD>嗆<EFBFBD>? 撌脣<E6928C>=1, <EFBFBD>芸<EFBFBD>=0
|
||||
- <EFBFBD>貊<EFBFBD><EFBFBD>? <20>?1, <EFBFBD>?0
|
||||
- <20>躰<EFBFBD>蝔见漲: 撠誩郎=1, <20>苷葉=2, 擃䀝葉=3, 憭批郎=4
|
||||
|
||||
---
|
||||
@@ -117,7 +117,7 @@ df['gender_code'] = df['gender'].map({'男': 1, '女': 0})
|
||||
|
||||
**<EFBFBD>冽<EFBFBD><EFBFBD><EFBFBD>誘**:
|
||||
```
|
||||
把年龄按18岁、60岁分为未成年、成年、老年三组
|
||||
<EFBFBD>𠰴僑樴<EFBFBD><EFBFBD>18撗<EFBFBD><EFBFBD>?0撗<30><E69297>銝箸𧊋<E7AEB8>𣂼僑<F0A382BC><E58391><EFBFBD>撟氬<E6929F><E6B0AC><EFBFBD><EFBFBD>僑銝厩<E98A9D>
|
||||
```
|
||||
|
||||
**AI<41><49><EFBFBD>隞<EFBFBD><E99A9E>**:
|
||||
@@ -125,35 +125,35 @@ df['gender_code'] = df['gender'].map({'男': 1, '女': 0})
|
||||
# 撟湧<E6929F><E6B9A7><EFBFBD><EFBFBD>
|
||||
df['age_group'] = pd.cut(df['age'],
|
||||
bins=[0, 18, 60, 120],
|
||||
labels=['未成年', '成年', '老年'],
|
||||
labels=['<EFBFBD>芣<EFBFBD>撟?, '<EFBFBD>𣂼僑', '<EFBFBD><EFBFBD>僑'],
|
||||
right=False)
|
||||
```
|
||||
|
||||
**隞<><E99A9E>霂湔<E99C82>**:
|
||||
- 将连续变量离散化为有序分类变量
|
||||
- 撠<EFBFBD><EFBFBD>蝏剖<EFBFBD><EFBFBD>讐氖<EFBFBD><EFBFBD><EFBFBD>銝箸<EFBFBD>摨誩<EFBFBD>蝐餃<EFBFBD><EFBFBD>?
|
||||
- bins摰帋<E691B0><E5B88B>粹𡢿颲寧<E9A2B2>嚗𨧣abels摰帋<E691B0><E5B88B><EFBFBD>倌
|
||||
- right=False銵函內撌阡𡡒<E998A1>喳<EFBFBD><E596B3>粹𡢿[0,18)
|
||||
|
||||
**憸<><E686B8>蝏𤘪<E89D8F>**:
|
||||
- 新增age_group列
|
||||
- 0-17岁→未成年
|
||||
- <EFBFBD>啣<EFBFBD>age_group<EFBFBD>?
|
||||
- 0-17撗<EFBFBD><EFBFBD><EFBFBD>芣<EFBFBD>撟?
|
||||
- 18-59撗<39><E69297><EFBFBD>𣂼僑
|
||||
- 60+撗<><E69297><EFBFBD><EFBFBD>僑
|
||||
|
||||
**<EFBFBD>餌<EFBFBD>摨𠉛鍂**:
|
||||
- 分层分析(各年龄段发病率)
|
||||
- 卡方检验(年龄组vs疾病)
|
||||
- <EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>嚗<EFBFBD><EFBFBD>撟湧<EFBFBD>畾萄<EFBFBD><EFBFBD><EFBFBD><EFBFBD>嚗?
|
||||
- <EFBFBD>⊥䲮璉<EFBFBD>撉䕘<EFBFBD>撟湧<EFBFBD>蝏<EFBFBD>s<EFBFBD>曄<EFBFBD>嚗?
|
||||
- <20>讛膩<E8AE9B>抒<EFBFBD>霈∴<E99C88><E288B4>箇瑪<E7AE87>孵<EFBFBD>銵剁<E98AB5>
|
||||
|
||||
---
|
||||
|
||||
## <20>蘂 Level 3: <20>餃郎霈∠<E99C88>銝𡒊<E98A9D><F0A1928A>㚁<EFBFBD>3銝迎<E98A9D>
|
||||
|
||||
### 示例5: BMI计算与分类
|
||||
### 蝷箔<EFBFBD>5: BMI霈∠<EFBFBD>銝𤾸<EFBFBD>蝐?
|
||||
|
||||
**<EFBFBD>冽<EFBFBD><EFBFBD><EFBFBD>誘**:
|
||||
```
|
||||
根据身高(cm)和体重(kg)计算BMI,并标记BMI≥28为肥胖
|
||||
<EFBFBD>寞旿頨恍<EFBFBD>(cm)<29>䔶<EFBFBD><E494B6>?kg)霈∠<E99C88>BMI嚗<49>僎<EFBFBD><E5838E>扇BMI<4D>?8銝箄<E98A9D><E7AE84>?
|
||||
```
|
||||
|
||||
**AI<41><49><EFBFBD>隞<EFBFBD><E99A9E>**:
|
||||
@@ -167,23 +167,23 @@ df['obesity'] = df['BMI'].apply(lambda x: '肥胖' if x >= 28 else '正常')
|
||||
|
||||
**隞<><E99A9E>霂湔<E99C82>**:
|
||||
- BMI<4D>砍<EFBFBD>: 雿㯄<E99BBF>(kg) / 頨恍<E9A0A8>(m)簡
|
||||
- 中国标准: BMI≥28为肥胀
|
||||
- 銝剖𤙴<EFBFBD><EFBFBD><EFBFBD>: BMI<4D>?8銝箄<E98A9D><E7AE84><EFBFBD>
|
||||
- <20>煾<EFBFBD><E785BE>𤥁恣蝞梹<E89D9E><E6A2B9>𣳇<EFBFBD>敺芰㴓
|
||||
|
||||
**憸<><E686B8>蝏𤘪<E89D8F>**:
|
||||
- <20>啣<EFBFBD>BMI<4D>梹<EFBFBD><E6A2B9>啣<EFBFBD>潘<EFBFBD>
|
||||
- 新增obesity列(分类)
|
||||
- <EFBFBD>啣<EFBFBD>obesity<EFBFBD>梹<EFBFBD><EFBFBD><EFBFBD>掩嚗?
|
||||
|
||||
**銝游<E98A9D><E6B8B8><EFBFBD><EFBFBD>**:
|
||||
- <20>讐𠣕: BMI < 18.5
|
||||
- 正常: 18.5 ≤ BMI < 24
|
||||
- 超重: 24 ≤ BMI < 28
|
||||
- 肥胖: BMI ≥ 28
|
||||
- 甇<EFBFBD>虜: 18.5 <EFBFBD>?BMI < 24
|
||||
- 頞<EFBFBD><EFBFBD>: 24 <EFBFBD>?BMI < 28
|
||||
- <EFBFBD>亥<EFBFBD>: BMI <EFBFBD>?28
|
||||
|
||||
**<EFBFBD>拙<EFBFBD><EFBFBD>箸艶**:
|
||||
- 雿栞”<E6A09E>Y妖(BSA): <20>𣇉<EFBFBD><F0A38789><EFBFBD><EFBFBD>霈∠<E99C88>
|
||||
- eGFR: 肾功能评估
|
||||
- APACHE评分: 危重症评估
|
||||
- eGFR: <EFBFBD>曉<EFBFBD><EFBFBD>質<EFBFBD>隡?
|
||||
- APACHE霂<EFBFBD><EFBFBD>: <20>梢<EFBFBD><E6A2A2><EFBFBD><EFBFBD>隡?
|
||||
|
||||
---
|
||||
|
||||
@@ -191,7 +191,7 @@ df['obesity'] = df['BMI'].apply(lambda x: '肥胖' if x >= 28 else '正常')
|
||||
|
||||
**<EFBFBD>冽<EFBFBD><EFBFBD><EFBFBD>誘**:
|
||||
```
|
||||
根据入院日期和出院日期计算住院天数
|
||||
<EFBFBD>寞旿<EFBFBD>仿堺<EFBFBD>交<EFBFBD><EFBFBD><EFBFBD>枂<EFBFBD>X𠯫<EFBFBD>蠘恣蝞𦯀<EFBFBD><EFBFBD>W予<EFBFBD>?
|
||||
```
|
||||
|
||||
**AI<41><49><EFBFBD>隞<EFBFBD><E99A9E>**:
|
||||
@@ -208,13 +208,13 @@ df['length_of_stay'] = (df['discharge_date'] - df['admission_date']).dt.days
|
||||
- .dt.days<79>𣂼<EFBFBD>憭拇㺭
|
||||
|
||||
**憸<><E686B8>蝏𤘪<E89D8F>**:
|
||||
- 新增length_of_stay列
|
||||
- <EFBFBD>啣<EFBFBD>length_of_stay<EFBFBD>?
|
||||
- <20>啣<EFBFBD>潛掩<E6BD9B>页<EFBFBD><E9A1B5>蓥<EFBFBD>銝箏予
|
||||
|
||||
**<EFBFBD>餌<EFBFBD>摨𠉛鍂**:
|
||||
- 雿誯堺憭拇㺭<E68B87><E3BAAD><EFBFBD>
|
||||
- 平均住院日(ALOS)统计
|
||||
- 住院费用相关性分析
|
||||
- 撟喳<EFBFBD>雿誯堺<EFBFBD>?ALOS)蝏蠘恣
|
||||
- 雿誯堺韐寧鍂<EFBFBD>詨<EFBFBD><EFBFBD>批<EFBFBD><EFBFBD>?
|
||||
|
||||
**<EFBFBD>拙<EFBFBD><EFBFBD>箸艶**:
|
||||
- 霈∠<E99C88>撟湧<E6929F>: (隞𠰴予 - <20>箇<EFBFBD><E7AE87>交<EFBFBD>) / 365.25
|
||||
@@ -223,34 +223,34 @@ df['length_of_stay'] = (df['discharge_date'] - df['admission_date']).dt.days
|
||||
|
||||
---
|
||||
|
||||
### 示例7: 条件筛选(入组标准)
|
||||
### 蝷箔<EFBFBD>7: <20>∩辣蝑偦<E89D91>㚁<EFBFBD><E39A81>亦<EFBFBD><E4BAA6><EFBFBD><EFBFBD>嚗?
|
||||
|
||||
**<EFBFBD>冽<EFBFBD><EFBFBD><EFBFBD>誘**:
|
||||
```
|
||||
筛选出年龄≥18岁、诊断为糖尿病、且血糖≥7.0的患者
|
||||
蝑偦<EFBFBD>匧枂撟湧<EFBFBD><EFBFBD>?8撗<38><E69297><EFBFBD><EFBFBD><EFBFBD>凋蛹蝟硋倏<E7A18B><E5808F><EFBFBD><EFBFBD><EFBFBD>銵<EFBFBD>蝟砽竉7.0<EFBFBD><EFBFBD><EFBFBD><EFBFBD>?
|
||||
```
|
||||
|
||||
**AI<41><49><EFBFBD>隞<EFBFBD><E99A9E>**:
|
||||
```python
|
||||
# 多条件筛选
|
||||
# 憭𡁏辺隞嗥<EFBFBD><EFBFBD>?
|
||||
df_selected = df[
|
||||
(df['age'] >= 18) &
|
||||
(df['diagnosis'] == '糖尿病') &
|
||||
(df['diagnosis'] == '蝟硋倏<EFBFBD>?) &
|
||||
(df['glucose'] >= 7.0)
|
||||
]
|
||||
```
|
||||
|
||||
**隞<><E99A9E>霂湔<E99C82>**:
|
||||
- 布尔索引,多条件用&连接
|
||||
- 每个条件需加括号
|
||||
- 返回满足所有条件的行
|
||||
- 撣<EFBFBD><EFBFBD>蝝W<EFBFBD>嚗<EFBFBD><EFBFBD><EFBFBD>∩辣<EFBFBD>?餈墧𦻖
|
||||
- 瘥譍葵<EFBFBD>∩辣<EFBFBD><EFBFBD><EFBFBD>䭾𡠺<EFBFBD>?
|
||||
- 餈𥪜<EFBFBD>皛∟雲<EFBFBD><EFBFBD><EFBFBD>㗇辺隞嗥<EFBFBD>銵?
|
||||
|
||||
**憸<><E686B8>蝏𤘪<E89D8F>**:
|
||||
- <20>蚤ataFrame嚗<65><E59A97><EFBFBD>怎泵<E6808E><E6B3B5>辺隞嗥<E99A9E><E597A5><EFBFBD><EFBFBD>
|
||||
- <20>篼f銝滚<E98A9D>
|
||||
|
||||
**銝游<E98A9D><E6B8B8>𠉛弦摨𠉛鍂**:
|
||||
- 纳入标准筛选
|
||||
- 蝥喳<EFBFBD><EFBFBD><EFBFBD><EFBFBD>蝑偦<EFBFBD>?
|
||||
- <20>㘾膄<E398BE><E88684><EFBFBD><EFBFBD>娪膄
|
||||
- 鈭𡁶<E988AD><F0A181B6><EFBFBD><EFBFBD>嚗<EFBFBD><E59A97><EFBFBD><EFBFBD>僑蝟硋倏<E7A18B><E5808F><EFBFBD><EFBFBD><EFBFBD><EFBFBD>
|
||||
|
||||
@@ -260,7 +260,7 @@ df_selected = df[
|
||||
included = df[
|
||||
(df['age'] >= 18) &
|
||||
(df['age'] <= 75) &
|
||||
(df['diagnosis'].isin(['肺腺癌', '肺鳞癌'])) &
|
||||
(df['diagnosis'].isin(['<EFBFBD>箄<EFBFBD><EFBFBD>?, '<EFBFBD>粹<EFBFBD><EFBFBD>?])) &
|
||||
(df['stage'].isin(['III', 'IV']))
|
||||
]
|
||||
|
||||
@@ -274,43 +274,43 @@ excluded = df[
|
||||
|
||||
---
|
||||
|
||||
## 🔬 Level 4: 高级数据治理(3个)
|
||||
## <EFBFBD>𥈡 Level 4: 擃条漣<EFBFBD>唳旿瘝餌<EFBFBD>嚗?銝迎<E98A9D>
|
||||
|
||||
### 示例8: 简单缺失值填补
|
||||
### 蝷箔<EFBFBD>8: 蝞<><E89D9E>閧撩憭勗<E686AD>澆‵銵?
|
||||
|
||||
**<EFBFBD>冽<EFBFBD><EFBFBD><EFBFBD>誘**:
|
||||
```
|
||||
用中位数填补BMI列的缺失值
|
||||
<EFBFBD>其葉雿齿㺭憛怨‘BMI<EFBFBD>㛖<EFBFBD>蝻箏仃<EFBFBD>?
|
||||
```
|
||||
|
||||
**AI<41><49><EFBFBD>隞<EFBFBD><E99A9E>**:
|
||||
```python
|
||||
# 用中位数填补缺失值
|
||||
# <EFBFBD>其葉雿齿㺭憛怨‘蝻箏仃<EFBFBD>?
|
||||
bmi_median = df['BMI'].median()
|
||||
df['BMI'] = df['BMI'].fillna(bmi_median)
|
||||
```
|
||||
|
||||
**隞<><E99A9E>霂湔<E99C82>**:
|
||||
- median()计算中位数(排除NaN)
|
||||
- fillna()填补缺失值
|
||||
- median()霈∠<EFBFBD>銝凋<EFBFBD><EFBFBD>堆<EFBFBD><EFBFBD>㘾膄NaN嚗?
|
||||
- fillna()憛怨‘蝻箏仃<EFBFBD>?
|
||||
- 銝凋<E98A9D><E5878B>啣<EFBFBD>蝳餌黎<E9A48C>澆蔣<E6BE86>滚<EFBFBD>
|
||||
|
||||
**憸<><E686B8>蝏𤘪<E89D8F>**:
|
||||
- BMI列无缺失值
|
||||
- BMI<EFBFBD>埈<EFBFBD>蝻箏仃<EFBFBD>?
|
||||
- 蝻箏仃雿滨蔭鋡思葉雿齿㺭<E9BDBF>蹂誨
|
||||
|
||||
**憛怨‘<E680A8>寞<EFBFBD><E5AF9E>㗇𥋘**:
|
||||
| <20>寞<EFBFBD> | <20><>鍂<EFBFBD>箸艶 | 隡条<E99AA1> | 蝻箇<E89DBB> |
|
||||
|------|---------|------|------|
|
||||
| 均值 | 正态分布 | 简单 | 受离群值影响 |
|
||||
| 中位数 | 偏态分布 | 稳健 | 信息损失 |
|
||||
| 众数 | 分类变量 | 保留分布 | 可能不合理 |
|
||||
| <EFBFBD><EFBFBD><EFBFBD>?| 甇<><E79487><EFBFBD><EFBFBD>撣?| 蝞<><E89D9E>?| <20>㛖氖蝢文<E89DA2>澆蔣<E6BE86>?|
|
||||
| 銝凋<EFBFBD><EFBFBD>?| <20>𤩺<EFBFBD><F0A4A9BA><EFBFBD>撣?| 蝔喳<E89D94> | 靽⊥<E99DBD><E28AA5>笔仃 |
|
||||
| 隡埈㺭 | <20><>掩<EFBFBD>㗛<EFBFBD> | 靽萘<E99DBD><E89098><EFBFBD><EFBFBD> | <20>航<EFBFBD>銝滚<E98A9D><E6BB9A>?|
|
||||
| <20>滚<EFBFBD>憛怠<E6869B> | <20>園𡢿摨誩<E691A8> | 靽萘<E99DBD>頞见飵 | 隞<><E99A9E><EFBFBD>鍂<EFBFBD>嗅<EFBFBD> |
|
||||
|
||||
**瘜冽<E7989C>鈭钅★**:
|
||||
- <20>𩤃<EFBFBD> 隞<><E99A9E><EFBFBD>鍂鈭𡒊撩憭梁<E686AD><5%
|
||||
- <20>𩤃<EFBFBD> <20><>挽蝻箏仃銝撤CAR嚗<52><E59A97><EFBFBD>券<EFBFBD><E588B8>箇撩憭梧<E686AD>
|
||||
- ⚠️ 可能低估标准差
|
||||
- <EFBFBD>𩤃<EFBFBD> <20>航<EFBFBD>雿𦒘摯<F0A69298><E691AF><EFBFBD>撌?
|
||||
|
||||
---
|
||||
|
||||
@@ -318,7 +318,7 @@ df['BMI'] = df['BMI'].fillna(bmi_median)
|
||||
|
||||
**<EFBFBD>冽<EFBFBD><EFBFBD><EFBFBD>誘**:
|
||||
```
|
||||
使用多重插补法对BMI、年龄、肌酐列的缺失值进行填补
|
||||
雿輻鍂憭𡁻<EFBFBD><EFBFBD>坿‘瘜訫笆BMI<EFBFBD><EFBFBD>僑樴<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>𣂼<EFBFBD><EFBFBD><EFBFBD>撩憭勗<EFBFBD>潸<EFBFBD>銵<EFBFBD>‵銵?
|
||||
```
|
||||
|
||||
**AI<41><49><EFBFBD>隞<EFBFBD><E99A9E>**:
|
||||
@@ -336,42 +336,42 @@ df[cols] = imputer.fit_transform(df[cols])
|
||||
|
||||
**隞<><E99A9E>霂湔<E99C82>**:
|
||||
- MICE (Multivariate Imputation by Chained Equations)
|
||||
- 利用变量间相关性预测缺失值
|
||||
- max_iter=10: 最多迭代10次
|
||||
- random_state=0: 可复现结果
|
||||
- <EFBFBD>拍鍂<EFBFBD>㗛<EFBFBD><EFBFBD>渡㮾<EFBFBD>單<EFBFBD>折<EFBFBD>瘚讠撩憭勗<EFBFBD>?
|
||||
- max_iter=10: <EFBFBD><EFBFBD>憭朞翮隞?0甈?
|
||||
- random_state=0: <EFBFBD>臬<EFBFBD><EFBFBD>啁<EFBFBD><EFBFBD>?
|
||||
|
||||
**蝞埈<E89D9E><E59F88>毺<EFBFBD>**:
|
||||
1. <20>嘥<EFBFBD>憛怨‘嚗<E28098><E59A97><EFBFBD><EFBFBD><EFBFBD>潘<EFBFBD>
|
||||
2. 循环迭代:
|
||||
- 对每个有缺失的变量,用其他变量预测
|
||||
- 更新填补值
|
||||
3. 收敛后停止
|
||||
2. 敺芰㴓餈凋誨嚗?
|
||||
- 撖寞<EFBFBD>銝芣<EFBFBD>蝻箏仃<EFBFBD><EFBFBD><EFBFBD><EFBFBD>𧶏<EFBFBD><EFBFBD>典<EFBFBD>隞硋<EFBFBD><EFBFBD>誯<EFBFBD>瘚?
|
||||
- <EFBFBD>湔鰵憛怨‘<EFBFBD>?
|
||||
3. <EFBFBD>嗆<EFBFBD><EFBFBD>𤾸<EFBFBD>甇?
|
||||
|
||||
**<EFBFBD><EFBFBD>鍂<EFBFBD>箸艶**:
|
||||
- ✅ 缺失率5%-30%
|
||||
- ✅ 缺失机制为MAR(随机缺失)
|
||||
- ✅ 变量间存在相关性
|
||||
- ✅ 需要保持数据分布特征
|
||||
- <EFBFBD>?蝻箏仃<E7AE8F>?%-30%
|
||||
- <EFBFBD>?蝻箏仃<E7AE8F>箏<EFBFBD>銝撤AR嚗<52><E59A97><EFBFBD>箇撩憭梧<E686AD>
|
||||
- <EFBFBD>?<3F>㗛<EFBFBD><E3979B>游<EFBFBD><E6B8B8>函㮾<E587BD>單<EFBFBD>?
|
||||
- <EFBFBD>?<3F><>閬<EFBFBD><E996AC><EFBFBD><EFBFBD>㺭<EFBFBD>桀<EFBFBD>撣<EFBFBD>鸌敺?
|
||||
|
||||
**隡睃飵**:
|
||||
- 利用变量间关系
|
||||
- <EFBFBD>拍鍂<EFBFBD>㗛<EFBFBD><EFBFBD>游<EFBFBD>蝟?
|
||||
- 靽脲<E99DBD><E884B2>唳旿<E594B3><E697BF><EFBFBD>
|
||||
- <20>誩<EFBFBD><E8AAA9>誩榆
|
||||
- 统计学上更合理
|
||||
- 蝏蠘恣摮虫<EFBFBD><EFBFBD>游<EFBFBD><EFBFBD>?
|
||||
|
||||
**vs 简单填补**:
|
||||
| 指标 | 简单填补 | 多重插补 |
|
||||
**vs 蝞<EFBFBD><EFBFBD>訫‵銵?*:
|
||||
| <EFBFBD><EFBFBD><EFBFBD> | 蝞<><E89D9E>訫‵銵?| 憭𡁻<E686AD><F0A181BB>坿‘ |
|
||||
|------|---------|---------|
|
||||
| 复杂度 | 低 | 中等 |
|
||||
| 计算时间 | 快 | 较慢 |
|
||||
| 憭齿<EFBFBD>摨?| 雿?| 銝剔<E98A9D> |
|
||||
| 霈∠<EFBFBD><EFBFBD>園𡢿 | 敹?| 颲<><E9A2B2> |
|
||||
| <20>誩榆 | <20>航<EFBFBD>颲<EFBFBD>之 | 颲<><E9A2B2> |
|
||||
| 标准误 | 低估 | 准确 |
|
||||
| 分布保持 | 差 | 好 |
|
||||
| <EFBFBD><EFBFBD><EFBFBD>霂?| 雿𦒘摯 | <20><>& |
|
||||
| <EFBFBD><EFBFBD><EFBFBD>靽脲<EFBFBD> | 撌?| 憟?|
|
||||
|
||||
**瘜冽<E7989C>鈭钅★**:
|
||||
- 仅适用于数值变量
|
||||
- 分类变量需先编码
|
||||
- 如有时间变量需先转换
|
||||
- 隞<EFBFBD><EFBFBD><EFBFBD>鍂鈭擧㺭<EFBFBD>澆<EFBFBD><EFBFBD>?
|
||||
- <EFBFBD><EFBFBD>掩<EFBFBD>㗛<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>?
|
||||
- 憒<EFBFBD><EFBFBD><EFBFBD>園𡢿<EFBFBD>㗛<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>蓮<EFBFBD>?
|
||||
|
||||
---
|
||||
|
||||
@@ -384,7 +384,7 @@ df[cols] = imputer.fit_transform(df[cols])
|
||||
|
||||
**AI<41><49><EFBFBD>隞<EFBFBD><E99A9E>**:
|
||||
```python
|
||||
# 智能去重:先按日期排序,再按ID去重保留最后一条
|
||||
# <EFBFBD>箄<EFBFBD><EFBFBD>駁<EFBFBD>嚗𡁜<EFBFBD><EFBFBD>㗇𠯫<EFBFBD><EFBFBD><EFBFBD>摨𧶏<EFBFBD><EFBFBD>齿<EFBFBD>ID<EFBFBD>駁<EFBFBD>靽萘<EFBFBD><EFBFBD><EFBFBD><EFBFBD>𦒘<EFBFBD><EFBFBD>?
|
||||
df['check_date'] = pd.to_datetime(df['check_date'])
|
||||
df = df.sort_values('check_date').drop_duplicates(subset=['patient_id'], keep='last')
|
||||
```
|
||||
@@ -395,19 +395,19 @@ df = df.sort_values('check_date').drop_duplicates(subset=['patient_id'], keep='l
|
||||
- keep='last'靽萘<E99DBD><E89098><EFBFBD><EFBFBD>𦒘<EFBFBD><F0A69298>∴<EFBFBD><E288B4>單<EFBFBD><E596AE>唳𠯫<E594B3><F0A0AFAB><EFBFBD>
|
||||
|
||||
**憸<><E686B8>蝏𤘪<E89D8F>**:
|
||||
- 每个患者只保留一条记录
|
||||
- 瘥譍葵<EFBFBD><EFBFBD><EFBFBD><EFBFBD>蘨靽萘<EFBFBD>銝<EFBFBD><EFBFBD>∟扇敶?
|
||||
- 靽萘<E99DBD><E89098><EFBFBD>糓璉<E7B393><E79289>交𠯫<E4BAA4><F0A0AFAB><EFBFBD><EFBFBD>啁<EFBFBD><E59581><EFBFBD>辺
|
||||
|
||||
**<EFBFBD>拙<EFBFBD><EFBFBD>箸艶**:
|
||||
|
||||
**场景1: 保留数据最完整的记录**
|
||||
**<EFBFBD>箸艶1: 靽萘<E99DBD><E89098>唳旿<E594B3><E697BF>摰峕㟲<E5B395><E39FB2>扇敶?*
|
||||
```python
|
||||
# 霈∠<E99C88>瘥讛<E798A5><E8AE9B><EFBFBD><EFBFBD><EFBFBD>游漲
|
||||
df['completeness'] = df.notna().sum(axis=1)
|
||||
df = df.sort_values('completeness', ascending=False).drop_duplicates(subset=['patient_id'], keep='first')
|
||||
```
|
||||
|
||||
**场景2: 多字段组合去重**
|
||||
**<EFBFBD>箸艶2: 憭𡁜<E686AD>畾萇<E795BE><E89087><EFBFBD>縧<EFBFBD>?*
|
||||
```python
|
||||
# <20>㗇<EFBFBD><E39787><EFBFBD>D+撠梯<E692A0><E6A2AF>交<EFBFBD><E4BAA4>駁<EFBFBD>
|
||||
df = df.drop_duplicates(subset=['patient_id', 'visit_date'], keep='first')
|
||||
@@ -415,13 +415,13 @@ df = df.drop_duplicates(subset=['patient_id', 'visit_date'], keep='first')
|
||||
|
||||
**<EFBFBD>箸艶3: 憭齿<E686AD><E9BDBF>餉<EFBFBD><E9A489>駁<EFBFBD>**
|
||||
```python
|
||||
# 优先级:日期最新 > 完整度最高
|
||||
# 隡睃<EFBFBD>蝥改<EFBFBD><EFBFBD>交<EFBFBD><EFBFBD><EFBFBD><EFBFBD>?> 摰峕㟲摨行<E691A8>擃?
|
||||
df = df.sort_values(['check_date', 'completeness'], ascending=[False, False]).drop_duplicates(subset=['patient_id'], keep='first')
|
||||
```
|
||||
|
||||
**<EFBFBD>餌<EFBFBD><EFBFBD>箸艶**:
|
||||
- 删除重复录入的病例
|
||||
- 多次就诊取首次/末次
|
||||
- <EFBFBD>𣳇膄<EFBFBD>滚<EFBFBD>敶訫<EFBFBD><EFBFBD><EFBFBD><EFBFBD>靘?
|
||||
- 憭𡁏活撠梯<EFBFBD><EFBFBD>㚚<EFBFBD>甈?<3F>急活
|
||||
- 璉<>撉𣬚<E69289><F0A3AC9A>𨅯縧<F0A885AF>㵪<EFBFBD><E3B5AA>𡝗<EFBFBD><F0A19D97>堆<EFBFBD>
|
||||
|
||||
---
|
||||
@@ -432,44 +432,44 @@ df = df.sort_values(['check_date', 'completeness'], ascending=[False, False]).dr
|
||||
|
||||
```python
|
||||
system_prompt = f"""
|
||||
你是医疗科研数据清洗专家,负责生成Pandas代码来清洗整理数据。
|
||||
雿䭾糓<EFBFBD>餌<EFBFBD>蝘𤑳<EFBFBD><EFBFBD>唳旿皜<EFBFBD><EFBFBD>銝枏振嚗諹<EFBFBD>韐<EFBFBD><EFBFBD><EFBFBD>辥andas隞<EFBFBD><EFBFBD><EFBFBD>交<EFBFBD>瘣埈㟲<EFBFBD><EFBFBD>㺭<EFBFBD>柴<EFBFBD>?
|
||||
|
||||
## 当前数据集信息
|
||||
- 文件名: {session.fileName}
|
||||
## 敶枏<EFBFBD><EFBFBD>唳旿<EFBFBD><EFBFBD>縑<EFBFBD>?
|
||||
- <EFBFBD><EFBFBD>辣<EFBFBD>? {session.fileName}
|
||||
- 銵峕㺭: {session.totalRows}
|
||||
- <20>埈㺭: {session.totalCols}
|
||||
- <20>堒<EFBFBD>: {', '.join(session.columns)}
|
||||
|
||||
## 摰匧<E691B0>閫<EFBFBD><E996AB>嚗<EFBFBD>撩<EFBFBD>塚<EFBFBD>
|
||||
1. <20>芾<EFBFBD><E88ABE>滢<EFBFBD>df<64>㗛<EFBFBD>
|
||||
2. 禁止导入os、sys等危险模块
|
||||
3. 禁止使用eval、exec等危险函数
|
||||
2. 蝳<EFBFBD>迫撖澆<EFBFBD>os<EFBFBD><EFBFBD>ys蝑匧暒<EFBFBD>拇芋<EFBFBD>?
|
||||
3. 蝳<EFBFBD>迫雿輻鍂eval<EFBFBD><EFBFBD>xec蝑匧暒<EFBFBD>拙遆<EFBFBD>?
|
||||
4. 敹<>◆餈𥡝<E9A488>撘<EFBFBD>虜憭<E8999C><E686AD>
|
||||
5. 餈𥪜<E9A488><F0A5AA9C>澆<EFBFBD>: {{"code": "...", "explanation": "..."}}
|
||||
|
||||
## Few-shot蝷箔<E89DB7>
|
||||
|
||||
### 示例1: 统一缺失值标记
|
||||
用户: 把所有代表缺失的符号统一替换为标准空值
|
||||
### 蝷箔<EFBFBD>1: 蝏煺<E89D8F>蝻箏仃<E7AE8F>潭<EFBFBD>霈?
|
||||
<EFBFBD>冽<EFBFBD>: <20>𦠜<EFBFBD><F0A6A09C>劐誨銵函撩憭梁<E686AD>蝚血噡蝏煺<E89D8F><E785BA>踵揢銝箸<E98A9D><E7AEB8><EFBFBD>征<EFBFBD>?
|
||||
隞<EFBFBD><EFBFBD>:
|
||||
```python
|
||||
df = df.replace(['-', '銝滩祕', 'NA', 'N/A'], np.nan)
|
||||
```
|
||||
|
||||
### 蝷箔<E89DB7>2: <20>啣<EFBFBD>澆<EFBFBD>皜<EFBFBD><E79A9C>
|
||||
用户: 把肌酐列里的非数字符号去掉,转为数值类型
|
||||
<EFBFBD>冽<EFBFBD>: <20>𡃏<EFBFBD><F0A1838F>𣂼<EFBFBD><F0A382BC>𣬚<EFBFBD><F0A3AC9A>墧㺭摮㛖泵<E39B96>瑕縧<E79195>㚁<EFBFBD>頧砌蛹<E7A08C>啣<EFBFBD>潛掩<E6BD9B>?
|
||||
隞<EFBFBD><EFBFBD>:
|
||||
```python
|
||||
df['creatinine'] = df['creatinine'].astype(str).str.replace('>', '').str.replace('<', '')
|
||||
df['creatinine'] = pd.to_numeric(df['creatinine'], errors='coerce')
|
||||
```
|
||||
|
||||
[... 其他8个示例 ...]
|
||||
[... <EFBFBD>嗡<EFBFBD>8銝芰內靘?...]
|
||||
|
||||
## <20>冽<EFBFBD>敶枏<E695B6>霂瑟<E99C82>
|
||||
{user_message}
|
||||
|
||||
请生成代码并解释。
|
||||
霂瑞<EFBFBD><EFBFBD>𣂷誨<EFBFBD><EFBFBD>僎閫<EFBFBD><EFBFBD><EFBFBD>?
|
||||
"""
|
||||
```
|
||||
|
||||
@@ -477,55 +477,56 @@ df['creatinine'] = pd.to_numeric(df['creatinine'], errors='coerce')
|
||||
|
||||
## <20>㴓 韐券<E99F90><E588B8><EFBFBD><EFBFBD>
|
||||
|
||||
每个示例必须满足:
|
||||
- ✅ 代码可直接运行
|
||||
- ✅ 有详细注释
|
||||
- ✅ 有明确的输入输出
|
||||
- ✅ 符合Python最佳实践
|
||||
- ✅ 考虑异常情况
|
||||
- ✅ 有医疗场景说明
|
||||
瘥譍葵蝷箔<EFBFBD>敹<EFBFBD>◆皛∟雲嚗?
|
||||
- <EFBFBD>?隞<><E99A9E><EFBFBD>舐凒<E88890>亥<EFBFBD>銵?
|
||||
- <EFBFBD>?<3F>㕑祕蝏<E7A595>釣<EFBFBD>?
|
||||
- <EFBFBD>?<3F>㗇<EFBFBD>蝖桃<E89D96>颲枏<E9A2B2>颲枏枂
|
||||
- <EFBFBD>?蝚血<E89D9A>Python<EFBFBD><EFBFBD>雿喳<EFBFBD>頝?
|
||||
- <EFBFBD>?<3F><><EFBFBD>撘<EFBFBD>虜<EFBFBD><E8999C><EFBFBD>
|
||||
- <EFBFBD>?<3F>匧龫<E58CA7>堒㦤<E5A092>航秩<E888AA>?
|
||||
|
||||
---
|
||||
|
||||
## <20><> 瘚贝<E7989A><E8B49D>其<EFBFBD>霈曇恣
|
||||
|
||||
基于这10个示例,Day 3测试应包含:
|
||||
<EFBFBD>箔<EFBFBD>餈?0銝芰內靘页<E99D98>Day 3瘚贝<E7989A>摨𥪜<E691A8><F0A5AA9C>恬<EFBFBD>
|
||||
|
||||
**基础测试(4个)**:
|
||||
1. 示例1测试(缺失值统一)
|
||||
**<EFBFBD>箇<EFBFBD>瘚贝<EFBFBD>嚗?銝迎<E98A9D>**:
|
||||
1. 蝷箔<EFBFBD>1瘚贝<EFBFBD>嚗<EFBFBD>撩憭勗<EFBFBD>潛<EFBFBD>銝<EFBFBD>嚗?
|
||||
2. 蝷箔<E89DB7>2瘚贝<E7989A>嚗<EFBFBD>㺭<EFBFBD>潭<EFBFBD>瘣梹<E798A3>
|
||||
3. 示例3测试(性别编码)
|
||||
3. 蝷箔<EFBFBD>3瘚贝<EFBFBD>嚗<EFBFBD><EFBFBD>批<EFBFBD>蝻𣇉<EFBFBD>嚗?
|
||||
4. 蝷箔<E89DB7>4瘚贝<E7989A>嚗<EFBFBD>僑樴<E58391><E6A8B4>蝏<EFBFBD><E89D8F>
|
||||
|
||||
**中级测试(3个)**:
|
||||
5. 示例5测试(BMI计算)
|
||||
**銝剔漣瘚贝<EFBFBD>嚗?銝迎<E98A9D>**:
|
||||
5. 蝷箔<EFBFBD>5瘚贝<EFBFBD>嚗㇂MI霈∠<EFBFBD>嚗?
|
||||
6. 蝷箔<E89DB7>6瘚贝<E7989A>嚗<EFBFBD><E59A97><EFBFBD>W予<EFBCB7>堆<EFBFBD>
|
||||
7. 蝷箔<E89DB7>7瘚贝<E7989A>嚗<EFBFBD>辺隞嗥<E99A9E><E597A5>㚁<EFBFBD>
|
||||
|
||||
**高级测试(3个)**:
|
||||
8. 示例8测试(中位数填补)
|
||||
9. 示例9测试(多重插补)⭐
|
||||
**擃条漣瘚贝<EFBFBD>嚗?銝迎<E98A9D>**:
|
||||
8. 蝷箔<EFBFBD>8瘚贝<EFBFBD>嚗<EFBFBD>葉雿齿㺭憛怨‘嚗?
|
||||
9. 蝷箔<EFBFBD>9瘚贝<EFBFBD>嚗<EFBFBD><EFBFBD><EFBFBD>齿<EFBFBD>銵伐<EFBFBD>潃?
|
||||
10. 蝷箔<E89DB7>10瘚贝<E7989A>嚗<EFBFBD>惣<EFBFBD>賢縧<E8B3A2>㵪<EFBFBD>
|
||||
|
||||
**扩展测试(5个)**:
|
||||
**<EFBFBD>拙<EFBFBD>瘚贝<EFBFBD>嚗?銝迎<E98A9D>**:
|
||||
11. 瘛瑕<E7989B><E79195>箸艶瘚贝<E7989A>嚗<EFBFBD><E59A97>皜<EFBFBD><E79A9C><EFBFBD>滩恣蝞梹<E89D9E>
|
||||
12. <20>躰秤<E8BAB0>箸艶瘚贝<E7989A>嚗<EFBFBD><E59A97>銝滚<E98A9D><E6BB9A>剁<EFBFBD>
|
||||
13. 颲寧<E9A2B2><E5AFA7>箸艶瘚贝<E7989A>嚗<EFBFBD><E59A97><EFBFBD>函撩憭梧<E686AD>
|
||||
14. 自我修正测试(代码报错后重试)
|
||||
14. <EFBFBD>芣<EFBFBD>靽格迤瘚贝<EFBFBD>嚗<EFBFBD>誨<EFBFBD><EFBFBD>𥁒<EFBFBD>坔<EFBFBD><EFBFBD>滩<EFBFBD>嚗?
|
||||
15. 蝡臬<E89DA1>蝡舀<E89DA1>霂𤏪<E99C82>銝𠹺<E98A9D><F0A0B9BA>𡒶I憭<49><E686AD><EFBFBD>垍<EFBFBD><E59E8D>𣈯<EFBFBD>霂<EFBFBD><E99C82>
|
||||
|
||||
---
|
||||
|
||||
## <20><> 蝏湔擪霈啣<E99C88>
|
||||
|
||||
| 日期 | 版本 | 修改内容 | 修改人 |
|
||||
| <EFBFBD>交<EFBFBD> | <20><>𧋦 | 靽格㺿<E6A0BC><E3BABF>捆 | 靽格㺿鈭?|
|
||||
|------|------|---------|--------|
|
||||
| 2025-12-06 | V1.0 | 初始创建,10个核心示例 | AI Assistant |
|
||||
| 2025-12-06 | V1.0 | <EFBFBD>嘥<EFBFBD><EFBFBD>𥕦遣嚗?0銝芣瓲敹<E793B2>內靘?| AI Assistant |
|
||||
|
||||
---
|
||||
|
||||
**文档状态**: ✅ 已确认
|
||||
**下一步**: 开始Day 3开发(AICodeService实现)
|
||||
**<EFBFBD><EFBFBD>﹝<EFBFBD>嗆<EFBFBD>?*: <20>?撌脩&霈?
|
||||
**銝衤<EFBFBD>甇?*: 撘<>憪𠵿ay 3撘<33><E69298>𡢅<EFBFBD>AICodeService摰䂿緵嚗?
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user