Files
AIclinicalresearch/docs/03-业务模块/DC-数据清洗整理/02-技术设计/工具 C:AI 辅助医疗数据清洗场景分级清单.md
HaHafeng 1b53ab9d52 feat(aia): Complete AIA V2.0 with universal streaming capabilities
Major Changes:
- Add StreamingService with OpenAI Compatible format
- Upgrade Chat component V2 with Ant Design X integration
- Implement AIA module with 12 intelligent agents
- Update API routes to unified /api/v1 prefix
- Update system documentation

Backend (~1300 lines):
- common/streaming: OpenAI Compatible adapter
- modules/aia: 12 agents, conversation service, streaming integration
- Update route versions (RVW, PKB to v1)

Frontend (~3500 lines):
- modules/aia: AgentHub + ChatWorkspace (100% prototype restoration)
- shared/Chat: AIStreamChat, ThinkingBlock, useAIStream Hook
- Update API endpoints to v1

Documentation:
- AIA module status guide
- Universal capabilities catalog
- System overview updates
- All module documentation sync

Tested: Stream response verified, authentication working
Status: AIA V2.0 core completed (85%)
2026-01-14 19:15:01 +08:00

99 lines
6.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# **撌亙<E6928C> C嚗鋫I 颲<>𨭌<EFBFBD><EFBFBD><E9A48C>唳旿皜<E697BF><E79A9C><EFBFBD>箸艶<E7AEB8><E889B6>漣皜<E6BCA3><E79A9C>**
餈嗘遢皜<EFBFBD><EFBFBD><EFBFBD>?*<2A><><EFBFBD><EFBFBD><E887AC>圈𠗕摨?*<2A>?*銝𡁜𦛚<F0A1819C><EFBFBD>憭齿<E686AD>摨?*隞𡒊<E99A9E><F0A1928A><EFBFBD>憭齿<E686AD><E9BDBF><EFBFBD><E98DA6><EFBFBD><EFBFBD><EFBFBD>匧㦤<E58CA7><EFBFBD><E887AC><EFBFBD><EFBFBD>唳旿撌脣<E6928C>頧賭蛹 Pandas DataFrame (df)<29>?
## **Level 1: <20><EFBFBD><E7AE87><EFBFBD><EFBFBD><E79A9C> (Data Hygiene)**
*<2A><EFBFBD>嚗𡁏<E59A97><F0A1818F>𡏭<EFBFBD><F0A18FAD>脲㺭<E884B2><EFBFBD><E6A180><EFBFBD>𡏭<EFBFBD>霂領<E99C82><EFBFBD><E89098>唳旿<E594B3><E697BF>xcel 銋蠘<E98A8B><E8A098>𡄯<EFBFBD>雿?Python <20>游翰<E6B8B8><EFBFBD><E6B8B8>?
### **1.1 <20><EFBFBD><E3979B>齿<EFBFBD><E9BDBF><EFBFBD><EFBFBD> (Rename)**
* **<2A>箸艶嚗?* <20><EFBFBD>銵典仍<E585B8>臭葉<E887AD><E89189><EFBFBD><EFBFBD>怎鸌畾羓泵<E7BE93><EFBFBD>撟湧<E6929F>(撗?, <20><EFBFBD>/Gender, <20>仿堺\_<><EFBFBD>嚗㚁<E59A97>SPSS <20>仿<EFBFBD><E4BBBF>?
* **<2A><EFBFBD><E586BD><EFBFBD>誘嚗?* <20>𨀣<EFBFBD><F0A880A3><EFBFBD><EFBFBD><EFBFBD><E58CA7>滩蓮銝箇滲<E7AE87><EFBFBD>撠誩<E692A0><EFBFBD><EFBFBD>㗇𡠺<E39787><EFBFBD><E791AF><EFBFBD>?
* **Python <20><EFBFBD>嚗?* 雿輻鍂<E8BCBB><EFBFBD>摮堒<E691AE><E5A092>𡝗迤<F0A19D97>蹱𤜯<E8B9B1><EFBFBD><EFBCB7><EFBFBD>?
### **1.2 <20><EFBFBD><EFBFBD><E6BE86>𨀣<EFBFBD>瘥圝<E798A5>?(Clean Numeric)**
* **<2A>箸艶嚗?* 璉<>撉𣬚<E69289>撖澆枂<E6BE86><E69E82><EFBFBD><EFBFBD><E6AEB7><EFBFBD><EFBFBD>瘛瑕<E7989B><EFBFBD><EFBFBD><EFBFBD>\>100, \<0.1, 12.5+, <20>芣䰻嚗剹<E59A97>?
* **<2A><EFBFBD><E586BD><EFBFBD>誘嚗?* <20>𨀣<EFBFBD><F0A880A3><EFBFBD><E48185><EFBFBD><EFBFBD><E59D94>𣬚<EFBFBD><F0A3AC9A>墧㺭摮㛖泵<E39B96>瑕縧<E79195><EFBFBD><E39A81>娫<0.1<EFBFBD><EFBFBD><EFBFBD>?.05<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>頧砌蛹瘚桃<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>?
* **Python <20><EFBFBD>嚗?* str.replace \+ 甇<><E79487><EFBFBD>𣂼<EFBFBD> \+ pd.to\_numeric(errors='coerce')<29>?
### **1.3 蝏煺<E89D8F>蝻箏仃<E7AE8F>?(Standardize Nulls)**
* **<2A>箸艶嚗?* <20>唳旿<E594B3>峕毽<E5B395><E6AFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>𦦵征<F0A6A6B5><EFBFBD>霂㵪<E99C82>NA, N/A, \-, \\, 銝滩祕<E6BBA9>?
* **<2A><EFBFBD><E586BD><EFBFBD>誘嚗?* <20>𨀣<EFBFBD><F0A880A3><EFBFBD><EFBFBD>劐誨銵兩<E98AB5>䀹瓷<E480B9><EFBFBD><EFBFBD>摮㛖泵<E39B96><EFBFBD><EFBFBD><E98A9D>踵揢銝箸<E98A9D><E7AEB8><EFBFBD><EFBFBD>蝛箏<E89D9B><EFBFBD><E6BDA6><EFBFBD>?
* **Python <20><EFBFBD>嚗?* df.replace(\['-', '銝滩祕', 'NA'\], np.nan, inplace=True)<29>?
## **Level 2: <20><EFBFBD><E3979B><EFBFBD><EFBFBD><EFBFBD><EFBFBD><E7A2B6><EFBFBD><E6BBA8>?(Recode & Standardization)**
*<2A><EFBFBD>嚗帋蛹蝏蠘恣<E8A098><E681A3><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><E3979B>?
### **2.1 <20><>𧋦頧祆㺭<E7A586><EFBFBD>撠?(Map Categorical)**
* **<2A>箸艶嚗?* <20><EFBFBD><E689B9>埈糓 Male/Female嚗<65>𢙺<EFBFBD>笔蟮<E7AC94>?Yes/No<4E>?
* **<2A><EFBFBD><E586BD><EFBFBD>誘嚗?* <20>𨀣<EFBFBD><F0A880A3><EFBFBD>頧砌蛹 1(<28>?/0(憟?嚗峕<E59A97><E5B395><EFBFBD><E8B28A>脰蓮銝?1/0<><30><EFBFBD>?
* **Python <20><EFBFBD>嚗?* df\['sex'\].map({'Male': 1, 'Female': 0})<29>?
### **2.2 餈䂿賒<E482BF><EFBFBD><E3979B><EFBFBD>拳 (Binning)**
* **<2A>箸艶嚗?* <20><><EFBFBD><E996AC>撟湧<E6929F><E6B9A7><EFBFBD><EFBFBD>餈𥡝<E9A488><F0A5A19D>⊥䲮璉<E4B2AE>撉䎚<E69289>?
* **<2A><EFBFBD><E586BD><EFBFBD>誘嚗?* <20>𨀣<EFBFBD>撟湧<E6929F><E6B9A7>?0-18, 19-60, 60+ <20><><EFBFBD>䀹𧊋<E480B9>𣂼僑<F0A382BC>? <20><EFBFBD>撟氯<E6929F>? <20><EFBFBD><E48185><EFBFBD><EFBFBD><EFBFBD><E89D8F><EFBFBD><EFBFBD>?
* **Python <20><EFBFBD>嚗?* pd.cut() <20>賣㺭<E8B3A3>?
### **2.3 憭齿<E686AD><E9BDBF><EFBFBD>霈∠<E99C88> (Date Logic)**
* **<2A>箸艶嚗?* 霈∠<E99C88><E288A0><EFBFBD><E7AC94>園𡢿嚗㇉S嚗剹<E59A97><E589B9>xcel 蝏誩虜蝞烾<E89D9E><E783BE>啣僑<E595A3>𡝗<EFBFBD>隞賬<E99A9E>?
* **<2A><EFBFBD><E586BD><EFBFBD>誘嚗?* <20>𨀣覔<F0A880A3><EFBFBD>霂𦠜𠯫<F0A6A09C><EFBFBD><EFBFBD><E59D94><EFBFBD>霈踵𠯫<E8B8B5><EFBFBD>躰恣蝞㛖<E89D9E>摮䀹<E691AE><E480B9><EFBFBD>靽萘<E99DBD>1雿滚<E99BBF><E6BB9A><EFBFBD><E5959C><EFBFBD>?
* **Python <20><EFBFBD>嚗?* (df\['end\_date'\] \- df\['start\_date'\]).dt.days / 30.4<EFBFBD>?
## **Level 3: 銝游<E98A9D><E6B8B8><EFBFBD><E9A489><EFBFBD>撌亦<E6928C> (Feature Engineering)**
*<2A><EFBFBD>嚗𡁜抅鈭𤾸龫摮衣䰻霂<E4B0BB><E99C82><EFBFBD>鞉鰵<E99E89><E9B0B5><EFBFBD><EFBFBD><EFBFBD><E99E89><EFBFBD><EFBFBD>?
### **3.1 憭滚<E686AD><E6BB9A><EFBFBD>霈∠<E99C88> (Complex Formula)**
* **<2A>箸艶嚗?* 霈∠<E99C88> eGFR (<28><EFBFBD><E69B89><EFBFBD>誘餈<E8AA98><E9A488>) <20>?BMI<4D>?
* **<2A><EFBFBD><E586BD><EFBFBD>誘嚗?* <20>𨅯葬<F0A885AF>𤏸恣蝞?BMI<4D><49><EFBFBD><EFBFBD>?BMI \> 28嚗𣬚<E59A97><F0A3AC9A>鞉鰵<E99E89><EFBFBD>霈唬蛹<E594AC><EFBFBD><E48185><EFBFBD><EFBFBD><E8B8BA><EFBFBD>?
* **Python <20><EFBFBD>嚗?* <20><EFBFBD><E785BE>𤥁恣蝞?df\['weight'\] / (df\['height'\]/100)\*\*2 \+ <20>∩辣韏见<E99F8F>?np.where<72>?
### **3.2 <20>𣂼<EFBFBD><F0A382BC><EFBFBD><E4BAA4><EFBFBD><EFBFBD> (Cohort Selection)**
* **<2A>箸艶嚗?* 蝑偦<E89D91>厩泵<E58EA9><E6B3B5>辺隞嗥<E99A9E><E597A5><EFBFBD>鈭箇黎<E7AE87>?
* **<2A><EFBFBD><E586BD><EFBFBD>誘嚗?* <20>𦦵<EFBFBD><F0A6A6B5>匧枂嚗𡁶霂𠹺蛹<F0A0B9BA><EFBFBD><E7AE84><EFBFBD>銝𥪜僑樴<E58391>之鈭?8撗<38><E69297>銝娍瓷<E5A88D><EFBFBD><EFBFBD><E98AB5><EFBFBD><E8AEA0><EFBFBD><E884A9><EFBFBD><EFBFBD><E78A96><EFBFBD>?
* **Python <20><EFBFBD>嚗?* df.query("diagnosis \== 'Lung Adenocarcinoma' & age \> 18 & hypertension \== 0")<29>?
### **3.3 <20><EFBFBD><E7A983><EFBFBD><E8AE90>?(One-Hot Encoding)**
* **<2A>箸艶嚗?* <20><><EFBFBD><EFBFBD>?Logistic <20>𧼮<EFBFBD>嚗峕<E59A97><EFBFBD>銝芣<E98A9D>摨誩<E691A8><E8AAA9><EFBFBD><EFBFBD><EFBFBD><E3979B>𡏭<EFBFBD><F0A18FAD>?(A, B, AB, O)<29><EFBFBD>?
* **<2A><EFBFBD><E586BD><EFBFBD>誘嚗?* <20>𨀣<EFBFBD><EFBFBD><E98AB5><EFBFBD><E8AEA0>𣂼<EFBFBD><F0A382BC><EFBFBD><E3979B><EFBFBD><EFBFBD>?
* **Python <20><EFBFBD>嚗?* pd.get\_dummies(df\['blood\_type'\], prefix='blood')<29>?
## **Level 4: 蝏𤘪<E89D8F><F0A498AA><EFBFBD>銝𡡞<E98A9D>蝥扳祥<E689B3>?(Reshaping & Governance)**
*<2A><EFBFBD>嚗𡁏㺿<F0A1818F>䁅”<E48185><EFBFBD><E6BD9B><EFBFBD><EFBFBD><E8AA91><EFBFBD><EFBFBD><EFBFBD><E5ADB5><EFBFBD><EFBFBD>霈⊥芋<E28AA5><EFBFBD><E9A1B5>𤥁<EFBFBD>銵屸<E98AB5><E5B1B8>嗆㺭<E59786>桐耨憭溻<E686AD>?
### **4.1 <20>踹捐銵刻蓮<E588BB>?(Pivot/Melt) <20><EFBFBD>?Excel <20><>埯璇?*
* **<2A>箸艶嚗?* <20><EFBFBD><E6A180><EFBFBD><EFBFBD>鈭箏<E988AD>銵𢞖<E98AB5><EFBFBD>撘牐<E69298>-蝚?甈<E79488>撉䕘<E69289>撘牐<E69298>-蝚?甈<E79488>撉䕘<E69289>嚗諹<E59A97><E8ABB9>𡁻<EFBFBD>憭齿<E686AD><E9BDBF><EFBFBD><E8AAA9><EFBFBD><E7909C><EFBFBD><EFBFBD><E996AC><EFBFBD><EFBFBD><EFBFBD>鈭箔<E988AD>銵𢞖<E98AB5><EFBFBD>撘牐<E69298>-<2D><EFBFBD>1-<2D><EFBFBD>2嚗剹<E59A97>?
* **<2A><EFBFBD><E586BD><EFBFBD>誘嚗?* <20>𨀣<EFBFBD>銵冽聢隞𡡞鵭銵刻蓮銝箏捐銵剁<E98AB5><E58981><EFBFBD>鈭截D蝝<E89D9D>嚗𣬚鍂<F0A3AC9A>䁅挪閫<E68CAA>活摨謿<E691A8><EFBFBD><E59D94>𡒊<EFBFBD>嚗屸唍撘<E5948D><E69298>条蒾蝏<E892BE><E89D8F><EFBFBD><EFBFBD><E59D94><EFBFBD><EFBFBD>?
* **Python <20><EFBFBD>嚗?* df.pivot(index='id', columns='visit', values='wbc')<29>?
### **4.2 <20><EFBFBD><E7AE84><EFBFBD> (Smart Deduplication)**
* **<2A>箸艶嚗?* <20><EFBFBD>銝芰<E98A9D>鈭箸<E988AD>銝斗辺霈啣<E99C88>嚗䔶<E59A97><E494B6>∩縑<E288A9><EFBFBD>嚗䔶<E59A97><E494B6>∩縑<E288A9>舐撩<E88890>?
* **<2A><EFBFBD><E586BD><EFBFBD>誘嚗?* <20>𨀣<EFBFBD><F0A880A3><EFBFBD>犖ID<49><EFBFBD><E9A781><EFBFBD><EFBFBD><EFBFBD>𨀣<EFBFBD><F0A880A3><EFBFBD>嚗䔶<E59A97><E494B6><EFBFBD><EFBFBD><E480B9>交𠯫<E4BAA4><EFBFBD><EFBFBD>餈𤑳<E9A488><F0A491B3><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><E68692><EFBFBD><EFBFBD><EFBFBD><E98A9D><EFBFBD>靽萘<E99DBD><E89098>䀹㺭<E480B9><EFBFBD><E6A180>游漲<E6B8B8><EFBFBD>擃条<E69383><E69DA1><EFBFBD><EFBFBD><E8BEBA><EFBFBD>?
* **Python <20><EFBFBD>嚗?* df.sort\_values(\['date', 'completeness'\]).drop\_duplicates(subset=\['id'\], keep='last')<29>?
### **4.3 頝典<E9A09D><E585B8><EFBFBD><E9A489><EFBFBD> (Cross-Check)**
* **<2A>箸艶嚗?* <20>𤑳緵<F0A491B3>𤩺㺭<F0A4A9BA><EFBFBD>?
* **<2A><EFBFBD><E586BD><EFBFBD>誘嚗?* <20>𨀣<EFBFBD><F0A880A3><EFBFBD>銝𧢲<E98A9D>瘝⊥<E7989D><E28AA5>条琸<E69DA1><EFBFBD><EFBFBD><E59798><EFBFBD><EFBFBD><E480B9>摮閙活<E99699>豹>0<><EFBFBD><E597B5>躰秤<E8BAB0>唳旿嚗峕<E59A97>霈啣枂<E595A3><EFBFBD><E4B993><EFBFBD>?
* **Python <20><EFBFBD>嚗?* df.loc\[(df\['sex'\]=='<27>?) & (df\['preg\_count'\]\>0), 'error\_flag'\] \= 1<>?
### **4.4 憭𡁻<E686AD><F0A181BB> (Multiple Imputation) <20><EFBFBD>?蝏蠘恣摮衣<E691AE>擃条漣憛怨**
* **<2A>箸艶嚗?* <20>唳旿<E594B3><E697BF><EFBFBD>蝻箏仃<E7AE8F><EFBFBD>憒?BMI 蝻箏仃嚗㚁<E59A97><E39A81>閧滲<E996A7><EFBFBD><E585B8>銵乩<E98AB5><E4B9A9><EFBFBD><E6B8B8>唳旿<E594B3><E697BF><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>隞硋<E99A9E><E7A18B>𧶏<EFBFBD><EFBFBD>僑樴<E58391><E6A8B4><EFBFBD><EFBFBD><EFBFBD><E689B9><EFBFBD><EFBFBD><EFBFBD><EFBFBD><E7909C><EFBFBD><EFBFBD><EFBFBD>扳䔉憸<E49489><E686B8>憛怨<E680A8>?
* **<2A><EFBFBD><E586BD><EFBFBD>誘嚗?* <20>靝蝙<E99D9D><EFBFBD><E585B8>齿<EFBFBD>銵交<E98AB5>(MICE)撖嫖<E69296>𦲂MI<4D><EFBFBD><E59D94>睃僑樴<E58391><E6A8B4><EFBFBD><E59D94><EFBFBD>撩憭勗<E686AD><EFBFBD><EFBFBD>銵乓<E98AB5><E4B993><EFBFBD>?
* # **Python <20><EFBFBD>嚗?\`\`\`python** **from sklearn.experimental import enable\_iterative\_imputer** **from sklearn.impute import IterativeImputer** **隞<><E99A9E>撖寞㺭<E5AF9E><EFBFBD>餈𥡝<E9A488><F0A5A19D>** **cols \= \['bmi', 'age', 'creatinine'\]** **imp \= IterativeImputer(max\_iter=10, random\_state=0)** **df\[cols\] \= imp.fit\_transform(df\[cols\])**
## **Level 5: <20><EFBFBD><E482BF><EFBFBD><EFBFBD><EFBFBD><EFBFBD>𧋦<EFBFBD>𡝗<EFBFBD> (Text Mining) <20><EFBFBD>?Python <20><><EFBFBD>撖寧<E69296>瘝餃躹**
*<2A><EFBFBD>嚗帋<E59A97><EFBFBD><EFBFBD>𡝗𥁒<F0A19D97>𦠜<EFBFBD><F0A6A09C>砌葉<E7A08C>𨀣<EFBFBD><F0A880A3>嘥枂<E598A5>唳旿<E594B3><E697BF><EFBFBD><EFBFBD>?Excel 蝏嘥笆<E598A5><EFBFBD><E5B88B><EFBFBD><E59581>?
### **5.1 甇<><E79487>銵刻噢撘𤩺<E69298><F0A4A9BA>?(Regex Extraction)**
* **<2A>箸艶嚗?* <20><EFBFBD><EFBFBD><E98A9D><EFBFBD><E59F88><EFBFBD>𦦵<EFBFBD><F0A6A6B5><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><E59885><EFBFBD>捆憒<E68D86><E68692><EFBFBD>?撌西<E6928C>銝𠰴蠏)瘚豢隋<E8B1A2><EFBFBD><E689AF><EFBFBD>憭批<E686AD>3.5\*2cm<63><EFBFBD><E88588><EFBFBD><EFBFBD><E996AC><EFBFBD>𤥁<EFBFBD><F0A4A581>文之撠譌<E692A0>?
* **<2A><EFBFBD><E586BD><EFBFBD>誘嚗?* <20><EFBFBD><E99D9D><EFBFBD><E69DA1><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><E4A0B7>𣂼<EFBFBD><F0A382BC><EFBFBD><E7AE84><EFBFBD><E696A4><EFBFBD><EFBFBD><E59A97>憭抒<E686AD><E68A92><EFBFBD><EFBFBD><EFBFBD>嚗剹<E59A97><E589B9><EFBFBD>?
* **Python <20><EFBFBD>嚗?* df\['text'\].str.extract(r'(\\d+\\.?\\d\*)\\s\*\[\\\*xX\]\\s\*(\\d+\\.?\\d\*)') 撟嗅<E6929F><E59785><EFBFBD>憭批<E686AD><EFBFBD>?
### **5.2 摮㛖泵銝脫芋蝟𠰴龪<F0A0B0B4>?(Fuzzy Matching)**
* **<2A>箸艶嚗?* <20>駁堺<E9A781>滨妍敶訫<E695B6>瘛瑚僚嚗尠<E59A97>𨅯<EFBFBD><F0A885AF><EFBFBD><EFBFBD><EFBFBD><EFBFBD><E88588><EFBFBD>𨅯<EFBFBD>鈭砍<E988AD><E7A08D>𢞖<EFBFBD><EFBFBD><E88588><EFBFBD>𨅯<EFBFBD><F0A885AF>𢞖<EFBFBD><EFBFBD><E88588><EFBFBD><EFBFBD><E996AC><EFBFBD><E98A9D>?
* **<2A><EFBFBD><E586BD><EFBFBD>誘嚗?* <20>𨀣<EFBFBD><F0A880A3>睃龫<E79D83><EFBFBD>蝘售<E89D98><EFBFBD><E59D94><EFBFBD><E5B395><EFBFBD><E58CA7><EFBFBD><EFBFBD><E79D83>𢞖<EFBFBD><EFBFBD>嚗屸<E59A97>蝏煺<E89D8F><E785BA>嫣蛹<E5ABA3>婱UMCH<43><EFBFBD><E8B8BA><EFBFBD>?
* **Python <20><EFBFBD>嚗?* df.loc\[df\['hospital'\].str.contains('<27><EFBFBD>'), 'hospital'\] \= 'PUMCH'<27>