Files
AIclinicalresearch/docs/02-通用能力层/02-文档处理引擎
HaHafeng 1b53ab9d52 feat(aia): Complete AIA V2.0 with universal streaming capabilities
Major Changes:
- Add StreamingService with OpenAI Compatible format
- Upgrade Chat component V2 with Ant Design X integration
- Implement AIA module with 12 intelligent agents
- Update API routes to unified /api/v1 prefix
- Update system documentation

Backend (~1300 lines):
- common/streaming: OpenAI Compatible adapter
- modules/aia: 12 agents, conversation service, streaming integration
- Update route versions (RVW, PKB to v1)

Frontend (~3500 lines):
- modules/aia: AgentHub + ChatWorkspace (100% prototype restoration)
- shared/Chat: AIStreamChat, ThinkingBlock, useAIStream Hook
- Update API endpoints to v1

Documentation:
- AIA module status guide
- Universal capabilities catalog
- System overview updates
- All module documentation sync

Tested: Stream response verified, authentication working
Status: AIA V2.0 core completed (85%)
2026-01-14 19:15:01 +08:00
..

鏂囨。澶勭悊寮曟搸

*鑳藉姏瀹氫綅锛? 閫氱敤鑳藉姏灞? 澶嶇敤鐜囷細 86% (6涓<36>ā鍧椾緷璧?
浼樺厛绾э細 P0
鐘舵€侊細 鉁?宸插疄鐜帮紙Python寰<6E>湇鍔★級


馃搵 鑳藉姏姒傝堪

鏂囨。澶勭悊寮曟搸鏄<EFBFBD>钩鍙扮殑鏍稿績鍩虹<EFBFBD>鑳藉姏锛岃礋璐

  • 澶氭牸寮忔枃妗枃鏈<EFBFBD>彁鍙栵紙PDF銆丏ocx銆乀xt銆丒xcel锛?
  • OCR澶勭悊
  • 琛ㄦ牸鎻愬彇
  • <EFBFBD>█妫€娴?
  • 璐ㄩ噺璇勪及

馃搳 渚濊禆妯″潡

*6涓<EFBFBD>ā鍧椾緷璧栵紙86%澶嶇敤鐜囷級锛?

  1. ASL - AI鏅鸿兘鏂囩尞锛堟枃鐚甈DF鎻愬彇锛?
  2. PKB - 涓<>汉鐭ヨ瘑搴擄紙鐭ヨ瘑搴撴枃妗笂浼狅級
  3. DC - 鏁版嵁娓呮礂锛圗xcel/Docx鏁版嵁瀵煎叆锛?
  4. SSA - 鏅鸿兘缁熻<E7BC81>鍒嗘瀽锛堟暟鎹<E69A9F><E98EB9>鍏ワ級
  5. ST - 缁熻<E7BC81>鍒嗘瀽宸ュ叿锛堟暟鎹<E69A9F><E98EB9>鍏ワ級
  6. RVW - 绋夸欢瀹℃煡锛堢ǹ浠舵枃妗f彁鍙栵級

馃挕 鏍稿績鍔熻兘

1. PDF鎻愬彇

  • Nougat锛氳嫳鏂囧<EFBFBD><EFBFBD><EFBFBD>鏂囷紙楂樿川閲忥級
  • PyMuPDF锛氫腑鏂嘝DF + 鍏滃簳鏂规<E98F82>锛堝揩閫燂級
  • **璇<>█妫€娴?*锛氳嚜鍔ㄨ瘑鍒<E79891>腑鑻辨枃
  • 璐ㄩ噺璇勪及锛氭彁鍙栬川閲忚瘎鍒?

2. Docx鎻愬彇

  • Mammoth锛氳浆Markdown
  • python-docx锛氱粨鏋勫寲璇诲彇

3. Txt鎻愬彇

  • **澶氱紪鐮佹敮鎸?*锛歎TF-8銆丟BK绛?
  • chardet锛氳嚜鍔ㄦ<EFBFBD>娴嬬紪鐮?

4. Excel澶勭悊

  • openpyxl锛氳<EFBFBD>鍙朎xcel
  • pandas锛氭暟鎹<EFBFBD><EFBFBD>鐞?

馃彈锔?鎶€鏈<E282AC>灦鏋?

Python寰<EFBFBD>湇鍔★紙FastAPI锛夛細

extraction_service/
  鈹溾攢鈹€ main.py (509琛?              - FastAPI涓绘湇鍔?
  鈹溾攢鈹€ services/
  鈹?  鈹溾攢鈹€ pdf_extractor.py (242琛?    - PDF鎻愬彇鎬诲崗璋?
  鈹?  鈹溾攢鈹€ pdf_processor.py (280琛?    - PyMuPDF瀹炵幇
  鈹?  鈹溾攢鈹€ language_detector.py (120琛? - 璇<>█妫€娴?
  鈹?  鈹溾攢鈹€ nougat_extractor.py (242琛? - Nougat瀹炵幇
  鈹?  鈹溾攢鈹€ docx_extractor.py (253琛?   - Docx鎻愬彇
  鈹?  鈹斺攢鈹€ txt_extractor.py (316琛?    - Txt鎻愬彇锛堝<E9949B>缂栫爜锛?
  鈹斺攢鈹€ requirements.txt

馃摎 API绔<49>

POST /api/extract/pdf      - PDF鏂囨湰鎻愬彇
POST /api/extract/docx     - Docx鏂囨湰鎻愬彇
POST /api/extract/txt      - Txt鏂囨湰鎻愬彇
POST /api/extract/excel    - Excel琛ㄦ牸鎻愬彇
GET  /health               - 鍋ュ悍妫€鏌?

馃敆 鐩稿叧鏂囨。


鏈€鍚庢洿鏂帮細 2025-11-06
缁存姢浜猴細 鎶€鏈<E282AC>灦鏋勫笀