Core Components: - PDFStorageService with Dify/OSS adapters - LLM12FieldsService with Nougat-first + dual-model + 3-layer JSON parsing - PromptBuilder for dynamic prompt assembly - MedicalLogicValidator with 5 rules + fault tolerance - EvidenceChainValidator for citation integrity - ConflictDetectionService for dual-model comparison Prompt Engineering: - System Prompt (6601 chars, Section-Aware strategy) - User Prompt template (PICOS context injection) - JSON Schema (12 fields constraints) - Cochrane standards (not loaded in MVP) Key Innovations: - 3-layer JSON parsing (JSON.parse + json-repair + code block extraction) - Promise.allSettled for dual-model fault tolerance - safeGetFieldValue for robust field extraction - Mixed CN/EN token calculation Integration Tests: - integration-test.ts (full test) - quick-test.ts (quick test) - cached-result-test.ts (fault tolerance test) Documentation Updates: - Development record (Day 2-3 summary) - Quality assurance strategy (full-text screening) - Development plan (progress update) - Module status (v1.1 update) - Technical debt (10 new items) Test Results: - JSON parsing success rate: 100% - Medical logic validation: 5/5 passed - Dual-model parallel processing: OK - Cost per PDF: CNY 0.10 Files: 238 changed, 14383 insertions(+), 32 deletions(-) Docs: docs/03-涓氬姟妯″潡/ASL-AI鏅鸿兘鏂囩尞/05-寮€鍙戣褰?2025-11-22_Day2-Day3_LLM鏈嶅姟涓庨獙璇佺郴缁熷紑鍙?md
106 lines
2.2 KiB
Markdown
106 lines
2.2 KiB
Markdown
# 通用能力层
|
||
|
||
> **层级定义:** 跨业务模块共享的核心技术能力
|
||
> **核心原则:** 可复用、高内聚、独立部署
|
||
|
||
---
|
||
|
||
## 📋 能力清单
|
||
|
||
| 能力 | 说明 | 复用率 | 优先级 | 状态 |
|
||
|------|------|-------|--------|------|
|
||
| **01-LLM大模型网关** | 统一管理LLM调用、成本控制、模型切换 | 71% (5/7) | P0 | ⏳ 待实现 |
|
||
| **02-文档处理引擎** | PDF/Docx/Txt提取、OCR、表格提取 | 86% (6/7) | P0 | ✅ 已实现 |
|
||
| **03-RAG引擎** | 向量检索、语义搜索、RAG问答 | 43% (3/7) | P1 | ✅ 已实现 |
|
||
| **04-数据ETL引擎** | Excel JOIN、数据清洗、数据转换 | 29% (2/7) | P2 | ⏳ 待实现 |
|
||
| **05-医学NLP引擎** | 医学实体识别、术语标准化 | 14% (1/7) | P2 | ⏳ 待实现 |
|
||
|
||
---
|
||
|
||
## 🎯 设计原则
|
||
|
||
### 1. 可复用性
|
||
- 多个业务模块共享
|
||
- 避免重复开发
|
||
|
||
### 2. 独立部署
|
||
- 可以独立为微服务
|
||
- 支持独立扩展
|
||
|
||
### 3. 高内聚
|
||
- 每个能力职责单一
|
||
- 接口清晰
|
||
|
||
### 4. 领域知识
|
||
- 包含业务领域知识
|
||
- 不是纯技术组件
|
||
|
||
---
|
||
|
||
## 📊 复用率分析
|
||
|
||
**LLM网关** - 71%复用率(最高优先级)
|
||
- AIA(AI智能问答)
|
||
- ASL(AI智能文献)
|
||
- PKB(个人知识库)
|
||
- DC(数据清洗)
|
||
- RVW(稿件审查)
|
||
|
||
**文档处理引擎** - 86%复用率(已实现)
|
||
- ASL、PKB、DC、SSA、ST、RVW
|
||
|
||
**RAG引擎** - 43%复用率(已实现)
|
||
- AIA、ASL、PKB
|
||
|
||
---
|
||
|
||
## 📚 快速导航
|
||
|
||
### 快速上下文
|
||
- **[AI对接] 通用能力快速上下文.md** - 2-3分钟了解通用能力层
|
||
|
||
### 核心能力
|
||
1. [LLM大模型网关](./01-LLM大模型网关/README.md) - P0优先级 ⭐
|
||
2. [文档处理引擎](./02-文档处理引擎/README.md) - 已实现
|
||
3. [RAG引擎](./03-RAG引擎/README.md) - 已实现
|
||
4. [数据ETL引擎](./04-数据ETL引擎/README.md)
|
||
5. [医学NLP引擎](./05-医学NLP引擎/README.md)
|
||
|
||
---
|
||
|
||
## 🔗 相关文档
|
||
|
||
- [系统架构分层设计](../00-系统总体设计/01-系统架构分层设计.md)
|
||
- [平台基础层](../01-平台基础层/README.md)
|
||
- [业务模块层](../03-业务模块/README.md)
|
||
|
||
---
|
||
|
||
**最后更新:** 2025-11-06
|
||
**维护人:** 技术架构师
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|