Core Components: - PDFStorageService with Dify/OSS adapters - LLM12FieldsService with Nougat-first + dual-model + 3-layer JSON parsing - PromptBuilder for dynamic prompt assembly - MedicalLogicValidator with 5 rules + fault tolerance - EvidenceChainValidator for citation integrity - ConflictDetectionService for dual-model comparison Prompt Engineering: - System Prompt (6601 chars, Section-Aware strategy) - User Prompt template (PICOS context injection) - JSON Schema (12 fields constraints) - Cochrane standards (not loaded in MVP) Key Innovations: - 3-layer JSON parsing (JSON.parse + json-repair + code block extraction) - Promise.allSettled for dual-model fault tolerance - safeGetFieldValue for robust field extraction - Mixed CN/EN token calculation Integration Tests: - integration-test.ts (full test) - quick-test.ts (quick test) - cached-result-test.ts (fault tolerance test) Documentation Updates: - Development record (Day 2-3 summary) - Quality assurance strategy (full-text screening) - Development plan (progress update) - Module status (v1.1 update) - Technical debt (10 new items) Test Results: - JSON parsing success rate: 100% - Medical logic validation: 5/5 passed - Dual-model parallel processing: OK - Cost per PDF: CNY 0.10 Files: 238 changed, 14383 insertions(+), 32 deletions(-) Docs: docs/03-涓氬姟妯″潡/ASL-AI鏅鸿兘鏂囩尞/05-寮€鍙戣褰?2025-11-22_Day2-Day3_LLM鏈嶅姟涓庨獙璇佺郴缁熷紑鍙?md
93 lines
1.1 KiB
Markdown
93 lines
1.1 KiB
Markdown
# 医学NLP引擎
|
||
|
||
> **能力定位:** 通用能力层
|
||
> **复用率:** 14% (1个模块依赖)
|
||
> **优先级:** P2
|
||
> **状态:** ⏳ 待实现
|
||
|
||
---
|
||
|
||
## 📋 能力概述
|
||
|
||
医学NLP引擎负责:
|
||
- 医学实体识别(NER)
|
||
- 医学术语标准化
|
||
- 疾病/药物识别
|
||
|
||
---
|
||
|
||
## 📊 依赖模块
|
||
|
||
**1个模块依赖(14%复用率):**
|
||
1. **DC** - 数据清洗整理(病例数据NER提取)
|
||
|
||
---
|
||
|
||
## 💡 核心功能
|
||
|
||
### 1. 医学实体识别
|
||
- 疾病识别
|
||
- 药物识别
|
||
- 手术识别
|
||
- TNM分期提取
|
||
|
||
### 2. 术语标准化
|
||
- ICD编码
|
||
- ATC编码
|
||
|
||
### 3. 关系抽取
|
||
- 疾病-药物关系
|
||
- 症状-疾病关系
|
||
|
||
---
|
||
|
||
## 🏗️ 技术方案
|
||
|
||
### 云端版(高准确率)
|
||
```python
|
||
# 基于LLM API(Claude/GPT)
|
||
# JSON Mode结构化输出
|
||
```
|
||
|
||
### 单机版(隐私优先)
|
||
```python
|
||
# 基于spaCy + 医学模型
|
||
# 100%本地运行
|
||
```
|
||
|
||
---
|
||
|
||
## 🔗 相关文档
|
||
|
||
- [通用能力层总览](../README.md)
|
||
- [DC模块需求](../../03-业务模块/DC-数据清洗整理/README.md)
|
||
|
||
---
|
||
|
||
**最后更新:** 2025-11-06
|
||
**维护人:** 技术架构师
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|