Files
AIclinicalresearch/docs/03-业务模块/SSA-智能统计分析/README.md
HaHafeng beb7f7f559 feat(asl): Implement full-text screening core LLM service and validation system (Day 1-3)
Core Components:
- PDFStorageService with Dify/OSS adapters
- LLM12FieldsService with Nougat-first + dual-model + 3-layer JSON parsing
- PromptBuilder for dynamic prompt assembly
- MedicalLogicValidator with 5 rules + fault tolerance
- EvidenceChainValidator for citation integrity
- ConflictDetectionService for dual-model comparison

Prompt Engineering:
- System Prompt (6601 chars, Section-Aware strategy)
- User Prompt template (PICOS context injection)
- JSON Schema (12 fields constraints)
- Cochrane standards (not loaded in MVP)

Key Innovations:
- 3-layer JSON parsing (JSON.parse + json-repair + code block extraction)
- Promise.allSettled for dual-model fault tolerance
- safeGetFieldValue for robust field extraction
- Mixed CN/EN token calculation

Integration Tests:
- integration-test.ts (full test)
- quick-test.ts (quick test)
- cached-result-test.ts (fault tolerance test)

Documentation Updates:
- Development record (Day 2-3 summary)
- Quality assurance strategy (full-text screening)
- Development plan (progress update)
- Module status (v1.1 update)
- Technical debt (10 new items)

Test Results:
- JSON parsing success rate: 100%
- Medical logic validation: 5/5 passed
- Dual-model parallel processing: OK
- Cost per PDF: CNY 0.10

Files: 238 changed, 14383 insertions(+), 32 deletions(-)
Docs: docs/03-涓氬姟妯″潡/ASL-AI鏅鸿兘鏂囩尞/05-寮€鍙戣褰?2025-11-22_Day2-Day3_LLM鏈嶅姟涓庨獙璇佺郴缁熷紑鍙?md
2025-11-22 22:21:12 +08:00

95 lines
1.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# SSA - 智能统计分析
> **模块代号:** SSA (Smart Statistical Analysis)
> **开发状态:** ⏳ 规划中
> **商业价值:** ⭐⭐⭐⭐⭐ 刚需
> **独立性:** ⭐⭐⭐⭐
> **优先级:** P2
---
## 📋 模块概述
智能统计分析模块提供3条核心分析路径实现从数据上传到报告导出的完整流程。
---
## 🎯 核心功能3条路径
### 1. 队列研究分析
- 基线特征分析
- 生存分析Kaplan-Meier
- Cox回归
### 2. 预测模型构建
- 变量筛选
- 模型构建Logistic回归、随机森林
- 模型验证ROC曲线
### 3. RCT研究分析
- 随机化检查
- 疗效分析
- 亚组分析
---
## 📂 文档结构
```
SSA-智能统计分析/
├── [AI对接] SSA快速上下文.md # ⏳ 待创建
├── 00-项目概述/
│ └── 01-产品需求文档(PRD).md # ⏳ 待创建
└── README.md # ✅ 当前文档
```
---
## 🔗 依赖的通用能力
- **文档处理引擎** - 数据导入
- **ETL引擎** - 数据预处理
---
## 🏗️ 技术栈
- **R语言** - 统计分析核心
- **Plumber** - R暴露为API
- **Node.js** - 粘合层
---
## 🎯 商业模式
**与ST模块协同售卖**
---
**最后更新:** 2025-11-06
**维护人:** 技术架构师