Files

HaHafeng 8a17369138 feat(dc): Complete Tool B MVP with full API integration and bug fixes

Phase 5: Export Feature
- Add Excel export API endpoint (GET /tasks/:id/export)
- Fix Content-Disposition header encoding for Chinese filenames
- Fix export field order to match template definition
- Export finalResult or resultA as fallback

API Integration Fixes (Phase 1-5):
- Fix API response parsing (return result.data consistently)
- Fix field name mismatch (fileKey -> sourceFileKey)
- Fix Excel parsing bug (range:99 -> slice(0,100))
- Add file upload with Excel parsing (columns, totalRows)
- Add detailed error logging for debugging

LLM Integration Fixes:
- Fix LLM call method: LLMFactory.createLLM -> getAdapter
- Fix adapter interface: generateText -> chat([messages])
- Fix response fields: text -> content, tokensUsed -> usage.totalTokens
- Fix model names: qwen-max -> qwen3-72b

React Infinite Loop Fixes:
- Step2: Remove updateState from useEffect deps
- Step3: Add useRef to prevent Strict Mode double execution
- Step3: Clear interval on API failure (max 3 retries)
- Step4: Add useRef to prevent infinite data loading
- Add cleanup functions to all useEffect hooks

Frontend Enhancements:
- Add comprehensive error handling with user-friendly messages
- Remove debug console.logs (production ready)
- Fix TypeScript type definitions (TaskProgress, ExtractionItem)
- Improve Step4Verify data transformation logic

Backend Enhancements:
- Add detailed logging at each step for debugging
- Add parameter validation in controllers
- Improve error messages with stack traces (dev mode)
- Add export field ordering by template definition

Documentation Updates:
- Update module status: Tool B MVP completed
- Create MVP completion summary (06-开发记录)
- Create technical debt document (07-技术债务)
- Update API documentation with test status
- Update database documentation with verified status
- Update system overview with DC module status
- Document 4 known issues (Excel preprocessing, progress display, etc.)

Testing Results:
- File upload: 9 rows parsed successfully
- Health check: Column validation working
- Dual model extraction: DeepSeek-V3 + Qwen-Max both working
- Processing time: ~49s for 9 records (~5s per record)
- Token usage: ~10k tokens total (~1.1k per record)
- Conflict detection: 1 clean, 8 conflicts (88.9% conflict rate)
- Excel export: Working with proper encoding

Files Changed:
Backend (~500 lines):
- ExtractionController.ts: Add upload endpoint, improve logging
- DualModelExtractionService.ts: Fix LLM call methods, add detailed logs
- HealthCheckService.ts: Fix Excel range parsing
- routes/index.ts: Add upload route

Frontend (~200 lines):
- toolB.ts: Fix API response parsing, add error handling
- Step1Upload.tsx: Integrate upload and health check APIs
- Step2Schema.tsx: Fix infinite loop, load templates from API
- Step3Processing.tsx: Fix infinite loop, integrate progress polling
- Step4Verify.tsx: Fix infinite loop, transform backend data correctly
- Step5Result.tsx: Integrate export API
- index.tsx: Add file metadata to state

Scripts:
- check-task-progress.mjs: Database inspection utility

Docs (~8 files):
- 00-模块当前状态与开发指南.md: Update to v2.0
- API设计文档.md: Mark all endpoints as tested
- 数据库设计文档.md: Update verification status
- DC模块Tool-B开发计划.md: Add MVP completion notice
- DC模块Tool-B开发任务清单.md: Update progress to 100%
- Tool-B-MVP完成总结.md: New completion summary
- Tool-B技术债务清单.md: New technical debt document
- 00-系统当前状态与开发指南.md: Update DC module status

Status: Tool B MVP complete and production ready

2025-12-03 15:07:39 +08:00

_templates

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

00-系统总体设计

feat(dc): Complete Tool B MVP with full API integration and bug fixes

2025-12-03 15:07:39 +08:00

00-项目概述

feat(dc): Complete Phase 1 - Portal workbench page development

2025-12-02 21:53:24 +08:00

01-平台基础层

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

02-通用能力层

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

03-业务模块

feat(dc): Complete Tool B MVP with full API integration and bug fixes

2025-12-03 15:07:39 +08:00

04-开发规范

feat(dc): Complete Phase 1 - Portal workbench page development

2025-12-02 21:53:24 +08:00

05-部署文档

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

06-测试文档

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

07-运维文档

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

08-项目管理

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

09-架构实施

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

[AI对接] 项目状态与下一步指南.md

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

[完成] 文档重构总结报告.md

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

README.md

docs: complete documentation system (250+ files)

2025-11-16 15:43:55 +08:00

文档整理清单.md

docs: complete documentation system (250+ files)

2025-11-16 15:43:55 +08:00

README.md

📚 壹证循AI科研平台 - 文档中心

文档版本： v3.0
最后更新： 2025-11-06
文档体系： 三层架构 + 金字塔式快速上下文

🚀 快速开始

我是新AI实例，第一次对话

必读： 00-系统总体设计/[AI对接] 快速上下文
阅读时间： 2分钟 | Token消耗： ~800 tokens
价值： 快速了解项目全貌、8个业务模块、技术栈、当前进度

我要开发具体模块

模块	快速上下文	阅读时间	Token消耗
ASL-AI智能文献 ⭐ P0	ASL快速上下文	5分钟	~2000
LLM网关 ⭐ P0	LLM网关快速上下文	5分钟	~2000
ADMIN运营管理端 P1	ADMIN快速上下文	5分钟	~2000

我要了解整体架构

文档	说明
系统架构分层设计	三层架构、8个业务模块
架构设计全景图	可视化架构图
数据库架构说明	PostgreSQL + Schema隔离

📂 文档目录结构

docs/
├── 📖 00-系统总体设计/              # 总体架构、技术决策
│   ├── [AI对接] 快速上下文.md       ⭐ L0总览（必读）
│   ├── 01-系统架构分层设计.md
│   ├── 02-文档体系重构方案.md
│   ├── 03-数据库架构说明.md
│   ├── 08-架构设计全景图.md
│   └── README.md
│
├── 📖 01-平台基础层/                # 平台基础设施（5个模块）
│   ├── [AI对接] 平台层快速上下文.md  ⭐ L1层级
│   ├── 01-用户与权限中心(UAM)/
│   ├── 02-存储服务/
│   ├── 03-通知服务/
│   ├── 04-监控与日志/
│   ├── 05-系统配置/
│   └── README.md
│
├── 📖 02-通用能力层/                # 跨模块共享能力（5个能力）
│   ├── [AI对接] 通用能力快速上下文.md ⭐ L1层级
│   ├── 01-LLM大模型网关/            ⭐ P0优先级
│   │   ├── [AI对接] LLM网关快速上下文.md ⭐ L2模块级
│   │   └── README.md
│   ├── 02-文档处理引擎/
│   ├── 03-RAG引擎/
│   ├── 04-数据ETL引擎/
│   ├── 05-医学NLP引擎/
│   └── README.md
│
├── 📖 03-业务模块/                  # 业务功能模块（8个模块）
│   ├── [AI对接] 业务模块快速上下文.md ⭐ L1层级
│   ├── ASL-AI智能文献/              ⭐ P0优先级
│   │   ├── [AI对接] ASL快速上下文.md ⭐ L2模块级
│   │   └── README.md
│   ├── ADMIN-运营管理端/            ⭐ P1优先级
│   │   ├── [AI对接] ADMIN快速上下文.md ⭐ L2模块级
│   │   └── README.md
│   ├── AIA-AI智能问答/              ✅ 已完成
│   ├── PKB-个人知识库/              ✅ 已完成
│   ├── RVW-稿件审查系统/            ✅ 已完成
│   ├── DC-数据清洗整理/
│   ├── SSA-智能统计分析/
│   ├── ST-统计分析工具/
│   └── README.md
│
├── 📖 04-开发规范/                  # 编码规范、API规范
│   ├── 01-数据库设计规范.md
│   ├── 02-API设计规范.md
│   ├── 03-代码规范.md
│   ├── 04-Git提交规范.md
│   └── README.md
│
├── 📖 05-部署文档/                  # 4种部署模式
│   ├── 01-云端SaaS部署/
│   ├── 02-独立产品包部署/
│   ├── 03-Electron单机版/
│   ├── 04-私有化部署/
│   └── README.md
│
├── 📖 06-测试文档/                  # 测试计划、测试用例
│   └── README.md
│
├── 📖 07-运维文档/                  # 运维手册、故障排查
│   └── README.md
│
├── 📖 08-项目管理/                  # 开发计划、每日进度
│   ├── 01-整体开发计划/
│   ├── 02-里程碑规划/
│   └── 03-每日进度/
│
├── 📖 09-架构实施/                  # 架构演进、重构计划
│   └── README.md
│
├── 📖 _templates/                   # 文档模板
│   ├── [AI对接] 快速上下文-模板.md
│   ├── 模块README-模板.md
│   └── README.md
│
└── 📖 00-项目概述/                  # 历史文档（待整理）
    ├── 现有系统技术摸底报告.md
    ├── 壹证循科技 AI科研产品需求文档.md
    ├── 壹证循科技AI科研产品 - 技术架构白皮书.md
    └── AI智能文献PRD（1-3）.md

🎯 快速上下文体系（金字塔式）

L0 - 总览（800 tokens，2分钟）

目标： 新AI实例快速了解项目全貌
文档： 00-系统总体设计/[AI对接] 快速上下文
包含： 8个业务模块、技术栈、当前进度、依赖关系

L1 - 层级（1500 tokens/篇，3分钟）

目标： 了解某一层的所有模块/能力
文档：

L2 - 模块级（2000 tokens/篇，5分钟）

目标： 深入了解具体模块的实现细节
文档：

📊 项目状态一览

当前阶段

🏗️ 架构设计完成，文档重构完成，准备ASL模块开发

已完成模块（3个）

模块	状态	说明
AIA-AI智能问答	✅	12个智能体、多轮对话
PKB-个人知识库	✅	RAG问答、智能引用
RVW-稿件审查系统	✅	独立系统，可独立售卖

下一步开发（P0）

模块	优先级	预计时间	前置条件
LLM大模型网关	P0	3天	无
ASL-AI智能文献	P0	3周	LLM网关完成

🔍 常见场景导航

场景1：开发ASL模块（标题摘要初筛）

阅读：ASL快速上下文（5分钟）
阅读：LLM网关快速上下文（5分钟）
参考：AI智能文献/ 目录下的详细PRD和技术设计文档
开始开发

Token消耗： ~4000 tokens | 时间： 10-15分钟

场景2：实现LLM网关

阅读：LLM网关快速上下文（5分钟）
阅读：平台层快速上下文 - 了解Feature Flag（3分钟）
参考：API设计规范、数据库设计规范
开始开发

Token消耗： ~3500 tokens | 时间： 8-10分钟

场景3：了解整体架构设计

阅读：00-系统总体设计/[AI对接] 快速上下文（2分钟）
阅读：系统架构分层设计（10分钟）
阅读：架构设计全景图（5分钟）
根据需要深入了解特定层级

Token消耗： ~5000 tokens | 时间： 15-20分钟

场景4：修改数据库

阅读：数据库架构说明
参考：04-开发规范/01-数据库设计规范
找到对应模块的数据库设计文档
开始修改

📝 文档维护规则

1. 快速上下文文档

✅ 必须控制在指定Token范围内
✅ 包含关键信息和跳转链接
✅ 每次架构变更后及时更新

2. 详细设计文档

✅ 每个模块必须有README.md
✅ 复杂模块需要分多个子文档
✅ 使用统一的文档模板

3. 代码注释

✅ 核心逻辑必须有注释
✅ 复杂算法必须有文档说明
✅ API接口必须有JSDoc

⚠️ 重要提醒

文档优先级

⭐⭐⭐⭐⭐ 快速上下文（L0、L1、L2） - 必须保持更新
⭐⭐⭐⭐ 架构设计文档 - 重要变更必须更新
⭐⭐⭐ 模块README - 每个模块必备
⭐⭐ 详细设计文档 - 复杂模块需要
⭐ 开发日志 - 记录关键决策

Token消耗优化

传统方式： 阅读全部文档 → ~30,000 tokens → 1小时+
快速上下文： L0+L1+L2 → ~5,000 tokens → 10-15分钟
节省： 83%的Token + 75%的时间

🔗 外部资源

代码仓库： D:\MyCursor\AIclinicalresearch\
数据库： PostgreSQL (Docker, localhost:5432)
Dify： http://localhost (独立系统)

📞 联系方式

技术架构师： 负责架构设计和文档维护
文档版本： v3.0
最后更新： 2025-11-06

🎯 下一步行动： 开始ASL模块开发 + LLM网关实现（Week 2）

README.md Unescape Escape

📚 壹证循AI科研平台 - 文档中心

🚀 快速开始

我是新AI实例，第一次对话

我要开发具体模块

我要了解整体架构

📂 文档目录结构

🎯 快速上下文体系（金字塔式）

L0 - 总览（800 tokens，2分钟）

L1 - 层级（1500 tokens/篇，3分钟）

L2 - 模块级（2000 tokens/篇，5分钟）

📊 项目状态一览

当前阶段

已完成模块（3个）

下一步开发（P0）

🔍 常见场景导航

场景1：开发ASL模块（标题摘要初筛）

场景2：实现LLM网关

场景3：了解整体架构设计

场景4：修改数据库

📝 文档维护规则

1. 快速上下文文档

2. 详细设计文档

3. 代码注释

⚠️ 重要提醒

文档优先级

Token消耗优化

🔗 外部资源

📞 联系方式

README.md