Files

HaHafeng 8a17369138 feat(dc): Complete Tool B MVP with full API integration and bug fixes

Phase 5: Export Feature
- Add Excel export API endpoint (GET /tasks/:id/export)
- Fix Content-Disposition header encoding for Chinese filenames
- Fix export field order to match template definition
- Export finalResult or resultA as fallback

API Integration Fixes (Phase 1-5):
- Fix API response parsing (return result.data consistently)
- Fix field name mismatch (fileKey -> sourceFileKey)
- Fix Excel parsing bug (range:99 -> slice(0,100))
- Add file upload with Excel parsing (columns, totalRows)
- Add detailed error logging for debugging

LLM Integration Fixes:
- Fix LLM call method: LLMFactory.createLLM -> getAdapter
- Fix adapter interface: generateText -> chat([messages])
- Fix response fields: text -> content, tokensUsed -> usage.totalTokens
- Fix model names: qwen-max -> qwen3-72b

React Infinite Loop Fixes:
- Step2: Remove updateState from useEffect deps
- Step3: Add useRef to prevent Strict Mode double execution
- Step3: Clear interval on API failure (max 3 retries)
- Step4: Add useRef to prevent infinite data loading
- Add cleanup functions to all useEffect hooks

Frontend Enhancements:
- Add comprehensive error handling with user-friendly messages
- Remove debug console.logs (production ready)
- Fix TypeScript type definitions (TaskProgress, ExtractionItem)
- Improve Step4Verify data transformation logic

Backend Enhancements:
- Add detailed logging at each step for debugging
- Add parameter validation in controllers
- Improve error messages with stack traces (dev mode)
- Add export field ordering by template definition

Documentation Updates:
- Update module status: Tool B MVP completed
- Create MVP completion summary (06-开发记录)
- Create technical debt document (07-技术债务)
- Update API documentation with test status
- Update database documentation with verified status
- Update system overview with DC module status
- Document 4 known issues (Excel preprocessing, progress display, etc.)

Testing Results:
- File upload: 9 rows parsed successfully
- Health check: Column validation working
- Dual model extraction: DeepSeek-V3 + Qwen-Max both working
- Processing time: ~49s for 9 records (~5s per record)
- Token usage: ~10k tokens total (~1.1k per record)
- Conflict detection: 1 clean, 8 conflicts (88.9% conflict rate)
- Excel export: Working with proper encoding

Files Changed:
Backend (~500 lines):
- ExtractionController.ts: Add upload endpoint, improve logging
- DualModelExtractionService.ts: Fix LLM call methods, add detailed logs
- HealthCheckService.ts: Fix Excel range parsing
- routes/index.ts: Add upload route

Frontend (~200 lines):
- toolB.ts: Fix API response parsing, add error handling
- Step1Upload.tsx: Integrate upload and health check APIs
- Step2Schema.tsx: Fix infinite loop, load templates from API
- Step3Processing.tsx: Fix infinite loop, integrate progress polling
- Step4Verify.tsx: Fix infinite loop, transform backend data correctly
- Step5Result.tsx: Integrate export API
- index.tsx: Add file metadata to state

Scripts:
- check-task-progress.mjs: Database inspection utility

Docs (~8 files):
- 00-模块当前状态与开发指南.md: Update to v2.0
- API设计文档.md: Mark all endpoints as tested
- 数据库设计文档.md: Update verification status
- DC模块Tool-B开发计划.md: Add MVP completion notice
- DC模块Tool-B开发任务清单.md: Update progress to 100%
- Tool-B-MVP完成总结.md: New completion summary
- Tool-B技术债务清单.md: New technical debt document
- 00-系统当前状态与开发指南.md: Update DC module status

Status: Tool B MVP complete and production ready

2025-12-03 15:07:39 +08:00

[AI对接] 快速上下文.md

docs: Update architecture docs with platform infrastructure details

2025-11-17 08:36:10 +08:00

[重要] 2025-11-06 架构设计完成报告.md

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

00-今日架构设计总结.md

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

00-核心问题解答.md

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

00-系统当前状态与开发指南.md

feat(dc): Complete Tool B MVP with full API integration and bug fixes

2025-12-03 15:07:39 +08:00

00-阅读指南.md

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

01-系统架构分层设计.md

docs: Update architecture docs with platform infrastructure details

2025-11-17 08:36:10 +08:00

02-文档体系重构方案.md

docs: complete documentation system (250+ files)

2025-11-16 15:43:55 +08:00

03-数据库架构说明.md

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

04-运营管理端架构设计.md

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

05-Schema隔离方案与成本分析.md

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

06-模块独立部署与单机版方案.md

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

07-Monorepo架构评估.md

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

08-架构设计全景图.md

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

09-总体需求文档(PRD).md

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

10-核心业务规则总览.md

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

99-下一步行动决策建议.md

feat(asl): Complete Day 5 - Fulltext Screening Backend API Development

2025-11-23 10:52:07 +08:00

README.md

docs: complete documentation system (250+ files)

2025-11-16 15:43:55 +08:00

前后端模块化架构设计-V2.md

refactor(asl): ASL frontend architecture refactoring with left navigation

2025-11-18 21:51:51 +08:00

README.md

系统总体设计

目录说明： 本目录包含壹证循AI科研平台的系统总体设计文档
文档层级： 总体层面（Platform Level）
目标读者： 技术架构师、产品经理、项目负责人

📚 文档导航

📌 快速导航

🎉 重要： 2025-11-06 架构设计完成报告 - 今日重大里程碑！

👉 第一次阅读： 00-阅读指南 - 如何阅读这些文档

👉 需要决策： 99-下一步行动决策建议 - 3个方案对比

核心文档

文档	说明	状态	优先级
00-阅读指南	如何阅读这些文档？	✅ 完成	⭐⭐⭐ 首次必读
00-今日架构设计总结	2025-11-06工作总结	✅ 完成	⭐⭐ 推荐阅读
99-下一步行动决策建议	3个方案对比+决策建议	✅ 完成	⭐⭐⭐ 决策必读
00-核心问题解答	回答关键架构问题	✅ 完成	P0
01-系统架构分层设计	三层架构设计（平台、能力、业务）	✅ 完成	P0
02-文档体系重构方案	文档结构重组方案v2.0	✅ 完成	P0
03-数据库架构说明	PostgreSQL Docker部署说明	✅ 完成	P0
04-运营管理端架构设计	15个功能模块设计	✅ 完成	P0
05-Schema隔离方案与成本分析	逻辑隔离vs物理隔离	✅ 完成	P0
06-模块独立部署与单机版方案	完整打包+Electron方案	✅ 完成	P0
07-Monorepo架构评估	当前阶段是否需要Monorepo	✅ 完成	P0
08-架构设计全景图	一图看懂整个系统架构	✅ 完成	⭐ 推荐阅读
09-总体需求文档(PRD).md	产品总体需求	⏳ 待迁移	P1
10-技术架构白皮书.md	技术架构总览	⏳ 待迁移	P1
11-商业模式设计.md	商业模式与定价	⏳ 待创建	P2
12-版本规划.md	版本演进路线图	⏳ 待创建	P2

🎯 核心内容概要

📌 快速开始

如果您是第一次阅读，强烈推荐：

⭐ 今日架构设计总结 - 快速了解今天的成果
⭐ 架构设计全景图 - 一图看懂整个系统

1. 系统架构分层

三层架构 + 8个业务模块：

┌──────────────────────────────────────────────────┐
│           业务模块层（8个模块）                     │
│  AIA | ASL | PKB | DC | SSA | ST | RVW | ADMIN   │
└──────────────────────────────────────────────────┘
                    ↓ 依赖
┌──────────────────────────────────────────────────┐
│           通用能力层（5个能力）                     │
│  LLM网关(71%) | 文档处理(86%) | RAG(43%)         │
│  ETL(29%) | 医学NLP(14%)                         │
└──────────────────────────────────────────────────┘
                    ↓ 依赖
┌──────────────────────────────────────────────────┐
│           平台基础层                               │
│  用户权限 | 存储 | 通知 | 监控 | 配置             │
└──────────────────────────────────────────────────┘

详见： 01-系统架构分层设计.md

2. 文档体系结构

新文档结构：

docs/
  ├── 00-系统总体设计/          # 总体层面
  ├── 01-平台基础层/            # 平台层（UAM、存储、通知等）
  ├── 02-通用能力层/            # 通用能力（LLM网关、文档处理等）
  ├── 03-业务模块/              # 7个独立业务模块
  │   ├── AIA-AI智能问答/
  │   ├── ASL-AI智能文献/
  │   ├── PKB-个人知识库/
  │   ├── DC-数据清洗整理/
  │   ├── SSA-智能统计分析/
  │   ├── ST-统计分析工具/
  │   └── RVW-稿件审查系统/
  ├── 04-开发规范/
  ├── 05-部署文档/
  ├── 06-测试文档/
  ├── 07-运维文档/
  └── 08-项目管理/

详见： 02-文档体系重构方案.md

3. 核心决策

部署模式（4种）

✅ 云端SaaS版（P0，当前）
✅ 独立产品包（P1，阶段二）- 支持模块化销售
✅ Electron单机版（P2，阶段二）- 代码复用85%+
✅ 私有化部署（P1，阶段二）
❌ ~~混合部署~~（不考虑）

模块划分（8个业务模块）

AIA - AI智能问答 ✅ 已完成
ASL - AI智能文献 ⏳ 下一步重点
PKB - 个人知识库 ✅ 已完成
DC - 数据清洗整理 ⏳ 规划中
SSA - 智能统计分析 ⏳ 规划中
ST - 统计分析工具 ⏳ 规划中
RVW - 稿件审查系统 ⚡ 独立系统
ADMIN - 运营管理端 ⭐ v2.0新增

核心能力（5个通用能力）

LLM网关 ⭐ 最高优先级，5个模块依赖（71%复用率）
文档处理引擎 ✅ 已实现，6个模块依赖（86%复用率）
RAG引擎 ✅ 已实现，3个模块依赖（43%复用率）
ETL引擎 ⏳ 待实现，2个模块依赖（29%复用率）
医学NLP ⏳ 待实现，1个模块依赖（14%复用率）

技术改造决策

Schema隔离 - 建议：现在做（1周）vs 未来做（3-5周）⭐⭐⭐⭐⭐
Monorepo转换 - 建议：现在做（2-3天）vs 未来做（7-11天）⭐⭐⭐⭐⭐

详见： 00-核心问题解答.md

🚀 架构演进路径

阶段一：模块化单体（当前 - 6个月）

目标： 云端SaaS版MVP

关键纪律：

✅ 严格按模块划分代码
✅ 数据表使用模块前缀（逻辑隔离）
✅ 模块间不直接import

优先开发：

ASL（AI智能文献）
DC（数据清洗）
LLM网关
Schema隔离

阶段二：首次拆分（6-18个月）

触发条件：

有客户要求私有化部署
有客户要求单机版
需要独立销售某个模块

架构调整：

引入API网关
引入K8s（可选）
拆分RVW（审稿系统）为独立服务

阶段三：全面微服务（18个月+）

目标： 所有模块独立部署，支持灵活组合

📊 关键指标

模块复用分析

通用能力	使用频率	复用模块数	优先级
LLM网关	71%	5/7	P0
文档处理	86%	6/7	P0
RAG引擎	43%	3/7	P1
ETL引擎	29%	2/7	P2
医学NLP	14%	1/7	P2

模块独立性分析

模块	独立性	商业价值	可独立销售
RVW（审稿）	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	✅ 是
ASL（文献）	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	✅ 是
DC（数据清洗）	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	✅ 是
SSA（统计分析）	⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⚠️ 与ST协同
ST（分析工具）	⭐⭐⭐⭐	⭐⭐⭐⭐	⚠️ 与SSA协同
AIA（AI问答）	⭐⭐⭐	⭐⭐⭐⭐	⚠️ 与PKB关联
PKB（知识库）	⭐⭐⭐	⭐⭐⭐	⚠️ 与AIA关联

✅ 当前任务清单

P0任务（已完成）✅

系统架构分层设计
文档体系重构方案v2.0
核心问题解答
数据库架构说明
运营管理端架构设计
Schema隔离方案与成本分析
模块独立部署与单机版方案
Monorepo架构评估
架构设计全景图
今日工作总结

P1任务（待决策）⏳

关键决策点：

是否现在做Schema隔离？（建议：是，1周）
是否现在转换Monorepo？（建议：是，2-3天）

如果决定先做架构改造（方案B）：

Week 1：Schema隔离 + Monorepo转换（6天）
Week 2+：ASL模块开发

如果决定立即开发（方案A）：

Week 1+：ASL模块开发
未来：架构改造（成本更高）

P2任务（后续）

迁移总体需求文档和技术架构白皮书
补充ASL模块缺失文档
LLM网关详细设计
RVW独立系统规划
补充运营管理端详细文档

📖 相关文档

平台基础层

01-平台基础层/ - 用户权限、存储、通知等

通用能力层

02-通用能力层/ - LLM网关、文档处理、RAG等

业务模块

03-业务模块/ - 7个独立业务模块

项目管理

08-项目管理/ - 开发计划、里程碑、每日进度

🤝 贡献指南

如何更新文档

总体架构调整：需要团队讨论，更新本目录文档
模块设计调整：更新对应模块目录文档
文档格式：遵循Markdown规范，包含目录、表格、代码块

文档审核流程

技术架构师审核总体文档
模块负责人审核模块文档
定期同步文档与代码

最后更新： 2025-11-06
维护人： 技术架构师

README.md Unescape Escape

系统总体设计

📚 文档导航

📌 快速导航

核心文档

🎯 核心内容概要

📌 快速开始

1. 系统架构分层

2. 文档体系结构

3. 核心决策

部署模式（4种）

模块划分（8个业务模块）

核心能力（5个通用能力）

技术改造决策

🚀 架构演进路径

阶段一：模块化单体（当前 - 6个月）

阶段二：首次拆分（6-18个月）

阶段三：全面微服务（18个月+）

📊 关键指标

模块复用分析

模块独立性分析

✅ 当前任务清单

P0任务（已完成）✅

P1任务（待决策）⏳

P2任务（后续）

📖 相关文档

平台基础层

通用能力层

业务模块

项目管理

🤝 贡献指南

如何更新文档

文档审核流程

README.md