feat(ssa): Complete QPER architecture - Query, Planner, Execute, Reflection layers

Implement the full QPER intelligent analysis pipeline: - Phase E+: Block-based standardization for all 7 R tools, DynamicReport renderer, Word export enhancement - Phase Q: LLM intent parsing with dynamic Zod validation against real column names, ClarificationCard component, DataProfile is_id_like tagging - Phase P: ConfigLoader with Zod schema validation and hot-reload API, DecisionTableService (4-dimension matching), FlowTemplateService with EPV protection, PlannedTrace audit output - Phase R: ReflectionService with statistical slot injection, sensitivity analysis conflict rules, ConclusionReport with section reveal animation, conclusion caching API, graceful R error classification End-to-end test: 40/40 passed across two complete analysis scenarios. Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-21 18:15:53 +08:00
parent 428a22adf2
commit 371e1c069c
73 changed files with 9242 additions and 706 deletions
--- a/docs/03-业务模块/SSA-智能统计分析/06-开发记录/SSA-QPER架构开发总结-2026-02-21.md
+++ b/docs/03-业务模块/SSA-智能统计分析/06-开发记录/SSA-QPER架构开发总结-2026-02-21.md
@@ -0,0 +1,108 @@
+# SSA QPER 架构开发总结
+
+> **日期：** 2026-02-21  
+> **范围：** Phase E+ / Q / P / R 四阶段全部完成  
+> **耗时：** ~93.5h（计划内），跨 2026-02-20 ~ 2026-02-21  
+> **结果：** QPER 智能化主线闭环，40/40 端到端测试通过
+
+---
+
+## 1. 完成概览
+
+| Phase | 名称 | 核心产出 | 状态 |
+|-------|------|---------|------|
+| **E+** | Block-based 标准化 | 7 个 R 工具输出 `report_blocks`，前端 `DynamicReport` 动态渲染，Word 导出 | ✅ 100% |
+| **Q** | Query 层（LLM 意图理解） | `QueryService` + LLM Intent 解析 + Zod 动态防幻觉 + 追问卡片 + DataProfile 增强 | ✅ 100% |
+| **P** | Planner 层（决策表+模板） | `ConfigLoader` + `DecisionTableService` + `FlowTemplateService` + `PlannedTrace` + 热更新 API | ✅ 100% |
+| **R** | Reflection 层（LLM 结论） | `ReflectionService` + 槽位注入 + Zod 校验 + 敏感性冲突准则 + 结论缓存 API + 前端渐入动画 | ✅ 100% |
+
+---
+
+## 2. 各阶段关键文件
+
+### Phase E+ — Block-based 标准化
+
+| 文件 | 说明 |
+|------|------|
+| `r-statistics-service/tools/*.R` | 7 个 R 工具全部输出 `report_blocks` |
+| `frontend-v2/.../DynamicReport.tsx` | 4 种 Block 渲染（markdown/table/image/key_value） |
+| `frontend-v2/.../exportBlocksToWord.ts` | Block → Word 导出 |
+
+### Phase Q — LLM 意图理解
+
+| 文件 | 说明 |
+|------|------|
+| `backend/.../services/QueryService.ts` | LLM Intent 解析 + json-repair + Zod 动态校验 + 正则 fallback |
+| `backend/.../types/query.types.ts` | `ParsedQuery` / `ClarificationCard` 接口 + `createDynamicIntentSchema` |
+| `backend/scripts/seed-ssa-intent-prompt.ts` | `SSA_QUERY_INTENT` Prompt 种子脚本 |
+| `extraction_service/operations/data_profile.py` | `is_id_like` 非分析变量自动标记 |
+| `frontend-v2/.../ClarificationCard.tsx` | 封闭式追问卡片组件 |
+
+### Phase P — 决策表 + 流程模板
+
+| 文件 | 说明 |
+|------|------|
+| `backend/.../config/ConfigLoader.ts` | 通用 JSON 加载 + Zod 校验 + 内存缓存 + 热更新 |
+| `backend/.../config/decision_tables.json` | 四维匹配规则（Goal×Y×X×Design） |
+| `backend/.../config/flow_templates.json` | 4+1 个标准流程模板 |
+| `backend/.../config/tools_registry.json` | R 工具注册表 |
+| `backend/.../services/DecisionTableService.ts` | 规则匹配引擎 |
+| `backend/.../services/FlowTemplateService.ts` | 模板选择 + EPV 防护 |
+| `backend/.../services/WorkflowPlannerService.ts` | 核心规划入口 + `PlannedTrace` 输出 |
+| `backend/.../routes/config.routes.ts` | `POST /reload` 热更新 API |
+
+### Phase R — LLM 论文级结论
+
+| 文件 | 说明 |
+|------|------|
+| `backend/.../types/reflection.types.ts` | `ConclusionReport` / `LLMConclusionSchema` / `classifyRError` |
+| `backend/.../services/ReflectionService.ts` | LLM 结论生成 + 槽位注入 + Zod 校验 + fallback |
+| `backend/scripts/seed-ssa-reflection-prompt.ts` | `SSA_REFLECTION` Prompt（含敏感性冲突准则） |
+| `backend/.../routes/workflow.routes.ts` | `GET /sessions/:id/conclusion` 缓存 API |
+| `frontend-v2/.../ConclusionReport.tsx` | 逐 section 渐入动画 + 一键复制 + source 标识 |
+| `frontend-v2/.../exportBlocksToWord.ts` | Word 导出增强（纳入 LLM 结论） |
+
+---
+
+## 3. 端到端测试结果
+
+**测试脚本：** `backend/scripts/test-ssa-qper-e2e.ts`  
+**运行方式：** `npx tsx scripts/test-ssa-qper-e2e.ts`
+
+| 测试项 | 结果 |
+|--------|------|
+| 登录认证 | ✅ JWT Token 获取 |
+| 创建会话 + 上传 CSV | ✅ 21 列 / 311 行 |
+| 数据画像（Python） | ✅ 行列数正确 |
+| **Q 层** — LLM Intent | ✅ Goal=comparison, Confidence=0.95, 变量名准确 |
+| **P 层** — Plan | ✅ 3 步流程, PlannedTrace 完整 |
+| **E 层** — R 引擎执行 | ✅ 3/3 步骤成功, 10 个 ReportBlocks |
+| **R 层** — LLM 结论 | ✅ source=llm, 6 要素完整（摘要/发现/统计/方法/局限/建议） |
+| 结论 API 缓存 | ✅ 14ms 缓存命中 |
+| 第二条链路（相关分析） | ✅ 2/2 步骤成功, LLM 结论正确 |
+| 错误分类验证 | ✅ 异常变量 confidence=0.4, 不崩溃 |
+| **总计** | **40/40 通过, 0 失败** |
+
+---
+
+## 4. 架构亮点
+
+1. **四层降级体系** — 每层都有 fallback：Q→正则, P→硬编码, R→规则引擎, 前端→旧组件
+2. **LLM 三层防御** — `jsonrepair` → `JSON.parse` → `Zod Schema`，Q 层和 R 层共用范式
+3. **统计量槽位注入** — LLM 被剥夺生成数值的权限，所有 P 值/效应量来自 R 引擎实际输出
+4. **配置化驱动** — 决策表/流程模板/工具注册表均为 JSON，方法学团队可配置，`POST /reload` 热更新
+5. **PlannedTrace 审计** — P 层只生成策略（"如果…则…"），E 层 R 引擎执行事实，R 层合并两者生成方法学说明
+
+---
+
+## 5. 下一步
+
+| 阶段 | 内容 | 预计工时 |
+|------|------|---------|
+| **Phase Deploy** | 补齐 4 个原子 R 工具（ANOVA/Fisher/Wilcoxon/线性回归）+ 复合工具 ST_BASELINE_TABLE + 部署 | 37h |
+| **Phase Q+** | 变量数据字典 + 变量选择确认面板（人机协同增强） | 20h |
+
+---
+
+**文档维护者：** SSA 架构团队  
+**关联文档：** `04-开发计划/10-QPER架构开发计划-智能化主线.md`