feat(ssa): Complete Phase I-IV intelligent dialogue and tool system development

Phase I - Session Blackboard + READ Layer:
- SessionBlackboardService with Postgres-Only cache
- DataProfileService for data overview generation
- PicoInferenceService for LLM-driven PICO extraction
- Frontend DataContextCard and VariableDictionaryPanel
- E2E tests: 31/31 passed

Phase II - Conversation Layer LLM + Intent Router:
- ConversationService with SSE streaming
- IntentRouterService (rule-first + LLM fallback, 6 intents)
- SystemPromptService with 6-segment dynamic assembly
- TokenTruncationService for context management
- ChatHandlerService as unified chat entry
- Frontend SSAChatPane and useSSAChat hook
- E2E tests: 38/38 passed

Phase III - Method Consultation + AskUser Standardization:
- ToolRegistryService with Repository Pattern
- MethodConsultService with DecisionTable + LLM enhancement
- AskUserService with global interrupt handling
- Frontend AskUserCard component
- E2E tests: 13/13 passed

Phase IV - Dialogue-Driven Analysis + QPER Integration:
- ToolOrchestratorService (plan/execute/report)
- analysis_plan SSE event for WorkflowPlan transmission
- Dual-channel confirmation (ask_user card + workspace button)
- PICO as optional hint for LLM parsing
- E2E tests: 25/25 passed

R Statistics Service:
- 5 new R tools: anova_one, baseline_table, fisher, linear_reg, wilcoxon
- Enhanced guardrails and block helpers
- Comprehensive test suite (run_all_tools_test.js)

Documentation:
- Updated system status document (v5.9)
- Updated SSA module status and development plan (v1.8)

Total E2E: 107/107 passed (Phase I: 31, Phase II: 38, Phase III: 13, Phase IV: 25)

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
2026-02-22 18:53:39 +08:00
parent bf10dec4c8
commit 3446909ff7
68 changed files with 11583 additions and 412 deletions

View File

@@ -95,7 +95,7 @@ from operations.metric_time_transform import (
)
from operations.fillna import fillna_simple, fillna_mice, get_column_missing_stats
# ✨ SSA Phase 2A: 数据画像
from operations.data_profile import generate_data_profile, get_quality_score
from operations.data_profile import generate_data_profile, get_quality_score, analyze_variable_detail
# ==================== Pydantic Models ====================
@@ -248,6 +248,14 @@ class DataProfileCSVRequest(BaseModel):
include_quality_score: bool = True
class VariableDetailRequest(BaseModel):
"""单变量详情请求模型 (SSA Phase I)"""
csv_content: str
variable_name: str
max_bins: int = 30
max_qq_points: int = 200
class FillnaSimpleRequest(BaseModel):
"""简单填补请求模型"""
data: List[Dict[str, Any]]
@@ -2265,6 +2273,46 @@ async def ssa_data_profile_csv(request: DataProfileCSVRequest):
}, status_code=400)
# ==================== 单变量详情 API (Phase I) ====================
@app.post("/api/ssa/variable-detail")
async def ssa_variable_detail(request: VariableDetailRequest):
"""
单变量详细分析 (SSA Phase I)
返回指定变量的描述统计、分布直方图数据、正态性检验、Q-Q 图数据点。
直方图 bins 上限 max_bins默认 30H2 防护Q-Q 点上限 max_qq_points。
"""
try:
import pandas as pd
import time
from io import StringIO
start_time = time.time()
df = pd.read_csv(StringIO(request.csv_content))
logger.info(f"[SSA] 单变量详情分析: {request.variable_name}")
detail = analyze_variable_detail(
df, request.variable_name,
max_bins=request.max_bins,
max_qq_points=request.max_qq_points
)
detail['execution_time'] = round(time.time() - start_time, 3)
status_code = 200 if detail.get('success') else 400
return JSONResponse(content=detail, status_code=status_code)
except Exception as e:
logger.error(f"[SSA] 单变量详情分析失败: {str(e)}")
return JSONResponse(content={
"success": False,
"error": str(e)
}, status_code=400)
# ==================== Word 导出 API ====================
@app.get("/api/pandoc/status")