feat(ssa): Complete Phase I-IV intelligent dialogue and tool system development

Phase I - Session Blackboard + READ Layer: - SessionBlackboardService with Postgres-Only cache - DataProfileService for data overview generation - PicoInferenceService for LLM-driven PICO extraction - Frontend DataContextCard and VariableDictionaryPanel - E2E tests: 31/31 passed Phase II - Conversation Layer LLM + Intent Router: - ConversationService with SSE streaming - IntentRouterService (rule-first + LLM fallback, 6 intents) - SystemPromptService with 6-segment dynamic assembly - TokenTruncationService for context management - ChatHandlerService as unified chat entry - Frontend SSAChatPane and useSSAChat hook - E2E tests: 38/38 passed Phase III - Method Consultation + AskUser Standardization: - ToolRegistryService with Repository Pattern - MethodConsultService with DecisionTable + LLM enhancement - AskUserService with global interrupt handling - Frontend AskUserCard component - E2E tests: 13/13 passed Phase IV - Dialogue-Driven Analysis + QPER Integration: - ToolOrchestratorService (plan/execute/report) - analysis_plan SSE event for WorkflowPlan transmission - Dual-channel confirmation (ask_user card + workspace button) - PICO as optional hint for LLM parsing - E2E tests: 25/25 passed R Statistics Service: - 5 new R tools: anova_one, baseline_table, fisher, linear_reg, wilcoxon - Enhanced guardrails and block helpers - Comprehensive test suite (run_all_tools_test.js) Documentation: - Updated system status document (v5.9) - Updated SSA module status and development plan (v1.8) Total E2E: 107/107 passed (Phase I: 31, Phase II: 38, Phase III: 13, Phase IV: 25) Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-22 18:53:39 +08:00
parent bf10dec4c8
commit 3446909ff7
68 changed files with 11583 additions and 412 deletions
--- a/extraction_service/main.py
+++ b/extraction_service/main.py
@@ -95,7 +95,7 @@ from operations.metric_time_transform import (
 )
 from operations.fillna import fillna_simple, fillna_mice, get_column_missing_stats
 # ✨ SSA Phase 2A: 数据画像
-from operations.data_profile import generate_data_profile, get_quality_score
+from operations.data_profile import generate_data_profile, get_quality_score, analyze_variable_detail


 # ==================== Pydantic Models ====================
@@ -248,6 +248,14 @@ class DataProfileCSVRequest(BaseModel):
    include_quality_score: bool = True


+class VariableDetailRequest(BaseModel):
+    """单变量详情请求模型 (SSA Phase I)"""
+    csv_content: str
+    variable_name: str
+    max_bins: int = 30
+    max_qq_points: int = 200
+
+
 class FillnaSimpleRequest(BaseModel):
    """简单填补请求模型"""
    data: List[Dict[str, Any]]
@@ -2265,6 +2273,46 @@ async def ssa_data_profile_csv(request: DataProfileCSVRequest):
        }, status_code=400)


+# ==================== 单变量详情 API (Phase I) ====================
+
+@app.post("/api/ssa/variable-detail")
+async def ssa_variable_detail(request: VariableDetailRequest):
+    """
+    单变量详细分析 (SSA Phase I)
+    
+    返回指定变量的描述统计、分布直方图数据、正态性检验、Q-Q 图数据点。
+    直方图 bins 上限 max_bins（默认 30，H2 防护），Q-Q 点上限 max_qq_points。
+    """
+    try:
+        import pandas as pd
+        import time
+        from io import StringIO
+        
+        start_time = time.time()
+        
+        df = pd.read_csv(StringIO(request.csv_content))
+        
+        logger.info(f"[SSA] 单变量详情分析: {request.variable_name}")
+        
+        detail = analyze_variable_detail(
+            df, request.variable_name,
+            max_bins=request.max_bins,
+            max_qq_points=request.max_qq_points
+        )
+        
+        detail['execution_time'] = round(time.time() - start_time, 3)
+        
+        status_code = 200 if detail.get('success') else 400
+        return JSONResponse(content=detail, status_code=status_code)
+        
+    except Exception as e:
+        logger.error(f"[SSA] 单变量详情分析失败: {str(e)}")
+        return JSONResponse(content={
+            "success": False,
+            "error": str(e)
+        }, status_code=400)
+
+
 # ==================== Word 导出 API ====================

@app.get("/api/pandoc/status")