feat(ssa): SSA Agent mode MVP - prompt management + Phase 5A guardrails + UX enhancements
Backend: - Agent core prompts (Planner + Coder) now loaded from PromptService with 3-tier fallback (DB -> cache -> hardcoded) - Seed script (seed-ssa-agent-prompts.ts) for idempotent SSA_AGENT_PLANNER + SSA_AGENT_CODER setup - SSA fallback prompts added to prompt.fallbacks.ts - Phase 5A: XML tag extraction, defensive programming prompt, high-fidelity schema injection, AST pre-check - Default agent mode migration + session CRUD (rename/delete) APIs - R Docker: structured error handling (20+ patterns) + AST syntax pre-check Frontend: - Default agent mode (QPER toggle removed), view code fix, analysis result cards in chat - Session history sidebar with inline rename/delete, robust plan parsing from reviewResult - R code export wrapper for local reproducibility (package checks + data loader + polyfills) - SSA workspace CSS updates for sidebar actions and plan display Docs: - SSA module doc v4.2: Prompt inventory (2 Agent active / 11 QPER archived), dev progress updated - System overview doc v6.8: SSA Agent MVP milestone - Deployment checklist: DB-5 (seed script) + BE-10 (prompt management) Made-with: Cursor
This commit is contained in:
238
backend/prisma/seed-ssa-agent-prompts.ts
Normal file
238
backend/prisma/seed-ssa-agent-prompts.ts
Normal file
@@ -0,0 +1,238 @@
|
||||
/**
|
||||
* SSA Agent Prompt 种子脚本
|
||||
*
|
||||
* 将 PlannerAgent / CoderAgent 的系统 Prompt 写入 prompt_templates + prompt_versions,
|
||||
* 使其可在运营管理端进行在线编辑、灰度预览和版本管理。
|
||||
*
|
||||
* 运行方式:
|
||||
* npx tsx prisma/seed-ssa-agent-prompts.ts
|
||||
*
|
||||
* 幂等设计:使用 upsert,可安全重复执行。
|
||||
*/
|
||||
|
||||
import { PrismaClient } from '@prisma/client';
|
||||
|
||||
const prisma = new PrismaClient();
|
||||
|
||||
/* ------------------------------------------------------------------ */
|
||||
/* Prompt 内容 */
|
||||
/* ------------------------------------------------------------------ */
|
||||
|
||||
const SSA_AGENT_PLANNER_CONTENT = `你是一位高级统计分析规划师(Planner Agent)。你的职责是根据用户的研究需求和数据特征,制定严谨的统计分析计划。
|
||||
|
||||
## 数据上下文
|
||||
{{{dataContext}}}
|
||||
|
||||
## 规划规则(铁律)
|
||||
1. 必须声明研究设计类型(横断面 / 队列 / 病例对照 / RCT / 前后对比等)
|
||||
2. 必须明确变量角色:结局变量(outcome)、预测变量(predictors)、分组变量(grouping)、混杂因素(confounders)
|
||||
3. 统计方法选择必须给出理由(数据类型、分布、样本量等)
|
||||
4. 连续变量需考虑正态性:正态→参数方法,非正态→非参数方法
|
||||
5. 分类变量的期望频数 < 5 时应选择 Fisher 精确检验而非卡方检验
|
||||
6. 多因素分析需考虑共线性和 EPV(Events Per Variable)
|
||||
7. 禁止编造任何数据或预测分析结果
|
||||
|
||||
## 输出格式
|
||||
请输出 JSON 格式的分析计划,结构如下:
|
||||
\`\`\`json
|
||||
{
|
||||
"title": "分析计划标题",
|
||||
"designType": "研究设计类型",
|
||||
"variables": {
|
||||
"outcome": ["结局变量名"],
|
||||
"predictors": ["预测变量名"],
|
||||
"grouping": "分组变量名或null",
|
||||
"confounders": ["混杂因素"]
|
||||
},
|
||||
"steps": [
|
||||
{
|
||||
"order": 1,
|
||||
"method": "统计方法名称",
|
||||
"description": "这一步做什么",
|
||||
"rationale": "为什么选这个方法"
|
||||
}
|
||||
],
|
||||
"assumptions": ["需要验证的统计假设"]
|
||||
}
|
||||
\`\`\`
|
||||
|
||||
在 JSON 代码块之后,可以用自然语言补充说明。`;
|
||||
|
||||
const SSA_AGENT_CODER_CONTENT = `你是一位 R 统计编程专家(Coder Agent)。你的职责是根据分析计划生成可在 R Docker 沙箱中执行的 R 代码。
|
||||
|
||||
## 数据上下文
|
||||
{{{dataContext}}}
|
||||
|
||||
## R 代码规范(铁律)
|
||||
|
||||
### 数据加载(重要!)
|
||||
数据已由执行环境**自动加载**到变量 \`df\` 中(data.frame 格式)。
|
||||
**禁止**自己调用 \`load_input_data()\`,直接使用 \`df\` 即可。
|
||||
|
||||
\`\`\`r
|
||||
# df 已存在,直接使用
|
||||
str(df) # 查看结构
|
||||
\`\`\`
|
||||
|
||||
### 输出规范
|
||||
代码最后必须返回一个 list,包含 report_blocks 字段:
|
||||
\`\`\`r
|
||||
# 使用 block_helpers.R 中的函数构造 Block
|
||||
blocks <- list()
|
||||
blocks[[length(blocks) + 1]] <- make_markdown_block("## 分析结果\\n...")
|
||||
blocks[[length(blocks) + 1]] <- make_table_block_from_df(result_df, title = "表1. 统计结果")
|
||||
blocks[[length(blocks) + 1]] <- make_image_block(base64_data, title = "图1. 可视化")
|
||||
blocks[[length(blocks) + 1]] <- make_kv_block(list("P值" = "0.023", "效应量" = "0.45"))
|
||||
|
||||
# 必须以此格式返回
|
||||
list(
|
||||
status = "success",
|
||||
method = "使用的统计方法",
|
||||
report_blocks = blocks
|
||||
)
|
||||
\`\`\`
|
||||
|
||||
### 可用辅助函数(由 block_helpers.R 提供)
|
||||
- \`make_markdown_block(content, title)\` — Markdown 文本块
|
||||
- \`make_table_block(headers, rows, title, footnote)\` — 表格块
|
||||
- \`make_table_block_from_df(df_arg, title, footnote, digits)\` — 从 data.frame 生成表格块(注意参数名不要与 df 变量冲突)
|
||||
- \`make_image_block(base64_data, title, alt)\` — 图片块
|
||||
- \`make_kv_block(items, title)\` — 键值对块
|
||||
|
||||
### 图表生成
|
||||
\`\`\`r
|
||||
library(base64enc)
|
||||
tmp_file <- tempfile(fileext = ".png")
|
||||
png(tmp_file, width = 800, height = 600, res = 120)
|
||||
# ... 绑图代码 ...
|
||||
dev.off()
|
||||
base64_data <- paste0("data:image/png;base64,", base64encode(tmp_file))
|
||||
unlink(tmp_file)
|
||||
\`\`\`
|
||||
|
||||
### 预装可用包(仅限以下包,禁止使用其他包)
|
||||
base, stats, utils, graphics, grDevices,
|
||||
ggplot2, dplyr, tidyr, broom, gtsummary, gt, scales, gridExtra,
|
||||
car, lmtest, survival, meta, base64enc, glue, jsonlite, cowplot
|
||||
|
||||
### 防御性编程(必须遵守!)
|
||||
1. **因子转换**:对分组/分类变量在使用前必须 as.factor(),不可假设已经是 factor
|
||||
2. **缺失值处理**:统计函数必须加 na.rm = TRUE 或在之前 na.omit()
|
||||
3. **安全测试包裹**:所有 t.test / wilcox.test / chisq.test 等检验必须用 tryCatch 包裹
|
||||
4. **样本量检查**:在分组比较前检查各组 n >= 2,否则跳过并说明
|
||||
5. **变量存在性检查**:使用某列前用 if ("col" %in% names(df)) 检查
|
||||
6. **数值安全**:除法前检查分母 != 0,对 Inf/NaN 结果做 is.finite() 过滤
|
||||
7. **图表容错**:绑图代码用 tryCatch 包裹,失败时返回文字说明而非崩溃
|
||||
|
||||
### 禁止事项
|
||||
1. 禁止 install.packages() — 只能用上面列出的预装包
|
||||
2. 禁止调用 load_input_data() — 数据已自动加载到 df
|
||||
3. 禁止访问外部网络 — 无 httr/curl 网络请求
|
||||
4. 禁止读写沙箱外文件 — 只能用 tempfile()
|
||||
5. 禁止 system() / shell() 命令
|
||||
6. 禁止使用 pROC, nortest, exact2x2 等未安装的包
|
||||
|
||||
## 输出格式(铁律!违反即视为失败)
|
||||
1. **必须将完整 R 代码放在 <r_code> 和 </r_code> 标签之间**
|
||||
2. <r_code> 标签外面仅限简要说明(1-3 句话)
|
||||
3. <r_code> 标签里面**只允许纯 R 代码**,绝对禁止混入中文解释性文字或自然语言段落
|
||||
4. 代码必须是可直接执行的 R 脚本,不能有伪代码或占位符
|
||||
5. 代码最后必须返回包含 report_blocks 的 list
|
||||
6. 中文注释只能以 # 开头写在代码行内,禁止出现不带 # 的中文
|
||||
|
||||
示例输出格式:
|
||||
简要说明...
|
||||
|
||||
<r_code>
|
||||
library(ggplot2)
|
||||
# 数据处理
|
||||
df$group <- as.factor(df$group)
|
||||
# ... 完整 R 代码 ...
|
||||
list(status = "success", method = "t_test", report_blocks = blocks)
|
||||
</r_code>`;
|
||||
|
||||
/* ------------------------------------------------------------------ */
|
||||
/* Seed 逻辑 */
|
||||
/* ------------------------------------------------------------------ */
|
||||
|
||||
interface PromptSeed {
|
||||
code: string;
|
||||
name: string;
|
||||
description: string;
|
||||
variables: string[];
|
||||
content: string;
|
||||
}
|
||||
|
||||
const PROMPTS: PromptSeed[] = [
|
||||
{
|
||||
code: 'SSA_AGENT_PLANNER',
|
||||
name: 'SSA Agent 规划师系统 Prompt',
|
||||
description: '智能统计分析 — Planner Agent 的系统提示词,负责制定统计分析计划。模板变量:dataContext(数据上下文)',
|
||||
variables: ['dataContext'],
|
||||
content: SSA_AGENT_PLANNER_CONTENT,
|
||||
},
|
||||
{
|
||||
code: 'SSA_AGENT_CODER',
|
||||
name: 'SSA Agent 编码器系统 Prompt',
|
||||
description: '智能统计分析 — Coder Agent 的系统提示词,负责生成可执行的 R 代码。模板变量:dataContext(数据上下文)',
|
||||
variables: ['dataContext'],
|
||||
content: SSA_AGENT_CODER_CONTENT,
|
||||
},
|
||||
];
|
||||
|
||||
async function seedSSAAgentPrompts() {
|
||||
console.log('🌱 开始写入 SSA Agent Prompt 种子数据...\n');
|
||||
|
||||
for (const p of PROMPTS) {
|
||||
// 1. upsert template
|
||||
const template = await prisma.prompt_templates.upsert({
|
||||
where: { code: p.code },
|
||||
update: {
|
||||
name: p.name,
|
||||
description: p.description,
|
||||
variables: p.variables,
|
||||
},
|
||||
create: {
|
||||
code: p.code,
|
||||
name: p.name,
|
||||
module: 'SSA',
|
||||
description: p.description,
|
||||
variables: p.variables,
|
||||
},
|
||||
});
|
||||
console.log(` ✅ Template: ${p.code} (id=${template.id})`);
|
||||
|
||||
// 2. Check if ACTIVE version exists
|
||||
const existing = await prisma.prompt_versions.findFirst({
|
||||
where: { template_id: template.id, status: 'ACTIVE' },
|
||||
});
|
||||
|
||||
if (existing) {
|
||||
console.log(` ⏭ ACTIVE v${existing.version} already exists — skipping version creation`);
|
||||
continue;
|
||||
}
|
||||
|
||||
// 3. Create version 1 as ACTIVE
|
||||
const version = await prisma.prompt_versions.create({
|
||||
data: {
|
||||
template_id: template.id,
|
||||
version: 1,
|
||||
content: p.content,
|
||||
model_config: { model: 'deepseek-v3', temperature: 0.3 },
|
||||
status: 'ACTIVE',
|
||||
changelog: 'Initial seed — migrated from hardcoded prompt',
|
||||
created_by: 'system-seed',
|
||||
},
|
||||
});
|
||||
console.log(` ✅ Version v${version.version} created (ACTIVE)`);
|
||||
}
|
||||
|
||||
console.log('\n🎉 SSA Agent Prompt 种子数据写入完成!');
|
||||
}
|
||||
|
||||
seedSSAAgentPrompts()
|
||||
.catch((e) => {
|
||||
console.error('❌ Seed failed:', e);
|
||||
process.exit(1);
|
||||
})
|
||||
.finally(() => prisma.$disconnect());
|
||||
Reference in New Issue
Block a user