Files
AIclinicalresearch/docs/05-部署文档/0227部署/01-数据库迁移方案.md
HaHafeng 6124c7abc6 docs(platform): Add database documentation system and restructure deployment docs
Completed:
- Add 6 core database documents (docs/01-平台基础层/07-数据库/)
  Architecture overview, migration history, environment comparison,
  tech debt tracking, seed data management, PostgreSQL extensions
- Restructure deployment docs: archive 20 legacy files to _archive-2025/
- Create unified daily operations manual (01-日常更新操作手册.md)
- Add pending deployment change tracker (03-待部署变更清单.md)
- Update database development standard to v3.0 (three iron rules)
- Fix Prisma schema type drift: align @db.* annotations with actual DB
  IIT: UUID/Timestamptz(6), SSA: Timestamp(6)/VarChar(20/50/100)
- Add migration: 20260227_align_schema_with_db_types (idempotent ALTER)
- Add Cursor Rule for auto-reminding deployment change documentation
- Update system status guide v6.4 with deployment and DB doc references
- Add architecture consultation docs (Prisma guide, SAE deployment guide)

Technical details:
- Manual migration due to shadow DB limitation (TD-001 in tech debt)
- Deployment docs reduced from 20+ scattered files to 3 core documents
- Cursor Rule triggers on schema.prisma, package.json, Dockerfile changes

Made-with: Cursor
2026-02-27 14:35:25 +08:00

662 lines
25 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 数据库迁移方案:开发环境 → RDS 结构同步
> **日期**2026-02-27
> **目标**将开发环境localhost Docker的数据库结构和必要 Seed 数据同步到阿里云 RDS 测试环境
> **核心原则**:只做增量,不动存量;结构先行,数据后补;每步可回滚
---
## 1. 环境信息
### 1.1 源环境(开发)
```
Host: localhost
Port: 5432
Database: ai_clinical_research
User: postgres
Password: postgres123
容器名: ai-clinical-postgres
表数量: 96
迁移数: 11+ 1 个文件未 apply
```
### 1.2 目标环境RDS 测试)
```
Host: pgm-2zex1m2y3r23hdn5so.pg.rds.aliyuncs.com外网
Host: pgm-2zex1m2y3r23hdn5.pg.rds.aliyuncs.com内网/SAE
Port: 5432
Database: ai_clinical_research_test
User: airesearch
Password: Xibahe@fengzhibo117
表数量: 66
迁移数: 6
```
### 1.3 DATABASE_URL操作时使用
```bash
# 外网(本地操作用)
postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5so.pg.rds.aliyuncs.com:5432/ai_clinical_research_test
# 内网SAE 部署用)
postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5.pg.rds.aliyuncs.com:5432/ai_clinical_research_test?connection_limit=18&pool_timeout=10
```
---
## 2. 差异分析总览
### 2.1 Prisma 迁移差距
| # | 迁移文件 | 本地 | RDS | 内容 |
|---|---------|------|-----|------|
| 1 | 20251010_init | ✅ | ✅ | 初始化 |
| 2 | 20251010_conversation_metadata | ✅ | ✅ | AIA 对话 |
| 3 | 20251012_batch_processing | ✅ | ✅ | 批处理 |
| 4 | 20251014_review_tasks | ✅ | ✅ | 论文预审 |
| 5 | 20251208_column_mapping | ✅ | ✅ | 列映射 |
| 6 | 20260128_system_knowledge_base | ✅ | ✅ | 系统知识库 |
| **7** | **20260207_add_iit_manager_agent_tables** | ✅ | ❌ | IIT Agent 8 张新表 |
| **8** | **20260208_add_cra_qc_engine_support** | ✅ | ❌ | CRA 质控 + skills 扩展 |
| **9** | **20260219_add_ssa_module** | ✅ | ❌ | SSA 模块 9 张表 |
| **10** | **20260223_add_deep_research_v2_fields** | ✅ | ❌ | research_tasks 加 6 列 |
| **11** | **20260225_add_extraction_template_engine** | ✅ | ❌ | ASL 提取引擎 4 张表 |
| **12** | **20260226_add_equery_critical_events_cron** | ❌ (db push) | ❌ | eQuery + 重大事件 + cron |
> RDS 需要执行迁移 #7 ~ #12共 **6 个迁移**。
### 2.2 无迁移文件的变更db push 造成的漂移)
以下 **6 张表 + 3 列 + 1 索引** 存在于开发数据库和 Prisma Schema 中,但没有对应的迁移文件:
| Schema | 对象 | 类型 | 说明 |
|--------|------|------|------|
| iit_schema | `field_metadata` | 新表16 列) | REDCap 字段元数据 |
| iit_schema | `qc_logs` | 新表16 列) | 质控日志(附历史追溯) |
| iit_schema | `qc_project_stats` | 新表10 列) | 项目级质控统计汇总 |
| iit_schema | `record_summary` | 新表16 列) | 受试者记录汇总 |
| iit_schema | `projects.knowledge_base_id` | 新列 | 关联 EKB 知识库 |
| iit_schema | `idx_iit_project_kb` | 新索引 | knowledge_base_id 索引 |
| ssa_schema | `ssa_workflows` | 新表11 列) | 统计分析工作流 |
| ssa_schema | `ssa_workflow_steps` | 新表13 列) | 工作流步骤明细 |
### 2.3 表数量对比
| Schema | 开发环境 | RDS | 差异 |
|--------|---------|-----|------|
| admin_schema | 2 | 2 | — |
| agent_schema | 6 | 6 | — |
| aia_schema | 3 | 3 | — |
| **asl_schema** | **12** | **7** | **+4 表 +6 列** |
| capability_schema | 4 | 4 | — |
| dc_schema | 6 | 6 | — |
| ekb_schema | 3 | 3 | — |
| **iit_schema** | **20** | **5** | **+15 表 +3 列** |
| pkb_schema | 5 | 5 | — |
| platform_schema | 19 | 19 | — |
| protocol_schema | 2 | 2 | — |
| public | 3 | 3 | — |
| rvw_schema | 1 | 1 | — |
| **ssa_schema** | **11** | **0** | **+11 表** |
| **合计** | **96** | **66** | **+30 表** |
### 2.4 列级差异(已有表)
| 表 | 开发环境 | RDS | 缺失列 |
|---|---------|-----|--------|
| `asl_schema.research_tasks` | 25 列 | 19 列 | `target_sources`, `confirmed_requirement`, `ai_intent_summary`, `execution_logs`, `synthesis_report`, `result_list` |
| `iit_schema.projects` | 18 列 | 15 列 | `knowledge_base_id`, `cron_enabled`, `cron_expression` |
| `iit_schema.skills` | 16 列 | — (新表) | 迁移 #8 中 ALTER ADD 4 列 |
---
## 3. Seed 数据同步需求
### 3.1 Prompt 管理数据对比
| 表 | 开发环境 | RDS | 分析 |
|---|---------|-----|------|
| `prompt_templates` | 27 行 | 14 行 | RDS 缺 13 个 SSA 模板ID 17-29 |
| `prompt_versions` | 42 行 | 23 行 | RDS 缺 19 个 SSA 版本ID 26-44 |
**重要**RDS 的 AIA/ASL/DC/RVW 模板ID 1-16已经过**独立迭代**,部分版本比开发环境更丰富(如模板 7 在 RDS 有 v2/v3 内容各 1044 字符,开发环境是短内容测试版),**绝对不能覆盖**。
### 3.2 需要同步的 Seed 数据
| 表 | 行数 | 内容 | 操作 |
|---|------|------|------|
| `capability_schema.prompt_templates` | 13 行 | SSA 模块 PromptID 17-29 | INSERT已有则跳过 |
| `capability_schema.prompt_versions` | 19 行 | SSA Prompt 版本ID 26-44 | INSERT已有则跳过 |
| `asl_schema.extraction_templates` | 3 行 | 系统提取模板RCT/Cohort/QC | INSERT新表迁移后 |
### 3.3 不需要同步的数据
| 表 | 原因 |
|---|------|
| `prompt_templates` ID 1-16 | RDS 版本更优,不覆盖 |
| `prompt_versions` ID 1-25 | RDS 版本更优,不覆盖 |
| `platform_schema.users/tenants/...` | RDS 有真实用户/租户数据 |
| `ekb_schema.*` | RDS 有真实知识库数据5686 chunks |
| `iit_schema.qc_logs` 等 | 开发测试数据,不推向生产 |
---
## 4. 迁移方案
### 4.1 方案选择:补丁迁移 + `prisma migrate deploy`
采用 **方案 B**:先为 db push 漏网的变更创建补丁迁移文件,然后用 `prisma migrate deploy` 一次性应用。
**风险评估**
| 风险项 | 评估 | 说明 |
|-------|------|------|
| 数据丢失 | ✅ 零风险 | 6 个迁移全部是 CREATE TABLE / ALTER TABLE ADD COLUMN无任何 DROP / DELETE / TRUNCATE |
| 现有表结构破坏 | ✅ 零风险 | 仅新增,不修改已有列 |
| 现有数据变更 | ✅ 零风险 | 纯 DDL 操作,不涉及 DML |
| Prompt 数据丢失 | ✅ 零风险 | Prompt 在 capability_schema不受任何迁移影响 |
| 迁移执行失败 | 🟡 低风险 | 可能因 RDS 权限问题失败,事务回滚不影响数据 |
### 4.2 执行步骤总览
```
步骤 1 → 备份 RDS 数据库
步骤 2 → 创建补丁迁移文件(覆盖 db push 漏网变更)
步骤 3 → 本地标记补丁迁移为已应用dev 已有这些表)
步骤 4 → 在 RDS 执行 prisma migrate deploy7 个迁移)
步骤 5 → 同步 Seed 数据SSA Prompt + 提取模板)
步骤 6 → 验证
```
---
## 5. 详细执行步骤
### 步骤 1备份 RDS 数据库
> ⚠️ **安全第一:任何操作前必须备份!**
使用 Docker 容器中的 pg_dump避免 PowerShell 编码问题):
```bash
# 在 Docker 容器内执行备份(避免 PowerShell UTF-8 编码问题!)
docker exec ai-clinical-postgres pg_dump \
"postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5so.pg.rds.aliyuncs.com:5432/ai_clinical_research_test" \
-f /tmp/backup_rds_before_0227.sql
# 将备份文件从容器复制到本地
docker cp ai-clinical-postgres:/tmp/backup_rds_before_0227.sql ./backup_rds_before_0227.sql
# 验证备份文件大小和中文内容
docker exec ai-clinical-postgres head -20 /tmp/backup_rds_before_0227.sql
```
预期:备份文件应 > 1MB且中文内容正常显示`????`)。
### 步骤 2创建补丁迁移文件
> 将 db push 漏网的 6 张表 + 3 列 + 索引 补入迁移体系
创建文件 `backend/prisma/migrations/20260227_patch_db_push_drift/migration.sql`
```sql
-- ================================================================
-- 补丁迁移:覆盖 db push 创建的 6 张表 + 3 列
-- 所有语句使用 IF NOT EXISTS 保证幂等(多次执行不报错)
-- ================================================================
-- ============ iit_schema: 4 张新表 + 1 列 + 1 索引 ============
-- 1. IIT 字段元数据表REDCap 字段定义缓存)
CREATE TABLE IF NOT EXISTS "iit_schema"."field_metadata" (
"id" TEXT NOT NULL,
"project_id" TEXT NOT NULL,
"field_name" TEXT NOT NULL,
"field_label" TEXT NOT NULL,
"field_type" TEXT NOT NULL,
"form_name" TEXT NOT NULL,
"section_header" TEXT,
"validation" TEXT,
"validation_min" TEXT,
"validation_max" TEXT,
"choices" TEXT,
"required" BOOLEAN NOT NULL DEFAULT false,
"branching" TEXT,
"alias" TEXT,
"rule_source" TEXT,
"synced_at" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT "field_metadata_pkey" PRIMARY KEY ("id")
);
CREATE UNIQUE INDEX IF NOT EXISTS "unique_iit_field_metadata"
ON "iit_schema"."field_metadata"("project_id", "field_name");
CREATE INDEX IF NOT EXISTS "idx_iit_field_metadata_project"
ON "iit_schema"."field_metadata"("project_id");
CREATE INDEX IF NOT EXISTS "idx_iit_field_metadata_form"
ON "iit_schema"."field_metadata"("project_id", "form_name");
-- 2. IIT 质控日志表
CREATE TABLE IF NOT EXISTS "iit_schema"."qc_logs" (
"id" TEXT NOT NULL,
"project_id" TEXT NOT NULL,
"record_id" TEXT NOT NULL,
"event_id" TEXT,
"qc_type" TEXT NOT NULL,
"form_name" TEXT,
"status" TEXT NOT NULL,
"issues" JSONB NOT NULL DEFAULT '[]',
"rules_evaluated" INTEGER NOT NULL DEFAULT 0,
"rules_skipped" INTEGER NOT NULL DEFAULT 0,
"rules_passed" INTEGER NOT NULL DEFAULT 0,
"rules_failed" INTEGER NOT NULL DEFAULT 0,
"rule_version" TEXT NOT NULL,
"inclusion_passed" BOOLEAN,
"exclusion_passed" BOOLEAN,
"triggered_by" TEXT NOT NULL,
"created_at" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT "qc_logs_pkey" PRIMARY KEY ("id")
);
CREATE INDEX IF NOT EXISTS "idx_iit_qc_log_record_time"
ON "iit_schema"."qc_logs"("project_id", "record_id", "created_at");
CREATE INDEX IF NOT EXISTS "idx_iit_qc_log_status_time"
ON "iit_schema"."qc_logs"("project_id", "status", "created_at");
CREATE INDEX IF NOT EXISTS "idx_iit_qc_log_type_time"
ON "iit_schema"."qc_logs"("project_id", "qc_type", "created_at");
-- 3. IIT 受试者记录汇总表
CREATE TABLE IF NOT EXISTS "iit_schema"."record_summary" (
"id" TEXT NOT NULL,
"project_id" TEXT NOT NULL,
"record_id" TEXT NOT NULL,
"enrolled_at" TIMESTAMP(3),
"enrolled_by" TEXT,
"last_updated_at" TIMESTAMP(3) NOT NULL,
"last_updated_by" TEXT,
"last_form_name" TEXT,
"form_status" JSONB NOT NULL DEFAULT '{}',
"total_forms" INTEGER NOT NULL DEFAULT 0,
"completed_forms" INTEGER NOT NULL DEFAULT 0,
"completion_rate" DOUBLE PRECISION NOT NULL DEFAULT 0,
"latest_qc_status" TEXT,
"latest_qc_at" TIMESTAMP(3),
"update_count" INTEGER NOT NULL DEFAULT 0,
"created_at" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
"updated_at" TIMESTAMP(3) NOT NULL,
CONSTRAINT "record_summary_pkey" PRIMARY KEY ("id")
);
CREATE UNIQUE INDEX IF NOT EXISTS "unique_iit_record_summary"
ON "iit_schema"."record_summary"("project_id", "record_id");
CREATE INDEX IF NOT EXISTS "idx_iit_record_summary_enrolled"
ON "iit_schema"."record_summary"("project_id", "enrolled_at");
CREATE INDEX IF NOT EXISTS "idx_iit_record_summary_qc_status"
ON "iit_schema"."record_summary"("project_id", "latest_qc_status");
CREATE INDEX IF NOT EXISTS "idx_iit_record_summary_completion"
ON "iit_schema"."record_summary"("project_id", "completion_rate");
-- 4. IIT 项目级质控统计汇总表
CREATE TABLE IF NOT EXISTS "iit_schema"."qc_project_stats" (
"id" TEXT NOT NULL,
"project_id" TEXT NOT NULL,
"total_records" INTEGER NOT NULL DEFAULT 0,
"passed_records" INTEGER NOT NULL DEFAULT 0,
"failed_records" INTEGER NOT NULL DEFAULT 0,
"warning_records" INTEGER NOT NULL DEFAULT 0,
"inclusion_met" INTEGER NOT NULL DEFAULT 0,
"exclusion_met" INTEGER NOT NULL DEFAULT 0,
"avg_completion_rate" DOUBLE PRECISION NOT NULL DEFAULT 0,
"updated_at" TIMESTAMP(3) NOT NULL,
CONSTRAINT "qc_project_stats_pkey" PRIMARY KEY ("id")
);
CREATE UNIQUE INDEX IF NOT EXISTS "qc_project_stats_project_id_key"
ON "iit_schema"."qc_project_stats"("project_id");
-- 5. IIT projects 表新增 knowledge_base_id 列
ALTER TABLE "iit_schema"."projects"
ADD COLUMN IF NOT EXISTS "knowledge_base_id" TEXT;
CREATE INDEX IF NOT EXISTS "idx_iit_project_kb"
ON "iit_schema"."projects"("knowledge_base_id");
-- ============ ssa_schema: 2 张新表 ============
-- 6. SSA 工作流表
CREATE TABLE IF NOT EXISTS "ssa_schema"."ssa_workflows" (
"id" TEXT NOT NULL DEFAULT gen_random_uuid()::text,
"session_id" TEXT NOT NULL,
"message_id" TEXT,
"status" VARCHAR NOT NULL DEFAULT 'pending',
"total_steps" INTEGER NOT NULL,
"completed_steps" INTEGER NOT NULL DEFAULT 0,
"workflow_plan" JSONB NOT NULL,
"reasoning" TEXT,
"created_at" TIMESTAMP(3) NOT NULL DEFAULT now(),
"started_at" TIMESTAMP(3),
"completed_at" TIMESTAMP(3),
CONSTRAINT "ssa_workflows_pkey" PRIMARY KEY ("id")
);
CREATE INDEX IF NOT EXISTS "idx_ssa_workflow_session"
ON "ssa_schema"."ssa_workflows"("session_id");
CREATE INDEX IF NOT EXISTS "idx_ssa_workflow_status"
ON "ssa_schema"."ssa_workflows"("status");
-- 外键ssa_workflows.session_id → ssa_sessions.id
DO $$
BEGIN
IF NOT EXISTS (
SELECT 1 FROM information_schema.table_constraints
WHERE constraint_name = 'ssa_workflows_session_id_fkey'
) THEN
ALTER TABLE "ssa_schema"."ssa_workflows"
ADD CONSTRAINT "ssa_workflows_session_id_fkey"
FOREIGN KEY ("session_id")
REFERENCES "ssa_schema"."ssa_sessions"("id")
ON DELETE CASCADE ON UPDATE CASCADE;
END IF;
END $$;
-- 7. SSA 工作流步骤表
CREATE TABLE IF NOT EXISTS "ssa_schema"."ssa_workflow_steps" (
"id" TEXT NOT NULL DEFAULT gen_random_uuid()::text,
"workflow_id" TEXT NOT NULL,
"step_order" INTEGER NOT NULL,
"tool_code" VARCHAR NOT NULL,
"tool_name" VARCHAR NOT NULL,
"status" VARCHAR NOT NULL DEFAULT 'pending',
"input_params" JSONB,
"guardrail_checks" JSONB,
"output_result" JSONB,
"error_info" JSONB,
"execution_ms" INTEGER,
"started_at" TIMESTAMP(3),
"completed_at" TIMESTAMP(3),
CONSTRAINT "ssa_workflow_steps_pkey" PRIMARY KEY ("id")
);
CREATE INDEX IF NOT EXISTS "idx_ssa_workflow_step_workflow"
ON "ssa_schema"."ssa_workflow_steps"("workflow_id");
CREATE INDEX IF NOT EXISTS "idx_ssa_workflow_step_status"
ON "ssa_schema"."ssa_workflow_steps"("status");
-- 外键ssa_workflow_steps.workflow_id → ssa_workflows.id
DO $$
BEGIN
IF NOT EXISTS (
SELECT 1 FROM information_schema.table_constraints
WHERE constraint_name = 'ssa_workflow_steps_workflow_id_fkey'
) THEN
ALTER TABLE "ssa_schema"."ssa_workflow_steps"
ADD CONSTRAINT "ssa_workflow_steps_workflow_id_fkey"
FOREIGN KEY ("workflow_id")
REFERENCES "ssa_schema"."ssa_workflows"("id")
ON DELETE CASCADE ON UPDATE CASCADE;
END IF;
END $$;
```
### 步骤 3本地标记补丁迁移为已应用
> 开发环境已经有这些表db push 创建的),只需告诉 Prisma "这个迁移已经执行过了"。
```bash
cd backend
# 标记补丁迁移为已应用(不实际执行 SQL
npx prisma migrate resolve --applied 20260227_patch_db_push_drift
# 同时标记 20260226 迁移为已应用dev 已有 equery/critical_events 表)
npx prisma migrate resolve --applied 20260226_add_equery_critical_events_cron
```
验证:
```bash
# 本地应该显示 13 个迁移全部已应用
npx prisma migrate status
```
### 步骤 4在 RDS 执行迁移
> 核心操作:一次性应用 7 个迁移(#7 ~ #12 + 补丁)
```bash
cd backend
# 临时切换 DATABASE_URL 指向 RDS 外网
# Windows PowerShell:
$env:DATABASE_URL="postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5so.pg.rds.aliyuncs.com:5432/ai_clinical_research_test"
# 执行迁移(只应用未执行的迁移,不会重新执行已有的)
npx prisma migrate deploy
```
预期输出:
```
7 migrations found in prisma/migrations
6 migrations have been already applied
Applying migration `20260207112544_add_iit_manager_agent_tables`
Applying migration `20260208134925_add_cra_qc_engine_support`
Applying migration `20260219_add_ssa_module`
Applying migration `20260223_add_deep_research_v2_fields`
Applying migration `20260225_add_extraction_template_engine`
Applying migration `20260226_add_equery_critical_events_cron`
Applying migration `20260227_patch_db_push_drift`
All migrations have been successfully applied.
```
> **注意**:迁移 #7 中包含 `CREATE EXTENSION IF NOT EXISTS vector;`,需要 RDS 已安装 pgvector 插件(已在 0126 部署中确认安装)。
### 步骤 5同步 Seed 数据
> 仅同步 SSA Prompt 和提取模板,不动 RDS 已有数据
**方法:从开发环境导出 → 插入 RDS**
```bash
# 5a. 导出 SSA Prompt 模板ID >= 17仅 SSA 新增)
docker exec ai-clinical-postgres psql -U postgres -d ai_clinical_research -c "\
COPY (SELECT * FROM capability_schema.prompt_templates WHERE id >= 17) \
TO '/tmp/seed_prompt_templates.csv' WITH CSV HEADER;"
# 5b. 导出 SSA Prompt 版本ID >= 26
docker exec ai-clinical-postgres psql -U postgres -d ai_clinical_research -c "\
COPY (SELECT * FROM capability_schema.prompt_versions WHERE id >= 26) \
TO '/tmp/seed_prompt_versions.csv' WITH CSV HEADER;"
# 5c. 导出提取模板(新表,全部 3 行)
docker exec ai-clinical-postgres psql -U postgres -d ai_clinical_research -c "\
COPY (SELECT * FROM asl_schema.extraction_templates) \
TO '/tmp/seed_extraction_templates.csv' WITH CSV HEADER;"
# 5d. 导入到 RDS在 Docker 容器内操作,避免 PowerShell 编码问题)
docker exec ai-clinical-postgres psql \
"postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5so.pg.rds.aliyuncs.com:5432/ai_clinical_research_test" \
-c "\COPY capability_schema.prompt_templates FROM '/tmp/seed_prompt_templates.csv' WITH CSV HEADER;"
docker exec ai-clinical-postgres psql \
"postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5so.pg.rds.aliyuncs.com:5432/ai_clinical_research_test" \
-c "\COPY capability_schema.prompt_versions FROM '/tmp/seed_prompt_versions.csv' WITH CSV HEADER;"
docker exec ai-clinical-postgres psql \
"postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5so.pg.rds.aliyuncs.com:5432/ai_clinical_research_test" \
-c "\COPY asl_schema.extraction_templates FROM '/tmp/seed_extraction_templates.csv' WITH CSV HEADER;"
```
> ⚠️ **关键**:所有操作通过 `docker exec` 在容器内执行,**避免 PowerShell 的 UTF-8 编码问题**(参考 0126 部署教训)。
### 步骤 6验证
```bash
# 6a. 验证表数量(应从 66 → 96
docker exec ai-clinical-postgres psql \
"postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5so.pg.rds.aliyuncs.com:5432/ai_clinical_research_test" \
-c "SELECT count(*) as total_tables FROM information_schema.tables WHERE table_type='BASE TABLE' AND table_schema NOT IN ('pg_catalog','information_schema');"
# 6b. 验证迁移记录(应为 13 条)
docker exec ai-clinical-postgres psql \
"postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5so.pg.rds.aliyuncs.com:5432/ai_clinical_research_test" \
-c "SELECT count(*) FROM public._prisma_migrations;"
# 6c. 验证 SSA Prompt 数据(应为 27 个模板)
docker exec ai-clinical-postgres psql \
"postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5so.pg.rds.aliyuncs.com:5432/ai_clinical_research_test" \
-c "SELECT count(*) FROM capability_schema.prompt_templates;"
# 6d. 验证 SSA 表存在(应有 11 张表)
docker exec ai-clinical-postgres psql \
"postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5so.pg.rds.aliyuncs.com:5432/ai_clinical_research_test" \
-c "SELECT count(*) FROM information_schema.tables WHERE table_schema='ssa_schema';"
# 6e. 验证关键列存在
docker exec ai-clinical-postgres psql \
"postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5so.pg.rds.aliyuncs.com:5432/ai_clinical_research_test" \
-c "SELECT column_name FROM information_schema.columns WHERE table_schema='asl_schema' AND table_name='research_tasks' AND column_name IN ('target_sources','synthesis_report','result_list');"
# 6f. 验证 RDS 原有数据未受影响Prompt ID 1-16 + 用户数据)
docker exec ai-clinical-postgres psql \
"postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5so.pg.rds.aliyuncs.com:5432/ai_clinical_research_test" \
-c "SELECT id, code, name FROM capability_schema.prompt_templates WHERE id <= 16 ORDER BY id LIMIT 5;"
docker exec ai-clinical-postgres psql \
"postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5so.pg.rds.aliyuncs.com:5432/ai_clinical_research_test" \
-c "SELECT count(*) as user_count FROM platform_schema.users;"
```
**验证通过标准**
| 检查项 | 预期值 |
|-------|-------|
| 总表数 | 96 |
| 迁移记录数 | 13 |
| prompt_templates 总数 | 27 |
| ssa_schema 表数 | 11 |
| research_tasks 新列 | 3 列存在 |
| 原有 PromptID 1-16 | 数据完整、中文正常 |
| 用户数 | 22不变 |
---
## 6. 回滚方案
### 6.1 迁移执行失败
`prisma migrate deploy` 每个迁移在事务中执行,失败自动回滚,不影响已有数据。
处理方式:查看错误信息,修复 SQL 后重新执行。
### 6.2 需要完全回滚
使用步骤 1 的备份文件恢复:
```bash
# 在 Docker 容器内操作(避免编码问题)
# 1. 删除测试数据库
docker exec ai-clinical-postgres psql \
"postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5so.pg.rds.aliyuncs.com:5432/postgres" \
-c "DROP DATABASE ai_clinical_research_test WITH (FORCE);"
# 2. 重新创建
docker exec ai-clinical-postgres psql \
"postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5so.pg.rds.aliyuncs.com:5432/postgres" \
-c "CREATE DATABASE ai_clinical_research_test ENCODING='UTF8';"
# 3. 安装插件
docker exec ai-clinical-postgres psql \
"postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5so.pg.rds.aliyuncs.com:5432/ai_clinical_research_test" \
-c "CREATE EXTENSION IF NOT EXISTS pg_bigm; CREATE EXTENSION IF NOT EXISTS vector;"
# 4. 恢复备份
docker exec ai-clinical-postgres psql \
"postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5so.pg.rds.aliyuncs.com:5432/ai_clinical_research_test" \
-f /tmp/backup_rds_before_0227.sql
```
---
## 7. 注意事项
### 7.1 PowerShell 编码问题
> ⚠️ **0126 部署的惨痛教训PowerShell 会破坏 UTF-8 编码**
所有涉及中文数据的操作(备份、导入 Seed 数据)必须在 Docker 容器内直接执行(`docker exec`**绝不能**通过 PowerShell 管道传输。
### 7.2 RDS 权限
- `CREATE EXTENSION` 需要 RDS 超级用户权限(已在 0126 部署中通过 RDS 控制台安装插件)
- `CREATE SCHEMA` 需要 OWNER 权限airesearch 用户已具备)
### 7.3 执行顺序
补丁迁移 `20260227` **必须排在** `20260219_add_ssa_module` 之后,因为 `ssa_workflows` 表的外键引用 `ssa_sessions` 表(由 #9 迁移创建)。当前文件名的字母序保证了正确顺序。
### 7.4 Prompt ID 序列
导入 Seed 数据后需要重置 `prompt_templates``prompt_versions` 表的 ID 序列:
```sql
-- 重置序列到当前最大值
SELECT setval(pg_get_serial_sequence('capability_schema.prompt_templates', 'id'),
(SELECT MAX(id) FROM capability_schema.prompt_templates));
SELECT setval(pg_get_serial_sequence('capability_schema.prompt_versions', 'id'),
(SELECT MAX(id) FROM capability_schema.prompt_versions));
```
---
## 8. 后续改进:防止 Schema 漂移
### 8.1 根因
当前 6 表 + 3 列无迁移文件的根因是开发中使用了 `prisma db push` 而非 `prisma migrate dev`
### 8.2 四层防线
| 层级 | 时机 | 方法 | 实施成本 |
|------|------|------|---------|
| **开发规范** | 写代码时 | 禁止 `db push`,统一用 `migrate dev` | 零 |
| **pre-commit Hook** | git commit 前 | 检测 schema.prisma 变更但无新迁移文件 | 一次性 |
| **CI 漂移检测** | PR 合并前 | `prisma migrate diff --exit-code` | 一次性 |
| **部署前闸门** | 部署前 | 对比 schema vs 迁移历史 | 一次性 |
详细实施方案见开发规范更新(后续单独文档)。
---
## 9. 时间估算
| 步骤 | 预计耗时 | 说明 |
|------|---------|------|
| 步骤 1备份 | 5 分钟 | pg_dump 约 20MB |
| 步骤 2创建补丁迁移 | 5 分钟 | 创建文件 |
| 步骤 3本地标记 | 2 分钟 | prisma resolve |
| 步骤 4RDS 迁移 | 3 分钟 | migrate deploy |
| 步骤 5Seed 数据 | 5 分钟 | CSV 导出导入 |
| 步骤 6验证 | 5 分钟 | 7 项检查 |
| **总计** | **~25 分钟** | 含缓冲时间 |
---
## 更新日志
| 版本 | 日期 | 内容 |
|------|------|------|
| v1.0 | 2026-02-27 | 初始版本:完整迁移方案 |
---
**文档结束**