Files
AIclinicalresearch/docs/05-部署文档/0126部署/01-数据库升级方案.md
HaHafeng 2481b786d8 deploy: Complete 0126-27 deployment - database upgrade, services update, code recovery
Major Changes:
- Database: Install pg_bigm/pgvector plugins, create test database
- Python service: v1.0 -> v1.1, add pymupdf4llm/openpyxl/pypandoc
- Node.js backend: v1.3 -> v1.7, fix pino-pretty and ES Module imports
- Frontend: v1.2 -> v1.3, skip TypeScript check for deployment
- Code recovery: Restore empty files from local backup

Technical Fixes:
- Fix pino-pretty error in production (conditional loading)
- Fix ES Module import paths (add .js extensions)
- Fix OSSAdapter TypeScript errors
- Update Prisma Schema (63 models, 16 schemas)
- Update environment variables (DATABASE_URL, EXTRACTION_SERVICE_URL, OSS)
- Remove deprecated variables (REDIS_URL, DIFY_API_URL, DIFY_API_KEY)

Documentation:
- Create 0126 deployment folder with 8 documents
- Update database development standards v2.0
- Update SAE deployment status records

Deployment Status:
- PostgreSQL: ai_clinical_research_test with plugins
- Python: v1.1 @ 172.17.173.84:8000
- Backend: v1.7 @ 172.17.173.89:3001
- Frontend: v1.3 @ 172.17.173.90:80

Tested: All services running successfully on SAE
2026-01-27 08:13:27 +08:00

14 KiB
Raw Blame History

📦 PostgreSQL数据库升级方案

文档版本v1.0
创建日期2026-01-26
适用范围阿里云RDS PostgreSQL 15
变更类型:插件安装 + 环境分离


📋 一、变更概述

1.1 变更内容

变更项 描述 优先级
pg_bigm插件 全文检索增强,支持中文分词 🔴
pgvector插件 向量存储支持RAG向量检索 🔴
测试/生产环境分离 创建独立的测试数据库 🟡
Prisma Schema同步 确保iit_schema正确配置 🔴

1.2 当前数据库状态

实例ID: pgm-2zex1m2y3r23hdn5
规格: 2核4GBpg.n2.2c.1m
存储: 100GB SSD
版本: PostgreSQL 15.0
数据库名: ai_clinical_research
内网地址: pgm-2zex1m2y3r23hdn5.pg.rds.aliyuncs.com:5432

1.3 目标状态

插件:
  - pg_bigm: 已安装
  - pgvector: 已安装

数据库:
  - ai_clinical_research: 生产环境
  - ai_clinical_research_test: 测试环境

Schema列表:
  - platform_schema
  - aia_schema
  - pkb_schema
  - asl_schema
  - dc_schema
  - iit_schema  # 确保已添加
  - admin_schema
  - ssa_schema
  - st_schema
  - rvw_schema
  - common_schema
  - public

🔧 二、pg_bigm插件安装

2.1 插件说明

pg_bigm 是PostgreSQL的全文检索增强插件特别适合中文和日文等CJK语言的搜索。

主要特性

  • 支持中文分词
  • 模糊搜索性能好
  • 可用于LIKE查询加速

2.2 检查插件是否可用

-- 登录RDS PostgreSQL检查可用插件
SELECT * FROM pg_available_extensions WHERE name = 'pg_bigm';

2.3 安装pg_bigm

方式1通过RDS控制台安装推荐

  1. 登录阿里云RDS控制台
  2. 进入实例详情 → 数据库管理
  3. 点击"插件管理"
  4. 搜索 pg_bigm,点击安装

方式2通过SQL安装

-- 需要超级用户权限
CREATE EXTENSION IF NOT EXISTS pg_bigm;

-- 验证安装
SELECT * FROM pg_extension WHERE extname = 'pg_bigm';

2.4 验证pg_bigm

-- 测试中文模糊搜索
CREATE TABLE test_bigm (content TEXT);
INSERT INTO test_bigm VALUES ('这是一个测试文本');

-- 创建pg_bigm索引
CREATE INDEX idx_bigm ON test_bigm USING gin (content gin_bigm_ops);

-- 测试搜索
SELECT * FROM test_bigm WHERE content LIKE '%测试%';

-- 清理测试表
DROP TABLE test_bigm;

🔧 三、pgvector插件安装

3.1 插件说明

pgvector 是PostgreSQL的向量存储和检索插件用于AI/RAG场景。

主要特性

  • 存储高维向量embedding
  • 支持向量相似度搜索L2距离、内积、余弦相似度
  • 可与PostgreSQL原生功能无缝集成

3.2 检查插件是否可用

-- 检查可用插件
SELECT * FROM pg_available_extensions WHERE name = 'vector';

3.3 安装pgvector

方式1通过RDS控制台安装推荐

  1. 登录阿里云RDS控制台
  2. 进入实例详情 → 数据库管理
  3. 点击"插件管理"
  4. 搜索 vector,点击安装

方式2通过SQL安装

-- 需要超级用户权限
CREATE EXTENSION IF NOT EXISTS vector;

-- 验证安装
SELECT * FROM pg_extension WHERE extname = 'vector';

3.4 验证pgvector

-- 测试向量功能
CREATE TABLE test_vectors (
  id SERIAL PRIMARY KEY,
  embedding vector(3)
);

-- 插入测试数据
INSERT INTO test_vectors (embedding) VALUES ('[1,2,3]'), ('[4,5,6]');

-- 测试向量搜索L2距离
SELECT * FROM test_vectors ORDER BY embedding <-> '[2,3,4]' LIMIT 1;

-- 清理测试表
DROP TABLE test_vectors;

🗃️ 四、测试/生产环境数据库分离

4.1 分离方案

方案A同一RDS实例下创建新数据库推荐

  • 优点:成本低,管理简单
  • 缺点:共享资源,可能相互影响

方案B创建新的RDS实例

  • 优点:完全隔离,互不影响
  • 缺点:成本翻倍

推荐方案方案A同一实例下创建新数据库

4.2 创建测试数据库

-- 以超级用户登录
-- 创建测试数据库
CREATE DATABASE ai_clinical_research_test
  WITH 
  OWNER = airesearch
  ENCODING = 'UTF8'
  LC_COLLATE = 'en_US.utf8'
  LC_CTYPE = 'en_US.utf8'
  TEMPLATE template0;

-- 授权
GRANT ALL PRIVILEGES ON DATABASE ai_clinical_research_test TO airesearch;

4.3 在测试数据库中创建Schema

-- 切换到测试数据库
\c ai_clinical_research_test

-- 创建所有Schema
CREATE SCHEMA IF NOT EXISTS platform_schema;
CREATE SCHEMA IF NOT EXISTS aia_schema;
CREATE SCHEMA IF NOT EXISTS pkb_schema;
CREATE SCHEMA IF NOT EXISTS asl_schema;
CREATE SCHEMA IF NOT EXISTS dc_schema;
CREATE SCHEMA IF NOT EXISTS iit_schema;
CREATE SCHEMA IF NOT EXISTS admin_schema;
CREATE SCHEMA IF NOT EXISTS ssa_schema;
CREATE SCHEMA IF NOT EXISTS st_schema;
CREATE SCHEMA IF NOT EXISTS rvw_schema;
CREATE SCHEMA IF NOT EXISTS common_schema;

-- 授权
GRANT ALL ON SCHEMA platform_schema TO airesearch;
GRANT ALL ON SCHEMA aia_schema TO airesearch;
GRANT ALL ON SCHEMA pkb_schema TO airesearch;
GRANT ALL ON SCHEMA asl_schema TO airesearch;
GRANT ALL ON SCHEMA dc_schema TO airesearch;
GRANT ALL ON SCHEMA iit_schema TO airesearch;
GRANT ALL ON SCHEMA admin_schema TO airesearch;
GRANT ALL ON SCHEMA ssa_schema TO airesearch;
GRANT ALL ON SCHEMA st_schema TO airesearch;
GRANT ALL ON SCHEMA rvw_schema TO airesearch;
GRANT ALL ON SCHEMA common_schema TO airesearch;

-- 安装插件(每个数据库都需要单独安装)
CREATE EXTENSION IF NOT EXISTS pg_bigm;
CREATE EXTENSION IF NOT EXISTS vector;

4.4 测试环境连接字符串

# 测试环境DATABASE_URL
DATABASE_URL=postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5.pg.rds.aliyuncs.com:5432/ai_clinical_research_test?connection_limit=18&pool_timeout=10

# 生产环境DATABASE_URL保持不变
DATABASE_URL=postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5.pg.rds.aliyuncs.com:5432/ai_clinical_research?connection_limit=18&pool_timeout=10

📐 五、Prisma Schema同步

5.1 当前问题

当前 prisma/schema.prisma 中的schemas数组

schemas = ["platform_schema", "aia_schema", "pkb_schema", "asl_schema", "common_schema", "dc_schema", "rvw_schema", "admin_schema", "ssa_schema", "st_schema", "public"]

问题:缺少 iit_schema

5.2 修复方案

修改 backend/prisma/schema.prisma

datasource db {
  provider = "postgresql"
  url      = env("DATABASE_URL")
  schemas  = ["platform_schema", "aia_schema", "pkb_schema", "asl_schema", "common_schema", "dc_schema", "rvw_schema", "admin_schema", "ssa_schema", "st_schema", "iit_schema", "public"]
}

5.3 IIT Schema模型定义

需要在 schema.prisma 文件末尾添加IIT模块的模型定义

// ==================== IIT Manager Agent模块 ====================

model IitProject {
  id                 String   @id @default(uuid())
  name               String
  description        String?
  status             String   @default("active")
  
  // REDCap配置
  redcapApiUrl       String?  @map("redcap_api_url")
  redcapApiToken     String?  @map("redcap_api_token")
  redcapProjectId    Int?     @map("redcap_project_id")
  
  // Dify配置
  difyDatasetId      String?  @map("dify_dataset_id")
  difyAgentUrl       String?  @map("dify_agent_url")
  
  // 通知配置
  notificationConfig Json?    @map("notification_config")
  
  // 同步状态
  lastSyncAt         DateTime? @map("last_sync_at")
  
  createdAt          DateTime @default(now()) @map("created_at")
  updatedAt          DateTime @updatedAt @map("updated_at")
  
  pendingActions     IitPendingAction[]
  taskRuns           IitTaskRun[]
  userMappings       IitUserMapping[]
  auditLogs          IitAuditLog[]
  
  @@index([status])
  @@index([redcapProjectId])
  @@map("projects")
  @@schema("iit_schema")
}

model IitPendingAction {
  id         String   @id @default(uuid())
  projectId  String   @map("project_id")
  
  actionType String   @map("action_type")
  status     String   @default("pending")
  
  entityId   String?  @map("entity_id")
  entityType String?  @map("entity_type")
  
  payload    Json?
  result     Json?
  
  createdAt  DateTime @default(now()) @map("created_at")
  updatedAt  DateTime @updatedAt @map("updated_at")
  
  project    IitProject @relation(fields: [projectId], references: [id], onDelete: Cascade)
  
  @@index([projectId])
  @@index([status])
  @@index([actionType])
  @@map("pending_actions")
  @@schema("iit_schema")
}

model IitTaskRun {
  id         String    @id @default(uuid())
  projectId  String    @map("project_id")
  
  taskType   String    @map("task_type")
  status     String    @default("pending")
  
  startedAt  DateTime? @map("started_at")
  completedAt DateTime? @map("completed_at")
  
  result     Json?
  error      String?
  
  createdAt  DateTime  @default(now()) @map("created_at")
  
  project    IitProject @relation(fields: [projectId], references: [id], onDelete: Cascade)
  
  @@index([projectId])
  @@index([status])
  @@index([taskType])
  @@map("task_runs")
  @@schema("iit_schema")
}

model IitUserMapping {
  id              String   @id @default(uuid())
  projectId       String   @map("project_id")
  
  redcapUsername  String?  @map("redcap_username")
  wechatUserId    String?  @map("wechat_user_id")
  wechatOpenId    String?  @map("wechat_open_id")
  
  name            String?
  role            String?
  
  createdAt       DateTime @default(now()) @map("created_at")
  updatedAt       DateTime @updatedAt @map("updated_at")
  
  project         IitProject @relation(fields: [projectId], references: [id], onDelete: Cascade)
  
  @@index([projectId])
  @@index([redcapUsername])
  @@index([wechatUserId])
  @@map("user_mappings")
  @@schema("iit_schema")
}

model IitAuditLog {
  id         String   @id @default(uuid())
  projectId  String   @map("project_id")
  
  actionType String   @map("action_type")
  operator   String?
  
  entityId   String?  @map("entity_id")
  entityType String?  @map("entity_type")
  
  details    Json?
  
  createdAt  DateTime @default(now()) @map("created_at")
  
  project    IitProject @relation(fields: [projectId], references: [id], onDelete: Cascade)
  
  @@index([projectId])
  @@index([actionType])
  @@index([createdAt])
  @@map("audit_logs")
  @@schema("iit_schema")
}

5.4 执行Prisma同步

cd backend

# 1. 更新schema.prisma文件后

# 2. 生成Prisma Client
npx prisma generate

# 3. 推送Schema到数据库注意生产环境谨慎操作
npx prisma db push

# 4. 验证
npx prisma studio

📋 六、操作步骤清单

Step 1备份数据库必须

# 方式1通过RDS控制台创建手动备份
# RDS控制台 → 备份恢复 → 手动备份

# 方式2使用pg_dump需要开启外网访问
pg_dump -h pgm-2zex1m2y3r23hdn5.pg.rds.aliyuncs.com -p 5432 -U airesearch -d ai_clinical_research -F c -f backup_20260126.dump

Step 2安装pg_bigm插件

-- 连接到生产数据库
\c ai_clinical_research

-- 安装插件
CREATE EXTENSION IF NOT EXISTS pg_bigm;

-- 验证
SELECT extname, extversion FROM pg_extension WHERE extname = 'pg_bigm';

Step 3安装pgvector插件

-- 安装插件
CREATE EXTENSION IF NOT EXISTS vector;

-- 验证
SELECT extname, extversion FROM pg_extension WHERE extname = 'vector';

Step 4创建测试数据库

-- 创建数据库
CREATE DATABASE ai_clinical_research_test
  WITH OWNER = airesearch ENCODING = 'UTF8';

-- 切换到测试数据库
\c ai_clinical_research_test

-- 创建所有Schema
CREATE SCHEMA IF NOT EXISTS platform_schema;
CREATE SCHEMA IF NOT EXISTS aia_schema;
CREATE SCHEMA IF NOT EXISTS pkb_schema;
CREATE SCHEMA IF NOT EXISTS asl_schema;
CREATE SCHEMA IF NOT EXISTS dc_schema;
CREATE SCHEMA IF NOT EXISTS iit_schema;
CREATE SCHEMA IF NOT EXISTS admin_schema;
CREATE SCHEMA IF NOT EXISTS ssa_schema;
CREATE SCHEMA IF NOT EXISTS st_schema;
CREATE SCHEMA IF NOT EXISTS rvw_schema;
CREATE SCHEMA IF NOT EXISTS common_schema;

-- 安装插件
CREATE EXTENSION IF NOT EXISTS pg_bigm;
CREATE EXTENSION IF NOT EXISTS vector;

Step 5更新Prisma Schema

cd backend

# 1. 编辑 prisma/schema.prisma添加iit_schema到schemas数组
# 2. 添加IIT模块的模型定义

# 3. 生成Client
npx prisma generate

# 4. 推送到生产数据库
npx prisma db push

Step 6验证

-- 检查插件
SELECT extname, extversion FROM pg_extension;

-- 检查Schema
SELECT schema_name FROM information_schema.schemata;

-- 检查iit_schema的表
SELECT table_name FROM information_schema.tables WHERE table_schema = 'iit_schema';

⚠️ 七、注意事项

7.1 风险提示

  1. 备份优先:任何数据库变更前必须备份
  2. 测试环境先行:在测试环境验证后再操作生产环境
  3. 插件兼容性确认RDS版本支持所需插件
  4. 连接数监控Schema同步时注意连接数

7.2 回滚方案

插件回滚

-- 删除插件(谨慎操作,会删除依赖的表!)
DROP EXTENSION pg_bigm CASCADE;
DROP EXTENSION vector CASCADE;

Schema回滚

-- 恢复到备份
-- 使用RDS控制台的恢复功能

📞 八、问题排查

问题1插件安装失败

可能原因

  • RDS版本不支持
  • 权限不足

解决方案

  • 检查RDS支持的插件列表
  • 使用superuser账号安装

问题2Prisma db push失败

可能原因

  • Schema已存在
  • 表结构冲突

解决方案

  • 使用 --accept-data-loss 参数(谨慎!)
  • 手动调整冲突的表结构

问题3连接数超限

可能原因

  • Prisma连接池未关闭
  • 多个实例同时连接

解决方案

  • 减少connection_limit参数
  • 分批执行迁移

最后更新2026-01-26
维护人员:开发团队
参考文档阿里云RDS PostgreSQL插件文档