Major Changes: - Database: Install pg_bigm/pgvector plugins, create test database - Python service: v1.0 -> v1.1, add pymupdf4llm/openpyxl/pypandoc - Node.js backend: v1.3 -> v1.7, fix pino-pretty and ES Module imports - Frontend: v1.2 -> v1.3, skip TypeScript check for deployment - Code recovery: Restore empty files from local backup Technical Fixes: - Fix pino-pretty error in production (conditional loading) - Fix ES Module import paths (add .js extensions) - Fix OSSAdapter TypeScript errors - Update Prisma Schema (63 models, 16 schemas) - Update environment variables (DATABASE_URL, EXTRACTION_SERVICE_URL, OSS) - Remove deprecated variables (REDIS_URL, DIFY_API_URL, DIFY_API_KEY) Documentation: - Create 0126 deployment folder with 8 documents - Update database development standards v2.0 - Update SAE deployment status records Deployment Status: - PostgreSQL: ai_clinical_research_test with plugins - Python: v1.1 @ 172.17.173.84:8000 - Backend: v1.7 @ 172.17.173.89:3001 - Frontend: v1.3 @ 172.17.173.90:80 Tested: All services running successfully on SAE
14 KiB
14 KiB
📦 PostgreSQL数据库升级方案
文档版本:v1.0
创建日期:2026-01-26
适用范围:阿里云RDS PostgreSQL 15
变更类型:插件安装 + 环境分离
📋 一、变更概述
1.1 变更内容
| 变更项 | 描述 | 优先级 |
|---|---|---|
| pg_bigm插件 | 全文检索增强,支持中文分词 | 🔴 高 |
| pgvector插件 | 向量存储,支持RAG向量检索 | 🔴 高 |
| 测试/生产环境分离 | 创建独立的测试数据库 | 🟡 中 |
| Prisma Schema同步 | 确保iit_schema正确配置 | 🔴 高 |
1.2 当前数据库状态
实例ID: pgm-2zex1m2y3r23hdn5
规格: 2核4GB(pg.n2.2c.1m)
存储: 100GB SSD
版本: PostgreSQL 15.0
数据库名: ai_clinical_research
内网地址: pgm-2zex1m2y3r23hdn5.pg.rds.aliyuncs.com:5432
1.3 目标状态
插件:
- pg_bigm: 已安装
- pgvector: 已安装
数据库:
- ai_clinical_research: 生产环境
- ai_clinical_research_test: 测试环境
Schema列表:
- platform_schema
- aia_schema
- pkb_schema
- asl_schema
- dc_schema
- iit_schema # 确保已添加
- admin_schema
- ssa_schema
- st_schema
- rvw_schema
- common_schema
- public
🔧 二、pg_bigm插件安装
2.1 插件说明
pg_bigm 是PostgreSQL的全文检索增强插件,特别适合中文和日文等CJK语言的搜索。
主要特性:
- 支持中文分词
- 模糊搜索性能好
- 可用于LIKE查询加速
2.2 检查插件是否可用
-- 登录RDS PostgreSQL,检查可用插件
SELECT * FROM pg_available_extensions WHERE name = 'pg_bigm';
2.3 安装pg_bigm
方式1:通过RDS控制台安装(推荐)
- 登录阿里云RDS控制台
- 进入实例详情 → 数据库管理
- 点击"插件管理"
- 搜索
pg_bigm,点击安装
方式2:通过SQL安装
-- 需要超级用户权限
CREATE EXTENSION IF NOT EXISTS pg_bigm;
-- 验证安装
SELECT * FROM pg_extension WHERE extname = 'pg_bigm';
2.4 验证pg_bigm
-- 测试中文模糊搜索
CREATE TABLE test_bigm (content TEXT);
INSERT INTO test_bigm VALUES ('这是一个测试文本');
-- 创建pg_bigm索引
CREATE INDEX idx_bigm ON test_bigm USING gin (content gin_bigm_ops);
-- 测试搜索
SELECT * FROM test_bigm WHERE content LIKE '%测试%';
-- 清理测试表
DROP TABLE test_bigm;
🔧 三、pgvector插件安装
3.1 插件说明
pgvector 是PostgreSQL的向量存储和检索插件,用于AI/RAG场景。
主要特性:
- 存储高维向量(embedding)
- 支持向量相似度搜索(L2距离、内积、余弦相似度)
- 可与PostgreSQL原生功能无缝集成
3.2 检查插件是否可用
-- 检查可用插件
SELECT * FROM pg_available_extensions WHERE name = 'vector';
3.3 安装pgvector
方式1:通过RDS控制台安装(推荐)
- 登录阿里云RDS控制台
- 进入实例详情 → 数据库管理
- 点击"插件管理"
- 搜索
vector,点击安装
方式2:通过SQL安装
-- 需要超级用户权限
CREATE EXTENSION IF NOT EXISTS vector;
-- 验证安装
SELECT * FROM pg_extension WHERE extname = 'vector';
3.4 验证pgvector
-- 测试向量功能
CREATE TABLE test_vectors (
id SERIAL PRIMARY KEY,
embedding vector(3)
);
-- 插入测试数据
INSERT INTO test_vectors (embedding) VALUES ('[1,2,3]'), ('[4,5,6]');
-- 测试向量搜索(L2距离)
SELECT * FROM test_vectors ORDER BY embedding <-> '[2,3,4]' LIMIT 1;
-- 清理测试表
DROP TABLE test_vectors;
🗃️ 四、测试/生产环境数据库分离
4.1 分离方案
方案A:同一RDS实例下创建新数据库(推荐)
- 优点:成本低,管理简单
- 缺点:共享资源,可能相互影响
方案B:创建新的RDS实例
- 优点:完全隔离,互不影响
- 缺点:成本翻倍
推荐方案:方案A(同一实例下创建新数据库)
4.2 创建测试数据库
-- 以超级用户登录
-- 创建测试数据库
CREATE DATABASE ai_clinical_research_test
WITH
OWNER = airesearch
ENCODING = 'UTF8'
LC_COLLATE = 'en_US.utf8'
LC_CTYPE = 'en_US.utf8'
TEMPLATE template0;
-- 授权
GRANT ALL PRIVILEGES ON DATABASE ai_clinical_research_test TO airesearch;
4.3 在测试数据库中创建Schema
-- 切换到测试数据库
\c ai_clinical_research_test
-- 创建所有Schema
CREATE SCHEMA IF NOT EXISTS platform_schema;
CREATE SCHEMA IF NOT EXISTS aia_schema;
CREATE SCHEMA IF NOT EXISTS pkb_schema;
CREATE SCHEMA IF NOT EXISTS asl_schema;
CREATE SCHEMA IF NOT EXISTS dc_schema;
CREATE SCHEMA IF NOT EXISTS iit_schema;
CREATE SCHEMA IF NOT EXISTS admin_schema;
CREATE SCHEMA IF NOT EXISTS ssa_schema;
CREATE SCHEMA IF NOT EXISTS st_schema;
CREATE SCHEMA IF NOT EXISTS rvw_schema;
CREATE SCHEMA IF NOT EXISTS common_schema;
-- 授权
GRANT ALL ON SCHEMA platform_schema TO airesearch;
GRANT ALL ON SCHEMA aia_schema TO airesearch;
GRANT ALL ON SCHEMA pkb_schema TO airesearch;
GRANT ALL ON SCHEMA asl_schema TO airesearch;
GRANT ALL ON SCHEMA dc_schema TO airesearch;
GRANT ALL ON SCHEMA iit_schema TO airesearch;
GRANT ALL ON SCHEMA admin_schema TO airesearch;
GRANT ALL ON SCHEMA ssa_schema TO airesearch;
GRANT ALL ON SCHEMA st_schema TO airesearch;
GRANT ALL ON SCHEMA rvw_schema TO airesearch;
GRANT ALL ON SCHEMA common_schema TO airesearch;
-- 安装插件(每个数据库都需要单独安装)
CREATE EXTENSION IF NOT EXISTS pg_bigm;
CREATE EXTENSION IF NOT EXISTS vector;
4.4 测试环境连接字符串
# 测试环境DATABASE_URL
DATABASE_URL=postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5.pg.rds.aliyuncs.com:5432/ai_clinical_research_test?connection_limit=18&pool_timeout=10
# 生产环境DATABASE_URL(保持不变)
DATABASE_URL=postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5.pg.rds.aliyuncs.com:5432/ai_clinical_research?connection_limit=18&pool_timeout=10
📐 五、Prisma Schema同步
5.1 当前问题
当前 prisma/schema.prisma 中的schemas数组:
schemas = ["platform_schema", "aia_schema", "pkb_schema", "asl_schema", "common_schema", "dc_schema", "rvw_schema", "admin_schema", "ssa_schema", "st_schema", "public"]
问题:缺少 iit_schema!
5.2 修复方案
修改 backend/prisma/schema.prisma:
datasource db {
provider = "postgresql"
url = env("DATABASE_URL")
schemas = ["platform_schema", "aia_schema", "pkb_schema", "asl_schema", "common_schema", "dc_schema", "rvw_schema", "admin_schema", "ssa_schema", "st_schema", "iit_schema", "public"]
}
5.3 IIT Schema模型定义
需要在 schema.prisma 文件末尾添加IIT模块的模型定义:
// ==================== IIT Manager Agent模块 ====================
model IitProject {
id String @id @default(uuid())
name String
description String?
status String @default("active")
// REDCap配置
redcapApiUrl String? @map("redcap_api_url")
redcapApiToken String? @map("redcap_api_token")
redcapProjectId Int? @map("redcap_project_id")
// Dify配置
difyDatasetId String? @map("dify_dataset_id")
difyAgentUrl String? @map("dify_agent_url")
// 通知配置
notificationConfig Json? @map("notification_config")
// 同步状态
lastSyncAt DateTime? @map("last_sync_at")
createdAt DateTime @default(now()) @map("created_at")
updatedAt DateTime @updatedAt @map("updated_at")
pendingActions IitPendingAction[]
taskRuns IitTaskRun[]
userMappings IitUserMapping[]
auditLogs IitAuditLog[]
@@index([status])
@@index([redcapProjectId])
@@map("projects")
@@schema("iit_schema")
}
model IitPendingAction {
id String @id @default(uuid())
projectId String @map("project_id")
actionType String @map("action_type")
status String @default("pending")
entityId String? @map("entity_id")
entityType String? @map("entity_type")
payload Json?
result Json?
createdAt DateTime @default(now()) @map("created_at")
updatedAt DateTime @updatedAt @map("updated_at")
project IitProject @relation(fields: [projectId], references: [id], onDelete: Cascade)
@@index([projectId])
@@index([status])
@@index([actionType])
@@map("pending_actions")
@@schema("iit_schema")
}
model IitTaskRun {
id String @id @default(uuid())
projectId String @map("project_id")
taskType String @map("task_type")
status String @default("pending")
startedAt DateTime? @map("started_at")
completedAt DateTime? @map("completed_at")
result Json?
error String?
createdAt DateTime @default(now()) @map("created_at")
project IitProject @relation(fields: [projectId], references: [id], onDelete: Cascade)
@@index([projectId])
@@index([status])
@@index([taskType])
@@map("task_runs")
@@schema("iit_schema")
}
model IitUserMapping {
id String @id @default(uuid())
projectId String @map("project_id")
redcapUsername String? @map("redcap_username")
wechatUserId String? @map("wechat_user_id")
wechatOpenId String? @map("wechat_open_id")
name String?
role String?
createdAt DateTime @default(now()) @map("created_at")
updatedAt DateTime @updatedAt @map("updated_at")
project IitProject @relation(fields: [projectId], references: [id], onDelete: Cascade)
@@index([projectId])
@@index([redcapUsername])
@@index([wechatUserId])
@@map("user_mappings")
@@schema("iit_schema")
}
model IitAuditLog {
id String @id @default(uuid())
projectId String @map("project_id")
actionType String @map("action_type")
operator String?
entityId String? @map("entity_id")
entityType String? @map("entity_type")
details Json?
createdAt DateTime @default(now()) @map("created_at")
project IitProject @relation(fields: [projectId], references: [id], onDelete: Cascade)
@@index([projectId])
@@index([actionType])
@@index([createdAt])
@@map("audit_logs")
@@schema("iit_schema")
}
5.4 执行Prisma同步
cd backend
# 1. 更新schema.prisma文件后
# 2. 生成Prisma Client
npx prisma generate
# 3. 推送Schema到数据库(注意:生产环境谨慎操作)
npx prisma db push
# 4. 验证
npx prisma studio
📋 六、操作步骤清单
Step 1:备份数据库(必须)
# 方式1:通过RDS控制台创建手动备份
# RDS控制台 → 备份恢复 → 手动备份
# 方式2:使用pg_dump(需要开启外网访问)
pg_dump -h pgm-2zex1m2y3r23hdn5.pg.rds.aliyuncs.com -p 5432 -U airesearch -d ai_clinical_research -F c -f backup_20260126.dump
Step 2:安装pg_bigm插件
-- 连接到生产数据库
\c ai_clinical_research
-- 安装插件
CREATE EXTENSION IF NOT EXISTS pg_bigm;
-- 验证
SELECT extname, extversion FROM pg_extension WHERE extname = 'pg_bigm';
Step 3:安装pgvector插件
-- 安装插件
CREATE EXTENSION IF NOT EXISTS vector;
-- 验证
SELECT extname, extversion FROM pg_extension WHERE extname = 'vector';
Step 4:创建测试数据库
-- 创建数据库
CREATE DATABASE ai_clinical_research_test
WITH OWNER = airesearch ENCODING = 'UTF8';
-- 切换到测试数据库
\c ai_clinical_research_test
-- 创建所有Schema
CREATE SCHEMA IF NOT EXISTS platform_schema;
CREATE SCHEMA IF NOT EXISTS aia_schema;
CREATE SCHEMA IF NOT EXISTS pkb_schema;
CREATE SCHEMA IF NOT EXISTS asl_schema;
CREATE SCHEMA IF NOT EXISTS dc_schema;
CREATE SCHEMA IF NOT EXISTS iit_schema;
CREATE SCHEMA IF NOT EXISTS admin_schema;
CREATE SCHEMA IF NOT EXISTS ssa_schema;
CREATE SCHEMA IF NOT EXISTS st_schema;
CREATE SCHEMA IF NOT EXISTS rvw_schema;
CREATE SCHEMA IF NOT EXISTS common_schema;
-- 安装插件
CREATE EXTENSION IF NOT EXISTS pg_bigm;
CREATE EXTENSION IF NOT EXISTS vector;
Step 5:更新Prisma Schema
cd backend
# 1. 编辑 prisma/schema.prisma,添加iit_schema到schemas数组
# 2. 添加IIT模块的模型定义
# 3. 生成Client
npx prisma generate
# 4. 推送到生产数据库
npx prisma db push
Step 6:验证
-- 检查插件
SELECT extname, extversion FROM pg_extension;
-- 检查Schema
SELECT schema_name FROM information_schema.schemata;
-- 检查iit_schema的表
SELECT table_name FROM information_schema.tables WHERE table_schema = 'iit_schema';
⚠️ 七、注意事项
7.1 风险提示
- 备份优先:任何数据库变更前必须备份
- 测试环境先行:在测试环境验证后再操作生产环境
- 插件兼容性:确认RDS版本支持所需插件
- 连接数监控:Schema同步时注意连接数
7.2 回滚方案
插件回滚:
-- 删除插件(谨慎操作,会删除依赖的表!)
DROP EXTENSION pg_bigm CASCADE;
DROP EXTENSION vector CASCADE;
Schema回滚:
-- 恢复到备份
-- 使用RDS控制台的恢复功能
📞 八、问题排查
问题1:插件安装失败
可能原因:
- RDS版本不支持
- 权限不足
解决方案:
- 检查RDS支持的插件列表
- 使用superuser账号安装
问题2:Prisma db push失败
可能原因:
- Schema已存在
- 表结构冲突
解决方案:
- 使用
--accept-data-loss参数(谨慎!) - 手动调整冲突的表结构
问题3:连接数超限
可能原因:
- Prisma连接池未关闭
- 多个实例同时连接
解决方案:
- 减少connection_limit参数
- 分批执行迁移
最后更新:2026-01-26
维护人员:开发团队
参考文档:阿里云RDS PostgreSQL插件文档