Files
AIclinicalresearch/docs/05-部署文档/0126部署/01-数据库升级方案.md
HaHafeng 2481b786d8 deploy: Complete 0126-27 deployment - database upgrade, services update, code recovery
Major Changes:
- Database: Install pg_bigm/pgvector plugins, create test database
- Python service: v1.0 -> v1.1, add pymupdf4llm/openpyxl/pypandoc
- Node.js backend: v1.3 -> v1.7, fix pino-pretty and ES Module imports
- Frontend: v1.2 -> v1.3, skip TypeScript check for deployment
- Code recovery: Restore empty files from local backup

Technical Fixes:
- Fix pino-pretty error in production (conditional loading)
- Fix ES Module import paths (add .js extensions)
- Fix OSSAdapter TypeScript errors
- Update Prisma Schema (63 models, 16 schemas)
- Update environment variables (DATABASE_URL, EXTRACTION_SERVICE_URL, OSS)
- Remove deprecated variables (REDIS_URL, DIFY_API_URL, DIFY_API_KEY)

Documentation:
- Create 0126 deployment folder with 8 documents
- Update database development standards v2.0
- Update SAE deployment status records

Deployment Status:
- PostgreSQL: ai_clinical_research_test with plugins
- Python: v1.1 @ 172.17.173.84:8000
- Backend: v1.7 @ 172.17.173.89:3001
- Frontend: v1.3 @ 172.17.173.90:80

Tested: All services running successfully on SAE
2026-01-27 08:13:27 +08:00

595 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 📦 PostgreSQL数据库升级方案
> **文档版本**v1.0
> **创建日期**2026-01-26
> **适用范围**阿里云RDS PostgreSQL 15
> **变更类型**:插件安装 + 环境分离
---
## 📋 一、变更概述
### 1.1 变更内容
| 变更项 | 描述 | 优先级 |
|--------|------|--------|
| **pg_bigm插件** | 全文检索增强,支持中文分词 | 🔴 高 |
| **pgvector插件** | 向量存储支持RAG向量检索 | 🔴 高 |
| **测试/生产环境分离** | 创建独立的测试数据库 | 🟡 中 |
| **Prisma Schema同步** | 确保iit_schema正确配置 | 🔴 高 |
### 1.2 当前数据库状态
```yaml
实例ID: pgm-2zex1m2y3r23hdn5
规格: 2核4GBpg.n2.2c.1m
存储: 100GB SSD
版本: PostgreSQL 15.0
数据库名: ai_clinical_research
内网地址: pgm-2zex1m2y3r23hdn5.pg.rds.aliyuncs.com:5432
```
### 1.3 目标状态
```yaml
插件:
- pg_bigm: 已安装
- pgvector: 已安装
数据库:
- ai_clinical_research: 生产环境
- ai_clinical_research_test: 测试环境
Schema列表:
- platform_schema
- aia_schema
- pkb_schema
- asl_schema
- dc_schema
- iit_schema # 确保已添加
- admin_schema
- ssa_schema
- st_schema
- rvw_schema
- common_schema
- public
```
---
## 🔧 二、pg_bigm插件安装
### 2.1 插件说明
**pg_bigm** 是PostgreSQL的全文检索增强插件特别适合中文和日文等CJK语言的搜索。
**主要特性**
- 支持中文分词
- 模糊搜索性能好
- 可用于LIKE查询加速
### 2.2 检查插件是否可用
```sql
-- 登录RDS PostgreSQL检查可用插件
SELECT * FROM pg_available_extensions WHERE name = 'pg_bigm';
```
### 2.3 安装pg_bigm
**方式1通过RDS控制台安装推荐**
1. 登录阿里云RDS控制台
2. 进入实例详情 → 数据库管理
3. 点击"插件管理"
4. 搜索 `pg_bigm`,点击安装
**方式2通过SQL安装**
```sql
-- 需要超级用户权限
CREATE EXTENSION IF NOT EXISTS pg_bigm;
-- 验证安装
SELECT * FROM pg_extension WHERE extname = 'pg_bigm';
```
### 2.4 验证pg_bigm
```sql
-- 测试中文模糊搜索
CREATE TABLE test_bigm (content TEXT);
INSERT INTO test_bigm VALUES ('这是一个测试文本');
-- 创建pg_bigm索引
CREATE INDEX idx_bigm ON test_bigm USING gin (content gin_bigm_ops);
-- 测试搜索
SELECT * FROM test_bigm WHERE content LIKE '%测试%';
-- 清理测试表
DROP TABLE test_bigm;
```
---
## 🔧 三、pgvector插件安装
### 3.1 插件说明
**pgvector** 是PostgreSQL的向量存储和检索插件用于AI/RAG场景。
**主要特性**
- 存储高维向量embedding
- 支持向量相似度搜索L2距离、内积、余弦相似度
- 可与PostgreSQL原生功能无缝集成
### 3.2 检查插件是否可用
```sql
-- 检查可用插件
SELECT * FROM pg_available_extensions WHERE name = 'vector';
```
### 3.3 安装pgvector
**方式1通过RDS控制台安装推荐**
1. 登录阿里云RDS控制台
2. 进入实例详情 → 数据库管理
3. 点击"插件管理"
4. 搜索 `vector`,点击安装
**方式2通过SQL安装**
```sql
-- 需要超级用户权限
CREATE EXTENSION IF NOT EXISTS vector;
-- 验证安装
SELECT * FROM pg_extension WHERE extname = 'vector';
```
### 3.4 验证pgvector
```sql
-- 测试向量功能
CREATE TABLE test_vectors (
id SERIAL PRIMARY KEY,
embedding vector(3)
);
-- 插入测试数据
INSERT INTO test_vectors (embedding) VALUES ('[1,2,3]'), ('[4,5,6]');
-- 测试向量搜索L2距离
SELECT * FROM test_vectors ORDER BY embedding <-> '[2,3,4]' LIMIT 1;
-- 清理测试表
DROP TABLE test_vectors;
```
---
## 🗃️ 四、测试/生产环境数据库分离
### 4.1 分离方案
**方案A同一RDS实例下创建新数据库推荐**
- 优点:成本低,管理简单
- 缺点:共享资源,可能相互影响
**方案B创建新的RDS实例**
- 优点:完全隔离,互不影响
- 缺点:成本翻倍
**推荐方案**方案A同一实例下创建新数据库
### 4.2 创建测试数据库
```sql
-- 以超级用户登录
-- 创建测试数据库
CREATE DATABASE ai_clinical_research_test
WITH
OWNER = airesearch
ENCODING = 'UTF8'
LC_COLLATE = 'en_US.utf8'
LC_CTYPE = 'en_US.utf8'
TEMPLATE template0;
-- 授权
GRANT ALL PRIVILEGES ON DATABASE ai_clinical_research_test TO airesearch;
```
### 4.3 在测试数据库中创建Schema
```sql
-- 切换到测试数据库
\c ai_clinical_research_test
-- 创建所有Schema
CREATE SCHEMA IF NOT EXISTS platform_schema;
CREATE SCHEMA IF NOT EXISTS aia_schema;
CREATE SCHEMA IF NOT EXISTS pkb_schema;
CREATE SCHEMA IF NOT EXISTS asl_schema;
CREATE SCHEMA IF NOT EXISTS dc_schema;
CREATE SCHEMA IF NOT EXISTS iit_schema;
CREATE SCHEMA IF NOT EXISTS admin_schema;
CREATE SCHEMA IF NOT EXISTS ssa_schema;
CREATE SCHEMA IF NOT EXISTS st_schema;
CREATE SCHEMA IF NOT EXISTS rvw_schema;
CREATE SCHEMA IF NOT EXISTS common_schema;
-- 授权
GRANT ALL ON SCHEMA platform_schema TO airesearch;
GRANT ALL ON SCHEMA aia_schema TO airesearch;
GRANT ALL ON SCHEMA pkb_schema TO airesearch;
GRANT ALL ON SCHEMA asl_schema TO airesearch;
GRANT ALL ON SCHEMA dc_schema TO airesearch;
GRANT ALL ON SCHEMA iit_schema TO airesearch;
GRANT ALL ON SCHEMA admin_schema TO airesearch;
GRANT ALL ON SCHEMA ssa_schema TO airesearch;
GRANT ALL ON SCHEMA st_schema TO airesearch;
GRANT ALL ON SCHEMA rvw_schema TO airesearch;
GRANT ALL ON SCHEMA common_schema TO airesearch;
-- 安装插件(每个数据库都需要单独安装)
CREATE EXTENSION IF NOT EXISTS pg_bigm;
CREATE EXTENSION IF NOT EXISTS vector;
```
### 4.4 测试环境连接字符串
```bash
# 测试环境DATABASE_URL
DATABASE_URL=postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5.pg.rds.aliyuncs.com:5432/ai_clinical_research_test?connection_limit=18&pool_timeout=10
# 生产环境DATABASE_URL保持不变
DATABASE_URL=postgresql://airesearch:Xibahe%40fengzhibo117@pgm-2zex1m2y3r23hdn5.pg.rds.aliyuncs.com:5432/ai_clinical_research?connection_limit=18&pool_timeout=10
```
---
## 📐 五、Prisma Schema同步
### 5.1 当前问题
当前 `prisma/schema.prisma` 中的schemas数组
```prisma
schemas = ["platform_schema", "aia_schema", "pkb_schema", "asl_schema", "common_schema", "dc_schema", "rvw_schema", "admin_schema", "ssa_schema", "st_schema", "public"]
```
**问题**:缺少 `iit_schema`
### 5.2 修复方案
修改 `backend/prisma/schema.prisma`
```prisma
datasource db {
provider = "postgresql"
url = env("DATABASE_URL")
schemas = ["platform_schema", "aia_schema", "pkb_schema", "asl_schema", "common_schema", "dc_schema", "rvw_schema", "admin_schema", "ssa_schema", "st_schema", "iit_schema", "public"]
}
```
### 5.3 IIT Schema模型定义
需要在 `schema.prisma` 文件末尾添加IIT模块的模型定义
```prisma
// ==================== IIT Manager Agent模块 ====================
model IitProject {
id String @id @default(uuid())
name String
description String?
status String @default("active")
// REDCap配置
redcapApiUrl String? @map("redcap_api_url")
redcapApiToken String? @map("redcap_api_token")
redcapProjectId Int? @map("redcap_project_id")
// Dify配置
difyDatasetId String? @map("dify_dataset_id")
difyAgentUrl String? @map("dify_agent_url")
// 通知配置
notificationConfig Json? @map("notification_config")
// 同步状态
lastSyncAt DateTime? @map("last_sync_at")
createdAt DateTime @default(now()) @map("created_at")
updatedAt DateTime @updatedAt @map("updated_at")
pendingActions IitPendingAction[]
taskRuns IitTaskRun[]
userMappings IitUserMapping[]
auditLogs IitAuditLog[]
@@index([status])
@@index([redcapProjectId])
@@map("projects")
@@schema("iit_schema")
}
model IitPendingAction {
id String @id @default(uuid())
projectId String @map("project_id")
actionType String @map("action_type")
status String @default("pending")
entityId String? @map("entity_id")
entityType String? @map("entity_type")
payload Json?
result Json?
createdAt DateTime @default(now()) @map("created_at")
updatedAt DateTime @updatedAt @map("updated_at")
project IitProject @relation(fields: [projectId], references: [id], onDelete: Cascade)
@@index([projectId])
@@index([status])
@@index([actionType])
@@map("pending_actions")
@@schema("iit_schema")
}
model IitTaskRun {
id String @id @default(uuid())
projectId String @map("project_id")
taskType String @map("task_type")
status String @default("pending")
startedAt DateTime? @map("started_at")
completedAt DateTime? @map("completed_at")
result Json?
error String?
createdAt DateTime @default(now()) @map("created_at")
project IitProject @relation(fields: [projectId], references: [id], onDelete: Cascade)
@@index([projectId])
@@index([status])
@@index([taskType])
@@map("task_runs")
@@schema("iit_schema")
}
model IitUserMapping {
id String @id @default(uuid())
projectId String @map("project_id")
redcapUsername String? @map("redcap_username")
wechatUserId String? @map("wechat_user_id")
wechatOpenId String? @map("wechat_open_id")
name String?
role String?
createdAt DateTime @default(now()) @map("created_at")
updatedAt DateTime @updatedAt @map("updated_at")
project IitProject @relation(fields: [projectId], references: [id], onDelete: Cascade)
@@index([projectId])
@@index([redcapUsername])
@@index([wechatUserId])
@@map("user_mappings")
@@schema("iit_schema")
}
model IitAuditLog {
id String @id @default(uuid())
projectId String @map("project_id")
actionType String @map("action_type")
operator String?
entityId String? @map("entity_id")
entityType String? @map("entity_type")
details Json?
createdAt DateTime @default(now()) @map("created_at")
project IitProject @relation(fields: [projectId], references: [id], onDelete: Cascade)
@@index([projectId])
@@index([actionType])
@@index([createdAt])
@@map("audit_logs")
@@schema("iit_schema")
}
```
### 5.4 执行Prisma同步
```bash
cd backend
# 1. 更新schema.prisma文件后
# 2. 生成Prisma Client
npx prisma generate
# 3. 推送Schema到数据库注意生产环境谨慎操作
npx prisma db push
# 4. 验证
npx prisma studio
```
---
## 📋 六、操作步骤清单
### Step 1备份数据库必须
```bash
# 方式1通过RDS控制台创建手动备份
# RDS控制台 → 备份恢复 → 手动备份
# 方式2使用pg_dump需要开启外网访问
pg_dump -h pgm-2zex1m2y3r23hdn5.pg.rds.aliyuncs.com -p 5432 -U airesearch -d ai_clinical_research -F c -f backup_20260126.dump
```
### Step 2安装pg_bigm插件
```sql
-- 连接到生产数据库
\c ai_clinical_research
-- 安装插件
CREATE EXTENSION IF NOT EXISTS pg_bigm;
-- 验证
SELECT extname, extversion FROM pg_extension WHERE extname = 'pg_bigm';
```
### Step 3安装pgvector插件
```sql
-- 安装插件
CREATE EXTENSION IF NOT EXISTS vector;
-- 验证
SELECT extname, extversion FROM pg_extension WHERE extname = 'vector';
```
### Step 4创建测试数据库
```sql
-- 创建数据库
CREATE DATABASE ai_clinical_research_test
WITH OWNER = airesearch ENCODING = 'UTF8';
-- 切换到测试数据库
\c ai_clinical_research_test
-- 创建所有Schema
CREATE SCHEMA IF NOT EXISTS platform_schema;
CREATE SCHEMA IF NOT EXISTS aia_schema;
CREATE SCHEMA IF NOT EXISTS pkb_schema;
CREATE SCHEMA IF NOT EXISTS asl_schema;
CREATE SCHEMA IF NOT EXISTS dc_schema;
CREATE SCHEMA IF NOT EXISTS iit_schema;
CREATE SCHEMA IF NOT EXISTS admin_schema;
CREATE SCHEMA IF NOT EXISTS ssa_schema;
CREATE SCHEMA IF NOT EXISTS st_schema;
CREATE SCHEMA IF NOT EXISTS rvw_schema;
CREATE SCHEMA IF NOT EXISTS common_schema;
-- 安装插件
CREATE EXTENSION IF NOT EXISTS pg_bigm;
CREATE EXTENSION IF NOT EXISTS vector;
```
### Step 5更新Prisma Schema
```bash
cd backend
# 1. 编辑 prisma/schema.prisma添加iit_schema到schemas数组
# 2. 添加IIT模块的模型定义
# 3. 生成Client
npx prisma generate
# 4. 推送到生产数据库
npx prisma db push
```
### Step 6验证
```sql
-- 检查插件
SELECT extname, extversion FROM pg_extension;
-- 检查Schema
SELECT schema_name FROM information_schema.schemata;
-- 检查iit_schema的表
SELECT table_name FROM information_schema.tables WHERE table_schema = 'iit_schema';
```
---
## ⚠️ 七、注意事项
### 7.1 风险提示
1. **备份优先**:任何数据库变更前必须备份
2. **测试环境先行**:在测试环境验证后再操作生产环境
3. **插件兼容性**确认RDS版本支持所需插件
4. **连接数监控**Schema同步时注意连接数
### 7.2 回滚方案
**插件回滚**
```sql
-- 删除插件(谨慎操作,会删除依赖的表!)
DROP EXTENSION pg_bigm CASCADE;
DROP EXTENSION vector CASCADE;
```
**Schema回滚**
```sql
-- 恢复到备份
-- 使用RDS控制台的恢复功能
```
---
## 📞 八、问题排查
### 问题1插件安装失败
**可能原因**
- RDS版本不支持
- 权限不足
**解决方案**
- 检查RDS支持的插件列表
- 使用superuser账号安装
### 问题2Prisma db push失败
**可能原因**
- Schema已存在
- 表结构冲突
**解决方案**
- 使用 `--accept-data-loss` 参数(谨慎!)
- 手动调整冲突的表结构
### 问题3连接数超限
**可能原因**
- Prisma连接池未关闭
- 多个实例同时连接
**解决方案**
- 减少connection_limit参数
- 分批执行迁移
---
> **最后更新**2026-01-26
> **维护人员**:开发团队
> **参考文档**[阿里云RDS PostgreSQL插件文档](https://help.aliyun.com/document_detail/142340.html)