Files
AIclinicalresearch/docs/07-运维文档/README.md
HaHafeng bbf98c4d5c fix(backend): Resolve PgBoss infinite loop issue and cleanup unused files
Backend fixes:
- Fix PgBoss task infinite loop on SAE (root cause: missing queue table constraints)
- Add singletonKey to prevent duplicate job enqueueing
- Add idempotency check in reviewWorker (skip completed tasks)
- Add optimistic locking in reviewService (atomic status update)

Frontend fixes:
- Add isSubmitting state to prevent duplicate submissions in RVW Dashboard
- Fix API baseURL in knowledgeBaseApi (relative path)

Cleanup (removed):
- Old frontend/ directory (migrated to frontend-v2)
- python-microservice/ (unused, replaced by extraction_service)
- Root package.json and node_modules (accidentally created)
- redcap-docker-dev/ (external dependency)
- Various temporary files and outdated docs in root

New documentation:
- docs/07-运维文档/01-PgBoss队列监控与维护.md
- docs/07-运维文档/02-故障预防检查清单.md
- docs/07-运维文档/03-数据库迁移注意事项.md

Database fix applied to RDS:
- Added PRIMARY KEY to platform_schema.queue
- Added 3 missing foreign key constraints

Tested: Local build passed, RDS constraints verified
2026-01-27 18:16:22 +08:00

60 lines
1.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 运维文档
> **文档目的**:记录系统运维相关的监控、故障排查、预防措施等
> **创建日期**2026-01-27
> **维护者**:运维团队
---
## 📚 文档索引
| 文档 | 说明 | 优先级 |
|------|------|--------|
| [01-PgBoss队列监控与维护](./01-PgBoss队列监控与维护.md) | pg-boss 任务队列的监控、清理、故障排查 | 🔴 高 |
| [02-故障预防检查清单](./02-故障预防检查清单.md) | 部署前/后的检查清单,预防常见故障 | 🔴 高 |
| [03-数据库迁移注意事项](./03-数据库迁移注意事项.md) | 数据库迁移时的检查项,避免约束丢失 | 🔴 高 |
---
## 🔧 快速参考
### 日常检查 SQL
```sql
-- 检查重复队列定义
SELECT name, COUNT(*) as cnt
FROM platform_schema.queue
GROUP BY name
HAVING COUNT(*) > 1;
-- 检查任务状态分布
SELECT name, state, COUNT(*)
FROM platform_schema.job_common
GROUP BY name, state
ORDER BY name, state;
```
### 紧急故障处理
1. **任务无限循环** → 参考 [01-PgBoss队列监控与维护](./01-PgBoss队列监控与维护.md)
2. **数据库连接满** → 参考 [03-数据库运维手册](./03-数据库运维手册.md)
3. **服务不可用** → 重启 SAE 应用,检查日志
---
## 📈 监控告警
| 监控项 | 阈值 | 处理方式 |
|--------|------|---------|
| 队列重复定义 | > 1 | 清理重复条目 |
| 活跃任务数 | > 100 | 检查是否有任务卡住 |
| 数据库连接数 | > 80% | 检查连接泄漏 |
---
## 📝 相关文档
- [部署文档](../05-部署文档/README.md)
- [测试文档](../06-测试文档/README.md)
- [故障分析报告](../06-测试文档/故障分析报告%20(1).md)