feat(platform): Complete Postgres-Only architecture refactoring (Phase 1-7)

Major Changes:
- Implement Platform-Only architecture pattern (unified task management)
- Add PostgresCacheAdapter for unified caching (platform_schema.app_cache)
- Add PgBossQueue for job queue management (platform_schema.job)
- Implement CheckpointService using job.data (generic for all modules)
- Add intelligent threshold-based dual-mode processing (THRESHOLD=50)
- Add task splitting mechanism (auto chunk size recommendation)
- Refactor ASL screening service with smart mode selection
- Refactor DC extraction service with smart mode selection
- Register workers for ASL and DC modules

Technical Highlights:
- All task management data stored in platform_schema.job.data (JSONB)
- Business tables remain clean (no task management fields)
- CheckpointService is generic (shared by all modules)
- Zero code duplication (DRY principle)
- Follows 3-layer architecture principle
- Zero additional cost (no Redis needed, save 8400 CNY/year)

Code Statistics:
- New code: ~1750 lines
- Modified code: ~500 lines
- Test code: ~1800 lines
- Documentation: ~3000 lines

Testing:
- Unit tests: 8/8 passed
- Integration tests: 2/2 passed
- Architecture validation: passed
- Linter errors: 0

Files:
- Platform layer: PostgresCacheAdapter, PgBossQueue, CheckpointService, utils
- ASL module: screeningService, screeningWorker
- DC module: ExtractionController, extractionWorker
- Tests: 11 test files
- Docs: Updated 4 key documents

Status: Phase 1-7 completed, Phase 8-9 pending
This commit is contained in:
2025-12-13 16:10:04 +08:00
parent a3586cdf30
commit fa72beea6c
135 changed files with 17508 additions and 91 deletions

View File

@@ -0,0 +1,20 @@
/**
* 回滚迁移:删除业务表中的任务管理字段
*
* 原因:任务拆分和断点续传应由 platform_schema.job (pg-boss) 统一管理
* 不应在各业务表中重复定义符合3层架构原则
*
* 影响表:
* - asl_schema.screening_tasks (删除 6 个字段)
* - dc_schema.dc_extraction_tasks (无需添加)
*/
-- 删除 ASL 表中的任务管理字段
ALTER TABLE asl_schema.screening_tasks
DROP COLUMN IF EXISTS total_batches,
DROP COLUMN IF EXISTS processed_batches,
DROP COLUMN IF EXISTS current_batch_index,
DROP COLUMN IF EXISTS current_index,
DROP COLUMN IF EXISTS last_checkpoint,
DROP COLUMN IF EXISTS checkpoint_data;