Files
AIclinicalresearch/DC模块代码恢复指南.md
HaHafeng 4c6eaaecbf feat(dc): Implement Postgres-Only async architecture and performance optimization
Summary:
- Implement async file upload processing (Platform-Only pattern)
- Add parseExcelWorker with pg-boss queue
- Implement React Query polling mechanism
- Add clean data caching (avoid duplicate parsing)
- Fix pivot single-value column tuple issue
- Optimize performance by 99 percent

Technical Details:

1. Async Architecture (Postgres-Only):
   - SessionService.createSession: Fast upload + push to queue (3s)
   - parseExcelWorker: Background parsing + save clean data (53s)
   - SessionController.getSessionStatus: Status query API for polling
   - React Query Hook: useSessionStatus (auto-serial polling)
   - Frontend progress bar with real-time feedback

2. Performance Optimization:
   - Clean data caching: Worker saves processed data to OSS
   - getPreviewData: Read from clean data cache (0.5s vs 43s, -99 percent)
   - getFullData: Read from clean data cache (0.5s vs 43s, -99 percent)
   - Intelligent cleaning: Boundary detection + ghost column/row removal
   - Safety valve: Max 3000 columns, 5M cells

3. Bug Fixes:
   - Fix pivot column name tuple issue for single value column
   - Fix queue name format (colon to underscore: asl:screening -> asl_screening)
   - Fix polling storm (15+ concurrent requests -> 1 serial request)
   - Fix QUEUE_TYPE environment variable (memory -> pgboss)
   - Fix logger import in PgBossQueue
   - Fix formatSession to return cleanDataKey
   - Fix saveProcessedData to update clean data synchronously

4. Database Changes:
   - ALTER TABLE dc_tool_c_sessions ADD COLUMN clean_data_key VARCHAR(1000)
   - ALTER TABLE dc_tool_c_sessions ALTER COLUMN total_rows DROP NOT NULL
   - ALTER TABLE dc_tool_c_sessions ALTER COLUMN total_cols DROP NOT NULL
   - ALTER TABLE dc_tool_c_sessions ALTER COLUMN columns DROP NOT NULL

5. Documentation:
   - Create Postgres-Only async task processing guide (588 lines)
   - Update Tool C status document (Day 10 summary)
   - Update DC module status document
   - Update system overview document
   - Update cloud-native development guide

Performance Improvements:
- Upload + preview: 96s -> 53.5s (-44 percent)
- Filter operation: 44s -> 2.5s (-94 percent)
- Pivot operation: 45s -> 2.5s (-94 percent)
- Concurrent requests: 15+ -> 1 (-93 percent)
- Complete workflow (upload + 7 ops): 404s -> 70.5s (-83 percent)

Files Changed:
- Backend: 15 files (Worker, Service, Controller, Schema, Config)
- Frontend: 4 files (Hook, Component, API)
- Docs: 4 files (Guide, Status, Overview, Spec)
- Database: 4 column modifications
- Total: ~1388 lines of new/modified code

Status: Fully tested and verified, production ready
2025-12-22 21:30:31 +08:00

248 lines
6.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# DC模块代码恢复指南
> **目标**: 从Cursor缓存中恢复丢失的DC模块代码
> **数据库位置**: `C:\Users\zhibo\AppData\Roaming\Cursor\User\workspaceStorage\d5e3431d02cbaa0109f69d72300733da\state.vscdb`
---
## 📋 恢复方法汇总
### 方法1使用Cursor内置Timeline最简单⭐⭐⭐⭐⭐
**适用于**: 文件曾经保存到磁盘过(即使后来被删除)
#### 步骤:
1. **打开Cursor IDE**
2. **打开资源管理器Explorer**
- 左侧边栏点击"文件"图标
- 或按 `Ctrl+Shift+E`
3. **找到Timeline面板**
- 在Explorer底部找到"TIMELINE"(时间轴)折叠面板
- 如果没看到右键点击Explorer标题栏 → 勾选"Timeline"
4. **浏览文件历史**
- 在文件树中,尝试导航到这些路径:
```
backend/src/modules/dc/tool-b/services/HealthCheckService.ts
backend/src/modules/dc/tool-b/services/TemplateService.ts
backend/src/modules/dc/tool-b/services/DualModelExtractionService.ts
backend/src/modules/dc/tool-b/services/ConflictDetectionService.ts
backend/src/modules/dc/tool-b/controllers/ExtractionController.ts
```
- 点击任一文件(即使文件不存在或为空)
- 查看Timeline面板会显示该文件的所有历史快照
5. **恢复文件**
- 在Timeline中找到最近的版本带有时间戳
- 右键点击历史版本 → 选择"Restore"(恢复)
- 文件内容会恢复到选定的版本
**重复以上步骤恢复所有DC模块文件**
---
### 方法2使用命令面板恢复已删除文件 ⭐⭐⭐⭐
**适用于**: 文件已被完全删除,但曾经保存过
#### 步骤:
1. **打开命令面板**
- Windows: `Ctrl+Shift+P`
- Mac: `Cmd+Shift+P`
2. **搜索恢复命令**
- 输入: `Local History: Find Entry to Restore`
- 选中该命令
3. **搜索文件**
- 输入文件路径,例如:
```
HealthCheckService
```
- 或更精确的路径:
```
backend/src/modules/dc/tool-b/services/HealthCheckService.ts
```
4. **选择版本并恢复**
- 从搜索结果中选择最近的版本
- 确认恢复
**重复以上步骤搜索并恢复所有DC模块文件**
---
### 方法3SQLite数据库直接提取 ⭐⭐⭐⭐⭐(终极方案)
**适用于**: 代码从未落盘只存在于Chat/Composer对话中
#### 准备工作:
1. **安装DB Browser for SQLite**
- 下载地址: https://sqlitebrowser.org/dl/
- 或直接下载: https://github.com/sqlitebrowser/sqlitebrowser/releases
2. **复制数据库文件(重要!)**
```powershell
# 在PowerShell中执行
$source = "C:\Users\zhibo\AppData\Roaming\Cursor\User\workspaceStorage\d5e3431d02cbaa0109f69d72300733da\state.vscdb"
$backup = "D:\MyCursor\AIclinicalresearch\state.vscdb.backup"
Copy-Item $source $backup
Write-Host "✅ 数据库已备份到: $backup"
```
#### 提取步骤:
1. **打开DB Browser**
- 启动"DB Browser for SQLite"
- File → Open Database
- 选择备份的数据库文件: `D:\MyCursor\AIclinicalresearch\state.vscdb.backup`
2. **查询Chat历史**
- 点击"Browse Data"(浏览数据)标签
- 从"Table"下拉菜单选择: `ItemTable`
3. **搜索DC模块相关记录**
- 点击"Filter"(过滤器)按钮
- 在"key"列的过滤框中输入:
```
chat
```
- 或:
```
composer
```
4. **查找关键词**
- 在"value"列中搜索以下关键词使用Ctrl+F:
- `HealthCheckService`
- `DualModelExtractionService`
- `ConflictDetectionService`
- `TemplateService`
- `dc_health_checks`
- `dc_extraction_tasks`
- `ExtractionController`
5. **导出数据**
- 找到包含代码的行
- 双击"value"列,查看完整内容
- value通常是JSON格式其中包含AI生成的代码块
- 复制代码到文本编辑器
6. **提取代码块**
- JSON中的代码通常在以下结构中:
```json
{
"messages": [
{
"content": "```typescript\n[你的代码]\n```"
}
]
}
```
- 提取所有 `\`\`\`typescript` 和 `\`\`\`` 之间的代码
---
## 🎯 重点查找的文件列表
| 文件路径 | 功能 | 优先级 |
|---------|------|--------|
| `backend/src/modules/dc/tool-b/services/HealthCheckService.ts` | 健康检查服务 | ⭐⭐⭐⭐⭐ |
| `backend/src/modules/dc/tool-b/services/TemplateService.ts` | 模板服务 | ⭐⭐⭐⭐⭐ |
| `backend/src/modules/dc/tool-b/services/DualModelExtractionService.ts` | 双模型提取服务 | ⭐⭐⭐⭐ |
| `backend/src/modules/dc/tool-b/services/ConflictDetectionService.ts` | 冲突检测服务 | ⭐⭐⭐⭐ |
| `backend/src/modules/dc/tool-b/controllers/ExtractionController.ts` | 提取控制器 | ⭐⭐⭐⭐⭐ |
| `backend/src/modules/dc/tool-b/routes/index.ts` | 路由配置 | ⭐⭐⭐ |
| `backend/prisma/schema.prisma` (DC相关模型) | 数据库模型 | ⭐⭐⭐⭐⭐ |
---
## 💡 关键提示
1. **Timeline方法最简单**
- 如果文件曾经保存过,这个方法成功率最高
- 即使文件现在是空的Timeline通常也能找到历史版本
2. **命令面板方法最快**
- 适合快速恢复多个已删除文件
- 可以搜索文件名片段
3. **SQLite方法最全面**
- 可以恢复从未保存的代码
- 需要一定的技术能力
- 最终兜底方案
4. **多方法结合**
- 先尝试方法1和2简单快速
- 如果失败再使用方法3终极方案
---
## 🚀 恢复后的操作
找到代码后:
1. **立即保存到文件**
```
backend/src/modules/dc/tool-b/services/[文件名].ts
```
2. **立即Git提交**
```bash
git add .
git commit -m "recover(dc): Restore DC module code from Cursor cache"
git push origin master
```
3. **验证代码完整性**
- 检查是否有语法错误
- 确认所有依赖是否正确
---
## 📞 需要帮助?
如果您在恢复过程中遇到问题:
1. 截图Timeline面板或SQLite查询结果
2. 告诉我具体卡在哪一步
3. 我会提供进一步的指导
---
**🎯 现在就开始恢复吧优先尝试方法1Timeline最简单**