Files
AIclinicalresearch/run_recovery.ps1
HaHafeng 40c2f8e148 feat(rag): Complete RAG engine implementation with pgvector
Major Features:
- Created ekb_schema (13th schema) with 3 tables: KB/Document/Chunk
- Implemented EmbeddingService (text-embedding-v4, 1024-dim vectors)
- Implemented ChunkService (smart Markdown chunking)
- Implemented VectorSearchService (multi-query + hybrid search)
- Implemented RerankService (qwen3-rerank)
- Integrated DeepSeek V3 QueryRewriter for cross-language search
- Python service: Added pymupdf4llm for PDF-to-Markdown conversion
- PKB: Dual-mode adapter (pgvector/dify/hybrid)

Architecture:
- Brain-Hand Model: Business layer (DeepSeek) + Engine layer (pgvector)
- Cross-language support: Chinese query matches English documents
- Small Embedding (1024) + Strong Reranker strategy

Performance:
- End-to-end latency: 2.5s
- Cost per query: 0.0025 RMB
- Accuracy improvement: +20.5% (cross-language)

Tests:
- test-embedding-service.ts: Vector embedding verified
- test-rag-e2e.ts: Full pipeline tested
- test-rerank.ts: Rerank quality validated
- test-query-rewrite.ts: Cross-language search verified
- test-pdf-ingest.ts: Real PDF document tested (Dongen 2003.pdf)

Documentation:
- Added 05-RAG-Engine-User-Guide.md
- Added 02-Document-Processing-User-Guide.md
- Updated system status documentation

Status: Production ready
2026-01-21 20:24:29 +08:00

108 lines
1.7 KiB
PowerShell
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# DC模块代码恢复脚本运行器
Write-Host "==================================================================================" -ForegroundColor Cyan
Write-Host "DC模块代码恢复工具" -ForegroundColor Cyan
Write-Host "==================================================================================" -ForegroundColor Cyan
# 检查Python是否安装
try {
$pythonVersion = python --version 2>&1
Write-Host "✅ Python版本: $pythonVersion" -ForegroundColor Green
} catch {
Write-Host "❌ 错误: Python未安装或不在PATH中" -ForegroundColor Red
exit 1
}
# 运行恢复脚本
Write-Host "`n📦 开始运行恢复脚本..." -ForegroundColor Yellow
$output = python recover_dc_code.py 2>&1 | Out-String
Write-Host $output
# 保存输出到文件
$output | Out-File "recovery_output.log" -Encoding UTF8
Write-Host "`n💾 输出已保存到: recovery_output.log" -ForegroundColor Green
# 检查恢复结果
if (Test-Path "recovered_dc_code") {
Write-Host "`n✅ 恢复目录已创建!" -ForegroundColor Green
Write-Host "📁 恢复的文件:" -ForegroundColor Cyan
Get-ChildItem "recovered_dc_code" | Format-Table Name, Length, LastWriteTime
} else {
Write-Host "`n⚠️ 未创建恢复目录可能未找到DC代码" -ForegroundColor Yellow
}
Write-Host "`n==================================================================================" -ForegroundColor Cyan
Write-Host "完成!请查看 recovered_dc_code 目录" -ForegroundColor Cyan
Write-Host "==================================================================================" -ForegroundColor Cyan