Files
AIclinicalresearch/docs/03-业务模块/PKB-个人知识库/04-开发计划/01-Dify替换为pgvector开发计划.md
HaHafeng 40c2f8e148 feat(rag): Complete RAG engine implementation with pgvector
Major Features:
- Created ekb_schema (13th schema) with 3 tables: KB/Document/Chunk
- Implemented EmbeddingService (text-embedding-v4, 1024-dim vectors)
- Implemented ChunkService (smart Markdown chunking)
- Implemented VectorSearchService (multi-query + hybrid search)
- Implemented RerankService (qwen3-rerank)
- Integrated DeepSeek V3 QueryRewriter for cross-language search
- Python service: Added pymupdf4llm for PDF-to-Markdown conversion
- PKB: Dual-mode adapter (pgvector/dify/hybrid)

Architecture:
- Brain-Hand Model: Business layer (DeepSeek) + Engine layer (pgvector)
- Cross-language support: Chinese query matches English documents
- Small Embedding (1024) + Strong Reranker strategy

Performance:
- End-to-end latency: 2.5s
- Cost per query: 0.0025 RMB
- Accuracy improvement: +20.5% (cross-language)

Tests:
- test-embedding-service.ts: Vector embedding verified
- test-rag-e2e.ts: Full pipeline tested
- test-rerank.ts: Rerank quality validated
- test-query-rewrite.ts: Cross-language search verified
- test-pdf-ingest.ts: Real PDF document tested (Dongen 2003.pdf)

Documentation:
- Added 05-RAG-Engine-User-Guide.md
- Added 02-Document-Processing-User-Guide.md
- Updated system status documentation

Status: Production ready
2026-01-21 20:24:29 +08:00

58 lines
2.7 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# ⚠️ 文档已迁移
> **迁移日期:** 2026-01-20
> **迁移原因:** 知识库能力提升为通用能力层,不再局限于 PKB 模块
---
## 📍 新文档位置
本文档已迁移至通用能力层:
**[02-通用能力层/03-RAG引擎/02-pgvector替换Dify计划.md](../../../02-通用能力层/03-RAG引擎/02-pgvector替换Dify计划.md)**
---
## 🔄 架构变更说明
### 变更原因
知识库RAG 引擎)是**通用能力**,不应局限于单一业务模块:
| 业务模块 | 使用场景 |
|----------|----------|
| **PKB** 个人知识库 | 知识库管理、RAG 问答 |
| **AIA** AI智能问答 | @知识库 问答、附件理解 |
| **ASL** AI智能文献 | 文献库检索、智能综述 |
| **RVW** 稿件审查 | 稿件与文献对比、查重 |
### 新架构
```
┌─────────────────────────────────────────────────────────────┐
│ 业务模块层 │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ PKB │ │ AIA │ │ ASL │ │ RVW │ │
│ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ │
│ └────────────┴────────────┴────────────┘ │
│ │ │
│ ▼ │
├─────────────────────────────────────────────────────────────┤
│ 知识库引擎(通用能力层) │
│ 代码位置backend/src/common/rag/ │
│ 文档位置docs/02-通用能力层/03-RAG引擎/ │
└─────────────────────────────────────────────────────────────┘
```
---
## 📚 相关文档
- [知识库引擎架构设计](../../../02-通用能力层/03-RAG引擎/01-知识库引擎架构设计.md)
- [pgvector 替换 Dify 开发计划](../../../02-通用能力层/03-RAG引擎/02-pgvector替换Dify计划.md)
- [通用能力层 - RAG 引擎 README](../../../02-通用能力层/03-RAG引擎/README.md)
---
**请访问新文档位置获取最新内容。**