Files
AIclinicalresearch/backend/src/tests/verify-test1-database.sql
HaHafeng 40c2f8e148 feat(rag): Complete RAG engine implementation with pgvector
Major Features:
- Created ekb_schema (13th schema) with 3 tables: KB/Document/Chunk
- Implemented EmbeddingService (text-embedding-v4, 1024-dim vectors)
- Implemented ChunkService (smart Markdown chunking)
- Implemented VectorSearchService (multi-query + hybrid search)
- Implemented RerankService (qwen3-rerank)
- Integrated DeepSeek V3 QueryRewriter for cross-language search
- Python service: Added pymupdf4llm for PDF-to-Markdown conversion
- PKB: Dual-mode adapter (pgvector/dify/hybrid)

Architecture:
- Brain-Hand Model: Business layer (DeepSeek) + Engine layer (pgvector)
- Cross-language support: Chinese query matches English documents
- Small Embedding (1024) + Strong Reranker strategy

Performance:
- End-to-end latency: 2.5s
- Cost per query: 0.0025 RMB
- Accuracy improvement: +20.5% (cross-language)

Tests:
- test-embedding-service.ts: Vector embedding verified
- test-rag-e2e.ts: Full pipeline tested
- test-rerank.ts: Rerank quality validated
- test-query-rewrite.ts: Cross-language search verified
- test-pdf-ingest.ts: Real PDF document tested (Dongen 2003.pdf)

Documentation:
- Added 05-RAG-Engine-User-Guide.md
- Added 02-Document-Processing-User-Guide.md
- Updated system status documentation

Status: Production ready
2026-01-21 20:24:29 +08:00

137 lines
2.6 KiB
SQL
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
-- ============================================
-- 验证测试1的数据库状态
-- ============================================
\echo '=========================================='
\echo '1. 检查 app_cache 表是否存在'
\echo '=========================================='
\dt platform_schema.app_cache
\echo ''
\echo '=========================================='
\echo '2. 查看表结构'
\echo '=========================================='
\d platform_schema.app_cache
\echo ''
\echo '=========================================='
\echo '3. 查看索引'
\echo '=========================================='
SELECT indexname, indexdef
FROM pg_indexes
WHERE schemaname = 'platform_schema'
AND tablename = 'app_cache';
\echo ''
\echo '=========================================='
\echo '4. 检查测试数据是否清理应为0行'
\echo '=========================================='
SELECT COUNT(*) as test_data_count
FROM platform_schema.app_cache
WHERE key LIKE 'test:%';
\echo ''
\echo '=========================================='
\echo '5. 查看所有缓存数据'
\echo '=========================================='
SELECT id, key,
LEFT(value::text, 50) as value_preview,
expires_at,
created_at
FROM platform_schema.app_cache
ORDER BY created_at DESC
LIMIT 10;
\echo ''
\echo '=========================================='
\echo '6. 查看表统计信息'
\echo '=========================================='
SELECT
COUNT(*) as total_records,
pg_size_pretty(pg_total_relation_size('platform_schema.app_cache')) as total_size,
pg_size_pretty(pg_relation_size('platform_schema.app_cache')) as table_size,
pg_size_pretty(pg_indexes_size('platform_schema.app_cache')) as indexes_size
FROM platform_schema.app_cache;
\echo ''
\echo '=========================================='
\echo '7. 测试写入和删除(不会影响现有数据)'
\echo '=========================================='
-- 插入测试数据
INSERT INTO platform_schema.app_cache (key, value, expires_at, created_at)
VALUES ('verify_test', '{"status": "ok"}', NOW() + INTERVAL '1 hour', NOW());
-- 验证插入
SELECT 'INSERT 成功' as result
FROM platform_schema.app_cache
WHERE key = 'verify_test';
-- 删除测试数据
DELETE FROM platform_schema.app_cache WHERE key = 'verify_test';
-- 验证删除
SELECT CASE
WHEN COUNT(*) = 0 THEN 'DELETE 成功'
ELSE 'DELETE 失败'
END as result
FROM platform_schema.app_cache
WHERE key = 'verify_test';
\echo ''
\echo '=========================================='
\echo '✅ 数据库验证完成!'
\echo '=========================================='