# AI智能文献模块 - 数据库设? > **文档版本?* v3.0 > **创建日期?* 2025-10-29 > **维护者:** AI智能文献开发团? > **最后更新:** 2025-11-22(Day 4:全文复筛数据库设计? > **更新说明?* 新增全文复筛相关表(`AslLiterature`扩展、`AslFulltextScreeningTask`、`AslFulltextScreeningResult`? --- ## 📋 文档说明 本文档描述AI智能文献模块的数据库设计,包括数据表结构、关系设计、索引设计等? **技术栈**: - 数据库:PostgreSQL 16+ - ORM:Prisma - Schema隔离:`asl_schema` - 关联用户表:`platform_schema.users` --- ## 🏗?Schema架构 ASL模块使用独立?`asl_schema` 进行数据隔离,确保模块独立性和数据安全? ``` platform_schema └── users (用户? ?asl_schema ├── screening_projects (筛选项? ├── literatures (文献条目) ├── screening_results (标题初筛结果) ├── screening_tasks (标题初筛任务) ├── fulltext_screening_tasks (全文复筛任务) ?Day 4新增 └── fulltext_screening_results (全文复筛结果) ?Day 4新增 ``` **v3.0 更新说明?025-11-22?*?- ?扩展 `literatures` 表:支持全文生命周期管理、PDF存储、全文内容引?- ?新增 `fulltext_screening_tasks` 表:管理全文复筛批处理任?- ?新增 `fulltext_screening_results` 表:存储12字段评估结果 - ?符合云原生规范:全文内容存储引用而非直接存储 --- ## 🗄?核心数据? ### 1. 筛选项目表 (screening_projects) **Prisma模型?*: `AslScreeningProject` **表名**: `asl_schema.screening_projects` ```prisma model AslScreeningProject { id String @id @default(uuid()) userId String @map("user_id") user User @relation("AslProjects", fields: [userId], references: [id], onDelete: Cascade) projectName String @map("project_name") // PICO标准 picoCriteria Json @map("pico_criteria") // ⚠️ 格式兼容性说明: // 前端使用: { P, I, C, O, S } // 后端兼容: { P, I, C, O, S } ?{ population, intervention, comparison, outcome, studyDesign } // screeningService.ts 中有字段映射逻辑 // 筛选标? inclusionCriteria String @map("inclusion_criteria") @db.Text exclusionCriteria String @map("exclusion_criteria") @db.Text // 状? status String @default("draft") // 可选? draft, screening, completed // 筛选配? screeningConfig Json? @map("screening_config") // 结构: { models: ["DeepSeek-V3", "Qwen-Max"], style: "standard" } // ⚠️ 模型名称映射? // 前端展示? DeepSeek-V3 ?API? deepseek-chat // 前端展示? Qwen-Max ?API? qwen-max // screeningService.ts 中有模型名映射逻辑 // 关联 literatures AslLiterature[] screeningTasks AslScreeningTask[] screeningResults AslScreeningResult[] createdAt DateTime @default(now()) @map("created_at") updatedAt DateTime @updatedAt @map("updated_at") @@map("screening_projects") @@schema("asl_schema") @@index([userId]) @@index([status]) } ``` **SQL表结?*: ```sql CREATE TABLE asl_schema.screening_projects ( id TEXT PRIMARY KEY, user_id TEXT NOT NULL, project_name TEXT NOT NULL, pico_criteria JSONB NOT NULL, inclusion_criteria TEXT NOT NULL, exclusion_criteria TEXT NOT NULL, status TEXT NOT NULL DEFAULT 'draft', screening_config JSONB, created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP, CONSTRAINT fk_user FOREIGN KEY (user_id) REFERENCES platform_schema.users(id) ON DELETE CASCADE ); CREATE INDEX idx_screening_projects_user_id ON asl_schema.screening_projects(user_id); CREATE INDEX idx_screening_projects_status ON asl_schema.screening_projects(status); ``` --- ### 2. 文献条目?(literatures) ?v3.0更新 **Prisma模型?*: `AslLiterature` **表名**: `asl_schema.literatures` **v3.0 更新说明**?- ?新增 `stage` 字段:追踪文献生命周期(imported ?title_screened ?pdf_acquired ?fulltext_screened ?data_extracted?- ?新增 PDF存储字段:支持Dify/OSS双适配(`pdfStorageType`, `pdfStorageRef`, `pdfStatus`?- ?新增 全文存储字段?*符合云原生规范,存储引用而非内容**(`fullTextStorageRef`, `fullTextUrl`?- ?新增索引:`stage`, `hasPdf`, `pdfStatus` 提升查询性能 ```prisma model AslLiterature { id String @id @default(uuid()) projectId String @map("project_id") project AslScreeningProject @relation(fields: [projectId], references: [id], onDelete: Cascade) // 文献基本信息 pmid String? title String @db.Text abstract String @db.Text authors String? journal String? publicationYear Int? @map("publication_year") doi String? // ?v3.0 新增:文献阶段(生命周期管理? stage String @default("imported") @map("stage") // imported | title_screened | title_included | pdf_acquired | fulltext_screened | data_extracted // 云原生存储字段(V1.0 阶段使用,MVP阶段预留? pdfUrl String? @map("pdf_url") // PDF访问URL pdfOssKey String? @map("pdf_oss_key") // OSS存储Key(用于删除) pdfFileSize Int? @map("pdf_file_size") // 文件大小(字节) // ?v3.0 新增:PDF存储(Dify/OSS双适配? hasPdf Boolean @default(false) @map("has_pdf") pdfStorageType String? @map("pdf_storage_type") // "dify" | "oss" pdfStorageRef String? @map("pdf_storage_ref") // Dify: document_id, OSS: object_key pdfStatus String? @map("pdf_status") // "uploading" | "ready" | "failed" pdfUploadedAt DateTime? @map("pdf_uploaded_at") // ?v3.0 新增:全文内容存储(云原生:存储引用而非内容? fullTextStorageType String? @map("full_text_storage_type") // "dify" | "oss" fullTextStorageRef String? @map("full_text_storage_ref") // document_id ?object_key fullTextUrl String? @map("full_text_url") // 访问URL fullTextFormat String? @map("full_text_format") // "markdown" | "plaintext" fullTextSource String? @map("full_text_source") // "nougat" | "pymupdf" fullTextTokenCount Int? @map("full_text_token_count") fullTextExtractedAt DateTime? @map("full_text_extracted_at") // 关联 screeningResults AslScreeningResult[] fulltextScreeningResults AslFulltextScreeningResult[] // ?v3.0 新增 createdAt DateTime @default(now()) @map("created_at") updatedAt DateTime @updatedAt @map("updated_at") @@map("literatures") @@schema("asl_schema") @@index([projectId]) @@index([doi]) @@index([stage]) // ?v3.0 新增 @@index([hasPdf]) // ?v3.0 新增 @@index([pdfStatus]) // ?v3.0 新增 @@unique([projectId, pmid]) } ``` **SQL表结?*(v3.0? ```sql CREATE TABLE asl_schema.literatures ( id TEXT PRIMARY KEY, project_id TEXT NOT NULL, -- 文献基本信息 pmid TEXT, title TEXT NOT NULL, abstract TEXT NOT NULL, authors TEXT, journal TEXT, publication_year INTEGER, doi TEXT, -- 文献阶段 stage TEXT NOT NULL DEFAULT 'imported', -- PDF存储(旧字段,V1.0预留? pdf_url TEXT, pdf_oss_key TEXT, pdf_file_size INTEGER, -- PDF存储(新字段,Dify/OSS双适配? has_pdf BOOLEAN NOT NULL DEFAULT false, pdf_storage_type TEXT, pdf_storage_ref TEXT, pdf_status TEXT, pdf_uploaded_at TIMESTAMP(3), -- 全文内容存储(引用) full_text_storage_type TEXT, full_text_storage_ref TEXT, full_text_url TEXT, full_text_format TEXT, full_text_source TEXT, full_text_token_count INTEGER, full_text_extracted_at TIMESTAMP(3), created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP, CONSTRAINT fk_project FOREIGN KEY (project_id) REFERENCES asl_schema.screening_projects(id) ON DELETE CASCADE, CONSTRAINT unique_project_pmid UNIQUE (project_id, pmid) ); CREATE INDEX idx_literatures_project_id ON asl_schema.literatures(project_id); CREATE INDEX idx_literatures_doi ON asl_schema.literatures(doi); CREATE INDEX idx_literatures_stage ON asl_schema.literatures(stage); CREATE INDEX idx_literatures_has_pdf ON asl_schema.literatures(has_pdf); CREATE INDEX idx_literatures_pdf_status ON asl_schema.literatures(pdf_status); ``` **字段说明**? | 字段 | 类型 | 说明 | 设计理由 | |------|------|------|----------| | `stage` | String | 文献阶段 | 追踪文献在整个流程中的位?| | `pdfStorageType` | String | PDF存储类型 | "dify"\|"oss",支持双适配?| | `pdfStorageRef` | String | PDF存储引用 | Dify的document_id或OSS的object_key | | `fullTextStorageType` | String | 全文存储类型 | 云原生:不直接存全文,存引用 ?| | `fullTextStorageRef` | String | 全文存储引用 | 指向Dify或OSS中的全文文档 ?| | `fullTextUrl` | String | 全文访问URL | 直接访问全文的URL | | `fullTextTokenCount` | Int | Token数量 | 用于成本估算和LLM调用优化 | **云原生设计亮?* ⭐: - ?全文内容存储在OSS/Dify,数据库只存引用(符合云原生规范?- ?支持Dify ?OSS无缝迁移(只需切换storageType?- ?数据库轻量,避免大量TEXT字段 --- ### 3. 筛选结果表 (screening_results) **Prisma模型?*: `AslScreeningResult` **表名**: `asl_schema.screening_results` **设计亮点**:支持双模型(DeepSeek + Qwen)并行验证,包含完整的判断、证据和冲突检测? ```prisma model AslScreeningResult { id String @id @default(uuid()) projectId String @map("project_id") project AslScreeningProject @relation(fields: [projectId], references: [id], onDelete: Cascade) literatureId String @map("literature_id") literature AslLiterature @relation(fields: [literatureId], references: [id], onDelete: Cascade) // DeepSeek模型判断 dsModelName String @map("ds_model_name") // "deepseek-chat" dsPJudgment String? @map("ds_p_judgment") // "match" | "partial" | "mismatch" dsIJudgment String? @map("ds_i_judgment") dsCJudgment String? @map("ds_c_judgment") dsSJudgment String? @map("ds_s_judgment") dsConclusion String? @map("ds_conclusion") // "include" | "exclude" | "uncertain" dsConfidence Float? @map("ds_confidence") // 0-1 // DeepSeek模型证据 dsPEvidence String? @map("ds_p_evidence") @db.Text dsIEvidence String? @map("ds_i_evidence") @db.Text dsCEvidence String? @map("ds_c_evidence") @db.Text dsSEvidence String? @map("ds_s_evidence") @db.Text dsReason String? @map("ds_reason") @db.Text // Qwen模型判断 qwenModelName String @map("qwen_model_name") // "qwen-max" qwenPJudgment String? @map("qwen_p_judgment") qwenIJudgment String? @map("qwen_i_judgment") qwenCJudgment String? @map("qwen_c_judgment") qwenSJudgment String? @map("qwen_s_judgment") qwenConclusion String? @map("qwen_conclusion") qwenConfidence Float? @map("qwen_confidence") // Qwen模型证据 qwenPEvidence String? @map("qwen_p_evidence") @db.Text qwenIEvidence String? @map("qwen_i_evidence") @db.Text qwenCEvidence String? @map("qwen_c_evidence") @db.Text qwenSEvidence String? @map("qwen_s_evidence") @db.Text qwenReason String? @map("qwen_reason") @db.Text // 冲突状? conflictStatus String @default("none") @map("conflict_status") // 可选? none, conflict, resolved conflictFields Json? @map("conflict_fields") // 示例: ["P", "I", "conclusion"] // 最终决策(Week 4 混合方案使用? finalDecision String? @map("final_decision") // "include" | "exclude" | null // ?Week 4 说明:人工复核后设置此字段,作为最终决? // - include: 人工决定纳入(可能推翻AI建议? // - exclude: 人工决定排除(可能推翻AI建议? // - null: 未复核,使用AI决策 finalDecisionBy String? @map("final_decision_by") // userId finalDecisionAt DateTime? @map("final_decision_at") exclusionReason String? @map("exclusion_reason") @db.Text // ?Week 4 说明:人工填写的排除原因(优先级高于AI提取? // - 如果finalDecision=exclude,此字段存储人工填写的原? // - 如果为null,前端自动从AI判断中提取(dsPJudgment/dsIJudgment等) // - Week 4 初筛结果页使用此字段显示排除原因 // AI处理状? aiProcessingStatus String @default("pending") @map("ai_processing_status") // 可选? pending, processing, completed, failed aiProcessedAt DateTime? @map("ai_processed_at") aiErrorMessage String? @map("ai_error_message") @db.Text // 可追溯信? promptVersion String @default("v1.0.0") @map("prompt_version") rawOutput Json? @map("raw_output") // 原始LLM输出(备份) createdAt DateTime @default(now()) @map("created_at") updatedAt DateTime @updatedAt @map("updated_at") @@map("screening_results") @@schema("asl_schema") @@index([projectId]) @@index([literatureId]) @@index([conflictStatus]) @@index([finalDecision]) @@unique([projectId, literatureId]) // 一篇文献在一个项目中只有一个筛选结?} ``` **SQL表结?*(简化版? ```sql CREATE TABLE asl_schema.screening_results ( id TEXT PRIMARY KEY, project_id TEXT NOT NULL, literature_id TEXT NOT NULL, -- DeepSeek判断 ds_model_name TEXT NOT NULL, ds_p_judgment TEXT, ds_i_judgment TEXT, ds_c_judgment TEXT, ds_s_judgment TEXT, ds_conclusion TEXT, ds_confidence DOUBLE PRECISION, ds_p_evidence TEXT, ds_i_evidence TEXT, ds_c_evidence TEXT, ds_s_evidence TEXT, ds_reason TEXT, -- Qwen判断 qwen_model_name TEXT NOT NULL, qwen_p_judgment TEXT, qwen_i_judgment TEXT, qwen_c_judgment TEXT, qwen_s_judgment TEXT, qwen_conclusion TEXT, qwen_confidence DOUBLE PRECISION, qwen_p_evidence TEXT, qwen_i_evidence TEXT, qwen_c_evidence TEXT, qwen_s_evidence TEXT, qwen_reason TEXT, -- 冲突状? conflict_status TEXT NOT NULL DEFAULT 'none', conflict_fields JSONB, -- 最终决? final_decision TEXT, final_decision_by TEXT, final_decision_at TIMESTAMP(3), exclusion_reason TEXT, -- AI处理状? ai_processing_status TEXT NOT NULL DEFAULT 'pending', ai_processed_at TIMESTAMP(3), ai_error_message TEXT, -- 可追溯信? prompt_version TEXT NOT NULL DEFAULT 'v1.0.0', raw_output JSONB, created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP, CONSTRAINT fk_project_result FOREIGN KEY (project_id) REFERENCES asl_schema.screening_projects(id) ON DELETE CASCADE, CONSTRAINT fk_literature FOREIGN KEY (literature_id) REFERENCES asl_schema.literatures(id) ON DELETE CASCADE, CONSTRAINT unique_project_literature UNIQUE (project_id, literature_id) ); CREATE INDEX idx_screening_results_project_id ON asl_schema.screening_results(project_id); CREATE INDEX idx_screening_results_literature_id ON asl_schema.screening_results(literature_id); CREATE INDEX idx_screening_results_conflict_status ON asl_schema.screening_results(conflict_status); CREATE INDEX idx_screening_results_final_decision ON asl_schema.screening_results(final_decision); ``` --- ### 4. 筛选任务表 (screening_tasks) **Prisma模型?*: `AslScreeningTask` **表名**: `asl_schema.screening_tasks` ```prisma model AslScreeningTask { id String @id @default(uuid()) projectId String @map("project_id") project AslScreeningProject @relation(fields: [projectId], references: [id], onDelete: Cascade) taskType String @map("task_type") // "title_abstract" | "full_text" status String @default("pending") // 可选? pending, running, completed, failed // 进度统计 totalItems Int @map("total_items") processedItems Int @default(0) @map("processed_items") successItems Int @default(0) @map("success_items") failedItems Int @default(0) @map("failed_items") conflictItems Int @default(0) @map("conflict_items") // 时间信息 startedAt DateTime? @map("started_at") completedAt DateTime? @map("completed_at") estimatedEndAt DateTime? @map("estimated_end_at") // 错误信息 errorMessage String? @map("error_message") @db.Text createdAt DateTime @default(now()) @map("created_at") updatedAt DateTime @updatedAt @map("updated_at") @@map("screening_tasks") @@schema("asl_schema") @@index([projectId]) @@index([status]) } ``` **SQL表结?*: ```sql CREATE TABLE asl_schema.screening_tasks ( id TEXT PRIMARY KEY, project_id TEXT NOT NULL, task_type TEXT NOT NULL, status TEXT NOT NULL DEFAULT 'pending', total_items INTEGER NOT NULL, processed_items INTEGER NOT NULL DEFAULT 0, success_items INTEGER NOT NULL DEFAULT 0, failed_items INTEGER NOT NULL DEFAULT 0, conflict_items INTEGER NOT NULL DEFAULT 0, started_at TIMESTAMP(3), completed_at TIMESTAMP(3), estimated_end_at TIMESTAMP(3), error_message TEXT, created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP, CONSTRAINT fk_project_task FOREIGN KEY (project_id) REFERENCES asl_schema.screening_projects(id) ON DELETE CASCADE ); CREATE INDEX idx_screening_tasks_project_id ON asl_schema.screening_tasks(project_id); CREATE INDEX idx_screening_tasks_status ON asl_schema.screening_tasks(status); ``` --- ### 5. 全文复筛任务?(fulltext_screening_tasks) ?v3.0新增 **Prisma模型?*: `AslFulltextScreeningTask` **表名**: `asl_schema.fulltext_screening_tasks` **设计目标**:管理全文复筛的批处理任务,支持双模型并行调用、成本追踪、降级模? ```prisma model AslFulltextScreeningTask { id String @id @default(uuid()) projectId String @map("project_id") project AslScreeningProject @relation(fields: [projectId], references: [id], onDelete: Cascade) // 任务配置 modelA String @map("model_a") // "deepseek-v3" modelB String @map("model_b") // "qwen-max" promptVersion String @default("v1.0.0") @map("prompt_version") // 任务状? status String @default("pending") // "pending" | "running" | "completed" | "failed" | "cancelled" // 进度统计 totalCount Int @map("total_count") processedCount Int @default(0) @map("processed_count") successCount Int @default(0) @map("success_count") failedCount Int @default(0) @map("failed_count") degradedCount Int @default(0) @map("degraded_count") // 单模型成? // 成本统计 totalTokens Int @default(0) @map("total_tokens") totalCost Float @default(0) @map("total_cost") // 时间信息 startedAt DateTime? @map("started_at") completedAt DateTime? @map("completed_at") estimatedEndAt DateTime? @map("estimated_end_at") // 错误信息 errorMessage String? @map("error_message") @db.Text errorStack String? @map("error_stack") @db.Text // 关联 results AslFulltextScreeningResult[] createdAt DateTime @default(now()) @map("created_at") updatedAt DateTime @updatedAt @map("updated_at") @@map("fulltext_screening_tasks") @@schema("asl_schema") @@index([projectId]) @@index([status]) @@index([createdAt]) } ``` **SQL表结?*: ```sql CREATE TABLE asl_schema.fulltext_screening_tasks ( id TEXT PRIMARY KEY, project_id TEXT NOT NULL, -- 任务配置 model_a TEXT NOT NULL, model_b TEXT NOT NULL, prompt_version TEXT NOT NULL DEFAULT 'v1.0.0', -- 任务状? status TEXT NOT NULL DEFAULT 'pending', -- 进度统计 total_count INTEGER NOT NULL, processed_count INTEGER NOT NULL DEFAULT 0, success_count INTEGER NOT NULL DEFAULT 0, failed_count INTEGER NOT NULL DEFAULT 0, degraded_count INTEGER NOT NULL DEFAULT 0, -- 成本统计 total_tokens INTEGER NOT NULL DEFAULT 0, total_cost DOUBLE PRECISION NOT NULL DEFAULT 0, -- 时间信息 started_at TIMESTAMP(3), completed_at TIMESTAMP(3), estimated_end_at TIMESTAMP(3), -- 错误信息 error_message TEXT, error_stack TEXT, created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP, CONSTRAINT fk_project_fulltext_task FOREIGN KEY (project_id) REFERENCES asl_schema.screening_projects(id) ON DELETE CASCADE ); CREATE INDEX idx_fulltext_screening_tasks_project_id ON asl_schema.fulltext_screening_tasks(project_id); CREATE INDEX idx_fulltext_screening_tasks_status ON asl_schema.fulltext_screening_tasks(status); CREATE INDEX idx_fulltext_screening_tasks_created_at ON asl_schema.fulltext_screening_tasks(created_at); ``` **字段说明**? | 字段 | 类型 | 说明 | |------|------|------| | `modelA / modelB` | String | 双模型名称(deepseek-v3 + qwen-max?| | `degradedCount` | Int | 单模型成功的任务数(容错机制?| | `totalTokens` | Int | 累计Token使用?| | `totalCost` | Float | 累计成本(元?| | `promptVersion` | String | Prompt版本(可追溯?| --- ### 6. 全文复筛结果?(fulltext_screening_results) ?v3.0新增 **Prisma模型?*: `AslFulltextScreeningResult` **表名**: `asl_schema.fulltext_screening_results` **设计目标**:存?2字段详细评估结果,支持双模型对比、验证结果、冲突检? **设计亮点**?- ?完整的双模型结果(fields + overall + logs?- ?医学逻辑验证和证据链验证结果 - ?冲突检测和复核优先?- ?降级模式支持(单模型成功?- ?JSON存储12字段评估(符合云原生规范? ```prisma model AslFulltextScreeningResult { id String @id @default(uuid()) taskId String @map("task_id") task AslFulltextScreeningTask @relation(fields: [taskId], references: [id], onDelete: Cascade) projectId String @map("project_id") project AslScreeningProject @relation(fields: [projectId], references: [id], onDelete: Cascade) literatureId String @map("literature_id") literature AslLiterature @relation(fields: [literatureId], references: [id], onDelete: Cascade) // ====== 模型A结果(DeepSeek-V3?===== modelAName String @map("model_a_name") modelAStatus String @map("model_a_status") // "success" | "failed" modelAFields Json @map("model_a_fields") // 12字段评估 { field1: {...}, field2: {...}, ... } modelAOverall Json @map("model_a_overall") // 总体评估 { decision, confidence, keyIssues } modelAProcessingLog Json? @map("model_a_processing_log") modelAVerification Json? @map("model_a_verification") modelATokens Int? @map("model_a_tokens") modelACost Float? @map("model_a_cost") modelAError String? @map("model_a_error") @db.Text // ====== 模型B结果(Qwen-Max?===== modelBName String @map("model_b_name") modelBStatus String @map("model_b_status") modelBFields Json @map("model_b_fields") modelBOverall Json @map("model_b_overall") modelBProcessingLog Json? @map("model_b_processing_log") modelBVerification Json? @map("model_b_verification") modelBTokens Int? @map("model_b_tokens") modelBCost Float? @map("model_b_cost") modelBError String? @map("model_b_error") @db.Text // ====== 验证结果 ====== medicalLogicIssues Json? @map("medical_logic_issues") // MedicalLogicValidator输出 evidenceChainIssues Json? @map("evidence_chain_issues") // EvidenceChainValidator输出 // ====== 冲突检?====== isConflict Boolean @default(false) @map("is_conflict") conflictSeverity String? @map("conflict_severity") // "high" | "medium" | "low" conflictFields String[] @map("conflict_fields") // ["field1", "field9", "overall"] conflictDetails Json? @map("conflict_details") reviewPriority Int? @map("review_priority") // 0-100复核优先? reviewDeadline DateTime? @map("review_deadline") // ====== 最终决?====== finalDecision String? @map("final_decision") // "include" | "exclude" | null finalDecisionBy String? @map("final_decision_by") finalDecisionAt DateTime? @map("final_decision_at") exclusionReason String? @map("exclusion_reason") @db.Text reviewNotes String? @map("review_notes") @db.Text // ====== 处理状?====== processingStatus String @default("pending") @map("processing_status") // "pending" | "processing" | "completed" | "failed" | "degraded" isDegraded Boolean @default(false) @map("is_degraded") degradedModel String? @map("degraded_model") // "modelA" | "modelB" processedAt DateTime? @map("processed_at") // ====== 可追溯信?====== promptVersion String @default("v1.0.0") @map("prompt_version") rawOutputA Json? @map("raw_output_a") rawOutputB Json? @map("raw_output_b") createdAt DateTime @default(now()) @map("created_at") updatedAt DateTime @updatedAt @map("updated_at") @@map("fulltext_screening_results") @@schema("asl_schema") @@index([taskId]) @@index([projectId]) @@index([literatureId]) @@index([isConflict]) @@index([finalDecision]) @@index([reviewPriority]) @@unique([projectId, literatureId]) // 一篇文献只有一个全文复筛结?} ``` **SQL表结?*(简化版,实际包含所有字段): ```sql CREATE TABLE asl_schema.fulltext_screening_results ( id TEXT PRIMARY KEY, task_id TEXT NOT NULL, project_id TEXT NOT NULL, literature_id TEXT NOT NULL, -- 模型A结果 model_a_name TEXT NOT NULL, model_a_status TEXT NOT NULL, model_a_fields JSONB NOT NULL, model_a_overall JSONB NOT NULL, model_a_processing_log JSONB, model_a_verification JSONB, model_a_tokens INTEGER, model_a_cost DOUBLE PRECISION, model_a_error TEXT, -- 模型B结果(同上) model_b_name TEXT NOT NULL, model_b_status TEXT NOT NULL, model_b_fields JSONB NOT NULL, model_b_overall JSONB NOT NULL, model_b_processing_log JSONB, model_b_verification JSONB, model_b_tokens INTEGER, model_b_cost DOUBLE PRECISION, model_b_error TEXT, -- 验证结果 medical_logic_issues JSONB, evidence_chain_issues JSONB, -- 冲突检? is_conflict BOOLEAN NOT NULL DEFAULT false, conflict_severity TEXT, conflict_fields TEXT[], conflict_details JSONB, review_priority INTEGER, review_deadline TIMESTAMP(3), -- 最终决? final_decision TEXT, final_decision_by TEXT, final_decision_at TIMESTAMP(3), exclusion_reason TEXT, review_notes TEXT, -- 处理状? processing_status TEXT NOT NULL DEFAULT 'pending', is_degraded BOOLEAN NOT NULL DEFAULT false, degraded_model TEXT, processed_at TIMESTAMP(3), -- 可追溯信? prompt_version TEXT NOT NULL DEFAULT 'v1.0.0', raw_output_a JSONB, raw_output_b JSONB, created_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP, CONSTRAINT fk_task FOREIGN KEY (task_id) REFERENCES asl_schema.fulltext_screening_tasks(id) ON DELETE CASCADE, CONSTRAINT fk_project_fulltext_result FOREIGN KEY (project_id) REFERENCES asl_schema.screening_projects(id) ON DELETE CASCADE, CONSTRAINT fk_literature_fulltext FOREIGN KEY (literature_id) REFERENCES asl_schema.literatures(id) ON DELETE CASCADE, CONSTRAINT unique_project_literature_fulltext UNIQUE (project_id, literature_id) ); CREATE INDEX idx_fulltext_screening_results_task_id ON asl_schema.fulltext_screening_results(task_id); CREATE INDEX idx_fulltext_screening_results_project_id ON asl_schema.fulltext_screening_results(project_id); CREATE INDEX idx_fulltext_screening_results_literature_id ON asl_schema.fulltext_screening_results(literature_id); CREATE INDEX idx_fulltext_screening_results_is_conflict ON asl_schema.fulltext_screening_results(is_conflict); CREATE INDEX idx_fulltext_screening_results_final_decision ON asl_schema.fulltext_screening_results(final_decision); CREATE INDEX idx_fulltext_screening_results_review_priority ON asl_schema.fulltext_screening_results(review_priority); ``` **JSON字段示例**? **modelAFields (12字段评估)**: ```json { "field1": { "present": true, "completeness": "完整", "extractable": true, "quote": "第一作者:Zhang et al., 发表?JAMA 2023...", "location": "Title page, Methods section", "note": "文献来源信息完整" }, "field2": { ... }, // ... field3-field12 } ``` **modelAOverall (总体评估)**: ```json { "decision": "include", "confidence": 0.92, "keyIssues": [ "随机化方法描述完?, "盲法实施清晰", "结局指标可提? ] } ``` **medicalLogicIssues (医学逻辑验证)**: ```json { "hasIssues": false, "issues": [] } ``` **conflictDetails (冲突详情)**: ```json { "field9": { "modelA": "完整", "modelB": "不完?, "severity": "high" } } ``` --- ## 📊 数据关系图(v3.0更新? ``` literature_screening_projects (1) ──< (N) literature_items literature_screening_projects (1) ──< (N) title_abstract_screening_results literature_items (1) ──< (1) title_abstract_screening_results literature_screening_projects (1) ──< (N) screening_tasks ``` --- ## 🔍 索引设计汇总(v3.0更新? | 表名 | 索引字段 | 索引类型 | 说明 | |------|---------|---------|------| | screening_projects | user_id | B-tree | 用户项目查询 | | screening_projects | status | B-tree | 状态筛?| | literatures | project_id | B-tree | 项目文献查询 | | literatures | doi | B-tree | DOI查重 | | literatures | stage ?| B-tree | 文献阶段查询 v3.0 | | literatures | has_pdf ?| B-tree | PDF获取状?v3.0 | | literatures | pdf_status ?| B-tree | PDF上传状?v3.0 | | literatures | (project_id, pmid) | Unique | 防止重复导入 | | screening_results | project_id | B-tree | 项目结果查询 | | screening_results | literature_id | B-tree | 文献结果查询 | | screening_results | conflict_status | B-tree | 冲突筛?| | screening_results | final_decision | B-tree | 决策筛?| | screening_results | (project_id, literature_id) | Unique | 唯一性约?| | screening_tasks | project_id | B-tree | 项目任务查询 | | screening_tasks | status | B-tree | 任务状态筛?| | fulltext_screening_tasks ?| project_id | B-tree | 全文任务查询 v3.0 | | fulltext_screening_tasks ?| status | B-tree | 任务状态筛?v3.0 | | fulltext_screening_tasks ?| created_at | B-tree | 时间排序 v3.0 | | fulltext_screening_results ?| task_id | B-tree | 任务结果查询 v3.0 | | fulltext_screening_results ?| project_id | B-tree | 项目结果查询 v3.0 | | fulltext_screening_results ?| literature_id | B-tree | 文献结果查询 v3.0 | | fulltext_screening_results ?| is_conflict | B-tree | 冲突筛?v3.0 | | fulltext_screening_results ?| final_decision | B-tree | 决策筛?v3.0 | | fulltext_screening_results ?| review_priority | B-tree | 复核优先?v3.0 | | fulltext_screening_results ?| (project_id, literature_id) | Unique | 唯一性约?v3.0 | **索引总数**: 25个(v3.0新增13个) **唯一约束**: 4个(v3.0新增1个) **v3.0索引优化说明**?- ?`literatures.stage`: 快速查询特定阶段的文献(如"pdf_acquired"待全文复筛) - ?`fulltext_screening_results.review_priority`: 优化人工复核队列排序 - ?`fulltext_screening_tasks.created_at`: 任务历史查询优化 --- ## 💾 数据字典 ### PICO标准 (picoCriteria JSON) ```json { "population": "研究人群,如?型糖尿病成人患?, "intervention": "干预措施,如:SGLT2抑制?, "comparison": "对照,如:安慰剂或常规疗?, "outcome": "结局指标,如:心血管结局", "studyDesign": "研究设计,如:随机对照试?(RCT)" } ``` ### 筛选配?(screeningConfig JSON) ```json { "models": ["deepseek-chat", "qwen-max"], "temperature": 0, "maxRetries": 3 } ``` ### 冲突字段 (conflictFields JSON) ```json ["P", "I", "C", "S", "conclusion"] ``` ### 原始输出 (rawOutput JSON) ```json { "deepseek": { "判断": {...}, "证据": {...} }, "qwen": { "判断": {...}, "证据": {...} } } ``` --- ## 🔒 数据安全 ### Schema隔离 - 使用 `asl_schema` 与其他模块数据隔?- 用户表在 `platform_schema`,统一管理 ### 级联删除 - 删除用户 ?自动删除所有筛选项目及关联数据 - 删除项目 ?自动删除文献、结果、任?- 删除文献 ?自动删除筛选结? ### 唯一性约?- 同一项目中PMID唯一(允许无PMID?- 同一项目中一篇文献只有一个筛选结? --- ## 📈 数据量预? | 项目规模 | 文献?| 筛选结?| 存储空间 | |---------|--------|---------|----------| | 小型 | 100-500 | 100-500 | < 10 MB | | 中型 | 500-2000 | 500-2000 | 10-50 MB | | 大型 | 2000-5000 | 2000-5000 | 50-200 MB | | 超大?| 5000+ | 5000+ | 200 MB+ | **单条记录大小估算**: - 文献条目:~2-5 KB - 筛选结果:~5-10 KB(含双模型判断和证据? --- ## ?后续规划 ### Phase 2 (全文复筛) ?v3.0已完?- [x] 扩展 `literatures` 表(生命周期管理?- [x] 添加 `fulltext_screening_tasks` ?- [x] 添加 `fulltext_screening_results` 表(12字段? ### Phase 3 (数据提取) 待开?- [ ] 复用 `fulltext_screening_tasks` 表(切换模式?- [ ] 复用 `fulltext_screening_results` 表(存储提取数据?- [ ] 或新?`data_extraction_results` 表(如需独立? ### Phase 4 (质量评估) 待规?- [ ] 质量评估结果?- [ ] 偏倚风险评估表 - [ ] GRADE证据质量? --- ## 📝 v3.0 设计决策记录 ### 决策1: 全文内容存储引用而非直接存储 ? **问题**:全文内容是否存储在数据库? **方案对比**?| 方案 | 优点 | 缺点 | |------|------|------| | 存TEXT | LLM调用?| 违背云原生规范,数据库臃?| | 存引?| 符合规范,轻?| LLM调用增加100-200ms | **决策**:✅ 采用方案2(存引用?- 符合云原生存储与计算分离原则 - 支持超大文献?1MB?- RDS存储成本是OSS?-10? ### 决策2: 12字段使用JSON存储 ? **问题**?2字段是拆分为列还是JSON存储? **决策**:✅ 使用PostgreSQL JSONB - 不需要单独查询某个字段内?- 字段结构复杂?个子字段?- JSONB性能优秀且支持GIN索引 ### 决策3: 独立全文复筛结果?? **问题**:是否复?`screening_results` 表? **决策**:✅ 新增独立?`fulltext_screening_results` - 数据结构完全不同(PICOS vs 12字段?- 避免字段冗余和逻辑耦合 - 便于独立维护和优? --- **文档版本?* v3.0 **最后更新:** 2025-11-22(Day 4:全文复筛数据库设计? **维护者:** AI智能文献开发团? **版本历史**?- v3.0 (2025-11-22): 全文复筛数据库设计,新增3个表和相关字?- v2.2 (2025-11-21): Week 4统计功能完成 - v2.0 (2025-11-18): 标题初筛数据库设?- v1.0 (2025-10-29): 初始版本