feat(rvw): Implement RVW V2.0 Data Forensics Module - Day 6 StatValidator
Summary: - Implement L2 Statistical Validator (CI-P consistency, T-test reverse) - Implement L2.5 Consistency Forensics (SE Triangle, SD>Mean check) - Add error/warning severity classification with tolerance thresholds - Support 5+ CI formats parsing (parentheses, brackets, 95% CI prefix) - Complete Python forensics service (types, config, validator, extractor) V2.0 Development Progress (Week 2 Day 6): - Day 1-5: Python service setup, Word table extraction, L1 arithmetic validator - Day 6: L2 StatValidator + L2.5 consistency forensics (promoted from V2.1) Test Results: - Unit tests: 4/4 passed (CI-P, SE Triangle, SD>Mean, T-test) - Real document tests: 5/5 successful, 2 reasonable WARNINGs Status: Day 6 completed, ready for Day 7 (Skills Framework) Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -52,6 +52,9 @@ app.add_middleware(
|
||||
TEMP_DIR = Path(os.getenv("TEMP_DIR", "/tmp/extraction_service"))
|
||||
TEMP_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# 注册 RVW V2.0 数据侦探路由
|
||||
app.include_router(forensics_router)
|
||||
|
||||
# 导入服务模块
|
||||
from services.pdf_extractor import extract_pdf_pymupdf
|
||||
from services.pdf_processor import extract_pdf, get_pdf_processing_strategy
|
||||
@@ -66,6 +69,9 @@ from services.pdf_markdown_processor import PdfMarkdownProcessor, extract_pdf_to
|
||||
# 新增:文档导出服务(Markdown → Word)
|
||||
from services.doc_export_service import check_pandoc_available, convert_markdown_to_docx, create_protocol_docx
|
||||
|
||||
# 新增:RVW V2.0 数据侦探模块
|
||||
from forensics.api import router as forensics_router
|
||||
|
||||
# 兼容:nougat 相关(已废弃,保留空实现避免报错)
|
||||
def check_nougat_available(): return False
|
||||
def get_nougat_info(): return {"available": False, "reason": "已废弃,使用 pymupdf4llm 替代"}
|
||||
|
||||
Reference in New Issue
Block a user