feat(rvw): Implement RVW V2.0 Data Forensics Module - Day 6 StatValidator

Summary:
- Implement L2 Statistical Validator (CI-P consistency, T-test reverse)
- Implement L2.5 Consistency Forensics (SE Triangle, SD>Mean check)
- Add error/warning severity classification with tolerance thresholds
- Support 5+ CI formats parsing (parentheses, brackets, 95% CI prefix)
- Complete Python forensics service (types, config, validator, extractor)

V2.0 Development Progress (Week 2 Day 6):
- Day 1-5: Python service setup, Word table extraction, L1 arithmetic validator
- Day 6: L2 StatValidator + L2.5 consistency forensics (promoted from V2.1)

Test Results:
- Unit tests: 4/4 passed (CI-P, SE Triangle, SD>Mean, T-test)
- Real document tests: 5/5 successful, 2 reasonable WARNINGs

Status: Day 6 completed, ready for Day 7 (Skills Framework)
Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
2026-02-17 22:15:27 +08:00
parent 7a299e8562
commit e785969e54
31 changed files with 5925 additions and 15 deletions

View File

@@ -0,0 +1,48 @@
"""
RVW V2.0 数据侦探模块 (Data Forensics)
提供 Word 文档表格提取和数据验证功能:
- 表格精准提取python-docx
- L1 算术自洽性验证
- L2 统计学复核T检验、卡方检验
- HTML 片段生成(含 R1C1 坐标)
Author: AIclinicalresearch Team
Version: 2.0.0
Date: 2026-02-17
"""
from .types import (
ForensicsConfig,
TableData,
Issue,
ForensicsResult,
ExtractionError,
Severity,
IssueType,
CellLocation,
)
from .extractor import DocxTableExtractor
from .validator import ArithmeticValidator, StatValidator
from .api import router as forensics_router
__all__ = [
# 类型
"ForensicsConfig",
"TableData",
"Issue",
"ForensicsResult",
"ExtractionError",
"Severity",
"IssueType",
"CellLocation",
# 核心类
"DocxTableExtractor",
"ArithmeticValidator",
"StatValidator",
# 路由
"forensics_router",
]
__version__ = "2.0.0"