Files
AIclinicalresearch/extraction_service/forensics/__init__.py
HaHafeng e785969e54 feat(rvw): Implement RVW V2.0 Data Forensics Module - Day 6 StatValidator
Summary:
- Implement L2 Statistical Validator (CI-P consistency, T-test reverse)
- Implement L2.5 Consistency Forensics (SE Triangle, SD>Mean check)
- Add error/warning severity classification with tolerance thresholds
- Support 5+ CI formats parsing (parentheses, brackets, 95% CI prefix)
- Complete Python forensics service (types, config, validator, extractor)

V2.0 Development Progress (Week 2 Day 6):
- Day 1-5: Python service setup, Word table extraction, L1 arithmetic validator
- Day 6: L2 StatValidator + L2.5 consistency forensics (promoted from V2.1)

Test Results:
- Unit tests: 4/4 passed (CI-P, SE Triangle, SD>Mean, T-test)
- Real document tests: 5/5 successful, 2 reasonable WARNINGs

Status: Day 6 completed, ready for Day 7 (Skills Framework)
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-17 22:15:27 +08:00

49 lines
965 B
Python
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
"""
RVW V2.0 数据侦探模块 (Data Forensics)
提供 Word 文档表格提取和数据验证功能:
- 表格精准提取python-docx
- L1 算术自洽性验证
- L2 统计学复核T检验、卡方检验
- HTML 片段生成(含 R1C1 坐标)
Author: AIclinicalresearch Team
Version: 2.0.0
Date: 2026-02-17
"""
from .types import (
ForensicsConfig,
TableData,
Issue,
ForensicsResult,
ExtractionError,
Severity,
IssueType,
CellLocation,
)
from .extractor import DocxTableExtractor
from .validator import ArithmeticValidator, StatValidator
from .api import router as forensics_router
__all__ = [
# 类型
"ForensicsConfig",
"TableData",
"Issue",
"ForensicsResult",
"ExtractionError",
"Severity",
"IssueType",
"CellLocation",
# 核心类
"DocxTableExtractor",
"ArithmeticValidator",
"StatValidator",
# 路由
"forensics_router",
]
__version__ = "2.0.0"