Summary: - Implement L2 Statistical Validator (CI-P consistency, T-test reverse) - Implement L2.5 Consistency Forensics (SE Triangle, SD>Mean check) - Add error/warning severity classification with tolerance thresholds - Support 5+ CI formats parsing (parentheses, brackets, 95% CI prefix) - Complete Python forensics service (types, config, validator, extractor) V2.0 Development Progress (Week 2 Day 6): - Day 1-5: Python service setup, Word table extraction, L1 arithmetic validator - Day 6: L2 StatValidator + L2.5 consistency forensics (promoted from V2.1) Test Results: - Unit tests: 4/4 passed (CI-P, SE Triangle, SD>Mean, T-test) - Real document tests: 5/5 successful, 2 reasonable WARNINGs Status: Day 6 completed, ready for Day 7 (Skills Framework) Co-authored-by: Cursor <cursoragent@cursor.com>
57 lines
1.4 KiB
Plaintext
57 lines
1.4 KiB
Plaintext
# ========================================
|
||
# 生产环境依赖 (2026-01-26 更新)
|
||
# 移除 Nougat,使用 pymupdf4llm 替代
|
||
# ========================================
|
||
|
||
# Web框架
|
||
fastapi==0.104.1
|
||
uvicorn[standard]==0.24.0
|
||
python-multipart==0.0.6
|
||
|
||
# 数据处理 (DC工具必需)
|
||
pandas>=2.0.0
|
||
numpy>=1.24.0
|
||
polars>=0.19.0
|
||
scipy>=1.11.0 # 统计验证(RVW V2.0 数据侦探:T检验、卡方检验)
|
||
|
||
# PDF处理 - 使用 pymupdf4llm(替代 nougat,更轻量)
|
||
PyMuPDF>=1.24.0 # PDF 核心库(代码中 import fitz 使用)
|
||
pymupdf4llm>=0.0.17 # PDF → Markdown
|
||
pdfplumber==0.10.3 # 备用 PDF 处理
|
||
|
||
# Word处理
|
||
mammoth==1.6.0 # Docx → Markdown
|
||
python-docx==1.1.0 # Docx 读取
|
||
pypandoc>=1.13 # Markdown → Docx (需要系统安装 pandoc)
|
||
|
||
# Excel/CSV处理
|
||
openpyxl>=3.1.2 # Excel 读取
|
||
tabulate>=0.9.0 # DataFrame → Markdown
|
||
|
||
# PPT处理
|
||
python-pptx>=0.6.23 # PPT 读取
|
||
|
||
# 语言检测
|
||
langdetect==1.0.9
|
||
|
||
# 编码检测
|
||
chardet==5.2.0
|
||
|
||
# 工具
|
||
python-dotenv==1.0.0
|
||
pydantic>=2.10.0
|
||
|
||
# 日志
|
||
loguru==0.7.2
|
||
|
||
# 测试工具
|
||
requests==2.31.0
|
||
|
||
# ========================================
|
||
# 注意:生产环境已移除以下重量级依赖
|
||
# - nougat-ocr==0.1.17 (约1.5GB)
|
||
# - albumentations==1.3.1 (Nougat依赖)
|
||
#
|
||
# 已使用 pymupdf4llm 替代,功能相似但更轻量
|
||
# ========================================
|