- Implement 5 core API endpoints (create task, get progress, get results, update decision, export Excel) - Add FulltextScreeningController with Zod validation (652 lines) - Implement ExcelExporter service with 4-sheet report generation (352 lines) - Register routes under /api/v1/asl/fulltext-screening - Create 31 REST Client test cases - Add automated integration test script - Fix PDF extraction fallback mechanism in LLM12FieldsService - Update API design documentation to v3.0 - Update development plan to v1.2 - Create Day 5 development record - Clean up temporary test files
548 lines
13 KiB
Markdown
548 lines
13 KiB
Markdown
# [AI对接] LLM网关快速上下文
|
||
|
||
> **阅读时间:** 5分钟 | **Token消耗:** ~2000 tokens
|
||
> **层级:** L2 | **优先级:** P0 ⭐⭐⭐⭐⭐
|
||
> **前置阅读:** 02-通用能力层/[AI对接] 通用能力快速上下文.md
|
||
|
||
---
|
||
|
||
## 📋 能力定位
|
||
|
||
**LLM大模型网关是整个平台的AI调用中枢,是商业模式的技术基础。**
|
||
|
||
**为什么是P0优先级:**
|
||
- 71%的业务模块依赖(5个模块:AIA、ASL、PKB、DC、RVW)
|
||
- ASL模块开发的**前置条件**
|
||
- 商业模式的**技术基础**(Feature Flag + 成本控制)
|
||
|
||
**状态:** ❌ 待实现
|
||
**建议时间:** ASL Week 1(Day 1-3)同步开发
|
||
|
||
---
|
||
|
||
## 🎯 核心功能
|
||
|
||
### 1. 根据用户版本选择模型 ⭐⭐⭐⭐⭐
|
||
|
||
**商业价值:**
|
||
```
|
||
专业版(¥99/月)→ DeepSeek-V3(¥1/百万tokens)
|
||
高级版(¥299/月)→ DeepSeek + Qwen3-72B(¥5/百万tokens)
|
||
旗舰版(¥999/月)→ 全部模型(含Claude/GPT)
|
||
```
|
||
|
||
**实现方式:**
|
||
```typescript
|
||
// 查询用户Feature Flag
|
||
const userFlags = await featureFlagService.getUserFlags(userId);
|
||
|
||
// 根据Feature Flag选择可用模型
|
||
if (requestModel === 'claude-3.5' && !userFlags.includes('claude_access')) {
|
||
throw new Error('您的套餐不支持Claude模型,请升级到旗舰版');
|
||
}
|
||
|
||
// 或自动降级
|
||
if (!userFlags.includes('claude_access')) {
|
||
model = 'deepseek-v3'; // 自动降级到DeepSeek
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
### 2. 统一调用接口 ⭐⭐⭐⭐⭐
|
||
|
||
**问题:** 不同LLM厂商API格式不同
|
||
- OpenAI格式
|
||
- Anthropic格式
|
||
- 国产大模型格式(DeepSeek、Qwen)
|
||
|
||
**解决方案:** 统一接口 + 适配器模式
|
||
|
||
```typescript
|
||
// 业务模块统一调用
|
||
const response = await llmGateway.chat({
|
||
userId: 'user123',
|
||
modelType: 'deepseek-v3', // 或 'qwen3', 'claude-3.5'
|
||
messages: [
|
||
{ role: 'user', content: '帮我分析这篇文献...' }
|
||
],
|
||
stream: false
|
||
});
|
||
|
||
// LLM网关内部:
|
||
// 1. 检查用户权限(Feature Flag)
|
||
// 2. 检查配额
|
||
// 3. 选择对应的适配器
|
||
// 4. 调用API
|
||
// 5. 记录成本
|
||
// 6. 返回统一格式
|
||
```
|
||
|
||
---
|
||
|
||
### 3. 成本控制 ⭐⭐⭐⭐
|
||
|
||
**核心需求:**
|
||
- 每个用户有月度配额
|
||
- 超出配额自动限流
|
||
- 实时成本统计
|
||
|
||
**实现:**
|
||
```typescript
|
||
// 调用前检查配额
|
||
async function checkQuota(userId: string): Promise<boolean> {
|
||
const usage = await getMonthlyUsage(userId);
|
||
const quota = await getUserQuota(userId);
|
||
|
||
if (usage.tokenCount >= quota.maxTokens) {
|
||
throw new QuotaExceededError('您的月度配额已用完,请升级套餐');
|
||
}
|
||
|
||
return true;
|
||
}
|
||
|
||
// 调用后记录成本
|
||
async function recordUsage(userId: string, usage: {
|
||
modelType: string;
|
||
tokenCount: number;
|
||
cost: number;
|
||
}) {
|
||
await db.llmUsage.create({
|
||
userId,
|
||
modelType,
|
||
inputTokens: usage.tokenCount,
|
||
cost: usage.cost,
|
||
timestamp: new Date()
|
||
});
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
### 4. 流式/非流式统一处理 ⭐⭐⭐
|
||
|
||
**场景:**
|
||
- AIA智能问答 → 需要流式输出(实时显示)
|
||
- ASL文献筛选 → 非流式(批量处理)
|
||
|
||
**统一接口:**
|
||
```typescript
|
||
interface ChatOptions {
|
||
userId: string;
|
||
modelType: ModelType;
|
||
messages: Message[];
|
||
stream: boolean; // 是否流式输出
|
||
temperature?: number;
|
||
maxTokens?: number;
|
||
}
|
||
|
||
// 流式
|
||
const stream = await llmGateway.chat({ ...options, stream: true });
|
||
for await (const chunk of stream) {
|
||
console.log(chunk.content);
|
||
}
|
||
|
||
// 非流式
|
||
const response = await llmGateway.chat({ ...options, stream: false });
|
||
console.log(response.content);
|
||
```
|
||
|
||
---
|
||
|
||
## 🏗️ 技术架构
|
||
|
||
### 目录结构
|
||
```
|
||
backend/src/modules/llm-gateway/
|
||
├── controllers/
|
||
│ └── llmController.ts # HTTP接口
|
||
├── services/
|
||
│ ├── llmGatewayService.ts # 核心服务 ⭐
|
||
│ ├── featureFlagService.ts # Feature Flag查询
|
||
│ ├── quotaService.ts # 配额管理
|
||
│ └── usageService.ts # 使用统计
|
||
├── adapters/ # 适配器模式 ⭐
|
||
│ ├── baseAdapter.ts
|
||
│ ├── deepseekAdapter.ts
|
||
│ ├── qwenAdapter.ts
|
||
│ ├── claudeAdapter.ts
|
||
│ └── openaiAdapter.ts
|
||
├── types/
|
||
│ └── llm.types.ts
|
||
└── routes/
|
||
└── llmRoutes.ts
|
||
```
|
||
|
||
---
|
||
|
||
### 核心类设计
|
||
|
||
#### 1. LLMGatewayService(核心)
|
||
```typescript
|
||
class LLMGatewayService {
|
||
private adapters: Map<ModelType, BaseLLMAdapter>;
|
||
|
||
async chat(options: ChatOptions): Promise<ChatResponse | AsyncIterator> {
|
||
// 1. 验证用户权限(Feature Flag)
|
||
await this.checkAccess(options.userId, options.modelType);
|
||
|
||
// 2. 检查配额
|
||
await quotaService.checkQuota(options.userId);
|
||
|
||
// 3. 选择适配器
|
||
const adapter = this.adapters.get(options.modelType);
|
||
|
||
// 4. 调用LLM API
|
||
const response = await adapter.chat(options);
|
||
|
||
// 5. 记录使用量
|
||
await usageService.record({
|
||
userId: options.userId,
|
||
modelType: options.modelType,
|
||
tokenCount: response.tokenUsage,
|
||
cost: this.calculateCost(options.modelType, response.tokenUsage)
|
||
});
|
||
|
||
// 6. 返回结果
|
||
return response;
|
||
}
|
||
|
||
private calculateCost(modelType: ModelType, tokens: number): number {
|
||
const prices = {
|
||
'deepseek-v3': 0.000001, // ¥1/百万tokens
|
||
'qwen3-72b': 0.000005, // ¥5/百万tokens
|
||
'claude-3.5': 0.00003 // $15/百万tokens ≈ ¥0.0003/千tokens
|
||
};
|
||
return tokens * prices[modelType];
|
||
}
|
||
}
|
||
```
|
||
|
||
#### 2. BaseLLMAdapter(适配器基类)
|
||
```typescript
|
||
abstract class BaseLLMAdapter {
|
||
abstract chat(options: ChatOptions): Promise<ChatResponse>;
|
||
abstract chatStream(options: ChatOptions): AsyncIterator<ChatChunk>;
|
||
|
||
protected abstract buildRequest(options: ChatOptions): any;
|
||
protected abstract parseResponse(response: any): ChatResponse;
|
||
}
|
||
```
|
||
|
||
#### 3. DeepSeekAdapter(实现示例)
|
||
```typescript
|
||
class DeepSeekAdapter extends BaseLLMAdapter {
|
||
private apiKey: string;
|
||
private baseUrl = 'https://api.deepseek.com/v1';
|
||
|
||
async chat(options: ChatOptions): Promise<ChatResponse> {
|
||
const request = this.buildRequest(options);
|
||
|
||
const response = await fetch(`${this.baseUrl}/chat/completions`, {
|
||
method: 'POST',
|
||
headers: {
|
||
'Authorization': `Bearer ${this.apiKey}`,
|
||
'Content-Type': 'application/json'
|
||
},
|
||
body: JSON.stringify(request)
|
||
});
|
||
|
||
const data = await response.json();
|
||
return this.parseResponse(data);
|
||
}
|
||
|
||
protected buildRequest(options: ChatOptions) {
|
||
return {
|
||
model: 'deepseek-chat',
|
||
messages: options.messages,
|
||
temperature: options.temperature || 0.7,
|
||
max_tokens: options.maxTokens || 4096,
|
||
stream: options.stream || false
|
||
};
|
||
}
|
||
|
||
protected parseResponse(response: any): ChatResponse {
|
||
return {
|
||
content: response.choices[0].message.content,
|
||
tokenUsage: response.usage.total_tokens,
|
||
finishReason: response.choices[0].finish_reason
|
||
};
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 📊 数据库设计
|
||
|
||
### platform_schema.llm_usage
|
||
```sql
|
||
CREATE TABLE platform_schema.llm_usage (
|
||
id SERIAL PRIMARY KEY,
|
||
user_id INTEGER REFERENCES platform_schema.users(id),
|
||
model_type VARCHAR(50) NOT NULL, -- 'deepseek-v3', 'qwen3', 'claude-3.5'
|
||
input_tokens INTEGER NOT NULL,
|
||
output_tokens INTEGER NOT NULL,
|
||
total_tokens INTEGER NOT NULL,
|
||
cost DECIMAL(10, 6) NOT NULL, -- 实际成本(人民币)
|
||
request_id VARCHAR(100), -- LLM API返回的request_id
|
||
module VARCHAR(50), -- 哪个模块调用的:'AIA', 'ASL', 'PKB'等
|
||
created_at TIMESTAMP DEFAULT NOW(),
|
||
|
||
INDEX idx_user_created (user_id, created_at),
|
||
INDEX idx_module (module)
|
||
);
|
||
```
|
||
|
||
### platform_schema.llm_quotas
|
||
```sql
|
||
CREATE TABLE platform_schema.llm_quotas (
|
||
id SERIAL PRIMARY KEY,
|
||
user_id INTEGER REFERENCES platform_schema.users(id) UNIQUE,
|
||
monthly_token_limit INTEGER NOT NULL, -- 月度token配额
|
||
monthly_cost_limit DECIMAL(10, 2), -- 月度成本上限(可选)
|
||
reset_day INTEGER DEFAULT 1, -- 每月重置日期(1-28)
|
||
created_at TIMESTAMP DEFAULT NOW(),
|
||
updated_at TIMESTAMP DEFAULT NOW()
|
||
);
|
||
```
|
||
|
||
---
|
||
|
||
## 📋 API端点
|
||
|
||
### 1. 聊天接口(非流式)
|
||
```
|
||
POST /api/v1/llm/chat
|
||
|
||
Request:
|
||
{
|
||
"modelType": "deepseek-v3",
|
||
"messages": [
|
||
{ "role": "user", "content": "分析这篇文献..." }
|
||
],
|
||
"temperature": 0.7,
|
||
"maxTokens": 4096
|
||
}
|
||
|
||
Response:
|
||
{
|
||
"content": "根据文献内容分析...",
|
||
"tokenUsage": {
|
||
"input": 150,
|
||
"output": 500,
|
||
"total": 650
|
||
},
|
||
"cost": 0.00065,
|
||
"modelType": "deepseek-v3"
|
||
}
|
||
```
|
||
|
||
### 2. 聊天接口(流式)
|
||
```
|
||
POST /api/v1/llm/chat/stream
|
||
|
||
Request: 同上 + "stream": true
|
||
|
||
Response: Server-Sent Events (SSE)
|
||
data: {"chunk": "根据", "tokenUsage": 1}
|
||
data: {"chunk": "文献", "tokenUsage": 1}
|
||
...
|
||
data: {"done": true, "totalTokens": 650, "cost": 0.00065}
|
||
```
|
||
|
||
### 3. 查询配额
|
||
```
|
||
GET /api/v1/llm/quota
|
||
|
||
Response:
|
||
{
|
||
"monthlyLimit": 1000000,
|
||
"used": 245000,
|
||
"remaining": 755000,
|
||
"resetDate": "2025-12-01"
|
||
}
|
||
```
|
||
|
||
### 4. 使用统计
|
||
```
|
||
GET /api/v1/llm/usage?startDate=2025-11-01&endDate=2025-11-30
|
||
|
||
Response:
|
||
{
|
||
"totalTokens": 245000,
|
||
"totalCost": 1.23,
|
||
"byModel": {
|
||
"deepseek-v3": { "tokens": 200000, "cost": 0.20 },
|
||
"qwen3-72b": { "tokens": 45000, "cost": 0.23 }
|
||
},
|
||
"byModule": {
|
||
"AIA": 100000,
|
||
"ASL": 120000,
|
||
"PKB": 25000
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## ⚠️ 关键技术难点
|
||
|
||
### 1. 流式输出的实现
|
||
**技术方案:** Server-Sent Events (SSE)
|
||
|
||
```typescript
|
||
// 后端(Fastify)
|
||
app.post('/api/v1/llm/chat/stream', async (req, reply) => {
|
||
reply.raw.setHeader('Content-Type', 'text/event-stream');
|
||
reply.raw.setHeader('Cache-Control', 'no-cache');
|
||
reply.raw.setHeader('Connection', 'keep-alive');
|
||
|
||
const stream = await llmGateway.chatStream(req.body);
|
||
|
||
for await (const chunk of stream) {
|
||
reply.raw.write(`data: ${JSON.stringify(chunk)}\n\n`);
|
||
}
|
||
|
||
reply.raw.end();
|
||
});
|
||
|
||
// 前端(React)
|
||
const eventSource = new EventSource('/api/v1/llm/chat/stream');
|
||
eventSource.onmessage = (event) => {
|
||
const data = JSON.parse(event.data);
|
||
setMessages(prev => [...prev, data.chunk]);
|
||
};
|
||
```
|
||
|
||
---
|
||
|
||
### 2. 错误处理和重试
|
||
```typescript
|
||
async function chatWithRetry(options: ChatOptions, maxRetries = 3) {
|
||
for (let i = 0; i < maxRetries; i++) {
|
||
try {
|
||
return await llmGateway.chat(options);
|
||
} catch (error) {
|
||
if (error.code === 'RATE_LIMIT' && i < maxRetries - 1) {
|
||
await sleep(2000 * (i + 1)); // 指数退避
|
||
continue;
|
||
}
|
||
throw error;
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
### 3. Token计数(精确计费)
|
||
**问题:** 不同模型的tokenizer不同
|
||
|
||
**解决方案:**
|
||
- 使用各厂商提供的API返回值(最准确)
|
||
- 备用方案:tiktoken库(OpenAI tokenizer)
|
||
|
||
```typescript
|
||
import { encoding_for_model } from 'tiktoken';
|
||
|
||
function estimateTokens(text: string, model: string): number {
|
||
const encoder = encoding_for_model(model);
|
||
const tokens = encoder.encode(text);
|
||
encoder.free();
|
||
return tokens.length;
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 📅 开发计划(3天)
|
||
|
||
### Day 1:基础架构(6-8小时)
|
||
- [ ] 创建目录结构
|
||
- [ ] 实现BaseLLMAdapter抽象类
|
||
- [ ] 实现DeepSeekAdapter
|
||
- [ ] 数据库表创建(llm_usage, llm_quotas)
|
||
- [ ] 基础API端点(非流式)
|
||
|
||
### Day 2:核心功能(6-8小时)
|
||
- [ ] Feature Flag集成
|
||
- [ ] 配额检查和记录
|
||
- [ ] 实现QwenAdapter
|
||
- [ ] 错误处理和重试机制
|
||
- [ ] 单元测试
|
||
|
||
### Day 3:流式输出 + 优化(6-8小时)
|
||
- [ ] 实现流式输出(SSE)
|
||
- [ ] 前端SSE接收处理
|
||
- [ ] 成本统计API
|
||
- [ ] 配额查询API
|
||
- [ ] 集成测试
|
||
- [ ] 文档完善
|
||
|
||
---
|
||
|
||
## ✅ 开发检查清单
|
||
|
||
**开始前确认:**
|
||
- [ ] Feature Flag表已创建(platform_schema.feature_flags)
|
||
- [ ] 用户表已有version字段(professional/premium/enterprise)
|
||
- [ ] 各LLM厂商API Key已配置
|
||
- [ ] Prisma Schema已更新
|
||
|
||
**开发中:**
|
||
- [ ] 每个适配器都有完整的错误处理
|
||
- [ ] 所有LLM调用都记录到llm_usage表
|
||
- [ ] 配额检查在每次调用前执行
|
||
- [ ] 流式和非流式都已测试
|
||
|
||
**完成后:**
|
||
- [ ] ASL模块可以成功调用LLM网关
|
||
- [ ] ADMIN可以查看用户LLM使用统计
|
||
- [ ] 配额超限会正确拒绝请求
|
||
|
||
---
|
||
|
||
## 🔗 相关文档
|
||
|
||
**依赖:**
|
||
- [用户与权限中心(UAM)](../../01-平台基础层/01-用户与权限中心(UAM)/README.md) - Feature Flag
|
||
- [运营管理端](../../03-业务模块/ADMIN-运营管理端/README.md) - LLM模型管理
|
||
|
||
**被依赖:**
|
||
- [ASL-AI智能文献](../../03-业务模块/ASL-AI智能文献/README.md) ⭐ P0
|
||
- [AIA-AI智能问答](../../03-业务模块/AIA-AI智能问答/README.md)
|
||
- [PKB-个人知识库](../../03-业务模块/PKB-个人知识库/README.md)
|
||
|
||
---
|
||
|
||
**最后更新:** 2025-11-06
|
||
**维护人:** 技术架构师
|
||
**优先级:** P0 ⭐⭐⭐⭐⭐
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|