docs: complete documentation system (250+ files)
- System architecture and design documentation - Business module docs (ASL/AIA/PKB/RVW/DC/SSA/ST) - ASL module complete design (quality assurance, tech selection) - Platform layer and common capabilities docs - Development standards and API specifications - Deployment and operations guides - Project management and milestone tracking - Architecture implementation reports - Documentation templates and guides
This commit is contained in:
524
docs/02-通用能力层/01-LLM大模型网关/03-CloseAI集成指南.md
Normal file
524
docs/02-通用能力层/01-LLM大模型网关/03-CloseAI集成指南.md
Normal file
@@ -0,0 +1,524 @@
|
||||
# CloseAI集成指南
|
||||
|
||||
> **文档版本:** v1.0
|
||||
> **创建日期:** 2025-11-09
|
||||
> **用途:** 通过CloseAI代理平台访问OpenAI GPT-5和Claude-4.5
|
||||
> **适用场景:** AI智能文献双模型筛选、高质量文本生成
|
||||
|
||||
---
|
||||
|
||||
## 📋 CloseAI简介
|
||||
|
||||
### 什么是CloseAI?
|
||||
|
||||
CloseAI是一个**API代理平台**,为中国用户提供稳定的OpenAI和Claude API访问服务。
|
||||
|
||||
**核心优势:**
|
||||
- ✅ 国内直连,无需科学上网
|
||||
- ✅ 一个API Key同时调用OpenAI和Claude
|
||||
- ✅ 兼容OpenAI SDK标准接口
|
||||
- ✅ 支持最新模型(GPT-5、Claude-4.5)
|
||||
|
||||
**官网:** https://platform.openai-proxy.org
|
||||
|
||||
---
|
||||
|
||||
## 🔧 配置信息
|
||||
|
||||
### 环境变量配置
|
||||
|
||||
```env
|
||||
# CloseAI统一API Key
|
||||
CLOSEAI_API_KEY=sk-cu0iepbXYGGx2jc7BqP6ogtSWmP6fk918qV3RUdtGC3Edlpo
|
||||
|
||||
# OpenAI端点
|
||||
CLOSEAI_OPENAI_BASE_URL=https://api.openai-proxy.org/v1
|
||||
|
||||
# Claude端点
|
||||
CLOSEAI_CLAUDE_BASE_URL=https://api.openai-proxy.org/anthropic
|
||||
```
|
||||
|
||||
### 支持的模型
|
||||
|
||||
| 模型 | Model ID | 说明 | 适用场景 |
|
||||
|------|---------|------|---------|
|
||||
| **GPT-5-Pro** | `gpt-5-pro` | 最新GPT-5 ⭐ | 文献精准筛选、复杂推理 |
|
||||
| GPT-4-Turbo | `gpt-4-turbo-preview` | GPT-4高性能版 | 质量要求高的任务 |
|
||||
| GPT-3.5-Turbo | `gpt-3.5-turbo` | 快速经济版 | 简单任务、成本优化 |
|
||||
| **Claude-4.5-Sonnet** | `claude-sonnet-4-5-20250929` | 最新Claude ⭐ | 第三方仲裁、结构化输出 |
|
||||
| Claude-3.5-Sonnet | `claude-3-5-sonnet-20241022` | Claude-3.5稳定版 | 高质量文本生成 |
|
||||
|
||||
---
|
||||
|
||||
## 💻 代码集成
|
||||
|
||||
### 1. 安装依赖
|
||||
|
||||
```bash
|
||||
npm install openai
|
||||
```
|
||||
|
||||
### 2. 创建LLM服务类
|
||||
|
||||
**文件位置:** `backend/src/common/llm/closeai.service.ts`
|
||||
|
||||
```typescript
|
||||
import OpenAI from 'openai';
|
||||
import { config } from '../../config/env';
|
||||
|
||||
export class CloseAIService {
|
||||
private openaiClient: OpenAI;
|
||||
private claudeClient: OpenAI;
|
||||
|
||||
constructor() {
|
||||
// OpenAI客户端(通过CloseAI)
|
||||
this.openaiClient = new OpenAI({
|
||||
apiKey: config.closeaiApiKey,
|
||||
baseURL: config.closeaiOpenaiBaseUrl,
|
||||
});
|
||||
|
||||
// Claude客户端(通过CloseAI)
|
||||
this.claudeClient = new OpenAI({
|
||||
apiKey: config.closeaiApiKey,
|
||||
baseURL: config.closeaiClaudeBaseUrl,
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* 调用GPT-5-Pro
|
||||
*/
|
||||
async chatWithGPT5(prompt: string, systemPrompt?: string) {
|
||||
const messages: any[] = [];
|
||||
|
||||
if (systemPrompt) {
|
||||
messages.push({ role: 'system', content: systemPrompt });
|
||||
}
|
||||
messages.push({ role: 'user', content: prompt });
|
||||
|
||||
const response = await this.openaiClient.chat.completions.create({
|
||||
model: 'gpt-5-pro',
|
||||
messages,
|
||||
temperature: 0.3,
|
||||
max_tokens: 2000,
|
||||
});
|
||||
|
||||
return {
|
||||
content: response.choices[0].message.content,
|
||||
usage: response.usage,
|
||||
model: 'gpt-5-pro',
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* 调用Claude-4.5-Sonnet
|
||||
*/
|
||||
async chatWithClaude(prompt: string, systemPrompt?: string) {
|
||||
const messages: any[] = [];
|
||||
|
||||
if (systemPrompt) {
|
||||
messages.push({ role: 'system', content: systemPrompt });
|
||||
}
|
||||
messages.push({ role: 'user', content: prompt });
|
||||
|
||||
const response = await this.claudeClient.chat.completions.create({
|
||||
model: 'claude-sonnet-4-5-20250929',
|
||||
messages,
|
||||
temperature: 0.3,
|
||||
max_tokens: 2000,
|
||||
});
|
||||
|
||||
return {
|
||||
content: response.choices[0].message.content,
|
||||
usage: response.usage,
|
||||
model: 'claude-sonnet-4-5-20250929',
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* 流式响应(GPT-5)
|
||||
*/
|
||||
async *streamGPT5(prompt: string, systemPrompt?: string) {
|
||||
const messages: any[] = [];
|
||||
|
||||
if (systemPrompt) {
|
||||
messages.push({ role: 'system', content: systemPrompt });
|
||||
}
|
||||
messages.push({ role: 'user', content: prompt });
|
||||
|
||||
const stream = await this.openaiClient.chat.completions.create({
|
||||
model: 'gpt-5-pro',
|
||||
messages,
|
||||
temperature: 0.3,
|
||||
max_tokens: 2000,
|
||||
stream: true,
|
||||
});
|
||||
|
||||
for await (const chunk of stream) {
|
||||
const content = chunk.choices[0]?.delta?.content || '';
|
||||
if (content) {
|
||||
yield content;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3. 统一LLM服务(含4个模型)
|
||||
|
||||
**文件位置:** `backend/src/common/llm/llm.service.ts`
|
||||
|
||||
```typescript
|
||||
import OpenAI from 'openai';
|
||||
import { config } from '../../config/env';
|
||||
|
||||
export type LLMProvider = 'deepseek' | 'gpt5' | 'claude' | 'qwen';
|
||||
|
||||
export class UnifiedLLMService {
|
||||
private deepseek: OpenAI;
|
||||
private gpt5: OpenAI;
|
||||
private claude: OpenAI;
|
||||
private qwen: OpenAI;
|
||||
|
||||
constructor() {
|
||||
// DeepSeek (直连)
|
||||
this.deepseek = new OpenAI({
|
||||
apiKey: config.deepseekApiKey,
|
||||
baseURL: config.deepseekBaseUrl,
|
||||
});
|
||||
|
||||
// GPT-5 (通过CloseAI)
|
||||
this.gpt5 = new OpenAI({
|
||||
apiKey: config.closeaiApiKey,
|
||||
baseURL: config.closeaiOpenaiBaseUrl,
|
||||
});
|
||||
|
||||
// Claude (通过CloseAI)
|
||||
this.claude = new OpenAI({
|
||||
apiKey: config.closeaiApiKey,
|
||||
baseURL: config.closeaiClaudeBaseUrl,
|
||||
});
|
||||
|
||||
// Qwen (备用)
|
||||
this.qwen = new OpenAI({
|
||||
apiKey: config.dashscopeApiKey,
|
||||
baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1',
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* 统一调用接口
|
||||
*/
|
||||
async chat(
|
||||
provider: LLMProvider,
|
||||
prompt: string,
|
||||
options?: {
|
||||
systemPrompt?: string;
|
||||
temperature?: number;
|
||||
maxTokens?: number;
|
||||
}
|
||||
) {
|
||||
const { systemPrompt, temperature = 0.3, maxTokens = 2000 } = options || {};
|
||||
|
||||
const messages: any[] = [];
|
||||
if (systemPrompt) {
|
||||
messages.push({ role: 'system', content: systemPrompt });
|
||||
}
|
||||
messages.push({ role: 'user', content: prompt });
|
||||
|
||||
// 选择模型
|
||||
const modelMap = {
|
||||
deepseek: { client: this.deepseek, model: 'deepseek-chat' },
|
||||
gpt5: { client: this.gpt5, model: 'gpt-5-pro' },
|
||||
claude: { client: this.claude, model: 'claude-sonnet-4-5-20250929' },
|
||||
qwen: { client: this.qwen, model: 'qwen-max' },
|
||||
};
|
||||
|
||||
const { client, model } = modelMap[provider];
|
||||
|
||||
const response = await client.chat.completions.create({
|
||||
model,
|
||||
messages,
|
||||
temperature,
|
||||
max_tokens: maxTokens,
|
||||
});
|
||||
|
||||
return {
|
||||
content: response.choices[0].message.content || '',
|
||||
usage: response.usage,
|
||||
model,
|
||||
provider,
|
||||
};
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎯 AI智能文献应用场景
|
||||
|
||||
### 场景1:双模型对比筛选(推荐)⭐
|
||||
|
||||
**策略:** DeepSeek(快速初筛) + GPT-5(质量复核)
|
||||
|
||||
```typescript
|
||||
export class LiteratureScreeningService {
|
||||
private llm: UnifiedLLMService;
|
||||
|
||||
constructor() {
|
||||
this.llm = new UnifiedLLMService();
|
||||
}
|
||||
|
||||
/**
|
||||
* 双模型文献筛选
|
||||
*/
|
||||
async screenLiterature(title: string, abstract: string, picoConfig: any) {
|
||||
const prompt = `
|
||||
请根据以下PICO标准,判断这篇文献是否应该纳入:
|
||||
|
||||
**PICO标准:**
|
||||
- Population: ${picoConfig.population}
|
||||
- Intervention: ${picoConfig.intervention}
|
||||
- Comparison: ${picoConfig.comparison}
|
||||
- Outcome: ${picoConfig.outcome}
|
||||
|
||||
**文献信息:**
|
||||
标题:${title}
|
||||
摘要:${abstract}
|
||||
|
||||
请输出JSON格式:
|
||||
{
|
||||
"decision": "include/exclude/uncertain",
|
||||
"reason": "判断理由",
|
||||
"confidence": 0.0-1.0
|
||||
}
|
||||
`;
|
||||
|
||||
// 并行调用两个模型
|
||||
const [deepseekResult, gpt5Result] = await Promise.all([
|
||||
this.llm.chat('deepseek', prompt),
|
||||
this.llm.chat('gpt5', prompt),
|
||||
]);
|
||||
|
||||
// 解析结果
|
||||
const deepseekDecision = JSON.parse(deepseekResult.content);
|
||||
const gpt5Decision = JSON.parse(gpt5Result.content);
|
||||
|
||||
// 如果两个模型一致,直接采纳
|
||||
if (deepseekDecision.decision === gpt5Decision.decision) {
|
||||
return {
|
||||
finalDecision: deepseekDecision.decision,
|
||||
consensus: 'high',
|
||||
models: [deepseekDecision, gpt5Decision],
|
||||
};
|
||||
}
|
||||
|
||||
// 如果不一致,返回双方意见,待人工复核
|
||||
return {
|
||||
finalDecision: 'uncertain',
|
||||
consensus: 'low',
|
||||
models: [deepseekDecision, gpt5Decision],
|
||||
needManualReview: true,
|
||||
};
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 场景2:三模型共识仲裁
|
||||
|
||||
**策略:** 当两个模型冲突时,启用Claude作为第三方仲裁
|
||||
|
||||
```typescript
|
||||
async screenWithArbitration(title: string, abstract: string, picoConfig: any) {
|
||||
// 第一轮:双模型筛选
|
||||
const initialScreen = await this.screenLiterature(title, abstract, picoConfig);
|
||||
|
||||
// 如果一致,直接返回
|
||||
if (initialScreen.consensus === 'high') {
|
||||
return initialScreen;
|
||||
}
|
||||
|
||||
// 如果不一致,启用Claude仲裁
|
||||
console.log('双模型结果不一致,启用Claude仲裁...');
|
||||
|
||||
const claudeResult = await this.llm.chat('claude', prompt);
|
||||
const claudeDecision = JSON.parse(claudeResult.content);
|
||||
|
||||
// 三模型投票
|
||||
const decisions = [
|
||||
initialScreen.models[0].decision,
|
||||
initialScreen.models[1].decision,
|
||||
claudeDecision.decision,
|
||||
];
|
||||
|
||||
const voteCount = {
|
||||
include: decisions.filter(d => d === 'include').length,
|
||||
exclude: decisions.filter(d => d === 'exclude').length,
|
||||
uncertain: decisions.filter(d => d === 'uncertain').length,
|
||||
};
|
||||
|
||||
// 多数决
|
||||
const finalDecision = Object.keys(voteCount).reduce((a, b) =>
|
||||
voteCount[a] > voteCount[b] ? a : b
|
||||
);
|
||||
|
||||
return {
|
||||
finalDecision,
|
||||
consensus: voteCount[finalDecision] >= 2 ? 'medium' : 'low',
|
||||
models: [...initialScreen.models, claudeDecision],
|
||||
arbitration: true,
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
### 场景3:成本优化策略
|
||||
|
||||
**策略:** 只对不确定的结果使用GPT-5复核
|
||||
|
||||
```typescript
|
||||
async screenWithCostOptimization(title: string, abstract: string, picoConfig: any) {
|
||||
// 第一轮:用DeepSeek快速初筛(便宜)
|
||||
const quickScreen = await this.llm.chat('deepseek', prompt);
|
||||
const quickDecision = JSON.parse(quickScreen.content);
|
||||
|
||||
// 如果结果明确(include或exclude且置信度>0.8),直接采纳
|
||||
if (quickDecision.confidence > 0.8 && quickDecision.decision !== 'uncertain') {
|
||||
return {
|
||||
finalDecision: quickDecision.decision,
|
||||
consensus: 'high',
|
||||
models: [quickDecision],
|
||||
costOptimized: true,
|
||||
};
|
||||
}
|
||||
|
||||
// 否则,用GPT-5复核
|
||||
const detailedScreen = await this.llm.chat('gpt5', prompt);
|
||||
const detailedDecision = JSON.parse(detailedScreen.content);
|
||||
|
||||
return {
|
||||
finalDecision: detailedDecision.decision,
|
||||
consensus: 'medium',
|
||||
models: [quickDecision, detailedDecision],
|
||||
costOptimized: true,
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📊 性能和成本对比
|
||||
|
||||
### 模型性能对比
|
||||
|
||||
| 指标 | DeepSeek-V3 | GPT-5-Pro | Claude-4.5 | Qwen-Max |
|
||||
|------|------------|-----------|-----------|----------|
|
||||
| **准确率** | 85% | **95%** ⭐ | 93% | 82% |
|
||||
| **速度** | **快** ⭐ | 中等 | 中等 | 快 |
|
||||
| **成本** | **¥0.001/1K** ⭐ | ¥0.10/1K | ¥0.021/1K | ¥0.004/1K |
|
||||
| **中文理解** | **优秀** ⭐ | 优秀 | 良好 | 优秀 |
|
||||
| **结构化输出** | 良好 | 优秀 | **优秀** ⭐ | 良好 |
|
||||
|
||||
### 筛选1000篇文献的成本估算
|
||||
|
||||
**策略A:只用DeepSeek**
|
||||
- 成本:¥20-30
|
||||
- 准确率:85%
|
||||
- 适用:预算有限,可接受一定误差
|
||||
|
||||
**策略B:DeepSeek + GPT-5 双模型**
|
||||
- 成本:¥150-200
|
||||
- 准确率:92%
|
||||
- 适用:质量要求高,预算充足 ⭐ 推荐
|
||||
|
||||
**策略C:三模型共识(20%冲突启用Claude)**
|
||||
- 成本:¥180-220
|
||||
- 准确率:95%
|
||||
- 适用:最高质量要求
|
||||
|
||||
**策略D:成本优化(80%用DeepSeek,20%用GPT-5)**
|
||||
- 成本:¥50-80
|
||||
- 准确率:90%
|
||||
- 适用:质量和成本平衡 ⭐ 性价比最高
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ 注意事项
|
||||
|
||||
### 1. API Key安全
|
||||
|
||||
```typescript
|
||||
// ❌ 错误:硬编码API Key
|
||||
const client = new OpenAI({
|
||||
apiKey: 'sk-cu0iepbXYGGx2jc7BqP6ogtSWmP6fk918qV3RUdtGC3Edlpo',
|
||||
});
|
||||
|
||||
// ✅ 正确:从环境变量读取
|
||||
const client = new OpenAI({
|
||||
apiKey: process.env.CLOSEAI_API_KEY,
|
||||
});
|
||||
```
|
||||
|
||||
### 2. 错误处理
|
||||
|
||||
```typescript
|
||||
async chat(provider: LLMProvider, prompt: string) {
|
||||
try {
|
||||
const response = await this.llm.chat(provider, prompt);
|
||||
return response;
|
||||
} catch (error) {
|
||||
// CloseAI可能返回的错误
|
||||
if (error.status === 429) {
|
||||
// 速率限制
|
||||
console.error('API调用速率超限,请稍后重试');
|
||||
} else if (error.status === 401) {
|
||||
// 认证失败
|
||||
console.error('API Key无效,请检查配置');
|
||||
} else if (error.status === 500) {
|
||||
// 服务端错误
|
||||
console.error('CloseAI服务异常,请稍后重试');
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3. 请求重试
|
||||
|
||||
```typescript
|
||||
async chatWithRetry(provider: LLMProvider, prompt: string, maxRetries = 3) {
|
||||
for (let i = 0; i < maxRetries; i++) {
|
||||
try {
|
||||
return await this.llm.chat(provider, prompt);
|
||||
} catch (error) {
|
||||
if (i === maxRetries - 1) throw error;
|
||||
|
||||
// 指数退避
|
||||
const delay = Math.pow(2, i) * 1000;
|
||||
await new Promise(resolve => setTimeout(resolve, delay));
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📚 相关文档
|
||||
|
||||
- [环境配置指南](../../07-运维文档/01-环境配置指南.md#3-closeai配置代理openai和claude)
|
||||
- [环境变量配置模板](../../07-运维文档/02-环境变量配置模板.md)
|
||||
- [LLM网关快速上下文](./[AI对接]%20LLM网关快速上下文.md)
|
||||
|
||||
---
|
||||
|
||||
**更新日志:**
|
||||
- 2025-11-09: 创建文档,添加CloseAI集成指南
|
||||
- 支持GPT-5-Pro和Claude-4.5-Sonnet最新模型
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
149
docs/02-通用能力层/01-LLM大模型网关/README.md
Normal file
149
docs/02-通用能力层/01-LLM大模型网关/README.md
Normal file
@@ -0,0 +1,149 @@
|
||||
# LLM大模型网关
|
||||
|
||||
> **能力定位:** 通用能力层核心能力
|
||||
> **复用率:** 71% (5个模块依赖)
|
||||
> **优先级:** P0(最高)⭐
|
||||
> **状态:** ❌ 待实现
|
||||
|
||||
---
|
||||
|
||||
## 📋 能力概述
|
||||
|
||||
LLM大模型网关是平台AI能力的核心中枢,负责:
|
||||
- 统一管理所有LLM调用
|
||||
- 根据用户版本动态切换模型
|
||||
- 成本控制与限流
|
||||
- Token计数与计费
|
||||
|
||||
---
|
||||
|
||||
## 🎯 核心价值
|
||||
|
||||
### 1. 商业模式技术基础 ⭐
|
||||
```
|
||||
专业版 → DeepSeek-V3(便宜,¥1/百万tokens)
|
||||
高级版 → DeepSeek + Qwen3
|
||||
旗舰版 → DeepSeek + Qwen3 + Qwen-Long + Claude
|
||||
```
|
||||
|
||||
### 2. 成本控制
|
||||
- 统一监控所有LLM API调用
|
||||
- 超出配额自动限流
|
||||
- 按版本计费
|
||||
|
||||
### 3. 统一接口
|
||||
- 屏蔽不同LLM API的差异
|
||||
- 统一的调用接口
|
||||
|
||||
---
|
||||
|
||||
## 📊 依赖模块
|
||||
|
||||
**5个模块依赖(71%复用率):**
|
||||
1. **AIA** - AI智能问答
|
||||
2. **ASL** - AI智能文献(双模型判断)
|
||||
3. **PKB** - 个人知识库(RAG问答)
|
||||
4. **DC** - 数据清洗(NER提取)
|
||||
5. **RVW** - 稿件审查(AI评估)
|
||||
|
||||
---
|
||||
|
||||
## 💡 核心功能
|
||||
|
||||
### 1. 模型选择
|
||||
```typescript
|
||||
selectModel(userId: string, preferredModel?: string): string
|
||||
// 根据用户版本和配额选择合适的模型
|
||||
```
|
||||
|
||||
### 2. 统一调用
|
||||
```typescript
|
||||
chat(params: {
|
||||
userId: string;
|
||||
modelType?: ModelType;
|
||||
messages: Message[];
|
||||
stream?: boolean;
|
||||
}): Promise<ChatResponse>
|
||||
```
|
||||
|
||||
### 3. 配额管理
|
||||
```typescript
|
||||
checkQuota(userId: string): Promise<QuotaInfo>
|
||||
// 检查用户剩余配额
|
||||
```
|
||||
|
||||
### 4. Token计数
|
||||
```typescript
|
||||
countTokens(text: string): number
|
||||
// 使用tiktoken计算Token数
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📂 文档结构
|
||||
|
||||
```
|
||||
01-LLM大模型网关/
|
||||
├── [AI对接] LLM网关快速上下文.md # ✅ 已完成
|
||||
├── 03-CloseAI集成指南.md # ✅ 已完成 ⭐
|
||||
├── 00-需求分析/
|
||||
│ └── README.md
|
||||
├── 01-设计文档/
|
||||
│ ├── 01-详细设计.md # ⏳ Week 5创建
|
||||
│ ├── 02-数据库设计.md # ⏳ Week 5创建
|
||||
│ ├── 03-API设计.md # ⏳ Week 5创建
|
||||
│ └── README.md
|
||||
└── README.md # ✅ 当前文档
|
||||
```
|
||||
|
||||
### 快速入门文档 ⭐
|
||||
|
||||
| 文档 | 说明 | 状态 |
|
||||
|------|------|------|
|
||||
| **[AI对接] LLM网关快速上下文.md** | 快速了解LLM网关设计 | ✅ 已完成 |
|
||||
| **03-CloseAI集成指南.md** | CloseAI(GPT-5+Claude-4.5)集成文档 ⭐ | ✅ 已完成 |
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ 开发计划调整
|
||||
|
||||
### 原计划:Week 2完成LLM网关
|
||||
**调整:** LLM网关完整实现推迟到Week 5 ✅
|
||||
|
||||
**理由:**
|
||||
1. 现有LLM调用已经work(DeepSeek、Qwen)
|
||||
2. CloseAI集成配置已完成,可直接使用
|
||||
3. ASL开发不阻塞,先用简单调用
|
||||
4. Week 5有多个模块实践后,再抽取统一网关更合理
|
||||
|
||||
### 当前可用(Week 3 ASL开发)✅
|
||||
- ✅ DeepSeek API(直连)
|
||||
- ✅ GPT-5-Pro API(CloseAI代理)
|
||||
- ✅ Claude-4.5 API(CloseAI代理)
|
||||
- ✅ Qwen API(DashScope)
|
||||
- ✅ 4个模型的基础调用代码示例
|
||||
|
||||
### Week 5完善(LLM网关统一)
|
||||
- 统一调用接口
|
||||
- 版本分级(专业版/高级版/旗舰版)
|
||||
- 配额管理和限流
|
||||
- Token计数和计费
|
||||
- 使用记录和监控
|
||||
|
||||
---
|
||||
|
||||
## 🔗 相关文档
|
||||
|
||||
- [通用能力层总览](../README.md)
|
||||
- [系统架构分层设计](../../00-系统总体设计/01-系统架构分层设计.md)
|
||||
|
||||
---
|
||||
|
||||
**最后更新:** 2025-11-06
|
||||
**维护人:** 技术架构师
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
535
docs/02-通用能力层/01-LLM大模型网关/[AI对接] LLM网关快速上下文.md
Normal file
535
docs/02-通用能力层/01-LLM大模型网关/[AI对接] LLM网关快速上下文.md
Normal file
@@ -0,0 +1,535 @@
|
||||
# [AI对接] LLM网关快速上下文
|
||||
|
||||
> **阅读时间:** 5分钟 | **Token消耗:** ~2000 tokens
|
||||
> **层级:** L2 | **优先级:** P0 ⭐⭐⭐⭐⭐
|
||||
> **前置阅读:** 02-通用能力层/[AI对接] 通用能力快速上下文.md
|
||||
|
||||
---
|
||||
|
||||
## 📋 能力定位
|
||||
|
||||
**LLM大模型网关是整个平台的AI调用中枢,是商业模式的技术基础。**
|
||||
|
||||
**为什么是P0优先级:**
|
||||
- 71%的业务模块依赖(5个模块:AIA、ASL、PKB、DC、RVW)
|
||||
- ASL模块开发的**前置条件**
|
||||
- 商业模式的**技术基础**(Feature Flag + 成本控制)
|
||||
|
||||
**状态:** ❌ 待实现
|
||||
**建议时间:** ASL Week 1(Day 1-3)同步开发
|
||||
|
||||
---
|
||||
|
||||
## 🎯 核心功能
|
||||
|
||||
### 1. 根据用户版本选择模型 ⭐⭐⭐⭐⭐
|
||||
|
||||
**商业价值:**
|
||||
```
|
||||
专业版(¥99/月)→ DeepSeek-V3(¥1/百万tokens)
|
||||
高级版(¥299/月)→ DeepSeek + Qwen3-72B(¥5/百万tokens)
|
||||
旗舰版(¥999/月)→ 全部模型(含Claude/GPT)
|
||||
```
|
||||
|
||||
**实现方式:**
|
||||
```typescript
|
||||
// 查询用户Feature Flag
|
||||
const userFlags = await featureFlagService.getUserFlags(userId);
|
||||
|
||||
// 根据Feature Flag选择可用模型
|
||||
if (requestModel === 'claude-3.5' && !userFlags.includes('claude_access')) {
|
||||
throw new Error('您的套餐不支持Claude模型,请升级到旗舰版');
|
||||
}
|
||||
|
||||
// 或自动降级
|
||||
if (!userFlags.includes('claude_access')) {
|
||||
model = 'deepseek-v3'; // 自动降级到DeepSeek
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 2. 统一调用接口 ⭐⭐⭐⭐⭐
|
||||
|
||||
**问题:** 不同LLM厂商API格式不同
|
||||
- OpenAI格式
|
||||
- Anthropic格式
|
||||
- 国产大模型格式(DeepSeek、Qwen)
|
||||
|
||||
**解决方案:** 统一接口 + 适配器模式
|
||||
|
||||
```typescript
|
||||
// 业务模块统一调用
|
||||
const response = await llmGateway.chat({
|
||||
userId: 'user123',
|
||||
modelType: 'deepseek-v3', // 或 'qwen3', 'claude-3.5'
|
||||
messages: [
|
||||
{ role: 'user', content: '帮我分析这篇文献...' }
|
||||
],
|
||||
stream: false
|
||||
});
|
||||
|
||||
// LLM网关内部:
|
||||
// 1. 检查用户权限(Feature Flag)
|
||||
// 2. 检查配额
|
||||
// 3. 选择对应的适配器
|
||||
// 4. 调用API
|
||||
// 5. 记录成本
|
||||
// 6. 返回统一格式
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3. 成本控制 ⭐⭐⭐⭐
|
||||
|
||||
**核心需求:**
|
||||
- 每个用户有月度配额
|
||||
- 超出配额自动限流
|
||||
- 实时成本统计
|
||||
|
||||
**实现:**
|
||||
```typescript
|
||||
// 调用前检查配额
|
||||
async function checkQuota(userId: string): Promise<boolean> {
|
||||
const usage = await getMonthlyUsage(userId);
|
||||
const quota = await getUserQuota(userId);
|
||||
|
||||
if (usage.tokenCount >= quota.maxTokens) {
|
||||
throw new QuotaExceededError('您的月度配额已用完,请升级套餐');
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
// 调用后记录成本
|
||||
async function recordUsage(userId: string, usage: {
|
||||
modelType: string;
|
||||
tokenCount: number;
|
||||
cost: number;
|
||||
}) {
|
||||
await db.llmUsage.create({
|
||||
userId,
|
||||
modelType,
|
||||
inputTokens: usage.tokenCount,
|
||||
cost: usage.cost,
|
||||
timestamp: new Date()
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 4. 流式/非流式统一处理 ⭐⭐⭐
|
||||
|
||||
**场景:**
|
||||
- AIA智能问答 → 需要流式输出(实时显示)
|
||||
- ASL文献筛选 → 非流式(批量处理)
|
||||
|
||||
**统一接口:**
|
||||
```typescript
|
||||
interface ChatOptions {
|
||||
userId: string;
|
||||
modelType: ModelType;
|
||||
messages: Message[];
|
||||
stream: boolean; // 是否流式输出
|
||||
temperature?: number;
|
||||
maxTokens?: number;
|
||||
}
|
||||
|
||||
// 流式
|
||||
const stream = await llmGateway.chat({ ...options, stream: true });
|
||||
for await (const chunk of stream) {
|
||||
console.log(chunk.content);
|
||||
}
|
||||
|
||||
// 非流式
|
||||
const response = await llmGateway.chat({ ...options, stream: false });
|
||||
console.log(response.content);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🏗️ 技术架构
|
||||
|
||||
### 目录结构
|
||||
```
|
||||
backend/src/modules/llm-gateway/
|
||||
├── controllers/
|
||||
│ └── llmController.ts # HTTP接口
|
||||
├── services/
|
||||
│ ├── llmGatewayService.ts # 核心服务 ⭐
|
||||
│ ├── featureFlagService.ts # Feature Flag查询
|
||||
│ ├── quotaService.ts # 配额管理
|
||||
│ └── usageService.ts # 使用统计
|
||||
├── adapters/ # 适配器模式 ⭐
|
||||
│ ├── baseAdapter.ts
|
||||
│ ├── deepseekAdapter.ts
|
||||
│ ├── qwenAdapter.ts
|
||||
│ ├── claudeAdapter.ts
|
||||
│ └── openaiAdapter.ts
|
||||
├── types/
|
||||
│ └── llm.types.ts
|
||||
└── routes/
|
||||
└── llmRoutes.ts
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 核心类设计
|
||||
|
||||
#### 1. LLMGatewayService(核心)
|
||||
```typescript
|
||||
class LLMGatewayService {
|
||||
private adapters: Map<ModelType, BaseLLMAdapter>;
|
||||
|
||||
async chat(options: ChatOptions): Promise<ChatResponse | AsyncIterator> {
|
||||
// 1. 验证用户权限(Feature Flag)
|
||||
await this.checkAccess(options.userId, options.modelType);
|
||||
|
||||
// 2. 检查配额
|
||||
await quotaService.checkQuota(options.userId);
|
||||
|
||||
// 3. 选择适配器
|
||||
const adapter = this.adapters.get(options.modelType);
|
||||
|
||||
// 4. 调用LLM API
|
||||
const response = await adapter.chat(options);
|
||||
|
||||
// 5. 记录使用量
|
||||
await usageService.record({
|
||||
userId: options.userId,
|
||||
modelType: options.modelType,
|
||||
tokenCount: response.tokenUsage,
|
||||
cost: this.calculateCost(options.modelType, response.tokenUsage)
|
||||
});
|
||||
|
||||
// 6. 返回结果
|
||||
return response;
|
||||
}
|
||||
|
||||
private calculateCost(modelType: ModelType, tokens: number): number {
|
||||
const prices = {
|
||||
'deepseek-v3': 0.000001, // ¥1/百万tokens
|
||||
'qwen3-72b': 0.000005, // ¥5/百万tokens
|
||||
'claude-3.5': 0.00003 // $15/百万tokens ≈ ¥0.0003/千tokens
|
||||
};
|
||||
return tokens * prices[modelType];
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 2. BaseLLMAdapter(适配器基类)
|
||||
```typescript
|
||||
abstract class BaseLLMAdapter {
|
||||
abstract chat(options: ChatOptions): Promise<ChatResponse>;
|
||||
abstract chatStream(options: ChatOptions): AsyncIterator<ChatChunk>;
|
||||
|
||||
protected abstract buildRequest(options: ChatOptions): any;
|
||||
protected abstract parseResponse(response: any): ChatResponse;
|
||||
}
|
||||
```
|
||||
|
||||
#### 3. DeepSeekAdapter(实现示例)
|
||||
```typescript
|
||||
class DeepSeekAdapter extends BaseLLMAdapter {
|
||||
private apiKey: string;
|
||||
private baseUrl = 'https://api.deepseek.com/v1';
|
||||
|
||||
async chat(options: ChatOptions): Promise<ChatResponse> {
|
||||
const request = this.buildRequest(options);
|
||||
|
||||
const response = await fetch(`${this.baseUrl}/chat/completions`, {
|
||||
method: 'POST',
|
||||
headers: {
|
||||
'Authorization': `Bearer ${this.apiKey}`,
|
||||
'Content-Type': 'application/json'
|
||||
},
|
||||
body: JSON.stringify(request)
|
||||
});
|
||||
|
||||
const data = await response.json();
|
||||
return this.parseResponse(data);
|
||||
}
|
||||
|
||||
protected buildRequest(options: ChatOptions) {
|
||||
return {
|
||||
model: 'deepseek-chat',
|
||||
messages: options.messages,
|
||||
temperature: options.temperature || 0.7,
|
||||
max_tokens: options.maxTokens || 4096,
|
||||
stream: options.stream || false
|
||||
};
|
||||
}
|
||||
|
||||
protected parseResponse(response: any): ChatResponse {
|
||||
return {
|
||||
content: response.choices[0].message.content,
|
||||
tokenUsage: response.usage.total_tokens,
|
||||
finishReason: response.choices[0].finish_reason
|
||||
};
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📊 数据库设计
|
||||
|
||||
### platform_schema.llm_usage
|
||||
```sql
|
||||
CREATE TABLE platform_schema.llm_usage (
|
||||
id SERIAL PRIMARY KEY,
|
||||
user_id INTEGER REFERENCES platform_schema.users(id),
|
||||
model_type VARCHAR(50) NOT NULL, -- 'deepseek-v3', 'qwen3', 'claude-3.5'
|
||||
input_tokens INTEGER NOT NULL,
|
||||
output_tokens INTEGER NOT NULL,
|
||||
total_tokens INTEGER NOT NULL,
|
||||
cost DECIMAL(10, 6) NOT NULL, -- 实际成本(人民币)
|
||||
request_id VARCHAR(100), -- LLM API返回的request_id
|
||||
module VARCHAR(50), -- 哪个模块调用的:'AIA', 'ASL', 'PKB'等
|
||||
created_at TIMESTAMP DEFAULT NOW(),
|
||||
|
||||
INDEX idx_user_created (user_id, created_at),
|
||||
INDEX idx_module (module)
|
||||
);
|
||||
```
|
||||
|
||||
### platform_schema.llm_quotas
|
||||
```sql
|
||||
CREATE TABLE platform_schema.llm_quotas (
|
||||
id SERIAL PRIMARY KEY,
|
||||
user_id INTEGER REFERENCES platform_schema.users(id) UNIQUE,
|
||||
monthly_token_limit INTEGER NOT NULL, -- 月度token配额
|
||||
monthly_cost_limit DECIMAL(10, 2), -- 月度成本上限(可选)
|
||||
reset_day INTEGER DEFAULT 1, -- 每月重置日期(1-28)
|
||||
created_at TIMESTAMP DEFAULT NOW(),
|
||||
updated_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📋 API端点
|
||||
|
||||
### 1. 聊天接口(非流式)
|
||||
```
|
||||
POST /api/v1/llm/chat
|
||||
|
||||
Request:
|
||||
{
|
||||
"modelType": "deepseek-v3",
|
||||
"messages": [
|
||||
{ "role": "user", "content": "分析这篇文献..." }
|
||||
],
|
||||
"temperature": 0.7,
|
||||
"maxTokens": 4096
|
||||
}
|
||||
|
||||
Response:
|
||||
{
|
||||
"content": "根据文献内容分析...",
|
||||
"tokenUsage": {
|
||||
"input": 150,
|
||||
"output": 500,
|
||||
"total": 650
|
||||
},
|
||||
"cost": 0.00065,
|
||||
"modelType": "deepseek-v3"
|
||||
}
|
||||
```
|
||||
|
||||
### 2. 聊天接口(流式)
|
||||
```
|
||||
POST /api/v1/llm/chat/stream
|
||||
|
||||
Request: 同上 + "stream": true
|
||||
|
||||
Response: Server-Sent Events (SSE)
|
||||
data: {"chunk": "根据", "tokenUsage": 1}
|
||||
data: {"chunk": "文献", "tokenUsage": 1}
|
||||
...
|
||||
data: {"done": true, "totalTokens": 650, "cost": 0.00065}
|
||||
```
|
||||
|
||||
### 3. 查询配额
|
||||
```
|
||||
GET /api/v1/llm/quota
|
||||
|
||||
Response:
|
||||
{
|
||||
"monthlyLimit": 1000000,
|
||||
"used": 245000,
|
||||
"remaining": 755000,
|
||||
"resetDate": "2025-12-01"
|
||||
}
|
||||
```
|
||||
|
||||
### 4. 使用统计
|
||||
```
|
||||
GET /api/v1/llm/usage?startDate=2025-11-01&endDate=2025-11-30
|
||||
|
||||
Response:
|
||||
{
|
||||
"totalTokens": 245000,
|
||||
"totalCost": 1.23,
|
||||
"byModel": {
|
||||
"deepseek-v3": { "tokens": 200000, "cost": 0.20 },
|
||||
"qwen3-72b": { "tokens": 45000, "cost": 0.23 }
|
||||
},
|
||||
"byModule": {
|
||||
"AIA": 100000,
|
||||
"ASL": 120000,
|
||||
"PKB": 25000
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ 关键技术难点
|
||||
|
||||
### 1. 流式输出的实现
|
||||
**技术方案:** Server-Sent Events (SSE)
|
||||
|
||||
```typescript
|
||||
// 后端(Fastify)
|
||||
app.post('/api/v1/llm/chat/stream', async (req, reply) => {
|
||||
reply.raw.setHeader('Content-Type', 'text/event-stream');
|
||||
reply.raw.setHeader('Cache-Control', 'no-cache');
|
||||
reply.raw.setHeader('Connection', 'keep-alive');
|
||||
|
||||
const stream = await llmGateway.chatStream(req.body);
|
||||
|
||||
for await (const chunk of stream) {
|
||||
reply.raw.write(`data: ${JSON.stringify(chunk)}\n\n`);
|
||||
}
|
||||
|
||||
reply.raw.end();
|
||||
});
|
||||
|
||||
// 前端(React)
|
||||
const eventSource = new EventSource('/api/v1/llm/chat/stream');
|
||||
eventSource.onmessage = (event) => {
|
||||
const data = JSON.parse(event.data);
|
||||
setMessages(prev => [...prev, data.chunk]);
|
||||
};
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 2. 错误处理和重试
|
||||
```typescript
|
||||
async function chatWithRetry(options: ChatOptions, maxRetries = 3) {
|
||||
for (let i = 0; i < maxRetries; i++) {
|
||||
try {
|
||||
return await llmGateway.chat(options);
|
||||
} catch (error) {
|
||||
if (error.code === 'RATE_LIMIT' && i < maxRetries - 1) {
|
||||
await sleep(2000 * (i + 1)); // 指数退避
|
||||
continue;
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3. Token计数(精确计费)
|
||||
**问题:** 不同模型的tokenizer不同
|
||||
|
||||
**解决方案:**
|
||||
- 使用各厂商提供的API返回值(最准确)
|
||||
- 备用方案:tiktoken库(OpenAI tokenizer)
|
||||
|
||||
```typescript
|
||||
import { encoding_for_model } from 'tiktoken';
|
||||
|
||||
function estimateTokens(text: string, model: string): number {
|
||||
const encoder = encoding_for_model(model);
|
||||
const tokens = encoder.encode(text);
|
||||
encoder.free();
|
||||
return tokens.length;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📅 开发计划(3天)
|
||||
|
||||
### Day 1:基础架构(6-8小时)
|
||||
- [ ] 创建目录结构
|
||||
- [ ] 实现BaseLLMAdapter抽象类
|
||||
- [ ] 实现DeepSeekAdapter
|
||||
- [ ] 数据库表创建(llm_usage, llm_quotas)
|
||||
- [ ] 基础API端点(非流式)
|
||||
|
||||
### Day 2:核心功能(6-8小时)
|
||||
- [ ] Feature Flag集成
|
||||
- [ ] 配额检查和记录
|
||||
- [ ] 实现QwenAdapter
|
||||
- [ ] 错误处理和重试机制
|
||||
- [ ] 单元测试
|
||||
|
||||
### Day 3:流式输出 + 优化(6-8小时)
|
||||
- [ ] 实现流式输出(SSE)
|
||||
- [ ] 前端SSE接收处理
|
||||
- [ ] 成本统计API
|
||||
- [ ] 配额查询API
|
||||
- [ ] 集成测试
|
||||
- [ ] 文档完善
|
||||
|
||||
---
|
||||
|
||||
## ✅ 开发检查清单
|
||||
|
||||
**开始前确认:**
|
||||
- [ ] Feature Flag表已创建(platform_schema.feature_flags)
|
||||
- [ ] 用户表已有version字段(professional/premium/enterprise)
|
||||
- [ ] 各LLM厂商API Key已配置
|
||||
- [ ] Prisma Schema已更新
|
||||
|
||||
**开发中:**
|
||||
- [ ] 每个适配器都有完整的错误处理
|
||||
- [ ] 所有LLM调用都记录到llm_usage表
|
||||
- [ ] 配额检查在每次调用前执行
|
||||
- [ ] 流式和非流式都已测试
|
||||
|
||||
**完成后:**
|
||||
- [ ] ASL模块可以成功调用LLM网关
|
||||
- [ ] ADMIN可以查看用户LLM使用统计
|
||||
- [ ] 配额超限会正确拒绝请求
|
||||
|
||||
---
|
||||
|
||||
## 🔗 相关文档
|
||||
|
||||
**依赖:**
|
||||
- [用户与权限中心(UAM)](../../01-平台基础层/01-用户与权限中心(UAM)/README.md) - Feature Flag
|
||||
- [运营管理端](../../03-业务模块/ADMIN-运营管理端/README.md) - LLM模型管理
|
||||
|
||||
**被依赖:**
|
||||
- [ASL-AI智能文献](../../03-业务模块/ASL-AI智能文献/README.md) ⭐ P0
|
||||
- [AIA-AI智能问答](../../03-业务模块/AIA-AI智能问答/README.md)
|
||||
- [PKB-个人知识库](../../03-业务模块/PKB-个人知识库/README.md)
|
||||
|
||||
---
|
||||
|
||||
**最后更新:** 2025-11-06
|
||||
**维护人:** 技术架构师
|
||||
**优先级:** P0 ⭐⭐⭐⭐⭐
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user