Files
AIclinicalresearch/docs/08-项目管理/PKB功能审查报告-阶段0.md
HaHafeng 1b53ab9d52 feat(aia): Complete AIA V2.0 with universal streaming capabilities
Major Changes:
- Add StreamingService with OpenAI Compatible format
- Upgrade Chat component V2 with Ant Design X integration
- Implement AIA module with 12 intelligent agents
- Update API routes to unified /api/v1 prefix
- Update system documentation

Backend (~1300 lines):
- common/streaming: OpenAI Compatible adapter
- modules/aia: 12 agents, conversation service, streaming integration
- Update route versions (RVW, PKB to v1)

Frontend (~3500 lines):
- modules/aia: AgentHub + ChatWorkspace (100% prototype restoration)
- shared/Chat: AIStreamChat, ThinkingBlock, useAIStream Hook
- Update API endpoints to v1

Documentation:
- AIA module status guide
- Universal capabilities catalog
- System overview updates
- All module documentation sync

Tested: Stream response verified, authentication working
Status: AIA V2.0 core completed (85%)
2026-01-14 19:15:01 +08:00

800 lines
21 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# PKB涓<42>汉鐭ヨ瘑搴撳姛鑳藉<E991B3>鏌ユ姤鍛?- 闃舵<E99783>0
> **瀹℃煡鏃ユ湡锛?* 2026-01-06
> **瀹℃煡浜哄憳锛?* AI鍔╂墜
> **瀹℃煡鐩<E785A1>爣锛?* 娣卞叆鐞嗚ВPKB鐜版湁鍔熻兘锛屼负瀹夊叏杩佺Щ鍋氬噯澶?
> **鐘舵€侊細** 鉁?杩涜<E69DA9>涓?
---
## 馃搵 鎵ц<E98EB5>鎽樿<E98EBD>
### 鍏抽敭鍙戠幇
**馃幆 PKB绯荤粺瀹為檯涓婃槸涓や釜绱у瘑鍏宠仈鐨勫姛鑳芥ā鍧楋細**
```
Part 1: PKB鐭ヨ瘑搴撶<E690B4>鐞嗘ā鍧?
鈹溾攢 浣嶇疆锛歜ackend/src/legacy/controllers/knowledgeBaseController.ts
鈹溾攢 鍔熻兘锛氬垱寤恒€佺紪杈戙€佸垹闄ょ煡璇嗗簱锛涗笂浼犮€佺<E282AC>鐞嗘枃妗?
鈹斺攢 鏁版嵁搴擄細pkb_schema锛堢嫭绔婼chema锛屾棤闇€杩佺Щ锛?
Part 2: AIA鏅鸿兘闂<E58598>瓟妯″潡涓<E6BDA1>殑PKB搴旂敤
鈹溾攢 浣嶇疆锛歜ackend/src/legacy/controllers/chatController.ts
鈹溾攢 鍔熻兘锛氫娇鐢ㄧ煡璇嗗簱杩涜<E69DA9>鏅鸿兘闂<E58598>瓟锛?绉嶅伐浣滄ā寮忥級
鈹斺攢 宸ヤ綔妯″紡锛?
鈹溾攢 鍏ㄦ枃闃呰<E99783>妯″紡锛?5-50绡囨枃鐚<E69E83>患鍚堝垎鏋愶級
鈹溾攢 閫愮瘒绮捐<E7BBAE>妯″紡锛?-5绡囨枃鐚<E69E83>繁搴﹀垎鏋愶級
鈹斺攢 鎵瑰<E98EB5>鐞嗘ā寮忥紙3-50绡囨枃鐚<E69E83>壒閲忔彁鍙栵級
```
---
## 馃搳 Part 1: PKB鐭ヨ瘑搴撶<E690B4>鐞嗘ā鍧?
### 1.1 鏂囦欢缁撴瀯
```
backend/src/legacy/
鈹溾攢 controllers/
鈹? 鈹溾攢 knowledgeBaseController.ts # API鎺у埗鍣<E59F97>紙342琛岋級
鈹? 鈹斺攢 documentController.ts # 鏂囨。涓婁紶鎺у埗鍣?
鈹溾攢 services/
鈹? 鈹溾攢 knowledgeBaseService.ts # 涓氬姟閫昏緫锛?65琛岋級
鈹? 鈹溾攢 documentService.ts # 鏂囨。澶勭悊鏈嶅姟
鈹? 鈹斺攢 tokenService.ts # Token璁畻鍜屾枃妗€夋嫨
鈹斺攢 routes/
鈹斺攢 knowledgeBases.ts # 璺<>敱瀹氫箟
```
### 1.2 鏍稿績API绔<49>
#### 鐭ヨ瘑搴撶<E690B4>鐞咥PI
```typescript
// 1. 鍒涘缓鐭ヨ瘑搴?
POST /api/v1/knowledge/create
Body: { name: string, description?: string }
?
kbQuota vs kbUsed锛?
ify鍒涘缓Dataset
<EFBFBD>
<EFBFBD>
// 2. 鑾峰彇鐭ヨ瘑搴撳垪琛?
GET /api/v1/knowledge/list
+ <EFBFBD>
// 3. 鑾峰彇鐭ヨ瘑搴撹<E690B4>鎯?
GET /api/v1/knowledge/:id
+ ?
// 4. 鏇存柊鐭ヨ瘑搴?
PUT /api/v1/knowledge/:id
Body: { name?: string, description?: string }
// 5. 鍒犻櫎鐭ヨ瘑搴?
DELETE /api/v1/knowledge/:id
?
Dify Dataset
ц<EFBFBD>documents鑷<EFBFBD>?
<EFBFBD>
// 6. 妫€绱㈢煡璇嗗簱锛圧AG锛?
GET /api/v1/knowledge/:id/search?query=xxx&top_k=15
?
Dify retrieveKnowledge API
樿<EFBFBD>15<EFBFBD>
// 7. 鑾峰彇鐭ヨ瘑搴撶粺璁?
GET /api/v1/knowledge/:id/stats
<EFBFBD><EFBFBD>oken鏁?
// 8. 鑾峰彇鏂囨。閫夋嫨锛堝叏鏂囬槄璇绘ā寮忥級
GET /api/v1/knowledge/:id/document-selection?max_files=7&max_tokens=750000
<EFBFBD>Token闄愬埗锛?
```
#### 鏂囨。绠悊API
```typescript
// 9. 涓婁紶鏂囨。
POST /api/v1/documents/upload
Multipart: { file, kbId }
?
SS
DF/Word/TXT/Markdown锛?
ify杩涜<EFBFBD>
<EFBFBD>uploading鈫抪arsing鈫抜ndexing鈫抍ompleted锛?
// 10. 鑾峰彇鏂囨。璇︽儏
GET /api/v1/documents/:id
// 11. 鍒犻櫎鏂囨。
DELETE /api/v1/documents/:id
?
ify鍒犻櫎Document
SS鍒犻櫎鏂囦欢
<EFBFBD>?
```
### 1.3 鏁版嵁搴揝chema
#### 琛ㄧ粨鏋勶紙鍦╬kb_schema涓<61>
```sql
-- 鐭ヨ瘑搴撹〃
knowledge_bases
id (UUID, PK)
userId (String)
name (String)
description (String?)
difyDatasetId (String, UNIQUE) -- Dify涓<79>殑Dataset ID
fileCount (Int, default: 0)
totalSizeBytes (BigInt, default: 0)
createdAt (DateTime)
updatedAt (DateTime)
-- 鏂囨。琛?
documents
id (UUID, PK)
kbId (String, FK ?knowledge_bases.id)
userId (String)
filename (String)
fileType (String) -- pdf/docx/txt/md
fileSizeBytes (BigInt)
fileUrl (String) -- OSS URL
difyDocumentId (String) -- Dify涓<79>殑Document ID
status (String) -- uploading/parsing/indexing/completed/error
progress (Int, 0-100)
errorMessage (String?)
segmentsCount (Int?) -- Dify绱㈠紩鐨勭墖娈垫暟
tokensCount (Int?) -- 鎬籘oken鏁?
charCount (Int?) -- 瀛楃<E7809B>鏁?
language (String?)
extractedText (String?) -- 鎻愬彇鐨勫叏鏂囷紙鐢ㄤ簬鍏ㄦ枃闃呰<E99783>妯″紡锛?
extractionMethod (String?) -- marker/pymupdf/docx
extractionQuality (Float?)
uploadedAt (DateTime)
processedAt (DateTime?)
-- 鎵瑰<E98EB5>鐞嗕换鍔¤〃
batch_tasks
id (UUID, PK)
userId (String)
kbId (String, FK ?knowledge_bases.id)
name (String)
templateType (String)
templateId (String?)
prompt (String)
status (String) -- pending/running/completed/failed
totalDocuments (Int)
completedCount (Int, default: 0)
failedCount (Int, default: 0)
modelType (String)
concurrency (Int, default: 3)
startedAt (DateTime?)
completedAt (DateTime?)
durationSeconds (Int?)
createdAt (DateTime)
updatedAt (DateTime)
-- 鎵瑰<E98EB5>鐞嗙粨鏋滆〃
batch_results
id (UUID, PK)
taskId (String, FK ?batch_tasks.id)
documentId (String, FK ?documents.id)
status (String) -- success/failed
data (Json?) -- 鎻愬彇鐨勭粨鏋勫寲鏁版嵁
rawOutput (String?) -- LLM鍘熷<E98D98>杈撳嚭
errorMessage (String?)
processingTimeMs (Int?)
tokensUsed (Int?)
createdAt (DateTime)
-- 浠诲姟妯℃澘琛?
task_templates
id (UUID, PK)
userId (String)
name (String)
description (String?)
prompt (String)
isPublic (Boolean, default: false)
outputFields (Json) -- 鏈熸湜鐨勮緭鍑哄瓧娈?
createdAt (DateTime)
updatedAt (DateTime)
```
#### 绱㈠紩
```sql
-- knowledge_bases
idx_pkb_knowledge_bases_user_id (userId)
idx_pkb_knowledge_bases_dify_dataset_id (difyDatasetId)
-- documents
idx_pkb_documents_kb_id (kbId)
idx_pkb_documents_user_id (userId)
idx_pkb_documents_status (status)
idx_pkb_documents_dify_document_id (difyDocumentId)
idx_pkb_documents_extraction_method (extractionMethod)
-- batch_tasks
idx_pkb_batch_tasks_kb_id (kbId)
idx_pkb_batch_tasks_user_id (userId)
idx_pkb_batch_tasks_status (status)
idx_pkb_batch_tasks_created_at (createdAt)
-- batch_results
idx_pkb_batch_results_task_id (taskId)
idx_pkb_batch_results_document_id (documentId)
idx_pkb_batch_results_status (status)
```
### 1.4 鍏抽敭涓氬姟閫昏緫
#### 閰嶉<E996B0>
```typescript
// 鐢ㄦ埛琛<E59F9B>紙鍦╬latform_schema.users锛変腑鐨勫瓧娈?
kbQuota: Int @default(3) // 鐭ヨ瘑搴撻厤棰?
kbUsed: Int @default(0) // 宸蹭娇鐢ㄦ暟閲?
// 鍒涘缓鐭ヨ瘑搴撴椂妫€鏌?
if (user.kbUsed >= user.kbQuota) {
throw new Error('閰嶉<E996B0>宸叉弧');
}
// 鍒涘缓鎴愬姛鍚庡<E98D9A>鍔犺<E98D94>鏁?
await prisma.user.update({
data: { kbUsed: { increment: 1 } }
});
// 鍒犻櫎鐭ヨ瘑搴撴椂鍑忓皯璁℃暟
await prisma.user.update({
data: { kbUsed: { decrement: 1 } }
});
```
#### Dify闆嗘垚
```typescript
// 鍒涘缓鐭ヨ瘑搴?鈫?鍒涘缓Dify Dataset
const difyDataset = await difyClient.createDataset({
name: `${userId}_${name}_${Date.now()}`,
description,
indexing_technique: 'high_quality',
});
// 妫€绱㈢煡璇嗗簱 鈫?璋冪敤Dify RAG
const results = await difyClient.retrieveKnowledge(
difyDatasetId,
query,
{
retrieval_model: {
search_method: 'semantic_search',
top_k: 15,
},
}
);
```
#### 鏂囨。Token璁畻锛坱okenService.ts锛?
```typescript
// Token璁畻瑙勫垯
const TOKEN_LIMITS = {
MAX_FILES: 7, // 鏈€澶?绡囨枃鐚?
MAX_TOTAL_TOKENS: 750000, // 鎬籘oken闄愬埗锛圦wen-Long: 1M涓婁笅鏂?- 250K瀵硅瘽绌洪棿锛?
MAX_SINGLE_DOC_TOKENS: 200000, // 鍗曠瘒鏂囩尞鏈€澶<E282AC>oken鏁?
};
// 鏅鸿兘閫夋嫨绠楁硶
function selectDocumentsForFullText(
documentTokens,
maxFiles,
maxTokens
) {
// 鎸塗oken鏁板崌搴忔帓搴?
const sorted = documentTokens.sort((a, b) => a.tokens - b.tokens);
// 璐<>績绠楁硶閫夋嫨
let totalTokens = 0;
let selectedCount = 0;
const selected = [];
for (const doc of sorted) {
if (selectedCount >= maxFiles) break;
if (totalTokens + doc.tokens > maxTokens) break;
if (doc.tokens > MAX_SINGLE_DOC_TOKENS) continue; // 璺宠繃瓒呭ぇ鏂囨。
selected.push(doc);
totalTokens += doc.tokens;
selectedCount++;
}
return { selected, totalTokens, excludedDocs };
}
```
---
## 馃搳 Part 2: AIA妯″潡涓<E6BDA1>殑PKB搴旂敤
### 2.1 鏂囦欢缁撴瀯
```
backend/src/legacy/controllers/
鈹斺攢 chatController.ts # 閫氱敤瀵硅瘽鎺у埗鍣<E59F97>紙鍖呭惈3绉嶆ā寮忥級
frontend/src/
鈹溾攢 pages/ChatPage.tsx # 涓诲<E6B693>璇濋〉闈?
鈹斺攢 components/
鈹溾攢 FullTextMode.tsx # 鍏ㄦ枃闃呰<E99783>妯″紡缁勪欢
鈹溾攢 DeepReadMode.tsx # 閫愮瘒绮捐<E7BBAE>妯″紡缁勪欢
鈹斺攢 BatchMode.tsx # 鎵瑰<E98EB5>鐞嗘ā寮忕粍浠?
```
### 2.2 涓夌<E6B693>宸ヤ綔妯″紡璇﹁В
#### 妯″紡1锛氬叏鏂囬槄璇绘ā寮忥紙Full Text Mode锛?
**鐢ㄩ€?*锛?5-50绡囨枃鐚<E69E83>殑缁煎悎鍒嗘瀽
**瀹炵幇鍘熺悊锛?*
```typescript
// 1. 鍓嶇<E98D93>锛氱敤鎴疯繘鍏ョ煡璇嗗簱妯″紡 鈫?閫夋嫨"鍏ㄦ枃闃呰<E99783>"
const modeState = {
baseMode: 'knowledge_base',
kbMode: 'full_text',
selectedKbId: 'xxx',
};
// 2. 鍓嶇<E98D93>锛氭櫤鑳藉姞杞芥枃鐚?
const selection = await knowledgeBaseApi.getDocumentSelection(kbId, {
max_files: 7,
max_tokens: 750000,
});
// 杩斿洖锛歿 selectedDocuments[], excludedDocuments[], totalTokens }
// 3. 鍓嶇<E98D93>锛氳嚜鍔ㄥ垏鎹㈠埌Qwen-Long妯″瀷
if (modeState.kbMode === 'full_text') {
setSelectedModel('qwen-long'); // 1M涓婁笅鏂?
showToast('宸茶嚜鍔ㄥ垏鎹㈠埌Qwen-Long妯″瀷锛堟敮鎸?M涓婁笅鏂囷級');
}
// 4. 鍓嶇<E98D93>锛氬彂閫佹秷鎭<E7A7B7>椂浼犻€掓枃妗<E69E83>D鍒楄〃
await chatApi.sendMessageStream({
content: userQuestion,
modelType: 'qwen-long',
fullTextDocumentIds: loadedDocs.map(d => d.id), // 鉁?鍏抽敭鍙傛暟
conversationId,
});
// 5. 鍚庣<E98D9A>锛氬姞杞藉畬鏁村叏鏂?
if (fullTextDocumentIds && fullTextDocumentIds.length > 0) {
const documents = await prisma.document.findMany({
where: { id: { in: fullTextDocumentIds } },
select: { id, filename, extractedText, tokensCount },
});
// 6. 缁勮<E7BC81>鍏ㄦ枃涓婁笅鏂?
const fullTextParts = [];
for (let i = 0; i < documents.length; i++) {
const doc = documents[i];
const docNumber = i + 1;
// 鏍煎紡锛氥€愭枃鐚甆锛氭枃浠跺悕銆慭n鍏ㄦ枃鍐呭<E98D90>
fullTextParts.push(
`銆愭枃鐚?{docNumber}锛?{doc.filename}銆慭n\n${doc.extractedText}`
);
// 娣诲姞寮曠敤淇℃伅
allCitations.push({
id: docNumber,
fileName: doc.filename,
score: 1.0, // 鍏ㄦ枃鐩稿叧搴?00%
content: doc.extractedText.substring(0, 200),
});
}
knowledgeBaseContext = fullTextParts.join('\n\n---\n\n');
}
// 7. 浼犻€掔粰LLM
const systemPrompt = '浣犳槸涓撲笟鐨勫<E990A8><EFBFBD>枃鐚<E69E83>垎鏋愬姪鎵嬨€傛瘡绡囨枃鐚<E69E83>敤銆愭枃鐚甆锛氭枃浠跺悕銆戞爣璁般€傝<E282AC>璁ょ湡闃呰<E99783>鎵€鏈夋枃鐚<E69E83>紝杩涜<E69DA9>娣卞叆鐨勭患鍚堝垎鏋愩€傚湪鍥炵瓟鏃惰<E98F83>寮曠敤鍏蜂綋鏂囩尞锛屼娇鐢ㄣ€愭枃鐚甆銆戞牸寮忋€?;
const userContent = `${userQuestion}\n\n## 鍙傝€冭祫鏂欙紙鏂囩尞鍏ㄦ枃锛塡n\n${knowledgeBaseContext}`;
const messages = [
{ role: 'system', content: systemPrompt },
...historyMessages, // 瀵硅瘽鍘嗗彶
{ role: 'user', content: userContent },
];
// 8. 璋冪敤Qwen-Long
const response = await LLMFactory.getAdapter('qwen-long').chatStream(messages, {
temperature: 0.7,
maxTokens: 6000, // 鍏ㄦ枃妯″紡闇€瑕佹洿闀跨殑鍥炵瓟绌洪棿
});
```
**鍏抽敭鐗圭偣锛?*
- 鉁?浼犻€掑畬鏁村叏鏂囷紙涓嶆槸RAG鐗囨<E99097>锛?
- 鉁?鏅鸿兘閫夋嫨鏂囩尞锛堝熀浜嶵oken闄愬埗锛?
- 鉁?鏂囩尞鏉ユ簮鏍囪<E98F8D>锛氥€愭枃鐚甆锛氭枃浠跺悕銆?
- 鉁?鑷<>姩鍒囨崲鍒癚wen-Long妯″瀷锛?M涓婁笅鏂囷級
- 鉁?100%鐩稿叧搴︼紙鍥犱负鏄<E8B49F>叏鏂囷級
- 鉁?閫傚悎璺ㄦ枃鐚<E69E83>瘮杈冦€佽秼鍔垮垎鏋愩€佺爺绌舵柟娉曞綊绾?
**Token浣跨敤锛?*
```
涓婁笅鏂囷細~750K tokens锛?绡囨枃鐚<E69E83>叏鏂囷級
瀵硅瘽绌洪棿锛殈250K tokens
杈撳嚭闀垮害锛?000 tokens锛堢患鍚堝垎鏋愰渶瑕佹洿闀垮洖绛旓級
```
---
#### 妯″紡2锛氶€愮瘒绮捐<E7BBAE>妯″紡锛圖eep Read Mode锛?
**鐢ㄩ€?*锛?-5绡囨枃鐚<E69E83>殑娣卞害鍒嗘瀽
**瀹炵幇鍘熺悊锛?*
```typescript
// 1. 鍓嶇<E98D93>锛氱敤鎴烽€夋嫨"閫愮瘒绮捐<E7BBAE>"
const modeState = {
baseMode: 'knowledge_base',
kbMode: 'deep_read',
selectedKbId: 'xxx',
};
// 2. 鍓嶇<E98D93>锛氱敤鎴烽€夋嫨瑕佺簿璇荤殑鏂囨。
const selectedDocs = [doc1, doc2, doc3]; // 鐢ㄦ埛鎵嬪姩閫夋嫨
// 3. 鍓嶇<E98D93>锛氬垏鎹㈠埌鏌愪釜鏂囨。
const currentDoc = selectedDocs[0];
// 4. 鍓嶇<E98D93>锛氬彂閫佹秷鎭<E7A7B7>椂浼犻€掑綋鍓嶆枃妗<E69E83>D锛堢敤浜嶳AG杩囨护锛?
await chatApi.sendMessageStream({
content: userQuestion,
modelType: selectedModel,
knowledgeBaseIds: [kbId], // 鐭ヨ瘑搴揑D
documentIds: [currentDoc.id], // 鉁?鍏抽敭锛氬彧妫€绱㈠綋鍓嶆枃妗?
conversationId: currentDocConversationId, // 姣忎釜鏂囨。鐙<E38082>珛瀵硅瘽
});
// 5. 鍚庣<E98D9A>锛歊AG妫€绱<E282AC>紙闄愬畾鍦ㄧ壒瀹氭枃妗
if (documentIds && documentIds.length > 0) {
// 璋冪敤Dify RAG锛屼絾浼氶檺瀹氬湪鎸囧畾鏂囨。鑼冨洿
const results = await difyClient.retrieveKnowledge(
difyDatasetId,
query,
{
retrieval_model: {
search_method: 'semantic_search',
top_k: 15,
document_ids: documentIds, // 鉁?Dify浼氬彧妫€绱㈣繖浜涙枃妗?
},
}
);
}
```
**鍏抽敭鐗圭偣锛?*
- 鉁?鍩轰簬RAG妫€绱<E282AC>紙涓嶆槸鍏ㄦ枃锛?
- 鉁?闄愬畾鍦ㄥ綋鍓嶆枃妗h寖鍥?
- 鉁?姣忎釜鏂囨。鏈夌嫭绔嬬殑瀵硅瘽鍘嗗彶
- 鉁?鐢ㄦ埛鍙<E59F9B>互鍦ㄦ枃妗棿鍒囨崲
- 鉁?閫傚悎娣卞害鐞嗚В鍗曠瘒鏂囩尞
---
#### 妯″紡3锛氭壒澶勭悊妯″紡锛圔atch Mode锛?
**鐢ㄩ€?*锛?-50绡囨枃鐚<E69E83>殑鎵归噺淇℃伅鎻愬彇
**瀹炵幇鍘熺悊锛?*
```typescript
// 1. 鐢ㄦ埛鍒涘缓鎵瑰<E98EB5>鐞嗕换鍔?
POST /api/v1/batch-tasks/create
Body: {
kbId: 'xxx',
name: '鎻愬彇鐮旂┒鏂规硶',
prompt: '璇蜂粠杩欑瘒鏂囩尞涓<E5B09E>彁鍙栵細鐮旂┒璁捐<E79281>銆佹牱鏈<E789B1>噺銆佺粺璁℃柟娉?,
templateType: 'custom' | 'preset',
modelType: 'deepseek-v3',
concurrency: 3, // 骞跺彂鏁?
}
// 2. 鍚庣<E98D9A>锛氬垱寤轰换鍔?
const task = await prisma.batchTask.create({
data: {
userId,
kbId,
name,
prompt,
templateType,
modelType,
status: 'pending',
totalDocuments: documentsCount,
concurrency,
},
});
// 3. 鍚庣<E98D9A>锛氬惎鍔ㄦ壒澶勭悊Worker
async function processBatchTask(taskId) {
// 3.1 鑾峰彇浠诲姟鍜屾枃妗e垪琛?
const task = await prisma.batchTask.findUnique({
where: { id: taskId },
include: { knowledgeBase: { include: { documents: true } } },
});
const documents = task.knowledgeBase.documents.filter(d => d.status === 'completed');
// 3.2 鏇存柊浠诲姟鐘舵€?
await prisma.batchTask.update({
where: { id: taskId },
data: { status: 'running', startedAt: new Date() },
});
// 3.3 骞跺彂澶勭悊鏂囨。
const concurrency = task.concurrency || 3;
const chunks = chunkArray(documents, concurrency);
for (const chunk of chunks) {
await Promise.all(chunk.map(async (doc) => {
try {
// 3.3.1 瀵规瘡涓<E798A1>枃妗紝浣跨敤鍏秂xtractedText + prompt璋冪敤LLM
const llmPrompt = `${task.prompt}\n\n鏂囩尞鍐呭<E98D90>锛歕n${doc.extractedText}`;
const response = await LLMFactory.getAdapter(task.modelType).chat([
{ role: 'user', content: llmPrompt },
]);
// 3.3.2 瑙瀽LLM杈撳嚭锛堟湡鏈汮SON鏍煎紡锛?
const data = parseJSONResponse(response.content);
// 3.3.3 淇濆瓨缁撴灉
await prisma.batchResult.create({
data: {
taskId: task.id,
documentId: doc.id,
status: 'success',
data,
rawOutput: response.content,
tokensUsed: response.usage.totalTokens,
processingTimeMs: Date.now() - startTime,
},
});
// 3.3.4 鏇存柊浠诲姟杩涘害
await prisma.batchTask.update({
where: { id: taskId },
data: { completedCount: { increment: 1 } },
});
} catch (error) {
// 3.3.5 澶勭悊澶辫触
await prisma.batchResult.create({
data: {
taskId: task.id,
documentId: doc.id,
status: 'failed',
errorMessage: error.message,
},
});
await prisma.batchTask.update({
where: { id: taskId },
data: { failedCount: { increment: 1 } },
});
}
}));
}
// 3.4 浠诲姟瀹屾垚
await prisma.batchTask.update({
where: { id: taskId },
data: {
status: 'completed',
completedAt: new Date(),
durationSeconds: Math.floor((Date.now() - task.startedAt) / 1000),
},
});
}
// 4. 鍓嶇<E98D93>锛氭煡鐪嬫壒澶勭悊缁撴灉
GET /api/v1/batch-tasks/:id/results
杩斿洖锛?
{
task: { /* 浠诲姟淇℃伅 */ },
results: [
{
documentId: 'xxx',
filename: 'paper1.pdf',
status: 'success',
data: {
鐮旂┒璁捐<E79281>: '',
鏍锋湰閲? '300?,
<EFBFBD>: 't妫<EFBFBD>?,
},
},
// ...
],
}
// 5. 鍓嶇<E98D93>锛氬<E9949B>鍑虹粨鏋滐紙Excel/CSV锛?
```
**鍏抽敭鐗圭偣锛?*
- 鉁?鎵归噺澶勭悊澶氫釜鏂囨。
- 鉁?骞跺彂鎺у埗锛堥粯璁?涓<>苟鍙戯級
- 鉁?缁撴瀯鍖栦俊鎭<E4BF8A>彁鍙?
- 鉁?杩涘害瀹炴椂鏇存柊
- 鉁?鏀<>寔鑷<E5AF94>畾涔夋ā鏉?
- 鉁?缁撴灉鍙<E78189><E98D99>鍑猴紙Excel/CSV锛?
- 鉁?閿欒<E996BF>澶勭悊鍜岄噸璇?
---
### 2.3 涓夌<E6B693>妯″紡鐨勫<E990A8>姣?
| 缁村害 | 鍏ㄦ枃闃呰<E99783> | 閫愮瘒绮捐<E7BBAE> | 鎵瑰<E98EB5>鐞?|
|------|---------|---------|--------|
| **鏂囨。鏁伴噺** | 7绡囧乏鍙?| 1-5绡?| 3-50绡?|
| **鏁版嵁鏉ユ簮** | 瀹屾暣鍏ㄦ枃 | RAG妫€绱㈢墖娈?| 瀹屾暣鍏ㄦ枃 |
| **LLM璋冪敤** | 瀵硅瘽寮忥紙澶氳疆锛?| 瀵硅瘽寮忥紙澶氳疆锛?| 鎵归噺锛堝崟娆★級 |
| **涓婁笅鏂?* | ~750K tokens | ~15K tokens | 鍗曠瘒鍏ㄦ枃 |
| **杈撳嚭鏂瑰紡** | 娴佸紡锛圫SE锛?| 娴佸紡锛圫SE锛?| 鎵归噺淇濆瓨 |
| **閫傜敤鍦烘櫙** | 缁煎悎鍒嗘瀽銆佽法鏂囩尞姣旇緝 | 娣卞害鐞嗚В鍗曠瘒 | 淇℃伅鎻愬彇銆佹暟鎹<E69A9F>〃鏍?|
| **鐢ㄦ埛浜や簰** | 瀹炴椂闂<E6A482>瓟 | 瀹炴椂闂<E6A482>瓟 | 鍚庡彴澶勭悊 |
| **瀵硅瘽鍘嗗彶** | 鍏ㄥ眬鍏变韩 | 姣忕瘒鐙<E79892>珛 | 鏃犲<E98F83>璇?|
---
## 馃搵 API绔<49>偣瀹屾暣娓呭崟
### PKB绠悊妯″潡API
```
POST /api/v1/knowledge/create # 鍒涘缓鐭ヨ瘑搴?
GET /api/v1/knowledge/list # 鑾峰彇鐭ヨ瘑搴撳垪琛?
GET /api/v1/knowledge/:id # 鑾峰彇鐭ヨ瘑搴撹<E690B4>鎯?
PUT /api/v1/knowledge/:id # 鏇存柊鐭ヨ瘑搴?
DELETE /api/v1/knowledge/:id # 鍒犻櫎鐭ヨ瘑搴?
GET /api/v1/knowledge/:id/search # RAG妫€绱?
GET /api/v1/knowledge/:id/stats # 缁熻<E7BC81>淇℃伅
GET /api/v1/knowledge/:id/document-selection # 鏂囨。閫夋嫨锛堝叏鏂囨ā寮忥級
POST /api/v1/documents/upload # 涓婁紶鏂囨。
GET /api/v1/documents/:id # 鑾峰彇鏂囨。璇︽儏
DELETE /api/v1/documents/:id # 鍒犻櫎鏂囨。
GET /api/v1/documents/:id/content # 鑾峰彇鏂囨。鍐呭<E98D90>锛堝叏鏂囷級
POST /api/v1/batch-tasks/create # 鍒涘缓鎵瑰<E98EB5>鐞嗕换鍔?
GET /api/v1/batch-tasks/list # 鑾峰彇鎵瑰<E98EB5>鐞嗕换鍔″垪琛?
GET /api/v1/batch-tasks/:id # 鑾峰彇浠诲姟璇︽儏
GET /api/v1/batch-tasks/:id/results # 鑾峰彇浠诲姟缁撴灉
DELETE /api/v1/batch-tasks/:id # 鍒犻櫎浠诲姟
GET /api/v1/task-templates/list # 鑾峰彇妯℃澘鍒楄〃
POST /api/v1/task-templates/create # 鍒涘缓妯℃澘
DELETE /api/v1/task-templates/:id # 鍒犻櫎妯℃澘
```
### AIA瀵硅瘽妯″潡API锛堝惈PKB闆嗘垚锛?
```
POST /api/v1/chat/send-message-stream # 鍙戦€佹秷鎭<E7A7B7>紙娴佸紡锛?
鍙傛暟锛?
- content: string
- modelType: 'deepseek-v3' | 'qwen3-72b' | 'qwen-long'
- knowledgeBaseIds?: string[] # RAG妯″紡
- documentIds?: string[] # 閫愮瘒绮捐<E7BBAE>妯″紡锛堥檺瀹氭枃妗
- fullTextDocumentIds?: string[] # 鍏ㄦ枃闃呰<E99783>妯″紡锛堜紶閫掑叏鏂囷級
- conversationId?: string
GET /api/v1/chat/conversations # 鑾峰彇瀵硅瘽鍒楄〃
GET /api/v1/chat/conversations/:id # 鑾峰彇瀵硅瘽鍘嗗彶
DELETE /api/v1/chat/conversations/:id # 鍒犻櫎瀵硅瘽
```
---
## 馃敆 妯″潡闂翠緷璧栧叧绯?
```
AIA鏅鸿兘闂<EFBFBD>瓟妯″潡
鈹?
鈹溾攢 渚濊禆 PKB鐭ヨ瘑搴撶<E690B4>鐞嗘ā鍧?
鈹? 鈹溾攢 鑾峰彇鐭ヨ瘑搴撳垪琛<E59EAA>紙閫夋嫨鐭ヨ瘑搴擄級
鈹? 鈹溾攢 鑾峰彇鏂囨。鍒楄〃锛堥€夋嫨鏂囨。锛?
鈹? 鈹溾攢 鑾峰彇鏂囨。鍏ㄦ枃锛堝叏鏂囬槄璇伙級
鈹? 鈹溾攢 RAG妫€绱<E282AC>紙閫愮瘒绮捐<E7BBAE>锛?
鈹? 鈹斺攢 鏂囨。鏅鸿兘閫夋嫨锛堝叏鏂囬槄璇伙級
鈹?
鈹溾攢 渚濊禆 LLM缃戝叧
鈹? 鈹溾攢 DeepSeek V3
鈹? 鈹溾攢 Qwen3-72B
鈹? 鈹斺攢 Qwen-Long
鈹?
鈹斺攢 渚濊禆 Dify RAG寮曟搸
鈹斺攢 retrieveKnowledge API
```
---
## 馃幆 杩佺Щ鍏抽敭鐐?
### 1. PKB妯″潡杩佺Щ
```
鉁?绠€鍗曪細
- 鏁版嵁搴撳凡鍦╬kb_schema锛屾棤闇€杩佺Щ
- API绔<49>偣娓呮櫚锛屾槗浜庡<E6B59C>鍒?
- 涓氬姟閫昏緫鐙<E7B7AB>
鈿狅笍 娉ㄦ剰锛?
- Dify闆嗘垚闇€瑕佷繚鎸?
- OSS鏂囦欢涓婁紶闇€瑕佷繚鎸?
- 閰嶉<E996B0>悊闇€瑕佷繚鎸?
```
### 2. AIA妯″潡涓<E6BDA1>殑PKB闆嗘垚杩佺Щ
```
鉁?绠€鍗曪細
- 鎺ュ彛娓呮櫚锛坒ullTextDocumentIds/documentIds锛?
- 涓夌<E6B693>妯″紡閫昏緫鐙<E7B7AB>
鈿狅笍 娉ㄦ剰锛?
- chatController.ts闇€瑕佸悓鏃惰縼绉?
- 鍓嶇<E98D93>3涓<33>ā寮忕粍浠堕渶瑕佽縼绉?
- 瀵硅瘽鍘嗗彶绠$悊闇€瑕佷繚鎸?
```
### 3. 娴嬭瘯瑕佺偣
```
蹇呴』娴嬭瘯锛?
鉁?PKB CRUD鍔熻兘
鉁?鏂囨。涓婁紶鍜屾彁鍙?
鉁?RAG妫€绱㈠姛鑳?
鉁?鍏ㄦ枃闃呰<E99783>妯″紡锛?绡囨枃鐚<E69E83>
鉁?閫愮瘒绮捐<E7BBAE>妯″紡锛堟枃妗垏鎹<E59E8F>
鉁?鎵瑰<E98EB5>鐞嗘ā寮忥紙骞跺彂澶勭悊锛?
鉁?閰嶉<E996B0>
鉁?瀵硅瘽鍘嗗彶绠$悊
鉁?妯″瀷鍒囨崲
```
---
## 鉁?闃舵<E99783>0瀹屾垚鏍囧噯
- [x] 娣卞叆鐞嗚ВPKB鐨勪袱涓<E8A2B1>儴鍒?
- [x] 鍒楀嚭鎵€鏈堿PI绔<49>
- [x] 鐞嗚В鏁版嵁搴揝chema
- [x] 鐞嗚В涓夌<E6B693>宸ヤ綔妯″紡
- [x] 鐞嗚В妯″潡闂翠緷璧?
- [ ] 鍒涘缓娴嬭瘯鐢ㄤ緥娓呭崟
- [ ] 鍑嗗<E98D91>娴嬭瘯鏁版嵁
---
## 馃搳 涓嬩竴姝ワ細鍒涘缓娴嬭瘯鐢ㄤ緥
鍗冲皢鍒涘缓璇︾粏鐨勬祴璇曠敤渚嬫竻鍗曪紝瑕嗙洊鎵€鏈夊姛鑳界偣...
---
**瀹℃煡鐘舵€侊細** 馃煛 杩涜<E69DA9><EFBFBD>紙90%瀹屾垚锛?
**涓嬩竴姝ワ細** 鍒涘缓娴嬭瘯鐢ㄤ緥娓呭崟鍜屾祴璇曟暟鎹<E69A9F>噯澶囨柟妗?