Files
AIclinicalresearch/docs/08-项目管理/PKB功能审查报告-阶段0.md
HaHafeng 1b53ab9d52 feat(aia): Complete AIA V2.0 with universal streaming capabilities
Major Changes:
- Add StreamingService with OpenAI Compatible format
- Upgrade Chat component V2 with Ant Design X integration
- Implement AIA module with 12 intelligent agents
- Update API routes to unified /api/v1 prefix
- Update system documentation

Backend (~1300 lines):
- common/streaming: OpenAI Compatible adapter
- modules/aia: 12 agents, conversation service, streaming integration
- Update route versions (RVW, PKB to v1)

Frontend (~3500 lines):
- modules/aia: AgentHub + ChatWorkspace (100% prototype restoration)
- shared/Chat: AIStreamChat, ThinkingBlock, useAIStream Hook
- Update API endpoints to v1

Documentation:
- AIA module status guide
- Universal capabilities catalog
- System overview updates
- All module documentation sync

Tested: Stream response verified, authentication working
Status: AIA V2.0 core completed (85%)
2026-01-14 19:15:01 +08:00

21 KiB
Raw Blame History

PKB涓<EFBFBD>汉鐭ヨ瘑搴撳姛鑳藉<EFBFBD>鏌ユ姤鍛?- 闃舵<E99783>0

*瀹℃煡鏃ユ湡锛? 2026-01-06
*瀹℃煡浜哄憳锛? AI鍔╂墜
*瀹℃煡鐩<EFBFBD>爣锛? 娣卞叆鐞嗚ВPKB鐜版湁鍔熻兘锛屼负瀹夊叏杩佺Щ鍋氬噯澶? 鐘舵€侊細 鉁?杩涜<E69DA9>涓?


馃搵 鎵ц<E98EB5>鎽樿<E98EBD>

鍏抽敭鍙戠幇

馃幆 PKB绯荤粺瀹為檯涓婃槸涓や釜绱у瘑鍏宠仈鐨勫姛鑳芥ā鍧楋細

Part 1: PKB鐭ヨ瘑搴撶<E690B4>鐞嗘ā鍧?
鈹溾攢 浣嶇疆锛歜ackend/src/legacy/controllers/knowledgeBaseController.ts
鈹溾攢 鍔熻兘锛氬垱寤恒€佺紪杈戙€佸垹闄ょ煡璇嗗簱锛涗笂浼犮€佺<E282AC>鐞嗘枃妗?
鈹斺攢 鏁版嵁搴擄細pkb_schema锛堢嫭绔婼chema锛屾棤闇€杩佺Щ锛?

Part 2: AIA鏅鸿兘闂<E58598>瓟妯″潡涓<E6BDA1>殑PKB搴旂敤
鈹溾攢 浣嶇疆锛歜ackend/src/legacy/controllers/chatController.ts
鈹溾攢 鍔熻兘锛氫娇鐢ㄧ煡璇嗗簱杩涜<E69DA9>鏅鸿兘闂<E58598>瓟锛?绉嶅伐浣滄ā寮忥級
鈹斺攢 宸ヤ綔妯″紡锛?
    鈹溾攢 鍏ㄦ枃闃呰<E99783>妯″紡锛?5-50绡囨枃鐚<E69E83>患鍚堝垎鏋愶級
    鈹溾攢 閫愮瘒绮捐<E7BBAE>妯″紡锛?-5绡囨枃鐚<E69E83>繁搴﹀垎鏋愶級
    鈹斺攢 鎵瑰<E98EB5>鐞嗘ā寮忥紙3-50绡囨枃鐚<E69E83>壒閲忔彁鍙栵級

馃搳 Part 1: PKB鐭ヨ瘑搴撶<E690B4>鐞嗘ā鍧?

1.1 鏂囦欢缁撴瀯

backend/src/legacy/
鈹溾攢 controllers/
鈹? 鈹溾攢 knowledgeBaseController.ts    # API鎺у埗鍣<E59F97>紙342琛岋級
鈹? 鈹斺攢 documentController.ts         # 鏂囨。涓婁紶鎺у埗鍣?
鈹溾攢 services/
鈹? 鈹溾攢 knowledgeBaseService.ts       # 涓氬姟閫昏緫锛?65琛岋級
鈹? 鈹溾攢 documentService.ts            # 鏂囨。澶勭悊鏈嶅姟
鈹? 鈹斺攢 tokenService.ts               # Token璁畻鍜屾枃妗€夋嫨
鈹斺攢 routes/
   鈹斺攢 knowledgeBases.ts             # 璺<>敱瀹氫箟

1.2 鏍稿績API绔<49>

鐭ヨ瘑搴撶<EFBFBD>鐞咥PI

// 1. 鍒涘缓鐭ヨ瘑搴?
POST /api/v1/knowledge/create
Body: { name: string, description?: string }
閫昏緫锛?
  鈹溾攢 妫€鏌ョ敤鎴烽厤棰濓紙kbQuota vs kbUsed锛?
  鈹溾攢 鍦―ify鍒涘缓Dataset
  鈹溾攢 鍦ㄦ暟鎹<EFBFBD>簱鍒涘缓璁板綍
  鈹斺攢 鏇存柊鐢ㄦ埛閰嶉<EFBFBD>璁℃暟

// 2. 鑾峰彇鐭ヨ瘑搴撳垪琛?
GET /api/v1/knowledge/list
杩斿洖锛氱敤鎴锋墍鏈夌煡璇嗗簱 + 鏂囨。鏁伴噺缁熻<EFBFBD>

// 3. 鑾峰彇鐭ヨ瘑搴撹<E690B4>鎯?
GET /api/v1/knowledge/:id
杩斿洖锛氱煡璇嗗簱淇℃伅 + 鎵€鏈夋枃妗e垪琛?

// 4. 鏇存柊鐭ヨ瘑搴?
PUT /api/v1/knowledge/:id
Body: { name?: string, description?: string }

// 5. 鍒犻櫎鐭ヨ瘑搴?
DELETE /api/v1/knowledge/:id
閫昏緫锛?
  鈹溾攢 鍒犻櫎Dify Dataset
  鈹溾攢 绾ц仈鍒犻櫎鏁版嵁搴撹<EFBFBD>褰曪紙documents鑷<EFBFBD>姩鍒犻櫎锛?
  鈹斺攢 鍑忓皯鐢ㄦ埛閰嶉<EFBFBD>璁℃暟

// 6. 妫€绱㈢煡璇嗗簱锛圧AG锛?
GET /api/v1/knowledge/:id/search?query=xxx&top_k=15
閫昏緫锛?
  鈹溾攢 楠岃瘉鏉冮檺
  鈹溾攢 璋冪敤Dify retrieveKnowledge API
  鈹斺攢 杩斿洖妫€绱㈢粨鏋滐紙榛樿<EFBFBD>15<EFBFBD>墖娈碉級

// 7. 鑾峰彇鐭ヨ瘑搴撶粺璁?
GET /api/v1/knowledge/:id/stats
杩斿洖锛氭枃妗暟銆佸畬鎴愭暟銆佸<EFBFBD>鐞嗕腑銆侀敊璇<EFBFBD>暟銆佹€籘oken鏁?

// 8. 鑾峰彇鏂囨。閫夋嫨锛堝叏鏂囬槄璇绘ā寮忥級
GET /api/v1/knowledge/:id/document-selection?max_files=7&max_tokens=750000
杩斿洖锛氭櫤鑳介€夋嫨鐨勬枃妗垪琛<EFBFBD>紙鍩轰簬Token闄愬埗锛?

鏂囨。绠悊API

// 9. 涓婁紶鏂囨。
POST /api/v1/documents/upload
Multipart: { file, kbId }
閫昏緫锛?
  鈹溾攢 涓婁紶鏂囦欢鍒癘SS
  鈹溾攢 鎻愬彇鏂囨湰锛圥DF/Word/TXT/Markdown锛?
  鈹溾攢 涓婁紶鍒癉ify杩涜<EFBFBD>绱㈠紩
  鈹斺攢 鍒涘缓鏁版嵁搴撹<EFBFBD>褰曪紙鐘舵€侊細uploading鈫抪arsing鈫抜ndexing鈫抍ompleted锛?

// 10. 鑾峰彇鏂囨。璇︽儏
GET /api/v1/documents/:id

// 11. 鍒犻櫎鏂囨。
DELETE /api/v1/documents/:id
閫昏緫锛?
  鈹溾攢 浠嶥ify鍒犻櫎Document
  鈹溾攢 浠嶰SS鍒犻櫎鏂囦欢
  鈹斺攢 鍒犻櫎鏁版嵁搴撹<EFBFBD>?

1.3 鏁版嵁搴揝chema

琛ㄧ粨鏋勶紙鍦╬kb_schema涓<EFBFBD>

-- 鐭ヨ瘑搴撹〃
knowledge_bases
鈹溾攢 id (UUID, PK)
鈹溾攢 userId (String)
鈹溾攢 name (String)
鈹溾攢 description (String?)
鈹溾攢 difyDatasetId (String, UNIQUE) -- Dify涓<79>殑Dataset ID
鈹溾攢 fileCount (Int, default: 0)
鈹溾攢 totalSizeBytes (BigInt, default: 0)
鈹溾攢 createdAt (DateTime)
鈹斺攢 updatedAt (DateTime)

-- 鏂囨。琛?
documents
鈹溾攢 id (UUID, PK)
鈹溾攢 kbId (String, FK ?knowledge_bases.id)
鈹溾攢 userId (String)
鈹溾攢 filename (String)
鈹溾攢 fileType (String) -- pdf/docx/txt/md
鈹溾攢 fileSizeBytes (BigInt)
鈹溾攢 fileUrl (String) -- OSS URL
鈹溾攢 difyDocumentId (String) -- Dify涓<79>殑Document ID
鈹溾攢 status (String) -- uploading/parsing/indexing/completed/error
鈹溾攢 progress (Int, 0-100)
鈹溾攢 errorMessage (String?)
鈹溾攢 segmentsCount (Int?) -- Dify绱㈠紩鐨勭墖娈垫暟
鈹溾攢 tokensCount (Int?) -- 鎬籘oken鏁?
鈹溾攢 charCount (Int?) -- 瀛楃<E7809B>鏁?
鈹溾攢 language (String?)
鈹溾攢 extractedText (String?) -- 鎻愬彇鐨勫叏鏂囷紙鐢ㄤ簬鍏ㄦ枃闃呰<E99783>妯″紡锛?
鈹溾攢 extractionMethod (String?) -- marker/pymupdf/docx
鈹溾攢 extractionQuality (Float?)
鈹溾攢 uploadedAt (DateTime)
鈹斺攢 processedAt (DateTime?)

-- 鎵瑰<E98EB5>鐞嗕换鍔¤〃
batch_tasks
鈹溾攢 id (UUID, PK)
鈹溾攢 userId (String)
鈹溾攢 kbId (String, FK ?knowledge_bases.id)
鈹溾攢 name (String)
鈹溾攢 templateType (String)
鈹溾攢 templateId (String?)
鈹溾攢 prompt (String)
鈹溾攢 status (String) -- pending/running/completed/failed
鈹溾攢 totalDocuments (Int)
鈹溾攢 completedCount (Int, default: 0)
鈹溾攢 failedCount (Int, default: 0)
鈹溾攢 modelType (String)
鈹溾攢 concurrency (Int, default: 3)
鈹溾攢 startedAt (DateTime?)
鈹溾攢 completedAt (DateTime?)
鈹溾攢 durationSeconds (Int?)
鈹溾攢 createdAt (DateTime)
鈹斺攢 updatedAt (DateTime)

-- 鎵瑰<E98EB5>鐞嗙粨鏋滆〃
batch_results
鈹溾攢 id (UUID, PK)
鈹溾攢 taskId (String, FK ?batch_tasks.id)
鈹溾攢 documentId (String, FK ?documents.id)
鈹溾攢 status (String) -- success/failed
鈹溾攢 data (Json?) -- 鎻愬彇鐨勭粨鏋勫寲鏁版嵁
鈹溾攢 rawOutput (String?) -- LLM鍘熷<E98D98>杈撳嚭
鈹溾攢 errorMessage (String?)
鈹溾攢 processingTimeMs (Int?)
鈹溾攢 tokensUsed (Int?)
鈹斺攢 createdAt (DateTime)

-- 浠诲姟妯℃澘琛?
task_templates
鈹溾攢 id (UUID, PK)
鈹溾攢 userId (String)
鈹溾攢 name (String)
鈹溾攢 description (String?)
鈹溾攢 prompt (String)
鈹溾攢 isPublic (Boolean, default: false)
鈹溾攢 outputFields (Json) -- 鏈熸湜鐨勮緭鍑哄瓧娈?
鈹溾攢 createdAt (DateTime)
鈹斺攢 updatedAt (DateTime)

绱㈠紩

-- knowledge_bases
idx_pkb_knowledge_bases_user_id (userId)
idx_pkb_knowledge_bases_dify_dataset_id (difyDatasetId)

-- documents
idx_pkb_documents_kb_id (kbId)
idx_pkb_documents_user_id (userId)
idx_pkb_documents_status (status)
idx_pkb_documents_dify_document_id (difyDocumentId)
idx_pkb_documents_extraction_method (extractionMethod)

-- batch_tasks
idx_pkb_batch_tasks_kb_id (kbId)
idx_pkb_batch_tasks_user_id (userId)
idx_pkb_batch_tasks_status (status)
idx_pkb_batch_tasks_created_at (createdAt)

-- batch_results
idx_pkb_batch_results_task_id (taskId)
idx_pkb_batch_results_document_id (documentId)
idx_pkb_batch_results_status (status)

1.4 鍏抽敭涓氬姟閫昏緫

閰嶉<EFBFBD>

// 鐢ㄦ埛琛<E59F9B>紙鍦╬latform_schema.users锛変腑鐨勫瓧娈?
kbQuota: Int @default(3)   // 鐭ヨ瘑搴撻厤棰?
kbUsed: Int @default(0)    // 宸蹭娇鐢ㄦ暟閲?

// 鍒涘缓鐭ヨ瘑搴撴椂妫€鏌?
if (user.kbUsed >= user.kbQuota) {
  throw new Error('閰嶉<E996B0>宸叉弧');
}

// 鍒涘缓鎴愬姛鍚庡<E98D9A>鍔犺<E98D94>鏁?
await prisma.user.update({
  data: { kbUsed: { increment: 1 } }
});

// 鍒犻櫎鐭ヨ瘑搴撴椂鍑忓皯璁℃暟
await prisma.user.update({
  data: { kbUsed: { decrement: 1 } }
});

Dify闆嗘垚

// 鍒涘缓鐭ヨ瘑搴?鈫?鍒涘缓Dify Dataset
const difyDataset = await difyClient.createDataset({
  name: `${userId}_${name}_${Date.now()}`,
  description,
  indexing_technique: 'high_quality',
});

// 妫€绱㈢煡璇嗗簱 鈫?璋冪敤Dify RAG
const results = await difyClient.retrieveKnowledge(
  difyDatasetId,
  query,
  {
    retrieval_model: {
      search_method: 'semantic_search',
      top_k: 15,
    },
  }
);

鏂囨。Token璁畻锛坱okenService.ts锛?

// Token璁畻瑙勫垯
const TOKEN_LIMITS = {
  MAX_FILES: 7,           // 鏈€澶?绡囨枃鐚?
  MAX_TOTAL_TOKENS: 750000, // 鎬籘oken闄愬埗锛圦wen-Long: 1M涓婁笅鏂?- 250K瀵硅瘽绌洪棿锛?
  MAX_SINGLE_DOC_TOKENS: 200000, // 鍗曠瘒鏂囩尞鏈€澶<E282AC>oken鏁?
};

// 鏅鸿兘閫夋嫨绠楁硶
function selectDocumentsForFullText(
  documentTokens,
  maxFiles,
  maxTokens
) {
  // 鎸塗oken鏁板崌搴忔帓搴?
  const sorted = documentTokens.sort((a, b) => a.tokens - b.tokens);
  
  // 璐<>績绠楁硶閫夋嫨
  let totalTokens = 0;
  let selectedCount = 0;
  const selected = [];
  
  for (const doc of sorted) {
    if (selectedCount >= maxFiles) break;
    if (totalTokens + doc.tokens > maxTokens) break;
    if (doc.tokens > MAX_SINGLE_DOC_TOKENS) continue; // 璺宠繃瓒呭ぇ鏂囨。
    
    selected.push(doc);
    totalTokens += doc.tokens;
    selectedCount++;
  }
  
  return { selected, totalTokens, excludedDocs };
}

馃搳 Part 2: AIA妯″潡涓<E6BDA1>殑PKB搴旂敤

2.1 鏂囦欢缁撴瀯

backend/src/legacy/controllers/
鈹斺攢 chatController.ts    # 閫氱敤瀵硅瘽鎺у埗鍣<E59F97>紙鍖呭惈3绉嶆ā寮忥級

frontend/src/
鈹溾攢 pages/ChatPage.tsx   # 涓诲<E6B693>璇濋〉闈?
鈹斺攢 components/
   鈹溾攢 FullTextMode.tsx    # 鍏ㄦ枃闃呰<E99783>妯″紡缁勪欢
   鈹溾攢 DeepReadMode.tsx    # 閫愮瘒绮捐<E7BBAE>妯″紡缁勪欢
   鈹斺攢 BatchMode.tsx       # 鎵瑰<E98EB5>鐞嗘ā寮忕粍浠?

2.2 涓夌<E6B693>宸ヤ綔妯″紡璇﹁В

妯″紡1锛氬叏鏂囬槄璇绘ā寮忥紙Full Text Mode锛?

**鐢ㄩ€?*锛?5-50绡囨枃鐚<E69E83>殑缁煎悎鍒嗘瀽

*瀹炵幇鍘熺悊锛?

// 1. 鍓嶇<E98D93>锛氱敤鎴疯繘鍏ョ煡璇嗗簱妯″紡 鈫?閫夋嫨"鍏ㄦ枃闃呰<E99783>"
const modeState = {
  baseMode: 'knowledge_base',
  kbMode: 'full_text',
  selectedKbId: 'xxx',
};

// 2. 鍓嶇<E98D93>锛氭櫤鑳藉姞杞芥枃鐚?
const selection = await knowledgeBaseApi.getDocumentSelection(kbId, {
  max_files: 7,
  max_tokens: 750000,
});
// 杩斿洖锛歿 selectedDocuments[], excludedDocuments[], totalTokens }

// 3. 鍓嶇<E98D93>锛氳嚜鍔ㄥ垏鎹㈠埌Qwen-Long妯″瀷
if (modeState.kbMode === 'full_text') {
  setSelectedModel('qwen-long'); // 1M涓婁笅鏂?
  showToast('宸茶嚜鍔ㄥ垏鎹㈠埌Qwen-Long妯″瀷锛堟敮鎸?M涓婁笅鏂囷級');
}

// 4. 鍓嶇<E98D93>锛氬彂閫佹秷鎭<E7A7B7>椂浼犻€掓枃妗<E69E83>D鍒楄〃
await chatApi.sendMessageStream({
  content: userQuestion,
  modelType: 'qwen-long',
  fullTextDocumentIds: loadedDocs.map(d => d.id), // 鉁?鍏抽敭鍙傛暟
  conversationId,
});

// 5. 鍚庣<E98D9A>锛氬姞杞藉畬鏁村叏鏂?
if (fullTextDocumentIds && fullTextDocumentIds.length > 0) {
  const documents = await prisma.document.findMany({
    where: { id: { in: fullTextDocumentIds } },
    select: { id, filename, extractedText, tokensCount },
  });
  
  // 6. 缁勮<E7BC81>鍏ㄦ枃涓婁笅鏂?
  const fullTextParts = [];
  for (let i = 0; i < documents.length; i++) {
    const doc = documents[i];
    const docNumber = i + 1;
    
    // 鏍煎紡锛氥€愭枃鐚甆锛氭枃浠跺悕銆慭n鍏ㄦ枃鍐呭<E98D90>
    fullTextParts.push(
      `銆愭枃鐚?{docNumber}锛?{doc.filename}銆慭n\n${doc.extractedText}`
    );
    
    // 娣诲姞寮曠敤淇℃伅
    allCitations.push({
      id: docNumber,
      fileName: doc.filename,
      score: 1.0, // 鍏ㄦ枃鐩稿叧搴?00%
      content: doc.extractedText.substring(0, 200),
    });
  }
  
  knowledgeBaseContext = fullTextParts.join('\n\n---\n\n');
}

// 7. 浼犻€掔粰LLM
const systemPrompt = '浣犳槸涓撲笟鐨勫<E990A8><EFBFBD>枃鐚<E69E83>垎鏋愬姪鎵嬨€傛瘡绡囨枃鐚<E69E83>敤銆愭枃鐚甆锛氭枃浠跺悕銆戞爣璁般€傝<E282AC>璁ょ湡闃呰<E99783>鎵€鏈夋枃鐚<E69E83>紝杩涜<E69DA9>娣卞叆鐨勭患鍚堝垎鏋愩€傚湪鍥炵瓟鏃惰<E98F83>寮曠敤鍏蜂綋鏂囩尞锛屼娇鐢ㄣ€愭枃鐚甆銆戞牸寮忋€?;

const userContent = `${userQuestion}\n\n## 鍙傝€冭祫鏂欙紙鏂囩尞鍏ㄦ枃锛塡n\n${knowledgeBaseContext}`;

const messages = [
  { role: 'system', content: systemPrompt },
  ...historyMessages, // 瀵硅瘽鍘嗗彶
  { role: 'user', content: userContent },
];

// 8. 璋冪敤Qwen-Long
const response = await LLMFactory.getAdapter('qwen-long').chatStream(messages, {
  temperature: 0.7,
  maxTokens: 6000, // 鍏ㄦ枃妯″紡闇€瑕佹洿闀跨殑鍥炵瓟绌洪棿
});

*鍏抽敭鐗圭偣锛?

  • 鉁?浼犻€掑畬鏁村叏鏂囷紙涓嶆槸RAG鐗囨<E99097>锛?
  • 鉁?鏅鸿兘閫夋嫨鏂囩尞锛堝熀浜嶵oken闄愬埗锛?
  • 鉁?鏂囩尞鏉ユ簮鏍囪<E98F8D>锛氥€愭枃鐚甆锛氭枃浠跺悕銆?
  • 鉁?鑷<>姩鍒囨崲鍒癚wen-Long妯″瀷锛?M涓婁笅鏂囷級
  • 鉁?100%鐩稿叧搴︼紙鍥犱负鏄<E8B49F>叏鏂囷級
  • 鉁?閫傚悎璺ㄦ枃鐚<E69E83>瘮杈冦€佽秼鍔垮垎鏋愩€佺爺绌舵柟娉曞綊绾?

*Token浣跨敤锛?

涓婁笅鏂囷細~750K tokens锛?绡囨枃鐚<E69E83>叏鏂囷級
瀵硅瘽绌洪棿锛殈250K tokens
杈撳嚭闀垮害锛?000 tokens锛堢患鍚堝垎鏋愰渶瑕佹洿闀垮洖绛旓級

妯″紡2锛氶€愮瘒绮捐<EFBFBD>妯″紡锛圖eep Read Mode锛?

**鐢ㄩ€?*锛?-5绡囨枃鐚<E69E83>殑娣卞害鍒嗘瀽

*瀹炵幇鍘熺悊锛?

// 1. 鍓嶇<E98D93>锛氱敤鎴烽€夋嫨"閫愮瘒绮捐<E7BBAE>"
const modeState = {
  baseMode: 'knowledge_base',
  kbMode: 'deep_read',
  selectedKbId: 'xxx',
};

// 2. 鍓嶇<E98D93>锛氱敤鎴烽€夋嫨瑕佺簿璇荤殑鏂囨。
const selectedDocs = [doc1, doc2, doc3]; // 鐢ㄦ埛鎵嬪姩閫夋嫨

// 3. 鍓嶇<E98D93>锛氬垏鎹㈠埌鏌愪釜鏂囨。
const currentDoc = selectedDocs[0];

// 4. 鍓嶇<E98D93>锛氬彂閫佹秷鎭<E7A7B7>椂浼犻€掑綋鍓嶆枃妗<E69E83>D锛堢敤浜嶳AG杩囨护锛?
await chatApi.sendMessageStream({
  content: userQuestion,
  modelType: selectedModel,
  knowledgeBaseIds: [kbId], // 鐭ヨ瘑搴揑D
  documentIds: [currentDoc.id], // 鉁?鍏抽敭锛氬彧妫€绱㈠綋鍓嶆枃妗?
  conversationId: currentDocConversationId, // 姣忎釜鏂囨。鐙<E38082>珛瀵硅瘽
});

// 5. 鍚庣<E98D9A>锛歊AG妫€绱<E282AC>紙闄愬畾鍦ㄧ壒瀹氭枃妗if (documentIds && documentIds.length > 0) {
  // 璋冪敤Dify RAG锛屼絾浼氶檺瀹氬湪鎸囧畾鏂囨。鑼冨洿
  const results = await difyClient.retrieveKnowledge(
    difyDatasetId,
    query,
    {
      retrieval_model: {
        search_method: 'semantic_search',
        top_k: 15,
        document_ids: documentIds, // 鉁?Dify浼氬彧妫€绱㈣繖浜涙枃妗?
      },
    }
  );
}

*鍏抽敭鐗圭偣锛?

  • 鉁?鍩轰簬RAG妫€绱<E282AC>紙涓嶆槸鍏ㄦ枃锛?
  • 鉁?闄愬畾鍦ㄥ綋鍓嶆枃妗h寖鍥?
  • 鉁?姣忎釜鏂囨。鏈夌嫭绔嬬殑瀵硅瘽鍘嗗彶
  • 鉁?鐢ㄦ埛鍙<E59F9B>互鍦ㄦ枃妗棿鍒囨崲
  • 鉁?閫傚悎娣卞害鐞嗚В鍗曠瘒鏂囩尞

妯″紡3锛氭壒澶勭悊妯″紡锛圔atch Mode锛?

**鐢ㄩ€?*锛?-50绡囨枃鐚<E69E83>殑鎵归噺淇℃伅鎻愬彇

*瀹炵幇鍘熺悊锛?

// 1. 鐢ㄦ埛鍒涘缓鎵瑰<E98EB5>鐞嗕换鍔?
POST /api/v1/batch-tasks/create
Body: {
  kbId: 'xxx',
  name: '鎻愬彇鐮旂┒鏂规硶',
  prompt: '璇蜂粠杩欑瘒鏂囩尞涓<E5B09E>彁鍙栵細鐮旂┒璁捐<E79281>銆佹牱鏈<E789B1>噺銆佺粺璁℃柟娉?,
  templateType: 'custom' | 'preset',
  modelType: 'deepseek-v3',
  concurrency: 3, // 骞跺彂鏁?
}

// 2. 鍚庣<E98D9A>锛氬垱寤轰换鍔?
const task = await prisma.batchTask.create({
  data: {
    userId,
    kbId,
    name,
    prompt,
    templateType,
    modelType,
    status: 'pending',
    totalDocuments: documentsCount,
    concurrency,
  },
});

// 3. 鍚庣<E98D9A>锛氬惎鍔ㄦ壒澶勭悊Worker
async function processBatchTask(taskId) {
  // 3.1 鑾峰彇浠诲姟鍜屾枃妗e垪琛?
  const task = await prisma.batchTask.findUnique({
    where: { id: taskId },
    include: { knowledgeBase: { include: { documents: true } } },
  });
  
  const documents = task.knowledgeBase.documents.filter(d => d.status === 'completed');
  
  // 3.2 鏇存柊浠诲姟鐘舵€?
  await prisma.batchTask.update({
    where: { id: taskId },
    data: { status: 'running', startedAt: new Date() },
  });
  
  // 3.3 骞跺彂澶勭悊鏂囨。
  const concurrency = task.concurrency || 3;
  const chunks = chunkArray(documents, concurrency);
  
  for (const chunk of chunks) {
    await Promise.all(chunk.map(async (doc) => {
      try {
        // 3.3.1 瀵规瘡涓<E798A1>枃妗紝浣跨敤鍏秂xtractedText + prompt璋冪敤LLM
        const llmPrompt = `${task.prompt}\n\n鏂囩尞鍐呭<E98D90>锛歕n${doc.extractedText}`;
        
        const response = await LLMFactory.getAdapter(task.modelType).chat([
          { role: 'user', content: llmPrompt },
        ]);
        
        // 3.3.2 瑙瀽LLM杈撳嚭锛堟湡鏈汮SON鏍煎紡锛?
        const data = parseJSONResponse(response.content);
        
        // 3.3.3 淇濆瓨缁撴灉
        await prisma.batchResult.create({
          data: {
            taskId: task.id,
            documentId: doc.id,
            status: 'success',
            data,
            rawOutput: response.content,
            tokensUsed: response.usage.totalTokens,
            processingTimeMs: Date.now() - startTime,
          },
        });
        
        // 3.3.4 鏇存柊浠诲姟杩涘害
        await prisma.batchTask.update({
          where: { id: taskId },
          data: { completedCount: { increment: 1 } },
        });
        
      } catch (error) {
        // 3.3.5 澶勭悊澶辫触
        await prisma.batchResult.create({
          data: {
            taskId: task.id,
            documentId: doc.id,
            status: 'failed',
            errorMessage: error.message,
          },
        });
        
        await prisma.batchTask.update({
          where: { id: taskId },
          data: { failedCount: { increment: 1 } },
        });
      }
    }));
  }
  
  // 3.4 浠诲姟瀹屾垚
  await prisma.batchTask.update({
    where: { id: taskId },
    data: {
      status: 'completed',
      completedAt: new Date(),
      durationSeconds: Math.floor((Date.now() - task.startedAt) / 1000),
    },
  });
}

// 4. 鍓嶇<E98D93>锛氭煡鐪嬫壒澶勭悊缁撴灉
GET /api/v1/batch-tasks/:id/results
杩斿洖锛?
{
  task: { /* 浠诲姟淇℃伅 */ },
  results: [
    {
      documentId: 'xxx',
      filename: 'paper1.pdf',
      status: 'success',
      data: {
        鐮旂┒璁捐<E79281>: '闅忔満瀵圭収璇曢獙',
        鏍锋湰閲? '300?,
        缁熻<EFBFBD>鏂规硶: 't妫€楠屻€佸崱鏂规<EFBFBD>?,
      },
    },
    // ...
  ],
}

// 5. 鍓嶇<E98D93>锛氬<E9949B>鍑虹粨鏋滐紙Excel/CSV锛?

*鍏抽敭鐗圭偣锛?

  • 鉁?鎵归噺澶勭悊澶氫釜鏂囨。
  • 鉁?骞跺彂鎺у埗锛堥粯璁?涓<>苟鍙戯級
  • 鉁?缁撴瀯鍖栦俊鎭<E4BF8A>彁鍙?
  • 鉁?杩涘害瀹炴椂鏇存柊
  • 鉁?鏀<>寔鑷<E5AF94>畾涔夋ā鏉?
  • 鉁?缁撴灉鍙<E78189><E98D99>鍑猴紙Excel/CSV锛?
  • 鉁?閿欒<E996BF>澶勭悊鍜岄噸璇?

2.3 涓夌<E6B693>妯″紡鐨勫<E990A8>姣?

缁村害 鍏ㄦ枃闃呰<EFBFBD> 閫愮瘒绮捐<EFBFBD> 鎵瑰<EFBFBD>鐞?
鏂囨。鏁伴噺 7绡囧乏鍙? 1-5绡? 3-50绡?
鏁版嵁鏉ユ簮 瀹屾暣鍏ㄦ枃 RAG妫€绱㈢墖娈? 瀹屾暣鍏ㄦ枃
LLM璋冪敤 瀵硅瘽寮忥紙澶氳疆锛? 瀵硅瘽寮忥紙澶氳疆锛? 鎵归噺锛堝崟娆★級
*涓婁笅鏂? ~750K tokens ~15K tokens 鍗曠瘒鍏ㄦ枃
杈撳嚭鏂瑰紡 娴佸紡锛圫SE锛? 娴佸紡锛圫SE锛? 鎵归噺淇濆瓨
閫傜敤鍦烘櫙 缁煎悎鍒嗘瀽銆佽法鏂囩尞姣旇緝 娣卞害鐞嗚В鍗曠瘒 淇℃伅鎻愬彇銆佹暟鎹<EFBFBD>〃鏍?
鐢ㄦ埛浜や簰 瀹炴椂闂<EFBFBD> 瀹炴椂闂<EFBFBD> 鍚庡彴澶勭悊
瀵硅瘽鍘嗗彶 鍏ㄥ眬鍏变韩 姣忕瘒鐙<EFBFBD> 鏃犲<EFBFBD>璇?

馃搵 API绔<49>偣瀹屾暣娓呭崟

PKB绠悊妯″潡API

POST   /api/v1/knowledge/create              # 鍒涘缓鐭ヨ瘑搴?
GET    /api/v1/knowledge/list                # 鑾峰彇鐭ヨ瘑搴撳垪琛?
GET    /api/v1/knowledge/:id                 # 鑾峰彇鐭ヨ瘑搴撹<E690B4>鎯?
PUT    /api/v1/knowledge/:id                 # 鏇存柊鐭ヨ瘑搴?
DELETE /api/v1/knowledge/:id                 # 鍒犻櫎鐭ヨ瘑搴?
GET    /api/v1/knowledge/:id/search          # RAG妫€绱?
GET    /api/v1/knowledge/:id/stats           # 缁熻<E7BC81>淇℃伅
GET    /api/v1/knowledge/:id/document-selection  # 鏂囨。閫夋嫨锛堝叏鏂囨ā寮忥級

POST   /api/v1/documents/upload              # 涓婁紶鏂囨。
GET    /api/v1/documents/:id                 # 鑾峰彇鏂囨。璇︽儏
DELETE /api/v1/documents/:id                 # 鍒犻櫎鏂囨。
GET    /api/v1/documents/:id/content         # 鑾峰彇鏂囨。鍐呭<E98D90>锛堝叏鏂囷級

POST   /api/v1/batch-tasks/create            # 鍒涘缓鎵瑰<E98EB5>鐞嗕换鍔?
GET    /api/v1/batch-tasks/list              # 鑾峰彇鎵瑰<E98EB5>鐞嗕换鍔″垪琛?
GET    /api/v1/batch-tasks/:id               # 鑾峰彇浠诲姟璇︽儏
GET    /api/v1/batch-tasks/:id/results       # 鑾峰彇浠诲姟缁撴灉
DELETE /api/v1/batch-tasks/:id               # 鍒犻櫎浠诲姟

GET    /api/v1/task-templates/list           # 鑾峰彇妯℃澘鍒楄〃
POST   /api/v1/task-templates/create         # 鍒涘缓妯℃澘
DELETE /api/v1/task-templates/:id            # 鍒犻櫎妯℃澘

AIA瀵硅瘽妯″潡API锛堝惈PKB闆嗘垚锛?

POST   /api/v1/chat/send-message-stream      # 鍙戦€佹秷鎭<E7A7B7>紙娴佸紡锛?
鍙傛暟锛?
  - content: string
  - modelType: 'deepseek-v3' | 'qwen3-72b' | 'qwen-long'
  - knowledgeBaseIds?: string[]          # RAG妯″紡
  - documentIds?: string[]               # 閫愮瘒绮捐<E7BBAE>妯″紡锛堥檺瀹氭枃妗級
  - fullTextDocumentIds?: string[]       # 鍏ㄦ枃闃呰<E99783>妯″紡锛堜紶閫掑叏鏂囷級
  - conversationId?: string

GET    /api/v1/chat/conversations            # 鑾峰彇瀵硅瘽鍒楄〃
GET    /api/v1/chat/conversations/:id        # 鑾峰彇瀵硅瘽鍘嗗彶
DELETE /api/v1/chat/conversations/:id        # 鍒犻櫎瀵硅瘽

馃敆 妯″潡闂翠緷璧栧叧绯?

AIA鏅鸿兘闂<EFBFBD>瓟妯″潡
鈹?
鈹溾攢 渚濊禆 PKB鐭ヨ瘑搴撶<E690B4>鐞嗘ā鍧?
鈹? 鈹溾攢 鑾峰彇鐭ヨ瘑搴撳垪琛<E59EAA>紙閫夋嫨鐭ヨ瘑搴擄級
鈹? 鈹溾攢 鑾峰彇鏂囨。鍒楄〃锛堥€夋嫨鏂囨。锛?
鈹? 鈹溾攢 鑾峰彇鏂囨。鍏ㄦ枃锛堝叏鏂囬槄璇伙級
鈹? 鈹溾攢 RAG妫€绱<E282AC>紙閫愮瘒绮捐<E7BBAE>锛?
鈹? 鈹斺攢 鏂囨。鏅鸿兘閫夋嫨锛堝叏鏂囬槄璇伙級
鈹?
鈹溾攢 渚濊禆 LLM缃戝叧
鈹? 鈹溾攢 DeepSeek V3
鈹? 鈹溾攢 Qwen3-72B
鈹? 鈹斺攢 Qwen-Long
鈹?
鈹斺攢 渚濊禆 Dify RAG寮曟搸
   鈹斺攢 retrieveKnowledge API

馃幆 杩佺Щ鍏抽敭鐐?

1. PKB妯″潡杩佺Щ

鉁?绠€鍗曪細
  - 鏁版嵁搴撳凡鍦╬kb_schema锛屾棤闇€杩佺Щ
  - API绔<49>偣娓呮櫚锛屾槗浜庡<E6B59C>鍒?
  - 涓氬姟閫昏緫鐙<E7B7AB>珛

鈿狅笍 娉ㄦ剰锛?
  - Dify闆嗘垚闇€瑕佷繚鎸?
  - OSS鏂囦欢涓婁紶闇€瑕佷繚鎸?
  - 閰嶉<E996B0>悊闇€瑕佷繚鎸?

2. AIA妯″潡涓<E6BDA1>殑PKB闆嗘垚杩佺Щ

鉁?绠€鍗曪細
  - 鎺ュ彛娓呮櫚锛坒ullTextDocumentIds/documentIds锛?
  - 涓夌<E6B693>妯″紡閫昏緫鐙<E7B7AB>珛

鈿狅笍 娉ㄦ剰锛?
  - chatController.ts闇€瑕佸悓鏃惰縼绉?
  - 鍓嶇<E98D93>3涓<33>ā寮忕粍浠堕渶瑕佽縼绉?
  - 瀵硅瘽鍘嗗彶绠$悊闇€瑕佷繚鎸?

3. 娴嬭瘯瑕佺偣

蹇呴』娴嬭瘯锛?
  鉁?PKB CRUD鍔熻兘
  鉁?鏂囨。涓婁紶鍜屾彁鍙?
  鉁?RAG妫€绱㈠姛鑳?
  鉁?鍏ㄦ枃闃呰<E99783>妯″紡锛?绡囨枃鐚<E69E83>級
  鉁?閫愮瘒绮捐<E7BBAE>妯″紡锛堟枃妗垏鎹<E59E8F>級
  鉁?鎵瑰<E98EB5>鐞嗘ā寮忥紙骞跺彂澶勭悊锛?
  鉁?閰嶉<E996B0>悊
  鉁?瀵硅瘽鍘嗗彶绠$悊
  鉁?妯″瀷鍒囨崲

鉁?闃舵<E99783>0瀹屾垚鏍囧噯

  • 娣卞叆鐞嗚ВPKB鐨勪袱涓<EFBFBD>儴鍒?
  • 鍒楀嚭鎵€鏈堿PI绔<EFBFBD>
  • 鐞嗚В鏁版嵁搴揝chema
  • 鐞嗚В涓夌<EFBFBD>宸ヤ綔妯″紡
  • 鐞嗚В妯″潡闂翠緷璧?
  • 鍒涘缓娴嬭瘯鐢ㄤ緥娓呭崟
  • 鍑嗗<EFBFBD>娴嬭瘯鏁版嵁

馃搳 涓嬩竴姝ワ細鍒涘缓娴嬭瘯鐢ㄤ緥

鍗冲皢鍒涘缓璇︾粏鐨勬祴璇曠敤渚嬫竻鍗曪紝瑕嗙洊鎵€鏈夊姛鑳界偣...


瀹℃煡鐘舵€侊細 馃煛 杩涜<E69DA9><EFBFBD>紙90%瀹屾垚锛? 涓嬩竴姝ワ細 鍒涘缓娴嬭瘯鐢ㄤ緥娓呭崟鍜屾祴璇曟暟鎹<E69A9F>噯澶囨柟妗?