agentscope.rag.llama_index_knowledge module

This module is an integration of the Llama index RAG into AgentScope package

class agentscope.rag.llama_index_knowledge.LlamaIndexKnowledge(knowledge_id: str, emb_model: ModelWrapperBase | BaseEmbedding | None = None, knowledge_config: dict | None = None, model: ModelWrapperBase | None = None, persist_root: str | None = None, overwrite_index: bool | None = False, showprogress: bool | None = True, **kwargs: Any)[源代码]

基类:Knowledge

This class is a wrapper with the llama index RAG.

__init__(knowledge_id: str, emb_model: ModelWrapperBase | BaseEmbedding | None = None, knowledge_config: dict | None = None, model: ModelWrapperBase | None = None, persist_root: str | None = None, overwrite_index: bool | None = False, showprogress: bool | None = True, **kwargs: Any) None[源代码]

initialize the knowledge component based on the llama-index framework: https://github.com/run-llama/llama_index

备注

In LlamaIndex, one of the most important concepts is index, which is a data structure composed of Document objects, designed to enable querying by an LLM. The core workflow of initializing RAG is to convert data to index, and retrieve information from index. For example: 1) preprocessing documents with data loaders 2) generate embedding by configuring pipline with embedding models 3) store the embedding-content to vector database

the default dir is “./rag_storage/knowledge_id”

参数:
  • knowledge_id (str) – The id of the RAG knowledge unit.

  • emb_model (ModelWrapperBase) – The embedding model used for generate embeddings

  • knowledge_config (dict) – The configuration for llama-index to generate or load the index.

  • model (ModelWrapperBase) – The language model used for final synthesis

  • persist_root (str) – The root directory for index persisting

  • overwrite_index (Optional[bool]) – Whether to overwrite the index while refreshing

  • showprogress (Optional[bool]) – Whether to show the indexing progress

retrieve(query: str, similarity_top_k: int | None = None, to_list_strs: bool = False, retriever: BaseRetriever | None = None, **kwargs: Any) list[Any][源代码]

This is a basic retrieve function for knowledge. It will build a retriever on the fly and return the result of the query. :param query: query is expected to be a question in string :type query: str :param similarity_top_k: the number of most similar data returned by the

retriever.

参数:
  • to_list_strs (bool) – whether returns the list of strings; if False, return NodeWithScore

  • retriever (BaseRetriever) – for advanced usage, user can pass their own retriever.

返回:

list of str or NodeWithScore

返回类型:

list[Any]

More advanced query processing can refer to https://docs.llamaindex.ai/en/stable/examples/query_transformations/query_transform_cookbook.html

refresh_index() None[源代码]

Refresh the index when needed.