agentscope.rag.llama_index_knowledge module
This module is an integration of the Llama index RAG into AgentScope package
- class LlamaIndexKnowledge(knowledge_id: str, emb_model: ModelWrapperBase | BaseEmbedding | None = None, knowledge_config: dict | None = None, model: ModelWrapperBase | None = None, persist_root: str | None = None, additional_sparse_retrieval: bool | None = False, overwrite_index: bool | None = False, showprogress: bool | None = True, **kwargs: Any)[source]
Bases:
Knowledge
This class is a wrapper with the llama index RAG.
- classmethod build_knowledge_instance(knowledge_id: str, knowledge_config: dict | None = None, data_dirs_and_types: dict[str, list[str]] | None = None, emb_model_config_name: str | None = None, model_config_name: str | None = None, **kwargs: Any) Knowledge [source]
Building an instance of the LlamaIndex knowledge
- Parameters:
knowledge_id (str) – User-defined unique id for the knowledge
knowledge_config (optional[dict]) – Complete indexing configuration, used for more advanced applications. Users can customize - loader, - transformations, - … Examples can refer to../examples/conversation_with_RAG_agents/
data_dirs_and_types (dict[str, list[str]]) – Dictionary of data paths (keys) to the data types (file extensions) for knowledge (e.g., [“.md”, “.py”, “.html”])
emb_model_config_name (Optional[str]) – Name of the embedding model. This should be specified here or in the knowledge_config dict. If specified both here and in the knowledge_config, the input parameter takes a higher priority than the one knowledge_config.
model_config_name (Optional[str]) – Name of the language model. Optional, can be None and not specified in knowledge_config. If specified both here and in the knowledge_config, the input parameter takes a higher priority than the one knowledge_config.
- Returns:
A knowledge instance
A simple example of importing data to Knowledge object:
knowledge_bank.add_data_as_knowledge( knowledge_id="agentscope_tutorial_rag", emb_model_config_name="qwen_emb_config", data_dirs_and_types={ "../../docs/sphinx_doc/en/source/tutorial": [".md"], }, persist_dir="./rag_storage/tutorial_assist", )
- Return type:
Knowledge
- classmethod default_config(knowledge_id: str, data_dirs_and_types: dict[str, list[str]] | None = None, knowledge_config: dict | None = None) dict [source]
Generate default config for loading data from directories and using the default operations to preprocess the data for RAG usage. :param knowledge_id: User-defined unique id for the knowledge :type knowledge_id: str :param data_dirs_and_types: Dictionary of data paths (keys) to the data types
(file extensions) for knowledgebase (e.g., [“.md”, “.py”, “.html”])
- Parameters:
knowledge_config (optional[dict]) – Complete indexing configuration, used for more advanced applications. Users can customize - loader, - transformations, - … Examples can refer to../examples/conversation_with_RAG_agents/
- Returns:
A default config of LlamaIndexKnowledge
- Return type:
dict
- retrieve(query: str, similarity_top_k: int | None = None, to_list_strs: bool = False, retriever: BaseRetriever | None = None, **kwargs: Any) list[RetrievedChunk | str] [source]
This is a basic retrieve function for knowledge. It will build a retriever on the fly and return the result of the query. :param query: Query is expected to be a question in string :type query: str :param similarity_top_k: The number of most similar data returned by the
retriever.
- Parameters:
to_list_strs (bool) – Whether returns the list of strings; if False, return list of RetrievedChunk
retriever (BaseRetriever) – For advanced usage, user can pass their own retriever.
- Returns:
List of retrieved content
- Return type:
list[Union[RetrievedChunk, str]]
More advanced query processing can refer to https://docs.llamaindex.ai/en/stable/examples/query_transformations/query_transform_cookbook.html
- knowledge_type: str = 'llamaindex_knowledge'
A string to identify a knowledge base class