Skip to main content
Version: DEV

Construct knowledge graph

Generate a knowledge graph for your dataset.


To enhance multi-hop question-answering, RAGFlow adds a knowledge graph construction step between data extraction and indexing, as illustrated below. This step creates additional chunks from existing ones generated by your specified chunking method.

Image

From v0.16.0 onward, RAGFlow supports constructing a knowledge graph on a dataset, allowing you to construct a unified graph across multiple files within your dataset. When a newly uploaded file starts parsing, the generated graph will automatically update.

WARNING

Constructing a knowledge graph requires significant memory, computational resources, and tokens.

Scenarios

Knowledge graphs are especially useful for multi-hop question-answering involving nested logic. They outperform traditional extraction approaches when you are performing question answering on books or works with complex entities and relationships.

NOTE

RAPTOR (Recursive Abstractive Processing for Tree Organized Retrieval) can also be used for multi-hop question-answering tasks. See Enable RAPTOR for details. You may use either approach or both, but ensure you understand the memory, computational, and token costs involved.

Prerequisites

The system's default chat model is used to generate knowledge graph. Before proceeding, ensure that you have a chat model properly configured:

Set default models

Configurations

Entity types (Required)

The types of the entities to extract from your dataset. The default types are: organization, person, event, and category. Add or remove types to suit your specific dataset.

Method

The method to use to construct knowledge graph:

  • General: Use prompts provided by GraphRAG to extract entities and relationships.
  • Light: (Default) Use prompts provided by LightRAG to extract entities and relationships. This option consumes fewer tokens, less memory, and fewer computational resources.

Entity resolution

Whether to enable entity resolution. You can think of this as an entity deduplication switch. When enabled, the LLM will combine similar entities - e.g., '2025' and 'the year of 2025', or 'IT' and 'Information Technology' - to construct a more effective graph.

  • (Default) Disable entity resolution.
  • Enable entity resolution. This option consumes more tokens.

Community reports

In a knowledge graph, a community is a cluster of entities linked by relationships. You can have the LLM generate an abstract for each community, known as a community report. See here for more information. This indicates whether to generate community reports:

  • Generate community reports. This option consumes more tokens.
  • (Default) Do not generate community reports.

Quickstart

  1. Navigate to the Configuration page of your dataset and update:

    • Entity types: Required - Specifies the entity types in the knowledge graph to generate. You don't have to stick with the default, but you need to customize them for your documents.
    • Method: Optional
    • Entity resolution: Optional
    • Community reports: Optional The default knowledge graph configurations for your dataset are now set.
  2. Navigate to the Files page of your dataset, click the Generate button on the top right corner of the page, then select Knowledge graph from the dropdown to initiate the knowledge graph generation process.

    You can click the pause button in the dropdown to halt the build process when necessary.

  3. Go back to the Configuration page:

    Once a knowledge graph is generated, the Knowledge graph field changes from Not generated to Generated at a specific timestamp. You can delete it by clicking the recycle bin button to the right of the field.

  4. To use the created knowledge graph, do either of the following:

    • In the Chat setting panel of your chat app, switch on the Use knowledge graph toggle.
    • If you are using an agent, click the Retrieval agent component to specify the dataset(s) and switch on the Use knowledge graph toggle.

Frequently asked questions

Nope. The knowledge graph does not update until you regenerate a knowledge graph for your dataset.

How to remove a generated knowledge graph?

On the Configuration page of your dataset, find the Knoweledge graph field and click the recycle bin button to the right of the field.

Where is the created knowledge graph stored?

All chunks of the created knowledge graph are stored in RAGFlow's document engine: either Elasticsearch or Infinity.

How to export a created knowledge graph?

Nope. Exporting a created knowledge graph is not supported. If you still consider this feature essential, please raise an issue explaining your use case and its importance.