Skip to main content

· 3 min read

The release of GitHub’s 2025 Octoverse report marks a pivotal moment for the open source ecosystem—and for projects like RAGFlow, which has emerged as one of the fastest-growing open source projects by contributors this year. With a remarkable 2,596% year-over-year growth in contributor engagement, RAGFlow isn’t just gaining traction—it’s defining the next wave of AI-powered development.

The Rise of Retrieval-Augmented Generation in Production

As the Octoverse report highlights, AI is no longer experimental—it’s foundational. More than 4.3 million AI-related repositories now exist on GitHub, and over 1.1 million public repos import LLM SDKs, a 178% YoY increase. In this context, RAGFlow’s rapid adoption signals a clear shift: developers are moving beyond prototyping and into production-grade AI workflows.

RAGFlow—an end-to-end retrieval-augmented generation engine with built-in agent capabilities—is perfectly positioned to meet this demand. It enables developers to build scalable, context-aware AI applications that are both powerful and practical. As the report notes, “AI infrastructure is emerging as a major magnet” for open source contributions, and RAGFlow sits squarely at the intersection of AI infrastructure and real-world usability.

Why RAGFlow Resonates in the AI Era

Several trends highlighted in the Octoverse report align closely with RAGFlow’s design and mission:

  • From Notebooks to Production: The report notes a shift from Jupyter Notebooks (+75% YoY) to Python codebases, signaling that AI projects are maturing. RAGFlow supports this transition by offering a structured, reproducible framework for deploying RAG systems in production.
  • Agentic Workflows Are Going Mainstream: With the launch of GitHub Copilot coding agent and the rise of AI-assisted development, developers are increasingly relying on tools that automate complex tasks. RAGFlow’s built-in agent capabilities allow teams to automate retrieval, reasoning, and response generation—key components of modern AI apps.
  • Security and Scalability Are Top of Mind: The report also highlights a 172% YoY increase in Broken Access Control vulnerabilities, underscoring the need for secure-by-design AI systems. RAGFlow’s focus on enterprise-ready deployment helps teams address these challenges head-on.

A Project in Active Development

RAGFlow's evolution mirrors a deliberate journey—from solving foundational RAG challenges to shaping the next generation of enterprise AI infrastructure.

The project first made its mark by systematically addressing core RAG limitations through integrated technological innovation. With features such as deep document understanding for parsing complex formats, hybrid retrieval that blends multiple search strategies, and built-in advanced tools like GraphRAG and RAPTOR, RAGFlow established itself as an end-to-end solution that dramatically enhances retrieval accuracy and reasoning performance.

Now, building on this robust technical foundation, RAGFlow is steering toward a bolder vision: to become the superior context engine for enterprise-grade Agents. Evolving from a specialized RAG engine into a unified, resilient context layer, RAGFlow is positioning itself as the essential data foundation for LLMs in the enterprise—enabling Agents of any kind to access rich, precise, and secure context, ensuring reliable and effective operation across all tasks.


RAGFlow is an open source retrieval-augmented generation engine designed for building production-ready AI applications. To learn more or contribute, visit the RAGFlow GitHub repository.

This post was inspired by insights from the GitHub Octoverse 2025 Report. Special thanks to the GitHub team for amplifying the voices of open source builders everywhere.


· 13 min read

To meet the refined requirements for system operation and maintenance monitoring as well as user account management, especially addressing practical issues currently faced by RAGFlow users, such as "the inability of users to recover lost passwords on their own" and "difficulty in effectively controlling account statuses," RAGFlow has officially launched a professional back-end management command-line tool.

This tool is based on a clear client-server architecture. By adopting the design philosophy of functional separation, it constructs an efficient and reliable system management channel, enabling administrators to achieve system recovery and permission control at the fundamental level.

The specific architectural design is illustrated in the following diagram:

Through this tool, RAGFlow users can gain a one-stop overview of the operational status of the RAGFlow Server as well as components such as MySQL, Elasticsearch, Redis, MinIO, and Infinity. It also supports comprehensive user lifecycle management, including account creation, status control, password reset, and data cleanup.

Start the service

If you deploy RAGFlow via Docker, please modify the docker/docker_compose.yml file and add parameters to the service startup command.

command:
- --enable-adminserver

If you are deploying RAGFlow from the source code, you can directly execute the following command to start the management server:

python3 admin/server/admin_server.py

After the server starts, it listens on port 9381 by default, waiting for client connections. If you need to use a different port, please modify the ADMIN_SVR_HTTP_PORT configuration item in the docker/.env file.

Install the client and connect to the management service

It is recommended to use pip to install the specified version of the client:

pip install ragflow-cli==0.21.0

Version 0.21.0 is the current latest version. Please ensure that the version of ragflow-cli matches the version of the RAGFlow server to avoid compatibility issues:

ragflow-cli -h 127.0.0.1 -p 9381

Here, -h specifies the server IP, and -p specifies the server port. If the server is deployed at a different address or port, please adjust the parameters accordingly.

For the first login, enter the default administrator password admin. After successfully logging in, it is recommended to promptly change the default password in the command-line tool to enhance security.

Command Usage Guide

By entering the following commands through the client, you can conveniently manage users and monitor the operational status of the service.

Interactive Feature Description

  • Supports using arrow keys to move the cursor and review historical commands.
  • Pressing Ctrl+C allows you to terminate the current interaction at any time.
  • If you need to copy content, please avoid using Ctrl+C to prevent accidentally interrupting the process.

Command Format Specifications

  • All commands are case-insensitive and must end with a semicolon ;.
  • Text parameters such as usernames and passwords need to be enclosed in single quotes ' or double quotes ".
  • Special characters like \, ', and " are prohibited in passwords.

Service Management Commands

LIST SERVICES;

  • List the operational status of RAGFlow backend services and all associated middleware.

Usage Example

admin> list services;
Listing all services
+-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+---------+
| extra | host | id | name | port | service_type | status |
+-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+---------+
| {} | 0.0.0.0 | 0 | ragflow_0 | 9380 | ragflow_server | Timeout |
| {'meta_type': 'mysql', 'password': 'infini_rag_flow', 'username': 'root'} | localhost | 1 | mysql | 5455 | meta_data | Alive |
| {'password': 'infini_rag_flow', 'store_type': 'minio', 'user': 'rag_flow'} | localhost | 2 | minio | 9000 | file_store | Alive |
| {'password': 'infini_rag_flow', 'retrieval_type': 'elasticsearch', 'username': 'elastic'} | localhost | 3 | elasticsearch | 1200 | retrieval | Alive |
| {'db_name': 'default_db', 'retrieval_type': 'infinity'} | localhost | 4 | infinity | 23817 | retrieval | Timeout |
| {'database': 1, 'mq_type': 'redis', 'password': 'infini_rag_flow'} | localhost | 5 | redis | 6379 | message_queue | Alive |
+-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+---------+

List the operational status of RAGFlow backend services and all associated middleware.

SHOW SERVICE <id>;

  • Query the detailed operational status of the specified service.
  • <id>: Command: LIST SERVICES; The service identifier ID in the displayed results.

Usage Example:

  1. Query the RAGFlow backend service:
admin> show service 0;
Showing service: 0
Service ragflow_0 is alive. Detail:
Confirm elapsed: 4.1 ms.

The response indicates that the RAGFlow backend service is online, with a response time of 4.1 milliseconds.

  1. Query the MySQL service:
admin> show service 1;
Showing service: 1
Service mysql is alive. Detail:
+---------+----------+------------------+------+------------------+------------------------+-------+-----------------+
| command | db | host | id | info | state | time | user |
+---------+----------+------------------+------+------------------+------------------------+-------+-----------------+
| Daemon | None | localhost | 5 | None | Waiting on empty queue | 86356 | event_scheduler |
| Sleep | rag_flow | 172.18.0.6:56788 | 8790 | None | | 2 | root |
| Sleep | rag_flow | 172.18.0.6:53482 | 8791 | None | | 73 | root |
| Query | rag_flow | 172.18.0.6:56794 | 8795 | SHOW PROCESSLIST | init | 0 | root |
+---------+----------+------------------+------+------------------+------------------------+-------+-----------------+

The response indicates that the MySQL service is online, with the current connection and execution status shown in the table above.

  1. Query the MinIO service:
admin> show service 2;
Showing service: 2
Service minio is alive. Detail:
Confirm elapsed: 2.3 ms.

The response indicates that the MinIO service is online, with a response time of 2.3 milliseconds.

  1. Query the Elasticsearch service:
admin> show service 3;
Showing service: 3
Service elasticsearch is alive. Detail:
+----------------+------+--------------+---------+----------------+--------------+---------------+--------------+------------------------------+----------------------------+-----------------+-------+---------------+---------+-------------+---------------------+--------+------------+--------------------+
| cluster_name | docs | docs_deleted | indices | indices_shards | jvm_heap_max | jvm_heap_used | jvm_versions | mappings_deduplicated_fields | mappings_deduplicated_size | mappings_fields | nodes | nodes_version | os_mem | os_mem_used | os_mem_used_percent | status | store_size | total_dataset_size |
+----------------+------+--------------+---------+----------------+--------------+---------------+--------------+------------------------------+----------------------------+-----------------+-------+---------------+---------+-------------+---------------------+--------+------------+--------------------+
| docker-cluster | 8 | 0 | 1 | 2 | 3.76 GB | 2.15 GB | 21.0.1+12-29 | 17 | 757 B | 17 | 1 | ['8.11.3'] | 7.52 GB | 2.30 GB | 31 | green | 226 KB | 226 KB |
+----------------+------+--------------+---------+----------------+--------------+---------------+--------------+------------------------------+----------------------------+-----------------+-------+---------------+---------+-------------+---------------------+--------+------------+--------------------+

The response indicates that the Elasticsearch cluster is operating normally, with specific metrics including document count, index status, memory usage, etc.

  1. Query the Infinity service:
admin> show service 4;
Showing service: 4
Fail to show service, code: 500, message: Infinity is not in use.

The response indicates that Infinity is not currently in use by the RAGFlow system.

admin> show service 4;
Showing service: 4
Service infinity is alive. Detail:
+-------+--------+----------+
| error | status | type |
+-------+--------+----------+
| | green | infinity |
+-------+--------+----------+

After enabling Infinity and querying again, the response indicates that the Infinity service is online and in good condition.

  1. Query the Redis service:
admin> show service 5;
Showing service: 5
Service redis is alive. Detail:
+-----------------+-------------------+---------------------------+-------------------------+---------------+-------------+--------------------------+---------------------+-------------+
| blocked_clients | connected_clients | instantaneous_ops_per_sec | mem_fragmentation_ratio | redis_version | server_mode | total_commands_processed | total_system_memory | used_memory |
+-----------------+-------------------+---------------------------+-------------------------+---------------+-------------+--------------------------+---------------------+-------------+
| 0 | 3 | 10 | 3.03 | 7.2.4 | standalone | 404098 | 30.84G | 1.29M |
+-----------------+-------------------+---------------------------+-------------------------+---------------+-------------+--------------------------+---------------------+-------------+

The response indicates that the Redis service is online, with the version number, deployment mode, and resource usage shown in the table above.

User Management Commands

LIST USERS;

  • List all users in the RAGFlow system.

Usage Example:

admin> list users;
Listing all users
+-------------------------------+----------------------+-----------+----------+
| create_date | email | is_active | nickname |
+-------------------------------+----------------------+-----------+----------+
| Mon, 13 Oct 2025 15:58:42 GMT | admin@ragflow.io | 1 | admin |
| Mon, 13 Oct 2025 15:54:34 GMT | lynn_inf@hotmail.com | 1 | Lynn |
+-------------------------------+----------------------+-----------+----------+

The response indicates that there are currently two users in the system, both of whom are enabled.

Among them, admin@ragflow.io is the administrator account, which is automatically created during the initial system startup.

SHOW USER <username>;

  • Query detailed user information by email.
  • <username>: The user's email address, which must be enclosed in single quotes ' or double quotes ".

Usage Example:

  1. Query the administrator user
admin> show user "admin@ragflow.io";
Showing user: admin@ragflow.io
+-------------------------------+------------------+-----------+--------------+------------------+--------------+----------+-----------------+---------------+--------+-------------------------------+
| create_date | email | is_active | is_anonymous | is_authenticated | is_superuser | language | last_login_time | login_channel | status | update_date |
+-------------------------------+------------------+-----------+--------------+------------------+--------------+----------+-----------------+---------------+--------+-------------------------------+
| Mon, 13 Oct 2025 15:58:42 GMT | admin@ragflow.io | 1 | 0 | 1 | True | English | None | None | 1 | Mon, 13 Oct 2025 15:58:42 GMT |
+-------------------------------+------------------+-----------+--------------+------------------+--------------+----------+-----------------+---------------+--------+-------------------------------+

The response indicates that admin@ragflow.io is a super administrator and is currently enabled.

  1. Query a regular user
admin> show user "lynn_inf@hotmail.com";
Showing user: lynn_inf@hotmail.com
+-------------------------------+----------------------+-----------+--------------+------------------+--------------+----------+-------------------------------+---------------+--------+-------------------------------+
| create_date | email | is_active | is_anonymous | is_authenticated | is_superuser | language | last_login_time | login_channel | status | update_date |
+-------------------------------+----------------------+-----------+--------------+------------------+--------------+----------+-------------------------------+---------------+--------+-------------------------------+
| Mon, 13 Oct 2025 15:54:34 GMT | lynn_inf@hotmail.com | 1 | 0 | 1 | False | English | Mon, 13 Oct 2025 15:54:33 GMT | password | 1 | Mon, 13 Oct 2025 17:24:09 GMT |
+-------------------------------+----------------------+-----------+--------------+------------------+--------------+----------+-------------------------------+---------------+--------+-------------------------------+

The response indicates that lynn_inf@hotmail.com is a regular user who logs in via password, with the last login time shown as the provided timestamp.

CREATE USER <username> <password>;

  • Create a new user.
  • <username>: The user's email address must comply with standard email format specifications.
  • <password>: The user's password must not contain special characters such as \, ', or ".

Usage Example:

admin> create user "example@ragflow.io" "psw";
Create user: example@ragflow.io, password: psw, role: user
+----------------------------------+--------------------+----------------------------------+--------------+---------------+----------+
| access_token | email | id | is_superuser | login_channel | nickname |
+----------------------------------+--------------------+----------------------------------+--------------+---------------+----------+
| be74d786a9b911f0a726d68c95a0776b | example@ragflow.io | be74d6b4a9b911f0a726d68c95a0776b | False | password | |
+----------------------------------+--------------------+----------------------------------+--------------+---------------+----------+

A regular user has been successfully created. Personal information such as nickname and avatar can be set by the user themselves after logging in and accessing the profile page.

ALTER USER PASSWORD <username> <new_password>;

  • Change the user's password.
  • <username>: User email address
  • <password>: New password (must not be the same as the old password and must not contain special characters)

Usage Example:

admin> alter user password "example@ragflow.io" "psw";
Alter user: example@ragflow.io, password: psw
Same password, no need to update!

When the new password is the same as the old password, the system prompts that no change is needed.

admin> alter user password "example@ragflow.io" "new psw";
Alter user: example@ragflow.io, password: new psw
Password updated successfully!

The password has been updated successfully. The user can log in with the new password thereafter.

ALTER USER ACTIVE <username> <on/off>;

  • Enable or disable a user.
  • <username>: User email address
  • <on/off>: Enabled or disabled status

Usage Example:

admin> alter user active "example@ragflow.io" off;
Alter user example@ragflow.io activate status, turn off.
Turn off user activate status successfully!

The user has been successfully disabled. Only users in a disabled state can be deleted.

DROP USER <username>;

  • Delete the user and all their associated data
  • <username>: User email address

Important Notes:

  • Only disabled users can be deleted.
  • Before proceeding, ensure that all necessary data such as knowledge bases and files that need to be retained have been transferred to other users.
  • This operation will permanently delete the following user data:

All knowledge bases created by the user, uploaded files, and configured agents, as well as files uploaded by the user in others' knowledge bases, will be permanently deleted. This operation is irreversible, so please proceed with extreme caution.

  • The deletion command is idempotent. If the system fails or the operation is interrupted during the deletion process, the command can be re-executed after troubleshooting to continue deleting the remaining data.

Usage Example:

  1. User Successfully Deleted
admin> drop user "example@ragflow.io";
Drop user: example@ragflow.io
Successfully deleted user. Details:
Start to delete owned tenant.
- Deleted 2 tenant-LLM records.
- Deleted 0 langfuse records.
- Deleted 1 tenant.
- Deleted 1 user-tenant records.
- Deleted 1 user.
Delete done!

The response indicates that the user has been successfully deleted, and it lists detailed steps for data cleanup.

  1. Deleting Super Administrator (Prohibited Operation)
admin> drop user "admin@ragflow.io";
Drop user: admin@ragflow.io
Fail to drop user, code: -1, message: Can't delete the super user.

The response indicates that the deletion failed. The super administrator account is protected and cannot be deleted, even if it is in a disabled state.

Data and Agent Management Commands

LIST DATASETS OF <username>;

  • List all knowledge bases of the specified user
  • <username>: User email address

Usage Example:

admin> list datasets of "lynn_inf@hotmail.com";
Listing all datasets of user: lynn_inf@hotmail.com
+-----------+-------------------------------+---------+----------+-----------------+------------+--------+-----------+-------------------------------+
| chunk_num | create_date | doc_num | language | name | permission | status | token_num | update_date |
+-----------+-------------------------------+---------+----------+-----------------+------------+--------+-----------+-------------------------------+
| 8 | Mon, 13 Oct 2025 15:56:43 GMT | 1 | English | primary_dataset | me | 1 | 3296 | Mon, 13 Oct 2025 15:57:54 GMT |
+-----------+-------------------------------+---------+----------+-----------------+------------+--------+-----------+-------------------------------+

The response shows that the user has one private knowledge base, with detailed information such as the number of documents and segments displayed in the table above.

LIST AGENTS OF <username>;

  • List all Agents of the specified user
  • <username>: User email address

Usage Example:

admin> list agents of "lynn_inf@hotmail.com";
Listing all agents of user: lynn_inf@hotmail.com
+-----------------+-------------+------------+----------------+
| canvas_category | canvas_type | permission | title |
+-----------------+-------------+------------+----------------+
| agent_canvas | None | me | finance_helper |
+-----------------+-------------+------------+----------------+

The response indicates that the user has one private Agent, with detailed information shown in the table above.

Other commands

  • ? or \help

Display help information.

  • \q or \quit

Follow-up plan

We are always committed to enhancing the system management experience and overall security. Building on its existing robust features, the RAGFlow back-end management tool will continue to evolve. In addition to the current efficient and flexible command-line interface, we are soon launching a professional system management UI, enabling administrators to perform all operational and maintenance tasks in a more secure and intuitive graphical environment.

To strengthen permission control, the system status information currently visible in the ordinary user interface will be revoked. After the future launch of the professional management UI, access to the core operational status of the system will be restricted to administrators only. This will effectively address the current issue of excessive permission exposure and further reinforce the system's security boundaries.

In addition, we will also roll out more fine-grained management features sequentially, including:

  • Fine-grained control over Datasets and Agents
  • User Team collaboration management mechanisms
  • Enhanced system monitoring and auditing capabilities

These improvements will establish a more comprehensive enterprise-level management ecosystem, providing administrators with a more all-encompassing and convenient system control experience.

· 10 min read

Introduction

In RAG (Retrieval-Augmented Generation) and LLM (Large Language Model) Memory, vector retrieval is widely employed. Among various options, graph-based indexing has become the most common choice due to its high accuracy and performance, with the HNSW (Hierarchical Navigable Small World) index being the most representative one1 2.

However, during our practical application of HNSW in RAGFlow, we encountered the following two major bottlenecks:

  1. As the data scale continues to grow, the memory consumption of vector data becomes highly significant. For instance, one billion 1024-dimensional floating-point vectors would require approximately 4TB of memory space.

  2. When constructing an HNSW index on complex datasets, there is a bottleneck in retrieval accuracy. After reaching a certain threshold, it becomes difficult to further improve accuracy solely by adjusting parameters.

To address this, Infinity has implemented a variety of improved algorithms based on HNSW. Users can select different indexing schemes by adjusting the index parameters of HNSW. Each HNSW index variant possesses distinct characteristics and is suitable for different scenarios, allowing users to construct corresponding index structures based on their actual needs.

Introduction to Indexing Schemes

The original HNSW, as a commonly used graph-based index, exhibits excellent performance.

Its structure consists of two parts:

A set of original vector data, along with a graph structure jointly constructed by a skip list and an adjacency list. Taking the Python interface as an example, the index can be constructed and utilized in the following manner:

## Create index
table_obj.create_index(
"hnsw_index",
index.IndexInfo(
"embedding",
index.IndexType.Hnsw, {
"m": 16,
"ef_construction": 200,
"metric": "l2"
},
)
)
## Vector retrieval
query_builder.match_dense('embedding', [1.0, 2.0, 3.0], 'float', 'l2', 10, {'ef': 200})

To address the issues of high memory consumption and accuracy bottlenecks, Infinity provides the following solutions:

  1. Introduce LVQ and RaBitQ quantization methods to reduce the memory overhead of original vectors during graph search processes.
  2. Introduce the LSG strategy to optimize the graph index structure of HNSW, enhancing its accuracy threshold and query efficiency.

To facilitate users in testing the performance of different indexes locally, Infinity provides a benchmark script. You can follow the tutorial provided by Infinity on GitHub to set up the environment and prepare the dataset, and then test different indexing schemes using the benchmark.

######################   compile benchmark  ######################
cmake --build cmake-build-release --target hnsw_benchmark

###################### build index & execute query ######################
# mode : build, query
# benchmark_type : sift, gist, msmarco
# build_type : plain, lvq, crabitq, lsg, lvq_lsg, crabitq_lsg
##############################################################
benchmark_type=sift
build_type=plain
./cmake-build-release/benchmark/local_infinity/hnsw_benchmark --mode=build --benchmark_type=$benchmark_type --build_type=$build_type --thread_n=8
./cmake-build-release/benchmark/local_infinity/hnsw_benchmark --mode=query --benchmark_type=$benchmark_type --build_type=$build_type --thread_n=8 --topk=10

Among them, the original HNSW corresponds to the parameter build_type=plain. This paper conducts a unified test on the query performance of all variant indexes, with the experimental environment configuration adopted as follows:

  1. OS: Ubuntu 24.04 LTS (Noble Numbat)
  2. CPU: 13th Gen Intel(R) Core(TM) i5-13400
  3. RAM: 64G

The CPU has a 16-core specification. To align with the actual device environments of most users, the parallel computing parameter in the benchmark is uniformly set to 8 threads.

Solution 1: Original HNSW + LVQ Quantizer (HnswLvq)

LVQ is a scalar quantization method that compresses each 32-bit floating-point number in the original vectors into an 8-bit integer3, thereby reducing memory usage to one-fourth of that of the original vectors.

Compared to simple scalar quantization methods (such as mean scalar quantization), LVQ reduces errors by statistically analyzing the residuals of each vector, effectively minimizing information loss in distance calculations for quantized vectors. Consequently, LVQ can accurately estimate the distances between original vectors with only approximately 30% of the original memory footprint.

In the HNSW algorithm, original vectors are utilized for distance calculations, which enables LVQ to be directly integrated with HNSW. We refer to this combined approach as HnswLvq. In Infinity, users can enable LVQ encoding by setting the parameter "encode": "lvq":

## Create index
table_obj.create_index(
"hnsw_index",
index.IndexInfo(
"embedding",
index.IndexType.Hnsw, {
"m": 16,
"ef_construction": 200,
"metric": "l2",
"encode": "lvq"
},
)
)
## Vector retrieval
query_builder.match_dense('embedding', [1.0, 2.0, 3.0], 'float', 'l2', 10, {'ef': 200})

The graph structure of HnswLvq remains consistent with that of the original HNSW, with the key difference being that it uses quantized vectors to perform all distance calculations within the graph. Through this improvement, HnswLvq outperforms the original HNSW index in terms of both index construction and query efficiency.

The improvement in construction efficiency stems from the shorter length of quantized vectors, which results in reduced time for distance calculations using SIMD instructions. The enhancement in query efficiency is attributed to the computational acceleration achieved through quantization, which outweighs the negative impact caused by the loss of precision.

In summary, HnswLvq significantly reduces memory usage while maintaining excellent query performance. We recommend that users adopt it as the primary index in most scenarios. To replicate this experiment, users can set the parameter build_type=lvq in the benchmark. The specific experimental results are compared alongside the RaBitQ quantizer scheme in Solution two.

Solution 2: Original HNSW + RaBitQ Quantizer (HnswRabitq)

RaBitQ is a binary scalar quantization method that shares a similar core idea with LVQ, both aiming to replace the 32-bit floating-point numbers in original vectors with fewer encoded bits. The difference lies in that RaBitQ employs binary scalar quantization, representing each floating-point number with just 1 bit, thereby achieving an extremely high compression ratio.

However, this extreme compression also leads to more significant information loss in the vectors, resulting in a decline in the accuracy of distance estimation. To mitigate this issue, RaBitQ introduces a rotation matrix to preprocess the dataset during the preprocessing stage and retains more residual information, thereby reducing errors in distance calculations to a certain extent4.

Nevertheless, binary quantization has obvious limitations in terms of precision, showing a substantial gap compared to LVQ. Indexes built directly using RaBitQ encoding exhibit poor query performance.

Therefore, the HnswRabitq scheme implemented in Infinity involves first constructing an original HNSW index for the dataset and then converting it into an HnswRabitq index through the compress_to_rabitq parameter in the optimize method.

During the query process, the system initially uses quantized vectors for preliminary retrieval and then re-ranks the ef candidate results specified by the user using the original vectors.

## Create index
table_obj.create_index(
"hnsw_index",
index.IndexInfo(
"embedding",
index.IndexType.Hnsw, {
"m": 16,
"ef_construction": 200,
"metric": "l2"
},
)
)
## Construct RaBitQ coding
table_obj.optimize("hnsw_index", {"compress_to_rabitq": "true"})
## Vector retrieval
query_builder.match_dense('embedding', [1.0, 2.0, 3.0], 'float', 'l2', 10, {'ef': 200})

Compared to LVQ, RaBitQ can further reduce the memory footprint of encoded vectors by nearly 70%. On some datasets, the query efficiency of HnswRabitq even surpasses that of HnswLvq due to the higher efficiency of distance calculations after binary quantization.

However, it should be noted that on certain datasets (such as sift1M), the quantization process may lead to significant precision loss, making such datasets unsuitable for using HnswRabitq.

In summary, if a user's dataset is not sensitive to quantization errors, adopting the HnswRabitq index can significantly reduce memory overhead while still maintaining relatively good query performance.

In such scenarios, it is recommended to prioritize the use of the HnswRabitq index. Users can replicate the aforementioned experiments by setting the benchmark parameter build_type=crabitq.

Solution 3: LSG Graph Construction Strategy

LSG (Local Scaling Graph) is an improved graph construction strategy based on graph indexing algorithms (such as HNSW, DiskANN, etc.) 5.

This strategy scales the distance (e.g., L2 distance, inner product distance, etc.) between any two vectors by statistically analyzing the local information—neighborhood radius—of each vector in the dataset. The scaled distance is referred to as the LS distance.

During the graph indexing construction process, LSG uniformly replaces the original distance metric with the LS distance, effectively performing a "local scaling" of the original metric space. Through theoretical proofs and experiments, the paper demonstrates that constructing a graph index in this scaled space can achieve superior query performance in the original space.

LSG optimizes the HNSW index in multiple ways. When the accuracy requirement is relatively lenient (< 99%), LSG exhibits higher QPS (Queries Per Second) compared to the original HNSW index.

In high-precision scenarios (> 99%), LSG enhances the quality of the graph index, enabling HNSW to surpass its original accuracy limit and achieve retrieval accuracy that is difficult for the original HNSW index to attain. These improvements translate into faster response times and more precise query results for users in real-world applications of RAGFlow.

In Infinity, LSG is provided as an optional parameter for HNSW. Users can enable this graph construction strategy by setting build_type=lsg, and we refer to the corresponding index as HnswLsg.

## Create index
table_obj.create_index(
"hnsw_index",
index.IndexInfo(
"embedding",
index.IndexType.Hnsw, {
"m": 16,
"ef_construction": 200,
"metric": "l2",
"build_type": "lsg"
},
)
)
## Vector retrieval
query_builder.match_dense('embedding', [1.0, 2.0, 3.0], 'float', 'l2', 10, {'ef': 200})

LSG essentially alters the metric space during the index construction process. Therefore, it can not only be applied to the original HNSW but also be combined with quantization methods (such as LVQ or RaBitQ) to form variant indexes like HnswLvqLsg or HnswRabitqLsg. The usage of the user interface remains consistent with that of HnswLvq and HnswRabitq.

LSG can enhance the performance of the vast majority of graph indexes and datasets, but at the cost of additional computation of local information—neighborhood radius—during the graph construction phase, which thus increases the construction time to a certain extent. For example, on the sift1M dataset, the construction time of HnswLsg is approximately 1.2 times that of the original HNSW.

In summary, if users are not sensitive to index construction time, they can confidently enable the LSG option, as it can steadily improve query performance. Users can replicate the aforementioned experiments by setting the benchmark parameter to build_type=[lsg/lvq_lsg/crabitq_lsg].

Index Performance Evaluation

To evaluate the performance of various indexes in Infinity, we selected three representative datasets as benchmarks, including the widely used sift and gist datasets in vector index evaluations.

Given that Infinity is frequently used in conjunction with RAGFlow in current scenarios, the retrieval effectiveness on RAG-type datasets is particularly crucial for users assessing index performance.

Therefore, we also incorporated the msmarco dataset. This dataset was generated by encoding the TREC-RAG 2024 corpus using the Cohere Embed English v3 model, comprising embedded vectors for 113.5 million text passages, as well as embedded vectors corresponding to 1,677 query instructions from TREC-Deep Learning 2021-2023.

From the test results of each dataset, it can be observed that in most cases, HnswRabitqLsg achieves the best overall performance. For instance, on the msmarco dataset in RAG scenarios, RaBitQ achieves a 90% reduction in memory usage while delivering query performance that is 5 times that of the original HNSW at a 99% recall rate.

Based on the above experimental results, we offer the following practical recommendations for Infinity users:

  1. The original HNSW can attain a higher accuracy ceiling compared to HnswLvq and HnswRabitq. If users have extremely high accuracy requirements, this strategy should be prioritized.
  2. Within the allowable accuracy range, HnswLvq can be confidently selected for most datasets. For datasets that are less susceptible to quantization effects, HnswRabitq is generally a better choice.
  3. The LSG strategy enhances performance across all index variants. If users are not sensitive to index construction time, it is recommended to enable this option in all scenarios to improve query efficiency. Additionally, due to its algorithmic characteristics, LSG can significantly raise the accuracy ceiling. Therefore, if the usage scenario demands extremely high accuracy (>99.9%), enabling LSG is strongly recommended to optimize index performance.

Infinity continues to iterate and improve. We welcome ongoing attention and valuable feedback and suggestions from everyone.

· 10 min read

RAGFlow 0.21.0 officially released

This release shifts focus from enhancing online Agent capabilities to strengthening the data foundation, prioritising usability and dialogue quality from the ground up. Directly addressing common RAG pain points—from data preparation to long-document understanding—version 0.21.0 brings crucial upgrades: a flexible, orchestratable Ingestion Pipeline, long-context RAG to close semantic gaps in complex files, and a new admin CLI for smoother operations. Taken together, these elements establish RAGFlow’s refreshed data-pipeline core, providing a more solid foundation for building robust and effective RAG applications.

Orchestratable Ingestion Pipeline

If earlier Agents primarily tackled the orchestration of online data—as seen in Workflow and Agentic Workflow—the Ingestion Pipeline mirrors this capability by applying the same technical architecture to orchestrate offline data ingestion. Its introduction enables users to construct highly customized RAG data pipelines within a unified framework. This not only streamlines bespoke development but also more fully embodies the "Flow" in RAGFlow.

A typical RAG ingestion process involves key stages such as document parsing, text chunking, vectorization, and index building. When RAGFlow first launched in April 2024, it already incorporated an advanced toolchain, including the DeepDoc-based parsing engine and a templated chunking mechanism. These state-of-the-art solutions were foundational to its early adoption.

However, with rapid industry evolution and deeper practical application, we have observed new trends and demands:

  • The rise of Vision Language Models (VLMs): Increasingly mature VLMs have driven a wave of fine-tuned document parsing models. These offer significantly improved accuracy for unstructured documents with complex layouts or mixed text and images.
  • Demand for flexible chunking: Users now seek more customized chunking strategies. Faced with diverse knowledge-base scenarios, RAGFlow's original built-in chunking templates have proved insufficient for covering all niche cases, which can impact the accuracy of final Q&A outcomes.

To this end, RAGFlow 0.21.0 formally introduces the Ingestion Pipeline, featuring core capabilities including:

  • Orchestratable Data Ingestion: Building on the underlying Agent framework, users can create varied data ingestion pipelines. Each pipeline may apply different strategies to connect a data source to the final index, turning the previous built-in data-writing process into a user-customizable workflow. This provides more flexible ingestion aligned with specific business logic.
  • Decoupling of Upload and Cleansing: The architecture separates data upload from cleansing, establishing standard interfaces for future batch data sources and a solid foundation for expanding data preprocessing workflows.
  • Refactored Parser: The Parser component has been redesigned for extensibility, laying groundwork for integrating advanced document-parsing models beyond DeepDoc.
  • Customizable Chunking Interface: By decoupling the chunking step, users can plug in custom chunkers to better suit the segmentation needs of different knowledge structures.
  • Optimized Efficiency for Complex RAG: The execution of IO/compute-intensive tasks, such as GraphRAG and RAPTOR, has been overhauled. In the pre-pipeline architecture, processing each new document triggered a full compute cycle, resulting in slow performance. The new pipeline enables batch execution, significantly improving data throughput and overall efficiency.

If ETL/ELT represents the standard pipeline for processing structured data in the modern data stack—with tools like dbt and Fivetran providing unified and flexible data integration solutions for data warehouses and data lakes—then RAGFlow's Ingestion Pipeline is positioned to become the equivalent infrastructure for unstructured data. The following diagram illustrates this architectural analogy:

Image

Specifically, while the Extract phase in ETL/ELT is responsible for pulling data from diverse sources, the RAGFlow Ingestion Pipeline augments this with a dedicated Parsing stage to extract information from unstructured data. This stage integrates multiple parsing models, led by DeepDoc, to convert multimodal documents (for example, text and images) into a unimodal representation suitable for processing.

In the Transform phase, where traditional ETL/ELT focuses on data cleansing and business logic, RAGFlow instead constructs a series of LLM-centric Agent components. These are optimized to address semantic gaps in retrieval, with a core mission that can be summarized as: to enhance recall and ranking accuracy.

For data loading, ETL/ELT writes results to a data warehouse or data lake, while RAGFlow uses an Indexer component to build the processed content into a retrieval-optimised index format. This reflects the RAG engine’s hybrid retrieval architecture, which must support full-text, vector, and future tensor-based retrieval to ensure optimal recall.

Thus, the modern data stack serves business analytics for structured data, whereas a RAG engine with an Ingestion Pipeline specializes in the intelligent retrieval of unstructured data—providing high-quality context for LLMs. Each occupies an equivalent ecological niche in its domain.

Regarding processing structured data, this is not the RAG engine’s core duty. It is handled by a Context Layer built atop the engine. This layer leverages the MCP (Model Context Protocol)—described as “TCP/IP for the AI era” —and accompanying Context Engineering to automate the population of all context types. This is a key focus area for RAGFlow’s next development phase.

Below is a preliminary look at the Ingestion Pipeline in v0.21.0; a more detailed guide will follow. We have introduced components for parsing, chunking, and other unstructured data processing tasks into the Agent Canvas, enabling users to freely orchestrate their parsing workflows.

Orchestrating an Ingestion Pipeline automates the process of parsing files and chunking them by length. It then leverages a large language model to generate summaries, keywords, questions, and even metadata. Previously, this metadata had to be entered manually. Now, a single configuration dramatically reduces maintenance overhead.

Furthermore, the pipeline process is fully observable, recording and displaying complete processing logs for each file.

Image

The implementation of the Ingestion Pipeline in version 0.21.0 is a foundational step. In the next release, we plan to significantly enhance it by:

  • Adding support for more data sources.
  • Providing a wider selection of Parsers.
  • Introducing more flexible Transformer components to facilitate orchestration of a richer set of semantic-enhancement templates.

Long-context RAG

As we enter 2025, Retrieval-Augmented Generation (RAG) faces notable challenges driven by two main factors.

Fundamental limitations of traditional RAG

Traditional RAG architectures often fail to guarantee strong dialogue performance because they rely on a retrieval mechanism built around text chunks as the primary unit. This makes them highly sensitive to chunk quality and can yield degraded results due to insufficient context. For example:

  • If a coherent semantic unit is split across chunks, retrieval can be incomplete.
  • If a chunk lacks global context, the information presented to the LLM is weakened.

While strategies such as automatically detecting section headers and attaching them to chunks can help with global semantics, they are constrained by header-identification accuracy and the header’s own completeness.

Cost-efficiency concerns with advanced pre-processing techniques

Modern pre-processing methods—GraphRAG, RAPTOR, and Context Retrieval—aim to inject additional semantic information into raw data to boost search hit rates and accuracy for complex queries. They, however, share issues of high cost and unpredictable effectiveness.

  • GraphRAG: This approach often consumes many times more tokens than the original text, and the automatically generated knowledge graphs are frequently unsatisfactory. Its effectiveness in complex multi-hop reasoning is limited by uncontrollable reasoning paths. As a supplementary retrieval outside the original chunks, the knowledge graph also loses some granular context from the source.
  • RAPTOR: This technique produces clustered summaries that are recalled as independent chunks but naturally lack the detail of the source text, reintroducing the problem of insufficient context.

Context Retrieval: This method enriches original chunks with extra semantics such as keywords or potential questions. It presents a clear trade-off:

  • The more effective option queries the LLM multiple times per chunk, using both full text and the current chunk for context, improving performance but driving token costs several times higher than the original text.
  • The cheaper option generates semantic information based only on the current chunk, saving costs but providing limited global context and modest performance gains. The last few years have seen the emergence of new RAG schemes.
  • Complete abandonment of retrieval: some approaches have the LLM read documents directly, splitting them into chunks according to the context window and performing multi-stage searches. First, the LLM decides which global document is relevant, then which chunks, and finally loads those chunks to answer. While this avoids recall inaccuracies, it harms response latency, concurrency, and large-scale data handling, making practical deployment difficult.
  • Abandoning embedding or indexing in favour of tools like grep: this evolves RAG into Agentic RAG. As applications grow more complex and user queries diversify, combining RAG with agents is increasingly inevitable, since only LLMs can translate raw inquiries into structured retrieval commands. In RAGFlow, this capability has long been realized. Abandoning indexing to use grep is a compromise for simplifying agent development in personal or small-scale contexts; in enterprise settings, a powerful retrieval engine remains essential.
  • Long-Context RAG: introduced in version 0.21.0 as part of the same family as GraphRAG, RAPTOR and Context Retrieval, this approach uses LLMs to enrich raw text semantics to boost recall while retaining indexing and search. Retrieval remains central. Long-context RAG mirrors how people consult information: identify relevant chapters via the table of contents, then locate exact pages for detail. During indexing, the LLM extracts and attaches chapter information to each chunk to provide global context; during retrieval, it finds matching chunks and uses the table-of-contents structure to fill in gaps from chunk fragmentation.
  • Current experience and future direction: users can try Long-Context RAG via the “TOC extraction” (Table of Contents) feature, though it is in beta. The next release will add an Ingestion Pipeline. A key path to improving RAG lies in using LLMs to enrich content semantics without discarding retrieval altogether. Consequently, a flexible pipeline that lets users assemble LLM-based content-transformation components is an important direction for enhancing RAG retrieval quality.

Backend management CLI

RAGFlow’s progression has shifted from core module development to strengthening administrative and operational capabilities.

  • In earlier versions, while parsing and retrieval-augmented generation improved, system administration lagged. Administrators could not modify passwords or delete accounts, complicating deployment and maintenance.
  • With RAGFlow 0.21.0, fundamental system management is markedly improved. A new command-line administration tool provides a central, convenient interface for administrators. Core capabilities include:
    • Service lifecycle management: monitoring built-in RAGFlow services for greater operational flexibility.
    • Comprehensive user management:
      • Create new registered users.
      • Directly modify login passwords.
      • Delete user accounts.
      • Enable or disable accounts.
      • View details of all registered users.
    • Resource overview: listing knowledge bases and Agents created under registered users for system-wide monitoring.

This upgrade underlines RAGFlow’s commitment to robust functionality and foundational administrative strength essential for enterprise use. Looking ahead, the team plans an enterprise-grade web administration panel and accompanying user interface to streamline management, boost efficiency, and enhance the end-user experience, supporting greater maturity and stability.

Finale

RAGFlow 0.21.0 marks a significant milestone, building on prior progress and outlining future developments. It introduces the first integration of Retrieval (RAG) with orchestration (Flow), forming an intelligent engine to support the LLM context layer, underpinned by unstructured data ELT and a robust RAG capability set.

From the user-empowered Ingestion Pipeline to long-context RAG that mitigates semantic fragmentation, and the management backend that ensures reliable operation, every new feature is designed to make the RAG system smarter, more flexible, and enterprise-ready. This is not merely a feature tally but an architectural evolution, establishing a solid foundation for future growth.

Our ongoing focus remains the LLM context layer: building a powerful, reliable data foundation for LLMs and effectively serving all Agents. This remains RAGFlow’s core aim.

We invite you to continue following and starring our project as we grow together.

GitHub: https://github.com/infiniflow/ragflow

· 7 min read

Currently, e-commerce retail platforms extensively use intelligent customer service systems to manage a wide range of user enquiries. However, traditional intelligent customer service often struggles to meet users’ increasingly complex and varied needs. For example, customers may require detailed comparisons of functionalities between different product models before making a purchase; they might be unable to use certain features due to losing the instruction manual; or, in the case of home products, they may need to arrange an on-site installation appointment through customer service.

To address these challenges, we have identified several common demand scenarios, including queries about functional differences between product models, requests for usage assistance, and scheduling of on-site installation services. Building on the recently launched Agent framework of RAGFlow, this blog presents an approach for the automatic identification and branch-specific handling of user enquiries, achieved by integrating workflow orchestration with large language models.

The workflow is orchestrated as follows:

Image

The following sections offer a detailed explanation of the implementation process for this solution.

1. Prepare datasets

1.1 Create datasets

You can download the sample datasets from Hugging Face Datasets.

Create the "Product Information" and "User Guide" knowledge bases and upload the relevant dataset documents.

1.2 Parse documents

For documents in the 'Product Information' and 'User Guide' knowledge bases, we choose to use Manual chunking.

Image

Product manuals are often richly illustrated with a combination of text and images, containing extensive information and complex structures. Relying solely on text length for segmentation risks compromising the integrity of the content. RAGFlow assumes such documents follow a hierarchical structure and therefore uses the "smallest heading" as the basic unit of segmentation, ensuring each section of text and its accompanying graphics remain intact within a single chunk. A preview of the user manual following segmentation is shown below:

Image

2. Build workflow

2.1 Create an app

Upon successful creation, the system will automatically generate a Begin component on the canvas.

Image

In the Begin component, the opening greeting message for customer service can be configured, for example:

Hi! I'm your assistant. 

Image

2.2 Add a Categorize component

The Categorize component uses a Large Language Model (LLM) for intent recognition. It classifies user inputs and routes them to the appropriate processing workflows based on the category’s name, description, and provided examples.

Image

2.3 Build a product feature comparison workflow

The Retrieval component connects to the "Product Information" knowledge base to fetch content relevant to the user’s query, which is then passed to the Agent component to generate a response.

Image

Add a Retrieval component named "Feature Comparison Knowledge Base" and link it to the "Product Information" knowledge base.

Image

Add an Agent component after the Retrieval component, name it "Feature Comparison Agent," and configure the System Prompt as follows:

## Role
You are a product specification comparison assistant.
## Goal
Help the user compare two or more products based on their features and specifications. Provide clear, accurate, and concise comparisons to assist the user in making an informed decision.
---
## Instructions
- Start by confirming the product models or options the user wants to compare.
- If the user has not specified the models, politely ask for them.
- Present the comparison in a structured way (e.g., bullet points or a table format if supported).
- Highlight key differences such as size, capacity, performance, energy efficiency, and price if available.
- Maintain a neutral and professional tone without suggesting unnecessary upselling.
---

Configure User prompt

User's query is /(Begin Input) sys.query 

Schema is /(Feature Comparison Knowledge Base) formalized_content

After configuring the Agent component, the result is as follows:

Image

2.4 Build a product user guide workflow

The Retrieval component queries the "User Guide" knowledge base for content relevant to the user’s question, then passes the results to the Agent component to formulate a response.

Image

Add a Retrieval component named "Usage Guide Knowledge Base" and link it to the "User Guide" knowledge base.

Image

Add an Agent component after the Retrieval component, name it "Usage Guide Agent," and configure its System Prompt as follows:

## Role
You are a product usage guide assistant.
## Goal
Provide clear, step-by-step instructions to help the user set up, operate, and maintain their product. Answer questions about functions, settings, and troubleshooting.
---
## Instructions
- If the user asks about setup, provide easy-to-follow installation or configuration steps.
- If the user asks about a feature, explain its purpose and how to activate it.
- For troubleshooting, suggest common solutions first, then guide through advanced checks if needed.
- Keep the response simple, clear, and actionable for a non-technical user.
---

Write user prompt

User's query is /(Begin Input) sys.query 

Schema is / (Usage Guide Knowledge Base) formalized_content

After configuring the Agent component, the result is as follows:

Image

2.5 Build an installation booking assistant

The Agent engages in a multi-turn dialogue with the user to collect three key pieces of information: contact number, installation time, and installation address. Create an Agent component named "Installation Booking Agent" and configure its System Prompt as follows:

# Role
You are an Installation Booking Assistant.
## Goal
Collect the following three pieces of information from the user
1. Contact Number
2. Preferred Installation Time
3. Installation Address
Once all three are collected, confirm the information and inform the user that a technician will contact them later by phone.
## Instructions
1. **Check if all three details** (Contact Number, Preferred Installation Time, Installation Address) have been provided.
2. **If some details are missing**, acknowledge the ones provided and only ask for the missing information.
3. Do **not repeat** the full request once some details are already known.
4. Once all three details are collected, summarize and confirm them with the user.

Write user prompt

User's query is /(Begin Input) sys.query 

After configuring the Agent component, the result is as follows:

Image

If user information needs to be registered, an HTTP Request component can be connected after the Agent component to transmit the data to platforms such as Google Sheets or Notion. Developers may implement this according to their specific requirements; this blog article does not cover implementation details.

Image

2.6 Add a reply message component

For these three workflows, a single Message component is used to receive the output from the Agent components, which then displays the processed results to the user.

Image

2.7 Save and test

Click Save → Run → View Execution Result. When inquiring about product models and features, the system correctly returns a comparison:

Image

When asked about usage instructions, the system provides accurate guidance:

Image

When scheduling an installation, the system collects and confirms all necessary information:

Image

Summary

This use case can also be implemented using an Agent-based workflow, which offers the advantage of flexibly handling complex problems. However, since Agents actively engage in planning and reflection, they often significantly increase response times, leading to a diminished customer experience. As such, this approach is not well suited to scenarios like e-commerce after-sales customer service, where high responsiveness and relatively straightforward tasks are required. For applications involving complex issues, we have previously shared the Deep Research multi-agent framework. Related templates are available in our template library.

Image

The customer service workflow presented in this article is designed for e-commerce, yet this domain offers many more scenarios suitable for workflow automation—such as user review analysis and personalized email campaigns—which have not been covered here. By following the practical guidelines provided, you can also easily adapt this approach to other customer service contexts. We encourage you to build such applications using RAGFlow. Reinventing customer service with large language models moves support beyond “mechanical responses,” elevating capabilities from mere “retrieval and matching” to “cognitive reasoning.” Through deep understanding and real-time knowledge generation, it delivers an unprecedented experience that truly “understands human language,” thereby redefining the upper limits of intelligent service and transforming support into a core value engine for businesses.

· 7 min read

Workflow overview

This tutorial shows how to create a SQL Assistant workflow that enables natural language queries for SQL databases. Non-technical users like marketers and product managers can use this tool to query business data independently, reducing the need for data analysts. It can also serve as a teaching aid for SQL in schools and coding courses. The finished workflow operates as follows:

Image

The database schema, field descriptions, and SQL examples are stored as knowledge bases in RAGFlow. Upon user queries, the system retrieves relevant information from these sources and passes it to an Agent, which generates SQL statements. These statements are then executed by a SQL Executor component to return the query results.

Procedure

1. Create three knowledge bases

1.1 Prepare dataset files

You can download the sample datasets from Hugging Face Datasets.

The following are the predefined example files:

  1. Schema.txt
CREATE TABLE `users` (
`id` INT NOT NULL AUTO_INCREMENT,
`username` VARCHAR(50) NOT NULL,
`password` VARCHAR(50) NOT NULL,
`email` VARCHAR(100),
`mobile` VARCHAR(20),
`create_time` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
`update_time` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
UNIQUE KEY `uk_username` (`username`),
UNIQUE KEY `uk_email` (`email`),
UNIQUE KEY `uk_mobile` (`mobile`)
);

...

Note: When defining schema fields, avoid special characters such as underscores, as they can cause errors in SQL statements generated by the LLM. 2. Question to SQL.csv

What are the names of all the Cities in Canada
SELECT geo_name, id FROM data_commons_public_data.cybersyn.geo_index WHERE iso_name ilike '%can%

What is average Fertility Rate measure of Canada in 2002 ?
SELECT variable_name, avg(value) as average_fertility_rate FROM data_commons_public_data.cybersyn.timeseries WHERE variable_name = 'Fertility Rate' and geo_id = 'country/CAN' and date >= '2002-01-01' and date < '2003-01-01' GROUP BY 1;

What 5 countries have the highest life expectancy ?
SELECT geo_name, value FROM data_commons_public_data.cybersyn.timeseries join data_commons_public_data.cybersyn.geo_index ON timeseries.geo_id = geo_index.id WHERE variable_name = 'Life Expectancy' and date = '2020-01-01' ORDER BY value desc limit 5;


...
  1. Database Description EN.txt
### Users Table (users)
The users table stores user information for the website or application. Below are the definitions of each column in this table:
- `id`: INTEGER, an auto-incrementing field that uniquely identifies each user (primary key). It automatically increases with every new user added, guaranteeing a distinct ID for every user.
- `username`: VARCHAR, stores the user’s login name; this value is typically the unique identifier used during authentication.
- `password`: VARCHAR, holds the user’s password; for security, the value must be encrypted (hashed) before persistence.
- `email`: VARCHAR, stores the user’s e-mail address; it can serve as an alternate login credential and is used for notifications or password-reset flows.
- `mobile`: VARCHAR, stores the user’s mobile phone number; it can be used for login, receiving SMS notifications, or identity verification.
- `create_time`: TIMESTAMP, records the timestamp when the user account was created; defaults to the current timestamp.
- `update_time`: TIMESTAMP, records the timestamp of the last update to the user’s information; automatically refreshed to the current timestamp on every update.

...

1.2 Create knowledge bases in RAGFlow

Schema knowledge base

Create a knowledge base titled "Schema" and upload the file Schema.txt.

Tables in the database vary in length, each ending with a semicolon (;).

CREATE TABLE `users` (
`id` INT NOT NULL AUTO_INCREMENT,
`username` VARCHAR(50) NOT NULL,
`password` VARCHAR(50) NOT NULL,
...
UNIQUE KEY `uk_mobile` (`mobile`)
);

CREATE TABLE `products` (
`id` INT NOT NULL AUTO_INCREMENT,
`name` VARCHAR(100) NOT NULL,
`description` TEXT,
`price` DECIMAL(10, 2) NOT NULL,
`stock` INT NOT NULL,
...
FOREIGN KEY (`merchant_id`) REFERENCES `merchants` (`id`)
);

CREATE TABLE `merchants` (
`id` INT NOT NULL AUTO_INCREMENT,
`name` VARCHAR(100) NOT NULL,
`description` TEXT,
`email` VARCHAR(100),
...
UNIQUE KEY `uk_mobile` (`mobile`)
);

To isolate each table as a standalone chunk without overlapping content, configure the knowledge base parameters as follows:

  • Chunking Method: General
  • Chunk Size: 2 tokens (minimum size for isolation)
  • Delimiter: Semicolon (;) RAGFlow will then parse and generate chunks according to this workflow:

Below is a preview of the parsed results from Schema.txt:

We now validate the retrieved results through retrieval testing:

Question to SQL knowledge base

Create a new knowledge base titled "Question to SQL" and upload the file "Question to SQL.csv".

Set the chunking method to Q&A, then parse Question_to_SQL.csv to preview the results.

We now validate the retrieved results through retrieval testing:

Database Description knowledge base Create a new knowledge base titled "Database Description" and upload the file "Database_Description_EN.txt".

Configuration (Same as Schema Knowledge Base):

  • Chunking Method: General
  • Chunk Size: 2 tokens (minimum size for isolation)
  • Delimiter: Semicolon ### Below is a preview of the parsed Database_Description_EN.txt following configuration.

We now validate the retrieved results through retrieval testing:

Note: The three knowledge bases are maintained and queried separately. The Agent component consolidates results from all sources before producing outputs."

2. Orchestrate the workflow

2.1 Create a workflow application

Once created successfully, the Begin component automatically appears on the canvas.

You can configure a welcome message in the Begin component. For example:

Hi! I'm your SQL assistant, what can I do for you?

2.2 Configure three Retrieval components

Add three parallel Retrieval components after the Begin component, named as follows:

  • Schema
  • Question to SQL
  • Database Description Configure each Retrieval component:
  1. Query variable: sys.query
  2. Knowledge base selection: Select the knowledge base whose name matches the current component's name.

2.3 Configure the Agent component

Add an Agent component named 'SQL Generator' after the Retrieval components, connecting all three to it.

Write System Prompt:

### ROLE
You are a Text-to-SQL assistant.
Given a relational database schema and a natural-language request, you must produce a **single, syntactically-correct MySQL query** that answers the request.
Return **nothing except the SQL statement itself**—no code fences, no commentary, no explanations, no comments, no trailing semicolon if not required.

### EXAMPLES
-- Example 1
User: List every product name and its unit price.
SQL:
SELECT name, unit_price FROM Products;

-- Example 2
User: Show the names and emails of customers who placed orders in January 2025.
SQL:
SELECT DISTINCT c.name, c.email
FROM Customers c
JOIN Orders o ON o.customer_id = c.id
WHERE o.order_date BETWEEN '2025-01-01' AND '2025-01-31';

-- Example 3
User: How many orders have a status of "Completed" for each month in 2024?
SQL:
SELECT DATE_FORMAT(order_date, '%Y-%m') AS month,
COUNT(*) AS completed_orders
FROM Orders
WHERE status = 'Completed'
AND YEAR(order_date) = 2024
GROUP BY month
ORDER BY month;

-- Example 4
User: Which products generated at least \$10 000 in total revenue?
SQL:
SELECT p.id, p.name, SUM(oi.quantity * oi.unit_price) AS revenue
FROM Products p
JOIN OrderItems oi ON oi.product_id = p.id
GROUP BY p.id, p.name
HAVING revenue >= 10000
ORDER BY revenue DESC;

### OUTPUT GUIDELINES
1. Think through the schema and the request.
2. Write **only** the final MySQL query.
3. Do **not** wrap the query in back-ticks or markdown fences.
4. Do **not** add explanations, comments, or additional text—just the SQL.

Write User Prompt:

User's query: /(Begin Input) sys.query  
Schema: /(Schema) formalized_content
Samples about question to SQL: /(Question to SQL) formalized_content
Description about meanings of tables and files: /(Database Description) formalized_content

After inserting variables, the populated result appears as follows:

2.4 Configure the ExeSQL component

Append an ExeSQL component named "SQL Executor" after the SQL Generator.

Configure the database for the SQL Executor component, specifying that its Query input comes from the output of the SQL Generator.

2.5 Configure the Message component

Append a Message component to the SQL Executor.

Insert variables into the Messages field to enable the message component to display the output of the SQL Executor (formalized_content):

2.6 Save and test

Click Save → Run → Enter a natural language question → View execution results.

Finale

Finally, like current Copilot technologies, NL2SQL cannot achieve complete accuracy. For standardized processing of structured data, we recommend consolidating its operations to specific APIs, then encapsulating these APIs as MCPs (Managed Content Packages) for RAGFlow. We will demonstrate this approach in a forthcoming blog.

· 13 min read

Deep Research: The defining capability of the Agent era

The year 2025 is hailed as the dawn of Agent adoption. Among the many unlocked possibilities, Deep Research—and the applications built upon it—stands out as especially significant. Why is this? Using Agentic RAG and its reflection mechanism, Deep Research enables large language models to reason deeply with users’ proprietary data. This is key for Agents to tackle more advanced tasks. For example, whether helping with creative writing or aiding decision-making in different industries, these rely on Deep Research. Deep Research represents a natural step forward from RAG, improved by Agents. Unlike basic RAG, it explores data at a deeper level. Like RAG, it can work on its own or serve as the base for other industry-specific Agents. A Deep Research workflow typically follows this sequence:

  1. Decomposition & Planning: The large language model breaks down the user’s query and devises a plan.
  2. Multi-Source Retrieval: The query is sent to multiple data sources, including internal RAG and external web searches.
  3. Reflection & Refinement: The model reviews the retrieved information, reflects on it, summarizes key points, and adjusts the plan as needed.
  4. Iteration & Output: After several iterations, a data-specific chain of thought is formed to generate the final answer or report.

Image

Built-in Deep Research was already implemented in RAGFlow v0.18.0. Although at that stage it primarily served as a demo to validate and explore deep reasoning’s potential in enterprise settings. Across the industry, Deep Research implementations generally fall into two categories:

  • No-Code Workflow Orchestration: Using a visual workflow interface with predefined workflows to implement Deep Research.
  • Dedicated Agent Libraries or Frameworks: Many current solutions follow this approach (see [Reference 1] for details).

RAGFlow 0.20.0, a unified RAG and Agent engine supporting both Workflow and Agentic modes with seamless orchestration on a single canvas, can implement Deep Research through either approach. However, employing Workflow for Deep Research is only functional, as its interface appears like this:

Image

This gives rise to two main problems:

  1. Overly Complex and Unintuitive Orchestration: If the basic Deep Research template is already this complicated, a full application would become a tangled web of workflows that are hard to maintain and expand.
  2. Better Suited to Agentic Methods: Deep Research naturally relies on dynamic problem breakdown and decision-making, with control steps driven by algorithms. Workflow drag-and-drop interfaces only handle simple loops, lacking the flexibility needed for advanced control.

RAGFlow 0.20.0 offers a comprehensive Agent engine designed to help enterprises build production-ready Agents using no-code tools. As the key Agent template, Deep Research strikes a careful balance between simplicity and flexibility. Compared to other solutions, RAGFlow highlights these strengths:

  • Agentic Execution, No-Code Driven: A fully customisable application template—not just an SDK or runtime—that uses Agentic mechanisms while remaining easy to access through a no-code platform.
  • Human Intervention (Coming Soon): Although Deep Research depends on LLM-generated plans, which can feel like a black box, RAGFlow’s template will support manual oversight to introduce certainty—a vital feature for enterprise-grade Agents.
  • Business-Focused and Outcome-Oriented:

Developers can customise the Deep Research Agent’s structure and tools, such as configuring internal knowledge bases for enterprise data retrieval, to meet specific business needs. Full transparency in the Agent’s workflow—including its plans and execution results—allows timely optimisation based on clear, actionable insights.

Practical Guide to Setting Up Deep Research

We use a Multi-Agent architecture [Reference 2], carefully defining each Agent’s role and responsibilities through thorough Prompt Engineering [Reference 4] and task decomposition principles [Reference 6]:

  • Lead Agent: Coordinates the Deep Research Agent, handling task planning, reflection, and delegating to Subagents, while keeping track of all workflow progress.
  • Web Search Specialist Subagent: Acts as the information retrieval expert, querying search engines, assessing results, and returning URLs of the best-quality sources.
  • Deep Content Reader Subagent: Extracts and organises web content from the URLs provided by the Search Specialist, preparing refined material for report drafting.
  • Research Synthesizer Subagent: Generates professional, consultancy-grade deep-dive reports according to the Lead Agent’s instructions.

Model selection

  • Lead Agent: Prioritizes models with strong reasoning capabilities, such as DeepSeek-R1, Qwen-3, Kimi-2, ChatGPT-o3, Claude-4, or Gemini-2.5 Pro.
  • Subagents: Optimized for execution efficiency and quality, balancing reasoning speed with output reliability. Context window length is also a key criterion based on their specific roles.

Temperature setting

Given the fact-driven nature of this application, we set the temperature parameter to 0.1 across all models to ensure deterministic, grounded outputs.

Deep research lead agent

Model Selection: Qwen-Max

Excerpts from Core System Prompt:

  1. The prompt directs the Deep Research Agent's workflow and task delegation to Subagents, greatly enhancing efficiency and flexibility compared to traditional workflow orchestration.
<execution_framework>
**Stage 1: URL Discovery**
- Deploy Web Search Specialist to identify 5 premium sources
- Ensure comprehensive coverage across authoritative domains
- Validate search strategy matches research scope

**Stage 2: Content Extraction**
- Deploy Content Deep Reader to process 5 premium URLs
- Focus on structured extraction with quality assessment
- Ensure 80%+ extraction success rate

**Stage 3: Strategic Report Generation**
- Deploy Research Synthesizer with detailed strategic analysis instructions
- Provide specific analysis framework and business focus requirements
- Generate comprehensive McKinsey-style strategic report (~2000 words)
- Ensure multi-source validation and C-suite ready insights
</execution_framework>
  1. Dynamically create task execution plans and carry out BFS or DFS searches [Reference 3]. While traditional workflows struggle to orchestrate BFS/DFS logic, RAGFlow’s Agent achieves this effortlessly through prompt engineering.
<research_process>
...
**Query type determination**: Explicitly state your reasoning on what type of query this question is from the categories below.
...
**Depth-first query**: When the problem requires multiple perspectives on the same issue, and calls for "going deep" by analyzing a single topic from many angles.
...
**Breadth-first query**: When the problem can be broken into distinct, independent sub-questions, and calls for "going wide" by gathering information about each sub-question.
...
**Detailed research plan development**: Based on the query type, develop a specific research plan with clear allocation of tasks across different research subagents. Ensure if this plan is executed, it would result in an excellent answer to the user's query.
</research_process>

Web search specialist subagent

Model Selection: Qwen-Plus Excerpts from Core System Prompt Design:

  1. Role Definition:
You are a Web Search Specialist working as part of a research team. Your expertise is in using web search tools and Model Context Protocol (MCP) to discover high-quality sources.

**CRITICAL: YOU MUST USE WEB SEARCH TOOLS TO EXECUTE YOUR MISSION**

<core_mission>
Use web search tools (including MCP connections) to discover and evaluate premium sources for research. Your success depends entirely on your ability to execute web searches effectively using available search tools.

**CRITICAL OUTPUT CONSTRAINT**: You MUST provide exactly 5 premium URLs - no more, no less. This prevents attention fragmentation in downstream analysis.
</core_mission>
  1. Design the search strategy:
<process>
1. **Plan**: Analyze the research task and design search strategy
2. **Search**: Execute web searches using search tools and MCP connections
3. **Evaluate**: Assess source quality, credibility, and relevance
4. **Prioritize**: Rank URLs by research value (High/Medium/Low) - **SELECT TOP 5 ONLY**
5. **Deliver**: Provide structured URL list with exactly 5 premium URLs for Content Deep Reader

**MANDATORY**: Use web search tools for every search operation. Do NOT attempt to search without using the available search tools.
**MANDATORY**: Output exactly 5 URLs to prevent attention dilution in Lead Agent processing.
</process>
  1. Search Strategies and How to Use Tools Like Tavily
<search_strategy>
**MANDATORY TOOL USAGE**: All searches must be executed using web search tools and MCP connections. Never attempt to search without tools.
**MANDATORY URL LIMIT**: Your final output must contain exactly 5 premium URLs to prevent Lead Agent attention fragmentation.

- Use web search tools with 3-5 word queries for optimal results
- Execute multiple search tool calls with different keyword combinations
- Leverage MCP connections for specialized search capabilities
- Balance broad vs specific searches based on search tool results
- Diversify sources: academic (30%), official (25%), industry (25%), news (20%)
- Execute parallel searches when possible using available search tools
- Stop when diminishing returns occur (typically 8-12 tool calls)
- **CRITICAL**: After searching, ruthlessly prioritize to select only the TOP 5 most valuable URLs

**Search Tool Strategy Examples:**
* **Broad exploration**: Use search tools → "AI finance regulation" → "financial AI compliance" → "automated trading rules"
* **Specific targeting**: Use search tools → "SEC AI guidelines 2024" → "Basel III algorithmic trading" → "CFTC machine learning"
* **Geographic variation**: Use search tools → "EU AI Act finance" → "UK AI financial services" → "Singapore fintech AI"
* **Temporal focus**: Use search tools → "recent AI banking regulations" → "2024 financial AI updates" → "emerging AI compliance"
</search_strategy>

Deep content reader subagent

Model Selection: Moonshot-v1-128k Core System Prompt Design Excerpts:

  1. Role Definition Framework
You are a Content Deep Reader working as part of a research team. Your expertise is in using web extracting tools and Model Context Protocol (MCP) to extract structured information from web content.

**CRITICAL: YOU MUST USE WEB EXTRACTING TOOLS TO EXECUTE YOUR MISSION**

<core_mission>
Use web extracting tools (including MCP connections) to extract comprehensive, structured content from URLs for research synthesis. Your success depends entirely on your ability to execute web extractions effectively using available tools.
</core_mission>
  1. Agent Planning and Web Extraction Tool Utilization
<process>
1. **Receive**: Process `RESEARCH_URLS` (5 premium URLs with extraction guidance)
2. **Extract**: Use web extracting tools and MCP connections to get complete webpage content and full text
3. **Structure**: Parse key information using defined schema while preserving full context
4. **Validate**: Cross-check facts and assess credibility across sources
5. **Organize**: Compile comprehensive `EXTRACTED_CONTENT` with full text for Research Synthesizer

**MANDATORY**: Use web extracting tools for every extraction operation. Do NOT attempt to extract content without using the available extraction tools.

**TIMEOUT OPTIMIZATION**: Always check extraction tools for timeout parameters and set generous values:
- **Single URL**: Set timeout=45-60 seconds
- **Multiple URLs (batch)**: Set timeout=90-180 seconds
- **Example**: `extract_tool(url="https://example.com", timeout=60)` for single URL
- **Example**: `extract_tool(urls=["url1", "url2", "url3"], timeout=180)` for multiple URLs
</process>

<processing_strategy>
**MANDATORY TOOL USAGE**: All content extraction must be executed using web extracting tools and MCP connections. Never attempt to extract content without tools.

- **Priority Order**: Process all 5 URLs based on extraction focus provided
- **Target Volume**: 5 premium URLs (quality over quantity)
- **Processing Method**: Extract complete webpage content using web extracting tools and MCP
- **Content Priority**: Full text extraction first using extraction tools, then structured parsing
- **Tool Budget**: 5-8 tool calls maximum for efficient processing using web extracting tools
- **Quality Gates**: 80% extraction success rate for all sources using available tools
</processing_strategy>

Research synthesizer subagent

Model Selection: Moonshot-v1-128k Special Note: The Subagent handling final report generation must use models with very long context windows. This is essential for processing extensive context and producing thorough reports. Models with limited context risk truncating information, resulting in shorter reports. Other potential model options include but are not limited to:

  • Qwen-Long (10M tokens)
  • Claude 4 Sonnet (200K tokens)
  • Gemini 2.5 Flash (1M tokens) Excerpts from Core Prompt Design:
  1. Role Definition Design
You are a Research Synthesizer working as part of a research team. Your expertise is in creating McKinsey-style strategic reports based on detailed instructions from the Lead Agent.

**YOUR ROLE IS THE FINAL STAGE**: You receive extracted content from websites AND detailed analysis instructions from Lead Agent to create executive-grade strategic reports.

**CRITICAL: FOLLOW LEAD AGENT'S ANALYSIS FRAMEWORK**: Your report must strictly adhere to the `ANALYSIS_INSTRUCTIONS` provided by the Lead Agent, including analysis type, target audience, business focus, and deliverable style.

**ABSOLUTELY FORBIDDEN**:
- Never output raw URL lists or extraction summaries
- Never output intermediate processing steps or data collection methods
- Always output a complete strategic report in the specified format

<core_mission>
**FINAL STAGE**: Transform structured research outputs into strategic reports following Lead Agent's detailed instructions.

**IMPORTANT**: You receive raw extraction data and intermediate content - your job is to TRANSFORM this into executive-grade strategic reports. Never output intermediate data formats, processing logs, or raw content summaries in any language.
</core_mission>
  1. Autonomous Task Execution
<process>
1. **Receive Instructions**: Process `ANALYSIS_INSTRUCTIONS` from Lead Agent for strategic framework
2. **Integrate Content**: Access `EXTRACTED_CONTENT` with FULL_TEXT from 5 premium sources
- **TRANSFORM**: Convert raw extraction data into strategic insights (never output processing details)
- **SYNTHESIZE**: Create executive-grade analysis from intermediate data
3. **Strategic Analysis**: Apply Lead Agent's analysis framework to extracted content
4. **Business Synthesis**: Generate strategic insights aligned with target audience and business focus
5. **Report Generation**: Create executive-grade report following specified deliverable style

**IMPORTANT**: Follow Lead Agent's detailed analysis instructions. The report style, depth, and focus should match the provided framework.
</process>
  1. Structure of the generated report
<report_structure>
**Executive Summary** (400 words)
- 5-6 core findings with strategic implications
- Key data highlights and their meaning
- Primary conclusions and recommended actions

**Analysis** (1200 words)
- Context & Drivers (300w): Market scale, growth factors, trends
- Key Findings (300w): Primary discoveries and insights
- Stakeholder Landscape (300w): Players, dynamics, relationships
- Opportunities & Challenges (300w): Prospects, barriers, risks

**Recommendations** (400 words)
- 3-4 concrete, actionable recommendations
- Implementation roadmap with priorities
- Success factors and risk mitigation
- Resource allocation guidance

**Examples:**

**Executive Summary Format:**

Key Finding 1: [FACT] 73% of major banks now use AI for fraud detection, representing 40% growth from 2023

  • Strategic Implication: AI adoption has reached critical mass in security applications
  • Recommendation: Financial institutions should prioritize AI compliance frameworks now

...

</report_structure>

Upcoming versions

The current RAGFlow 0.20.0 release does not yet support human intervention in Deep Research execution, but this feature is planned for future updates. It is crucial for making Deep Research production-ready, as it adds certainty and improves accuracy [Reference 5] to what would otherwise be uncertain automated processes. Manual oversight will be vital for enterprise-grade Deep Research applications.

Image

We warmly invite everyone to stay tuned and star RAGFlow at https://github.com/infiniflow/ragflow

Bibliography

  1. Awesome Deep Research https://github.com/DavidZWZ/Awesome-Deep-Research
  2. How we built our multi-agent research system https://www.anthropic.com/engineering/built-multi-agent-research-system
  3. Anthropic Cookbook https://github.com/anthropics/anthropic-cookbook
  4. State-Of-The-Art Prompting For AI Agents https://youtu.be/DL82mGde6wo?si=KQtOEiOkmKTpC_1E
  5. From Language Models to Language Agents https://ysymyth.github.io/papers/from_language_models_to_language_agents.pdf
  6. Agentic Design Patterns Part 5, Multi-Agent Collaboration https://www.deeplearning.ai/the-batch/agentic-design-patterns-part-5-multi-agent-collaboration/

· 8 min read

1. From Workflow to Agentic Workflow

After a long wait, RAGFlow 0.20.0 has finally been released—a milestone update that completes the bigger picture of RAG/Agent. A year ago, RAGFlow introduced Agent functionality, but it only supported manually managed Workflows and lacked Agentic Workflow capabilities. By RAGFlow's definition, a true Agent requires both: Workflows for human-defined tasks and Agentic Workflows powered by LLM-driven automation. Anthropic’s 2024 article "Building Effective Agents" emphasized this distinction, noting that Workflows continue to be the primary way Agents are used. Now, in 2025, as LLMs become more advanced, Agentic Workflows are enabling more impressive use cases.

Image

Ideally, fully LLM-driven Agentic Workflows are the ultimate goal for most Agent applications. However, due to current limitations of LLMs, they introduce unpredictability and a lack of control—issues that are especially problematic in enterprise settings. On the other hand, traditional Workflows take the opposite approach: using low-code platforms where every variable, condition, and loop is explicitly defined. This allows non-technical business users to effectively “program” based on their understanding of the logic. While this ensures predictability, it often leads to overly complex, tangled workflows that are hard to maintain and prone to misuse. More importantly, it makes it difficult to properly break down and manage tasks. Therefore, the long-term solution requires both Agentic and manual Workflows working together in harmony. Only this unified approach can truly meet the demands of enterprise-level Agents. With full Agent capabilities now in place, RAGFlow has become a more enterprise-ready, platform-level LLM engine. In the enterprise ecosystem, RAG occupies a role similar to traditional databases, while Agents serve as the specific applications—yet there are important differences. First, RAG focuses on leveraging unstructured data rather than structured datasets. Second, the interaction between RAG and Agents is much more frequent and intensive compared to typical application-database relationships because Agents need real-time, precise context to ensure their actions align closely with user intent. RAG plays a vital role in providing this context. For these reasons, completing the Agent capabilities is key to RAGFlow’s evolution.

Image

Let’s take a look at the key features of RAGFlow 0.20.0.

2. Key Updates in RAGFlow 0.20.0

The key features of this release include:

  • Unified orchestration of both Agents and Workflows.
  • A complete refactor of the Agent, significantly improving its capabilities and usability, with support for Multi-Agent configurations, planning and reflection, and visual features.
  • Full MCP functionality, enabling MCP Server import, Agents to act as MCP Clients, and RAGFlow itself to function as an MCP Server.
  • Access to runtime logs for Agents.
  • Chat histories with Agents available via the management panel.

How does the updated Agent-building experience differ for developers?

Take the 0.19.1 customer service template as an example: previously, building this Workflow required seven types of components (Begin, Interact, Refine Question, Categorize, Knowledge Retrieval, Generate, and Message), with the longest chain for a category 4 question involving seven steps.

Image

In the new version, building in pure Workflow mode requires only five types of components (Begin, Categorize, Knowledge Retrieval, Agent, and Message), with the workflow for a category 4 question (product related) reduced to five steps.

Image

With Agentic mode, just three types of components are needed - the original workflow logic can now be handled entirely through prompt engineering.

Image

Developers can inspect Agents' execution paths and verify their inputs/outputs.

Image

Business users can view Agents' reasoning processes through either the embedded page or chat interface.

Image

This comparison quantitatively demonstrates reduced complexity and improved efficiency. Further details follow below - we encourage you to try it yourself.

3. A unified orchestration engine for Agents

RAGFlow 0.20.0 introduces unified orchestration of both Workflow and Agentic Workflow. As mentioned earlier, these represent two extremes, but enterprise scenarios demand their collaboration. The platform now supports co-orchestration on one, inherently Multi-Agent canvas—users can designate uncertain inputs as Agentic Workflows and deterministic ones as Workflows. To align with common practice, Agentic Workflows are represented as separate Agent components on the canvas. This release redesigns the orchestration interface and component functions around this goal, while also improving usability issues from earlier Workflow versions. Key improvements include reducing core Components from 12 to 10, with the main changes as follows:

Image

Begin component

It now supports a task-based Agent mode that does not require a conversation to be triggered. Developers can build both conversational and task-based Agents on the same canvas.

Image

Retrieval component

Retrieval can function as a component within a workflow and also be used as a Tool by an Agent component, enabling the Agent to determine when and how to invoke retrieval queries.

Image

Agent component

Image

An Agent capable of independently replacing your work needs to have the following abilities:

  • Autonomous reasoning [1], with the capacity to reflect and adjust based on environmental feedback
  • The ability to use tools to complete tasks [3]

With the new Agent component, developers only need to configure the Prompt and Tools to quickly build an Agent, as RAGFlow has already handled the underlying technical implementation.

Image

Besides the single-agent mode, the new agent component also supports adding subagents that can be called during runtime.

Image

You can freely add agents to build your own unlimited agent team.

Image

Add and bulk import your already deployed MCP Servers.

Image

Tools from added MCP Servers can be used within the agent.

Image

Await Response component

The original Interact component has been refactored into an await-response component, allowing developers to actively pause the process to initiate preset conversations and collect key information via forms.

Image

Switch component

Improved the usability of the switch component.

Image

Iteration component

The input parameter type for the Iteration component has been changed to an array; during iteration, both the index and value are accessible to internal components, whose outputs can be formed into an array and passed downstream.

Image

Reply Message component

Messages can now be replied to directly via Reply Message, eliminating the need for the Interact component.

Image

Text Processing component

Developers can easily concatenate strings through Text Processing.

Image

You can also split strings into arrays.

Image

Summary

RAGFlow 0.20 enables simultaneous orchestration of both Agentic and Workflow modes, with built-in Multi-Agent support allowing multiple agents to coexist on a single canvas. For open-ended queries like “Why has company performance declined?” where the approach isn’t predetermined, users can define the process in Agentic style. In contrast, scenarios with clear, fixed steps use Workflow style. This lets developers build Agent applications with minimal configuration. The seamless integration of Agentic and Workflow approaches is the best practice for deploying enterprise-grade intelligent agents.

4. Application Ecosystem and Future Development

With a complete, unified no-code Agent framework in place, RAGFlow naturally supports a wide range of scenario-specific Agent applications—this is a major focus for its long-term development. In other words, a vast number of Agent templates will be built on top of RAGFlow, backed by our new ecosystem co-creation initiative.

RAGFlow 0.20.0 introduces Deep Research as a built-in template—an exceptional Agent that can serve both as a standalone application and as a foundation for building other intelligent agents. We will explain how to build the Deep Research template in detail in a forthcoming article.

The example below shows that on the RAGFlow platform, Deep Research can be created using either Agentic Workflow or traditional Workflow methods, with the former offering far greater flexibility and simplicity. Deep Research built via a traditional Workflow:

Image

Deep Research built with an Agentic Workflow shows significantly reduced complexity compared to the Workflow implementation above:

Image

RAGFlow’s ecosystem initiative aims to provide enterprise-ready Agent templates embedded with industry know-how. Developers can easily customize these templates to fit their own business needs. Among these, Deep Research is the most important, as it represents the most common form of Agentic RAG and represents the essential path for Agents to unlock deeper value from enterprise data. Built on RAGFlow’s built-in Deep Research template, developers can quickly adapt it into specialized assistants—such as legal or medical advisors—significantly narrowing the gap between business systems and underlying infrastructure. This ecosystem approach is made possible by the close collaboration between RAG and Agents. The 0.20.0 release marks a major step in integrating RAG and Agent capabilities within RAGFlow, with rapid updates planned ahead, including memory management and manual adjustment of Agent Plans. While unifying Workflow and Agentic Workflow greatly lowers the barriers to building enterprise Agents, and the ecosystem expands their application scope, the data foundation that merges structured and unstructured data around RAG remains the cornerstone of Agent capabilities. This approach is now known as “context engineering,” with traditional RAG representing its 1.0 version. Our future articles will explore these advancements in greater detail.

Image

We welcome your continued support - star RAGFlow on GitHub: https://github.com/infiniflow/ragflow

Bibliography

  1. ReAct: Synergizing Reasoning and Acting in Language Models https://arxiv.org/abs/2210.03629
  2. Reflexion: Language Agents with Verbal Reinforcement Learning https://arxiv.org/abs/2303.11366
  3. A Practical Guide to Building Agents https://cdn.openai.com/business-guides-and-resources/a-practical-guide-to-building-agents.pdf

· 13 min read

Six months have passed since our last year-end review. As the initial wave of excitement sparked by DeepSeek earlier this year begins to wane, AI seems to have entered a phase of stagnation. This pattern is evident in Retrieval-Augmented Generation (RAG) as well: although academic papers on RAG continue to be plentiful, significant breakthroughs have been few and far between in recent months. Likewise, recent iterations of RAGFlow have focused on incremental improvements rather than major feature releases. Is this the start of future leaps forward, or the beginning of a period of steady, incremental growth? A mid-year assessment is therefore both timely and necessary.