Skip to main content

2 posts tagged with "CLI"

View All Tags

· 13 min read

To meet the refined requirements for system operation and maintenance monitoring as well as user account management, especially addressing practical issues currently faced by RAGFlow users, such as "the inability of users to recover lost passwords on their own" and "difficulty in effectively controlling account statuses," RAGFlow has officially launched a professional back-end management command-line tool.

This tool is based on a clear client-server architecture. By adopting the design philosophy of functional separation, it constructs an efficient and reliable system management channel, enabling administrators to achieve system recovery and permission control at the fundamental level.

The specific architectural design is illustrated in the following diagram:

Through this tool, RAGFlow users can gain a one-stop overview of the operational status of the RAGFlow Server as well as components such as MySQL, Elasticsearch, Redis, MinIO, and Infinity. It also supports comprehensive user lifecycle management, including account creation, status control, password reset, and data cleanup.

Start the service

If you deploy RAGFlow via Docker, please modify the docker/docker_compose.yml file and add parameters to the service startup command.

command:
- --enable-adminserver

If you are deploying RAGFlow from the source code, you can directly execute the following command to start the management server:

python3 admin/server/admin_server.py

After the server starts, it listens on port 9381 by default, waiting for client connections. If you need to use a different port, please modify the ADMIN_SVR_HTTP_PORT configuration item in the docker/.env file.

Install the client and connect to the management service

It is recommended to use pip to install the specified version of the client:

pip install ragflow-cli==0.21.0

Version 0.21.0 is the current latest version. Please ensure that the version of ragflow-cli matches the version of the RAGFlow server to avoid compatibility issues:

ragflow-cli -h 127.0.0.1 -p 9381

Here, -h specifies the server IP, and -p specifies the server port. If the server is deployed at a different address or port, please adjust the parameters accordingly.

For the first login, enter the default administrator password admin. After successfully logging in, it is recommended to promptly change the default password in the command-line tool to enhance security.

Command Usage Guide

By entering the following commands through the client, you can conveniently manage users and monitor the operational status of the service.

Interactive Feature Description

  • Supports using arrow keys to move the cursor and review historical commands.
  • Pressing Ctrl+C allows you to terminate the current interaction at any time.
  • If you need to copy content, please avoid using Ctrl+C to prevent accidentally interrupting the process.

Command Format Specifications

  • All commands are case-insensitive and must end with a semicolon ;.
  • Text parameters such as usernames and passwords need to be enclosed in single quotes ' or double quotes ".
  • Special characters like \, ', and " are prohibited in passwords.

Service Management Commands

LIST SERVICES;

  • List the operational status of RAGFlow backend services and all associated middleware.

Usage Example

admin> list services;
Listing all services
+-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+---------+
| extra | host | id | name | port | service_type | status |
+-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+---------+
| {} | 0.0.0.0 | 0 | ragflow_0 | 9380 | ragflow_server | Timeout |
| {'meta_type': 'mysql', 'password': 'infini_rag_flow', 'username': 'root'} | localhost | 1 | mysql | 5455 | meta_data | Alive |
| {'password': 'infini_rag_flow', 'store_type': 'minio', 'user': 'rag_flow'} | localhost | 2 | minio | 9000 | file_store | Alive |
| {'password': 'infini_rag_flow', 'retrieval_type': 'elasticsearch', 'username': 'elastic'} | localhost | 3 | elasticsearch | 1200 | retrieval | Alive |
| {'db_name': 'default_db', 'retrieval_type': 'infinity'} | localhost | 4 | infinity | 23817 | retrieval | Timeout |
| {'database': 1, 'mq_type': 'redis', 'password': 'infini_rag_flow'} | localhost | 5 | redis | 6379 | message_queue | Alive |
+-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+---------+

List the operational status of RAGFlow backend services and all associated middleware.

SHOW SERVICE <id>;

  • Query the detailed operational status of the specified service.
  • <id>: Command: LIST SERVICES; The service identifier ID in the displayed results.

Usage Example:

  1. Query the RAGFlow backend service:
admin> show service 0;
Showing service: 0
Service ragflow_0 is alive. Detail:
Confirm elapsed: 4.1 ms.

The response indicates that the RAGFlow backend service is online, with a response time of 4.1 milliseconds.

  1. Query the MySQL service:
admin> show service 1;
Showing service: 1
Service mysql is alive. Detail:
+---------+----------+------------------+------+------------------+------------------------+-------+-----------------+
| command | db | host | id | info | state | time | user |
+---------+----------+------------------+------+------------------+------------------------+-------+-----------------+
| Daemon | None | localhost | 5 | None | Waiting on empty queue | 86356 | event_scheduler |
| Sleep | rag_flow | 172.18.0.6:56788 | 8790 | None | | 2 | root |
| Sleep | rag_flow | 172.18.0.6:53482 | 8791 | None | | 73 | root |
| Query | rag_flow | 172.18.0.6:56794 | 8795 | SHOW PROCESSLIST | init | 0 | root |
+---------+----------+------------------+------+------------------+------------------------+-------+-----------------+

The response indicates that the MySQL service is online, with the current connection and execution status shown in the table above.

  1. Query the MinIO service:
admin> show service 2;
Showing service: 2
Service minio is alive. Detail:
Confirm elapsed: 2.3 ms.

The response indicates that the MinIO service is online, with a response time of 2.3 milliseconds.

  1. Query the Elasticsearch service:
admin> show service 3;
Showing service: 3
Service elasticsearch is alive. Detail:
+----------------+------+--------------+---------+----------------+--------------+---------------+--------------+------------------------------+----------------------------+-----------------+-------+---------------+---------+-------------+---------------------+--------+------------+--------------------+
| cluster_name | docs | docs_deleted | indices | indices_shards | jvm_heap_max | jvm_heap_used | jvm_versions | mappings_deduplicated_fields | mappings_deduplicated_size | mappings_fields | nodes | nodes_version | os_mem | os_mem_used | os_mem_used_percent | status | store_size | total_dataset_size |
+----------------+------+--------------+---------+----------------+--------------+---------------+--------------+------------------------------+----------------------------+-----------------+-------+---------------+---------+-------------+---------------------+--------+------------+--------------------+
| docker-cluster | 8 | 0 | 1 | 2 | 3.76 GB | 2.15 GB | 21.0.1+12-29 | 17 | 757 B | 17 | 1 | ['8.11.3'] | 7.52 GB | 2.30 GB | 31 | green | 226 KB | 226 KB |
+----------------+------+--------------+---------+----------------+--------------+---------------+--------------+------------------------------+----------------------------+-----------------+-------+---------------+---------+-------------+---------------------+--------+------------+--------------------+

The response indicates that the Elasticsearch cluster is operating normally, with specific metrics including document count, index status, memory usage, etc.

  1. Query the Infinity service:
admin> show service 4;
Showing service: 4
Fail to show service, code: 500, message: Infinity is not in use.

The response indicates that Infinity is not currently in use by the RAGFlow system.

admin> show service 4;
Showing service: 4
Service infinity is alive. Detail:
+-------+--------+----------+
| error | status | type |
+-------+--------+----------+
| | green | infinity |
+-------+--------+----------+

After enabling Infinity and querying again, the response indicates that the Infinity service is online and in good condition.

  1. Query the Redis service:
admin> show service 5;
Showing service: 5
Service redis is alive. Detail:
+-----------------+-------------------+---------------------------+-------------------------+---------------+-------------+--------------------------+---------------------+-------------+
| blocked_clients | connected_clients | instantaneous_ops_per_sec | mem_fragmentation_ratio | redis_version | server_mode | total_commands_processed | total_system_memory | used_memory |
+-----------------+-------------------+---------------------------+-------------------------+---------------+-------------+--------------------------+---------------------+-------------+
| 0 | 3 | 10 | 3.03 | 7.2.4 | standalone | 404098 | 30.84G | 1.29M |
+-----------------+-------------------+---------------------------+-------------------------+---------------+-------------+--------------------------+---------------------+-------------+

The response indicates that the Redis service is online, with the version number, deployment mode, and resource usage shown in the table above.

User Management Commands

LIST USERS;

  • List all users in the RAGFlow system.

Usage Example:

admin> list users;
Listing all users
+-------------------------------+----------------------+-----------+----------+
| create_date | email | is_active | nickname |
+-------------------------------+----------------------+-----------+----------+
| Mon, 13 Oct 2025 15:58:42 GMT | admin@ragflow.io | 1 | admin |
| Mon, 13 Oct 2025 15:54:34 GMT | lynn_inf@hotmail.com | 1 | Lynn |
+-------------------------------+----------------------+-----------+----------+

The response indicates that there are currently two users in the system, both of whom are enabled.

Among them, admin@ragflow.io is the administrator account, which is automatically created during the initial system startup.

SHOW USER <username>;

  • Query detailed user information by email.
  • <username>: The user's email address, which must be enclosed in single quotes ' or double quotes ".

Usage Example:

  1. Query the administrator user
admin> show user "admin@ragflow.io";
Showing user: admin@ragflow.io
+-------------------------------+------------------+-----------+--------------+------------------+--------------+----------+-----------------+---------------+--------+-------------------------------+
| create_date | email | is_active | is_anonymous | is_authenticated | is_superuser | language | last_login_time | login_channel | status | update_date |
+-------------------------------+------------------+-----------+--------------+------------------+--------------+----------+-----------------+---------------+--------+-------------------------------+
| Mon, 13 Oct 2025 15:58:42 GMT | admin@ragflow.io | 1 | 0 | 1 | True | English | None | None | 1 | Mon, 13 Oct 2025 15:58:42 GMT |
+-------------------------------+------------------+-----------+--------------+------------------+--------------+----------+-----------------+---------------+--------+-------------------------------+

The response indicates that admin@ragflow.io is a super administrator and is currently enabled.

  1. Query a regular user
admin> show user "lynn_inf@hotmail.com";
Showing user: lynn_inf@hotmail.com
+-------------------------------+----------------------+-----------+--------------+------------------+--------------+----------+-------------------------------+---------------+--------+-------------------------------+
| create_date | email | is_active | is_anonymous | is_authenticated | is_superuser | language | last_login_time | login_channel | status | update_date |
+-------------------------------+----------------------+-----------+--------------+------------------+--------------+----------+-------------------------------+---------------+--------+-------------------------------+
| Mon, 13 Oct 2025 15:54:34 GMT | lynn_inf@hotmail.com | 1 | 0 | 1 | False | English | Mon, 13 Oct 2025 15:54:33 GMT | password | 1 | Mon, 13 Oct 2025 17:24:09 GMT |
+-------------------------------+----------------------+-----------+--------------+------------------+--------------+----------+-------------------------------+---------------+--------+-------------------------------+

The response indicates that lynn_inf@hotmail.com is a regular user who logs in via password, with the last login time shown as the provided timestamp.

CREATE USER <username> <password>;

  • Create a new user.
  • <username>: The user's email address must comply with standard email format specifications.
  • <password>: The user's password must not contain special characters such as \, ', or ".

Usage Example:

admin> create user "example@ragflow.io" "psw";
Create user: example@ragflow.io, password: psw, role: user
+----------------------------------+--------------------+----------------------------------+--------------+---------------+----------+
| access_token | email | id | is_superuser | login_channel | nickname |
+----------------------------------+--------------------+----------------------------------+--------------+---------------+----------+
| be74d786a9b911f0a726d68c95a0776b | example@ragflow.io | be74d6b4a9b911f0a726d68c95a0776b | False | password | |
+----------------------------------+--------------------+----------------------------------+--------------+---------------+----------+

A regular user has been successfully created. Personal information such as nickname and avatar can be set by the user themselves after logging in and accessing the profile page.

ALTER USER PASSWORD <username> <new_password>;

  • Change the user's password.
  • <username>: User email address
  • <password>: New password (must not be the same as the old password and must not contain special characters)

Usage Example:

admin> alter user password "example@ragflow.io" "psw";
Alter user: example@ragflow.io, password: psw
Same password, no need to update!

When the new password is the same as the old password, the system prompts that no change is needed.

admin> alter user password "example@ragflow.io" "new psw";
Alter user: example@ragflow.io, password: new psw
Password updated successfully!

The password has been updated successfully. The user can log in with the new password thereafter.

ALTER USER ACTIVE <username> <on/off>;

  • Enable or disable a user.
  • <username>: User email address
  • <on/off>: Enabled or disabled status

Usage Example:

admin> alter user active "example@ragflow.io" off;
Alter user example@ragflow.io activate status, turn off.
Turn off user activate status successfully!

The user has been successfully disabled. Only users in a disabled state can be deleted.

DROP USER <username>;

  • Delete the user and all their associated data
  • <username>: User email address

Important Notes:

  • Only disabled users can be deleted.
  • Before proceeding, ensure that all necessary data such as knowledge bases and files that need to be retained have been transferred to other users.
  • This operation will permanently delete the following user data:

All knowledge bases created by the user, uploaded files, and configured agents, as well as files uploaded by the user in others' knowledge bases, will be permanently deleted. This operation is irreversible, so please proceed with extreme caution.

  • The deletion command is idempotent. If the system fails or the operation is interrupted during the deletion process, the command can be re-executed after troubleshooting to continue deleting the remaining data.

Usage Example:

  1. User Successfully Deleted
admin> drop user "example@ragflow.io";
Drop user: example@ragflow.io
Successfully deleted user. Details:
Start to delete owned tenant.
- Deleted 2 tenant-LLM records.
- Deleted 0 langfuse records.
- Deleted 1 tenant.
- Deleted 1 user-tenant records.
- Deleted 1 user.
Delete done!

The response indicates that the user has been successfully deleted, and it lists detailed steps for data cleanup.

  1. Deleting Super Administrator (Prohibited Operation)
admin> drop user "admin@ragflow.io";
Drop user: admin@ragflow.io
Fail to drop user, code: -1, message: Can't delete the super user.

The response indicates that the deletion failed. The super administrator account is protected and cannot be deleted, even if it is in a disabled state.

Data and Agent Management Commands

LIST DATASETS OF <username>;

  • List all knowledge bases of the specified user
  • <username>: User email address

Usage Example:

admin> list datasets of "lynn_inf@hotmail.com";
Listing all datasets of user: lynn_inf@hotmail.com
+-----------+-------------------------------+---------+----------+-----------------+------------+--------+-----------+-------------------------------+
| chunk_num | create_date | doc_num | language | name | permission | status | token_num | update_date |
+-----------+-------------------------------+---------+----------+-----------------+------------+--------+-----------+-------------------------------+
| 8 | Mon, 13 Oct 2025 15:56:43 GMT | 1 | English | primary_dataset | me | 1 | 3296 | Mon, 13 Oct 2025 15:57:54 GMT |
+-----------+-------------------------------+---------+----------+-----------------+------------+--------+-----------+-------------------------------+

The response shows that the user has one private knowledge base, with detailed information such as the number of documents and segments displayed in the table above.

LIST AGENTS OF <username>;

  • List all Agents of the specified user
  • <username>: User email address

Usage Example:

admin> list agents of "lynn_inf@hotmail.com";
Listing all agents of user: lynn_inf@hotmail.com
+-----------------+-------------+------------+----------------+
| canvas_category | canvas_type | permission | title |
+-----------------+-------------+------------+----------------+
| agent_canvas | None | me | finance_helper |
+-----------------+-------------+------------+----------------+

The response indicates that the user has one private Agent, with detailed information shown in the table above.

Other commands

  • ? or \help

Display help information.

  • \q or \quit

Follow-up plan

We are always committed to enhancing the system management experience and overall security. Building on its existing robust features, the RAGFlow back-end management tool will continue to evolve. In addition to the current efficient and flexible command-line interface, we are soon launching a professional system management UI, enabling administrators to perform all operational and maintenance tasks in a more secure and intuitive graphical environment.

To strengthen permission control, the system status information currently visible in the ordinary user interface will be revoked. After the future launch of the professional management UI, access to the core operational status of the system will be restricted to administrators only. This will effectively address the current issue of excessive permission exposure and further reinforce the system's security boundaries.

In addition, we will also roll out more fine-grained management features sequentially, including:

  • Fine-grained control over Datasets and Agents
  • User Team collaboration management mechanisms
  • Enhanced system monitoring and auditing capabilities

These improvements will establish a more comprehensive enterprise-level management ecosystem, providing administrators with a more all-encompassing and convenient system control experience.

· 10 min read

RAGFlow 0.21.0 officially released

This release shifts focus from enhancing online Agent capabilities to strengthening the data foundation, prioritising usability and dialogue quality from the ground up. Directly addressing common RAG pain points—from data preparation to long-document understanding—version 0.21.0 brings crucial upgrades: a flexible, orchestratable Ingestion Pipeline, long-context RAG to close semantic gaps in complex files, and a new admin CLI for smoother operations. Taken together, these elements establish RAGFlow’s refreshed data-pipeline core, providing a more solid foundation for building robust and effective RAG applications.

Orchestratable Ingestion Pipeline

If earlier Agents primarily tackled the orchestration of online data—as seen in Workflow and Agentic Workflow—the Ingestion Pipeline mirrors this capability by applying the same technical architecture to orchestrate offline data ingestion. Its introduction enables users to construct highly customized RAG data pipelines within a unified framework. This not only streamlines bespoke development but also more fully embodies the "Flow" in RAGFlow.

A typical RAG ingestion process involves key stages such as document parsing, text chunking, vectorization, and index building. When RAGFlow first launched in April 2024, it already incorporated an advanced toolchain, including the DeepDoc-based parsing engine and a templated chunking mechanism. These state-of-the-art solutions were foundational to its early adoption.

However, with rapid industry evolution and deeper practical application, we have observed new trends and demands:

  • The rise of Vision Language Models (VLMs): Increasingly mature VLMs have driven a wave of fine-tuned document parsing models. These offer significantly improved accuracy for unstructured documents with complex layouts or mixed text and images.
  • Demand for flexible chunking: Users now seek more customized chunking strategies. Faced with diverse knowledge-base scenarios, RAGFlow's original built-in chunking templates have proved insufficient for covering all niche cases, which can impact the accuracy of final Q&A outcomes.

To this end, RAGFlow 0.21.0 formally introduces the Ingestion Pipeline, featuring core capabilities including:

  • Orchestratable Data Ingestion: Building on the underlying Agent framework, users can create varied data ingestion pipelines. Each pipeline may apply different strategies to connect a data source to the final index, turning the previous built-in data-writing process into a user-customizable workflow. This provides more flexible ingestion aligned with specific business logic.
  • Decoupling of Upload and Cleansing: The architecture separates data upload from cleansing, establishing standard interfaces for future batch data sources and a solid foundation for expanding data preprocessing workflows.
  • Refactored Parser: The Parser component has been redesigned for extensibility, laying groundwork for integrating advanced document-parsing models beyond DeepDoc.
  • Customizable Chunking Interface: By decoupling the chunking step, users can plug in custom chunkers to better suit the segmentation needs of different knowledge structures.
  • Optimized Efficiency for Complex RAG: The execution of IO/compute-intensive tasks, such as GraphRAG and RAPTOR, has been overhauled. In the pre-pipeline architecture, processing each new document triggered a full compute cycle, resulting in slow performance. The new pipeline enables batch execution, significantly improving data throughput and overall efficiency.

If ETL/ELT represents the standard pipeline for processing structured data in the modern data stack—with tools like dbt and Fivetran providing unified and flexible data integration solutions for data warehouses and data lakes—then RAGFlow's Ingestion Pipeline is positioned to become the equivalent infrastructure for unstructured data. The following diagram illustrates this architectural analogy:

Image

Specifically, while the Extract phase in ETL/ELT is responsible for pulling data from diverse sources, the RAGFlow Ingestion Pipeline augments this with a dedicated Parsing stage to extract information from unstructured data. This stage integrates multiple parsing models, led by DeepDoc, to convert multimodal documents (for example, text and images) into a unimodal representation suitable for processing.

In the Transform phase, where traditional ETL/ELT focuses on data cleansing and business logic, RAGFlow instead constructs a series of LLM-centric Agent components. These are optimized to address semantic gaps in retrieval, with a core mission that can be summarized as: to enhance recall and ranking accuracy.

For data loading, ETL/ELT writes results to a data warehouse or data lake, while RAGFlow uses an Indexer component to build the processed content into a retrieval-optimised index format. This reflects the RAG engine’s hybrid retrieval architecture, which must support full-text, vector, and future tensor-based retrieval to ensure optimal recall.

Thus, the modern data stack serves business analytics for structured data, whereas a RAG engine with an Ingestion Pipeline specializes in the intelligent retrieval of unstructured data—providing high-quality context for LLMs. Each occupies an equivalent ecological niche in its domain.

Regarding processing structured data, this is not the RAG engine’s core duty. It is handled by a Context Layer built atop the engine. This layer leverages the MCP (Model Context Protocol)—described as “TCP/IP for the AI era” —and accompanying Context Engineering to automate the population of all context types. This is a key focus area for RAGFlow’s next development phase.

Below is a preliminary look at the Ingestion Pipeline in v0.21.0; a more detailed guide will follow. We have introduced components for parsing, chunking, and other unstructured data processing tasks into the Agent Canvas, enabling users to freely orchestrate their parsing workflows.

Orchestrating an Ingestion Pipeline automates the process of parsing files and chunking them by length. It then leverages a large language model to generate summaries, keywords, questions, and even metadata. Previously, this metadata had to be entered manually. Now, a single configuration dramatically reduces maintenance overhead.

Furthermore, the pipeline process is fully observable, recording and displaying complete processing logs for each file.

Image

The implementation of the Ingestion Pipeline in version 0.21.0 is a foundational step. In the next release, we plan to significantly enhance it by:

  • Adding support for more data sources.
  • Providing a wider selection of Parsers.
  • Introducing more flexible Transformer components to facilitate orchestration of a richer set of semantic-enhancement templates.

Long-context RAG

As we enter 2025, Retrieval-Augmented Generation (RAG) faces notable challenges driven by two main factors.

Fundamental limitations of traditional RAG

Traditional RAG architectures often fail to guarantee strong dialogue performance because they rely on a retrieval mechanism built around text chunks as the primary unit. This makes them highly sensitive to chunk quality and can yield degraded results due to insufficient context. For example:

  • If a coherent semantic unit is split across chunks, retrieval can be incomplete.
  • If a chunk lacks global context, the information presented to the LLM is weakened.

While strategies such as automatically detecting section headers and attaching them to chunks can help with global semantics, they are constrained by header-identification accuracy and the header’s own completeness.

Cost-efficiency concerns with advanced pre-processing techniques

Modern pre-processing methods—GraphRAG, RAPTOR, and Context Retrieval—aim to inject additional semantic information into raw data to boost search hit rates and accuracy for complex queries. They, however, share issues of high cost and unpredictable effectiveness.

  • GraphRAG: This approach often consumes many times more tokens than the original text, and the automatically generated knowledge graphs are frequently unsatisfactory. Its effectiveness in complex multi-hop reasoning is limited by uncontrollable reasoning paths. As a supplementary retrieval outside the original chunks, the knowledge graph also loses some granular context from the source.
  • RAPTOR: This technique produces clustered summaries that are recalled as independent chunks but naturally lack the detail of the source text, reintroducing the problem of insufficient context.

Context Retrieval: This method enriches original chunks with extra semantics such as keywords or potential questions. It presents a clear trade-off:

  • The more effective option queries the LLM multiple times per chunk, using both full text and the current chunk for context, improving performance but driving token costs several times higher than the original text.
  • The cheaper option generates semantic information based only on the current chunk, saving costs but providing limited global context and modest performance gains. The last few years have seen the emergence of new RAG schemes.
  • Complete abandonment of retrieval: some approaches have the LLM read documents directly, splitting them into chunks according to the context window and performing multi-stage searches. First, the LLM decides which global document is relevant, then which chunks, and finally loads those chunks to answer. While this avoids recall inaccuracies, it harms response latency, concurrency, and large-scale data handling, making practical deployment difficult.
  • Abandoning embedding or indexing in favour of tools like grep: this evolves RAG into Agentic RAG. As applications grow more complex and user queries diversify, combining RAG with agents is increasingly inevitable, since only LLMs can translate raw inquiries into structured retrieval commands. In RAGFlow, this capability has long been realized. Abandoning indexing to use grep is a compromise for simplifying agent development in personal or small-scale contexts; in enterprise settings, a powerful retrieval engine remains essential.
  • Long-Context RAG: introduced in version 0.21.0 as part of the same family as GraphRAG, RAPTOR and Context Retrieval, this approach uses LLMs to enrich raw text semantics to boost recall while retaining indexing and search. Retrieval remains central. Long-context RAG mirrors how people consult information: identify relevant chapters via the table of contents, then locate exact pages for detail. During indexing, the LLM extracts and attaches chapter information to each chunk to provide global context; during retrieval, it finds matching chunks and uses the table-of-contents structure to fill in gaps from chunk fragmentation.
  • Current experience and future direction: users can try Long-Context RAG via the “TOC extraction” (Table of Contents) feature, though it is in beta. The next release will add an Ingestion Pipeline. A key path to improving RAG lies in using LLMs to enrich content semantics without discarding retrieval altogether. Consequently, a flexible pipeline that lets users assemble LLM-based content-transformation components is an important direction for enhancing RAG retrieval quality.

Backend management CLI

RAGFlow’s progression has shifted from core module development to strengthening administrative and operational capabilities.

  • In earlier versions, while parsing and retrieval-augmented generation improved, system administration lagged. Administrators could not modify passwords or delete accounts, complicating deployment and maintenance.
  • With RAGFlow 0.21.0, fundamental system management is markedly improved. A new command-line administration tool provides a central, convenient interface for administrators. Core capabilities include:
    • Service lifecycle management: monitoring built-in RAGFlow services for greater operational flexibility.
    • Comprehensive user management:
      • Create new registered users.
      • Directly modify login passwords.
      • Delete user accounts.
      • Enable or disable accounts.
      • View details of all registered users.
    • Resource overview: listing knowledge bases and Agents created under registered users for system-wide monitoring.

This upgrade underlines RAGFlow’s commitment to robust functionality and foundational administrative strength essential for enterprise use. Looking ahead, the team plans an enterprise-grade web administration panel and accompanying user interface to streamline management, boost efficiency, and enhance the end-user experience, supporting greater maturity and stability.

Finale

RAGFlow 0.21.0 marks a significant milestone, building on prior progress and outlining future developments. It introduces the first integration of Retrieval (RAG) with orchestration (Flow), forming an intelligent engine to support the LLM context layer, underpinned by unstructured data ELT and a robust RAG capability set.

From the user-empowered Ingestion Pipeline to long-context RAG that mitigates semantic fragmentation, and the management backend that ensures reliable operation, every new feature is designed to make the RAG system smarter, more flexible, and enterprise-ready. This is not merely a feature tally but an architectural evolution, establishing a solid foundation for future growth.

Our ongoing focus remains the LLM context layer: building a powerful, reliable data foundation for LLMs and effectively serving all Agents. This remains RAGFlow’s core aim.

We invite you to continue following and starring our project as we grow together.

GitHub: https://github.com/infiniflow/ragflow