Version: v0.14.1

Frequently asked questions

Queries regarding general features, troubleshooting, usage, and more.

General features

What sets RAGFlow apart from other RAG products?

The "garbage in garbage out" status quo remains unchanged despite the fact that LLMs have advanced Natural Language Processing (NLP) significantly. In response, RAGFlow introduces two unique features compared to other Retrieval-Augmented Generation (RAG) products.

Fine-grained document parsing: Document parsing involves images and tables, with the flexibility for you to intervene as needed.
Traceable answers with reduced hallucinations: You can trust RAGFlow's responses as you can view the citations and references supporting them.

Why does it take longer for RAGFlow to parse a document than LangChain?

We put painstaking effort into document pre-processing tasks like layout analysis, table structure recognition, and OCR (Optical Character Recognition) using our vision models. This contributes to the additional time required.

Why does RAGFlow require more resources than other projects?

RAGFlow has a number of built-in models for document structure parsing, which account for the additional computational resources.

Which architectures or devices does RAGFlow support?

We officially support x86 CPU and nvidia GPU. While we also test RAGFlow on ARM64 platforms, we do not plan to maintain RAGFlow Docker images for ARM.

Which embedding models can be deployed locally?

RAGFlow offers two Docker image editions, dev-slim and dev:

infiniflow/ragflow:dev-slim (default): The RAGFlow Docker image without embedding models.
infiniflow/ragflow:dev: The RAGFlow Docker image with embedding models including:
- Built-in embedding models:
  - BAAI/bge-large-zh-v1.5
  - BAAI/bge-reranker-v2-m3
  - maidalun1020/bce-embedding-base_v1
  - maidalun1020/bce-reranker-base_v1
- Embedding models that will be downloaded once you select them in the RAGFlow UI:
  - BAAI/bge-base-en-v1.5
  - BAAI/bge-large-en-v1.5
  - BAAI/bge-small-en-v1.5
  - BAAI/bge-small-zh-v1.5
  - jinaai/jina-embeddings-v2-base-en
  - jinaai/jina-embeddings-v2-small-en
  - nomic-ai/nomic-embed-text-v1.5
  - sentence-transformers/all-MiniLM-L6-v2

Do you offer an API for integration with third-party applications?

The corresponding APIs are now available. See the RAGFlow HTTP API Reference or the RAGFlow Python API Reference for more information.

Do you support stream output?

Yes, we do.

No, this feature is not supported.

Do you support multiple rounds of dialogues, i.e., referencing previous dialogues as context for the current dialogue?

This feature and the related APIs are still in development. Contributions are welcome.

Troubleshooting

Issues with Docker images

How to build the RAGFlow image from scratch?

See Build a RAGFlow Docker image.

Issues with huggingface models

Cannot access https://huggingface.co

A locally deployed RAGflow downloads OCR and embedding modules from Huggingface website by default. If your machine is unable to access this site, the following error occurs and PDF parsing fails:

FileNotFoundError: [Errno 2] No such file or directory: '/root/.cache/huggingface/hub/models--InfiniFlow--deepdoc/snapshots/be0c1e50eef6047b412d1800aa89aba4d275f997/ocr.res'

To fix this issue, use https://hf-mirror.com instead:

Stop all containers and remove all related resources:
```
cd ragflow/docker/
docker compose down
```
Uncomment the following line in ragflow/docker/.env:
```
# HF_ENDPOINT=https://hf-mirror.com
```
Start up the server:
```
docker compose up -d 
```

`MaxRetryError: HTTPSConnectionPool(host='hf-mirror.com', port=443)`

This error suggests that you do not have Internet access or are unable to connect to hf-mirror.com. Try the following:

Manually download the resource files from huggingface.co/InfiniFlow/deepdoc to your local folder ~/deepdoc.
Add a volumes to docker-compose.yml, for example:
```
- ~/deepdoc:/ragflow/rag/res/deepdoc
```

Issues with RAGFlow servers

`WARNING: can't find /raglof/rag/res/borker.tm`

Ignore this warning and continue. All system warnings can be ignored.

`network anomaly There is an abnormality in your network and you cannot connect to the server.`

anomaly

You will not log in to RAGFlow unless the server is fully initialized. Run docker logs -f ragflow-server.

The server is successfully initialized, if your system displays the following:

     ____   ___    ______ ______ __               
    / __ \ /   |  / ____// ____// /____  _      __
   / /_/ // /| | / / __ / /_   / // __ \| | /| / /
  / _, _// ___ |/ /_/ // __/  / // /_/ /| |/ |/ / 
 /_/ |_|/_/  |_|\____//_/    /_/ \____/ |__/|__/  

 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:9380
 * Running on http://x.x.x.x:9380
 INFO:werkzeug:Press CTRL+C to quit

Issues with RAGFlow backend services

`Realtime synonym is disabled, since no redis connection`

Ignore this warning and continue. All system warnings can be ignored.

Why does my document parsing stall at under one percent?

stall

Click the red cross beside the 'parsing status' bar, then restart the parsing process to see if the issue remains. If the issue persists and your RAGFlow is deployed locally, try the following:

Check the log of your RAGFlow server to see if it is running properly:
```
docker logs -f ragflow-server
```
Check if the task_executor.py process exists.
Check if your RAGFlow server can access hf-mirror.com or huggingface.com.

Why does my pdf parsing stall near completion, while the log does not show any error?

Click the red cross beside the 'parsing status' bar, then restart the parsing process to see if the issue remains. If the issue persists and your RAGFlow is deployed locally, the parsing process is likely killed due to insufficient RAM. Try increasing your memory allocation by increasing the MEM_LIMIT value in docker/.env.

note

Ensure that you restart up your RAGFlow server for your changes to take effect!

docker compose stop

docker compose up -d

nearcompletion

`Index failure`

An index failure usually indicates an unavailable Elasticsearch service.

How to check the log of RAGFlow?

tail -f ragflow/docker/ragflow-logs/*.log

How to check the status of each component in RAGFlow?

Check the status of the Elasticsearch Docker container:

$ docker ps

The following is an example result:

5bc45806b680   infiniflow/ragflow:latest     "./entrypoint.sh"        11 hours ago   Up 11 hours               0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp, 0.0.0.0:9380->9380/tcp, :::9380->9380/tcp   ragflow-server
91220e3285dd   docker.elastic.co/elasticsearch/elasticsearch:8.11.3   "/bin/tini -- /usr/l…"   11 hours ago   Up 11 hours (healthy)     9300/tcp, 0.0.0.0:9200->9200/tcp, :::9200->9200/tcp           ragflow-es-01
d8c86f06c56b   mysql:5.7.18        "docker-entrypoint.s…"   7 days ago     Up 16 seconds (healthy)   0.0.0.0:3306->3306/tcp, :::3306->3306/tcp     ragflow-mysql
cd29bcb254bc   quay.io/minio/minio:RELEASE.2023-12-20T01-00-02Z       "/usr/bin/docker-ent…"   2 weeks ago    Up 11 hours      0.0.0.0:9001->9001/tcp, :::9001->9001/tcp, 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp     ragflow-minio

Follow this document to check the health status of the Elasticsearch service.

IMPORTANT

The status of a Docker container status does not necessarily reflect the status of the service. You may find that your services are unhealthy even when the corresponding Docker containers are up running. Possible reasons for this include network failures, incorrect port numbers, or DNS issues.

`Exception: Can't connect to ES cluster`

Check the status of the Elasticsearch Docker container:

$ docker ps

The status of a healthy Elasticsearch component should look as follows:

91220e3285dd   docker.elastic.co/elasticsearch/elasticsearch:8.11.3   "/bin/tini -- /usr/l…"   11 hours ago   Up 11 hours (healthy)     9300/tcp, 0.0.0.0:9200->9200/tcp, :::9200->9200/tcp           ragflow-es-01

Follow this document to check the health status of the Elasticsearch service.

IMPORTANT

If your container keeps restarting, ensure vm.max_map_count >= 262144 as per this README. Updating the vm.max_map_count value in /etc/sysctl.conf is required, if you wish to keep your change permanent. Note that this configuration works only for Linux.

Can't start ES container and get `Elasticsearch did not exit normally`

This is because you forgot to update the vm.max_map_count value in /etc/sysctl.conf and your change to this value was reset after a system reboot.

`{"data":null,"code":100,"message":"<NotFound '404: Not Found'>"}`

Your IP address or port number may be incorrect. If you are using the default configurations, enter http://<IP_OF_YOUR_MACHINE> (NOT 9380, AND NO PORT NUMBER REQUIRED!) in your browser. This should work.

`Ollama - Mistral instance running at 127.0.0.1:11434 but cannot add Ollama as model in RagFlow`

A correct Ollama IP address and port is crucial to adding models to Ollama:

If you are on demo.ragflow.io, ensure that the server hosting Ollama has a publicly accessible IP address. Note that 127.0.0.1 is not a publicly accessible IP address.
If you deploy RAGFlow locally, ensure that Ollama and RAGFlow are in the same LAN and can comunicate with each other.

See Deploy a local LLM for more information.

Do you offer examples of using deepdoc to parse PDF or other files?

Yes, we do. See the Python files under the rag/app folder.

Why did I fail to upload a 128MB+ file to my locally deployed RAGFlow?

Ensure that you update the MAX_CONTENT_LENGTH environment variable:

In ragflow/docker/.env, uncomment environment variable MAX_CONTENT_LENGTH:
```
MAX_CONTENT_LENGTH=128000000
```

Update docker-compose.yml:

environment:
  - MAX_CONTENT_LENGTH=${MAX_CONTENT_LENGTH}

Restart the RAGFlow server:
```
docker compose up ragflow -d
```

`FileNotFoundError: [Errno 2] No such file or directory`

Check the status of the MinIO Docker container:

$ docker ps

The status of a healthy Elasticsearch component should look as follows:

cd29bcb254bc   quay.io/minio/minio:RELEASE.2023-12-20T01-00-02Z       "/usr/bin/docker-ent…"   2 weeks ago    Up 11 hours      0.0.0.0:9001->9001/tcp, :::9001->9001/tcp, 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp     ragflow-minio

Follow this document to check the health status of the Elasticsearch service.

IMPORTANT

Usage

How to increase the length of RAGFlow responses?

Right click the desired dialog to display the Chat Configuration window.
Switch to the Model Setting tab and adjust the Max Tokens slider to get the desired length.
Click OK to confirm your change.

How to run RAGFlow with a locally deployed LLM?

You can use Ollama or Xinference to deploy local LLM. See here for more information.

How to interconnect RAGFlow with Ollama?

If RAGFlow is locally deployed, ensure that your RAGFlow and Ollama are in the same LAN.
If you are using our online demo, ensure that the IP address of your Ollama server is public and accessible.

See here for more information.

`Error: Range of input length should be [1, 30000]`

This error occurs because there are too many chunks matching your search criteria. Try reducing the TopN and increasing Similarity threshold to fix this issue:

Click Chat in the middle top of the page.
Right click the desired conversation > Edit > Prompt Engine
Reduce the TopN and/or raise Silimarity threshold.
Click OK to confirm your changes.

topn

How to get an API key for integration with third-party applications?

See Acquire a RAGFlow API key.

How to upgrade RAGFlow?

See Upgrade RAGFlow for more information.

Frequently asked questions

General features​

What sets RAGFlow apart from other RAG products?​

Why does it take longer for RAGFlow to parse a document than LangChain?​

Why does RAGFlow require more resources than other projects?​

Which architectures or devices does RAGFlow support?​

Which embedding models can be deployed locally?​

Do you offer an API for integration with third-party applications?​

Do you support stream output?​

Is it possible to share dialogue through URL?​

Do you support multiple rounds of dialogues, i.e., referencing previous dialogues as context for the current dialogue?​

Troubleshooting​

Issues with Docker images​

How to build the RAGFlow image from scratch?​

Issues with huggingface models​

Cannot access https://huggingface.co​

MaxRetryError: HTTPSConnectionPool(host='hf-mirror.com', port=443)​

Issues with RAGFlow servers​

WARNING: can't find /raglof/rag/res/borker.tm​

network anomaly There is an abnormality in your network and you cannot connect to the server.​

Issues with RAGFlow backend services​

Realtime synonym is disabled, since no redis connection​

Why does my document parsing stall at under one percent?​

Why does my pdf parsing stall near completion, while the log does not show any error?​

Index failure​

How to check the log of RAGFlow?​

How to check the status of each component in RAGFlow?​

Exception: Can't connect to ES cluster​

Can't start ES container and get Elasticsearch did not exit normally​

{"data":null,"code":100,"message":"<NotFound '404: Not Found'>"}​

Ollama - Mistral instance running at 127.0.0.1:11434 but cannot add Ollama as model in RagFlow​

Do you offer examples of using deepdoc to parse PDF or other files?​

Why did I fail to upload a 128MB+ file to my locally deployed RAGFlow?​

FileNotFoundError: [Errno 2] No such file or directory​

Usage​

How to increase the length of RAGFlow responses?​

How to run RAGFlow with a locally deployed LLM?​

How to interconnect RAGFlow with Ollama?​

Error: Range of input length should be [1, 30000]​

How to get an API key for integration with third-party applications?​

How to upgrade RAGFlow?​

General features

What sets RAGFlow apart from other RAG products?

Why does it take longer for RAGFlow to parse a document than LangChain?

Why does RAGFlow require more resources than other projects?

Which architectures or devices does RAGFlow support?

Which embedding models can be deployed locally?

Do you offer an API for integration with third-party applications?

Do you support stream output?

Is it possible to share dialogue through URL?

Do you support multiple rounds of dialogues, i.e., referencing previous dialogues as context for the current dialogue?

Troubleshooting

Issues with Docker images

How to build the RAGFlow image from scratch?

Issues with huggingface models

Cannot access https://huggingface.co

`MaxRetryError: HTTPSConnectionPool(host='hf-mirror.com', port=443)`

Issues with RAGFlow servers

`WARNING: can't find /raglof/rag/res/borker.tm`

`network anomaly There is an abnormality in your network and you cannot connect to the server.`

Issues with RAGFlow backend services

`Realtime synonym is disabled, since no redis connection`

Why does my document parsing stall at under one percent?

Why does my pdf parsing stall near completion, while the log does not show any error?

`Index failure`

How to check the log of RAGFlow?

How to check the status of each component in RAGFlow?

`Exception: Can't connect to ES cluster`

Can't start ES container and get `Elasticsearch did not exit normally`

`{"data":null,"code":100,"message":"<NotFound '404: Not Found'>"}`

`Ollama - Mistral instance running at 127.0.0.1:11434 but cannot add Ollama as model in RagFlow`

Do you offer examples of using deepdoc to parse PDF or other files?

Why did I fail to upload a 128MB+ file to my locally deployed RAGFlow?

`FileNotFoundError: [Errno 2] No such file or directory`

Usage

How to increase the length of RAGFlow responses?

How to run RAGFlow with a locally deployed LLM?

How to interconnect RAGFlow with Ollama?

`Error: Range of input length should be [1, 30000]`

How to get an API key for integration with third-party applications?

How to upgrade RAGFlow?