Skip to main content
Version: DEV

Accelerate answering

A checklist to speed up question answering.


Please note that some of your settings may consume a significant amount of time. If you often find that your question answering is time-consuming, here is a checklist to consider:

  • In the Prompt Engine tab of your Chat Configuration dialogue, disabling Multi-turn optimization will reduce the time required to get an answer from the LLM.
  • In the Prompt Engine tab of your Chat Configuration dialogue, leaving the Rerank model field empty will significantly decrease retrieval time.
  • In the Assistant Setting tab of your Chat Configuration dialogue, disabling Keyword analysis will reduce the time to receive an answer from the LLM.
  • When chatting with your chat assistant, click the light bulb icon above the current dialogue and scroll down the popup window to view the time taken for each task:
    enlighten
Item nameDescription
TotalTotal time spent on this conversation round, including chunk retrieval and answer generation.
Check LLMTime to validate the specified LLM.
Create retrieverTime to create a chunk retriever.
Bind embeddingTime to initialize an embedding model instance.
Bind LLMTime to initialize an LLM instance.
Tune questionTime to optimize the user query using the context of the mult-turn conversation.
Bind rerankerTime to initialize an reranker model instance for chunk retrieval.
Generate keywordsTime to extract keywords from the user query.
RetrievalTime to retrieve the chunks.
Generate answerTime to generate the answer.