Skip to main content
Version: DEV

Transformer component

A component that uses an LLM to extract insights from the chunks.


A Transformer component indexes chunks and configures their storage formats in the document engine. It typically precedes the Indexer in the ingestion pipeline, but you can also chain multiple Transformer components in sequence.

Scenario

A Transformer component is essential when you need the LLM to extract new information, such as keywords, questions, metadata, and summaries, from the original chunks.

Configurations

Model

Click the dropdown menu of Model to show the model configuration window.

  • Model: The chat model to use.
    • Ensure you set the chat model correctly on the Model providers page.
    • You can use different models for different components to increase flexibility or improve overall performance.
  • Creavity: A shortcut to Temperature, Top P, Presence penalty, and Frequency penalty settings, indicating the freedom level of the model. From Improvise, Precise, to Balance, each preset configuration corresponds to a unique combination of Temperature, Top P, Presence penalty, and Frequency penalty.
    This parameter has three options:
    • Improvise: Produces more creative responses.
    • Precise: (Default) Produces more conservative responses.
    • Balance: A middle ground between Improvise and Precise.
  • Temperature: The randomness level of the model's output.
    Defaults to 0.1.
    • Lower values lead to more deterministic and predictable outputs.
    • Higher values lead to more creative and varied outputs.
    • A temperature of zero results in the same output for the same prompt.
  • Top P: Nucleus sampling.
    • Reduces the likelihood of generating repetitive or unnatural text by setting a threshold P and restricting the sampling to tokens with a cumulative probability exceeding P.
    • Defaults to 0.3.
  • Presence penalty: Encourages the model to include a more diverse range of tokens in the response.
    • A higher presence penalty value results in the model being more likely to generate tokens not yet been included in the generated text.
    • Defaults to 0.4.
  • Frequency penalty: Discourages the model from repeating the same words or phrases too frequently in the generated text.
    • A higher frequency penalty value results in the model being more conservative in its use of repeated tokens.
    • Defaults to 0.7.
  • Max tokens:
    This sets the maximum length of the model's output, measured in the number of tokens (words or pieces of words). It is disabled by default, allowing the model to determine the number of tokens in its responses.
NOTE
  • It is not necessary to stick with the same model for all components. If a specific model is not performing well for a particular task, consider using a different one.
  • If you are uncertain about the mechanism behind Temperature, Top P, Presence penalty, and Frequency penalty, simply choose one of the three options of Creativity.

Result destination

Select the type of output to be generated by the LLM:

  • Summary
  • Keywords
  • Questions
  • Metadata

System prompt

Typically, you use the system prompt to describe the task for the LLM, specify how it should respond, and outline other miscellaneous requirements. We do not plan to elaborate on this topic, as it can be as extensive as prompt engineering.

NOTE

The system prompt here automatically updates to match your selected Result destination.

User prompt

The user-defined prompt. For example, you can type / or click (x) to insert variables of preceding components in the ingestion pipeline as the LLM's input.

Output

The global variable name for the output of the Transformer component, which can be referenced by subsequent Transformer components in the ingestion pipeline.

  • Default: chunks
  • Type: Array<Object>