Request Access

Launch an Evaluation Round

An Evaluation is a background process that evaluates an Application using a set of prompt and reference answers. It checks the quality of the generated predictions using one or more metrics. The following metrics are supported:

  • equivalence
  • groundedness.

You can provide the evaluation data for an Evaluation in two ways:

  1. CSV File
    Upload an evalset_file as a CSV with the following columns:

    • prompt: The question.
    • reference: The correct (gold) answer.
      If you want to evaluate with your own predictions you need to provide the extra columns:
    • response: Include this column to measure equivalence.
    • knowledge: Include this column to evaluate groundedness.
  2. Dataset Name
    Use a evalset_name that refers to an evaluation_set Dataset created through the Dataset API.

Language
Credentials
Bearer
API Key
Click Try It! to start a request and see the response here!