Request Access

Launch an Evaluation round.

An Evaluation is an asynchronous operation which evaluates an Application on a set of test questions and reference answers. An Evaluation can select one or more metrics to assess the quality of generated answers. These metrics include equivalence, factuality, and unit_test.

Evaluation test set data can be provided in one of two forms: - A CSV evalset_file containing the columns prompt, and reference (i.e. gold-answers) - A dataset_name which refers to an evaluation_set Dataset created through the Dataset API.

Language
Credentials
Bearer
API Key
Click Try It! to start a request and see the response here!