post https://api.contextual.ai/v1/applications//evaluate
Launch an Evaluation
round.
An Evaluation
is an asynchronous operation which evaluates an Application
on a set of test questions and reference answers. An Evaluation
can select one or more metrics to assess the quality of generated answers. These metrics include equivalence
, factuality
, and unit_test
.
Evaluation
test set data can be provided in one of two forms: - A CSV evalset_file
containing the columns prompt
, and reference
(i.e. gold-answers) - A dataset_name
which refers to an evaluation_set
Dataset
created through the Dataset
API.