Evaluating with human feedback

This guide demonstrates how to run a batch generation and collect manual human feedback.

Prerequisites

  1. You need to have access to evaluations.
  2. You also need to have a project created - if not, please first follow our project creation guides.
  3. Finally, you need at least a few logs in your project. Use the Editor to generate some logs if you don't have any yet.

Set up an evaluator to collect human feedback

  1. From the Evaluations page, click New Evaluator and select Human.
  1. Give the evaluator a name and description and click Create in the top-right.
  2. Return to the Evaluations page and select Run Evaluation.
  3. Choose the model config you are evaluating, a dataset you would like to evaluate against and then select the new Human evaluator.

  1. Click Batch generate and follow the link in the bottom-right corner to see the evaluation run.

  1. As the rows populate with the generated output from the model, you can review those outputs and apply feedback in the rating column. Click a row to see the full details of the Log in a drawer.
  2. Apply your feedback either directly in the table, or from the drawer.

  1. Once you've finished providing feedback for all the Logs in the run, click Mark as complete in the top right of the page.
  2. You can review the aggregated feedback results in the Stats section on this page.

Configuring the feedback schema

If you need a more complex feedback schema, visit the Settings page in your project and follow the link to Feedbacks. Here, you can add more categories to the default feedback types. If you need more control over feedback types, you can create new ones via the API.