Filterable and sortable evaluations overview
We've made improvements to the evaluations runs overview page to make it easier for your team to find interesting or important runs.
Projects rename and file creation flow
We've renamed Projects
to Prompts
and Tools
as part of our move towards managing Prompts
, Tools
, Evaluators
and Datasets
as special-cased and strictly versioned files in your Humanloop directories.
Logging token usage
We're now computing and storing the number of tokens used in both the requests to and responses from the model.
Control logging level
We've added a save
flag to all of our endpoints that generate logs on Humanloop so that you can control whether the request and response payloads that may contain sensitive information are persisted on our servers or not.
Logging provider request
We're now capturing the raw provider request body alongside the existing provider response for all logs generated from our deployed endpoints.
Add Evaluators to existing runs
You can now add an evaluator to any existing evaluation run. This is helpful in situations where you have no need to regenerate logs across a dataset, but simply want to run new evaluators across the existing run. By doing this instead of launching a fresh run, you can the save significant time & costs associated with unnecessarily regenerating logs, especially when working with large datasets.
Improved Evaluation Debug Console
We've enhanced the usability of the debug console when creating and modifying evaluators. Now you can more easily inspect the data you are working with, and understand the root causes of errors to make debugging quicker and more intuitive.
Tool projects
We have upgraded projects to now also work for tools. Tool projects are automatically created for tools you define as part of your model config in the Editor as well as tools managed at organization level.
Support for new OpenAI Models
Following OpenAI's latest model releases, you will find support for all the latest models in our Playground and Editor.