Run experiments with own model/provider

How to set up an experiment on Humanloop with your own models.

Experiments can be used to compare different prompt templates, different hyperparameter combinations (such as temperature and presence penalties) and even different base models.

This guide focuses on the case where you wish to manage your own model provider calls.

Prerequisites

  1. You already have a project created - if not, please pause and first follow our project creation guides.
  2. You have integrated humanloop.generate() and humanloop.feedback() with the API or Python SDK.

📘

Using other model providers

This guide assumes you are managing your own model provider calls. If are using an OpenAI model we recommend to instead follow our run an experiment guide, which results in a simpler integration.

Support for other model providers on Humanloop is coming soon.

Create an experiment

  1. Navigate to the Experiments tab of your project.
  2. Click the Create new experiment button:
    1. Give your experiment a descriptive name.
    2. Select a list of feedback labels to be considered as positive actions - this will be used to calculate the performance of each of your model configs during the experiment.
    3. Select which of your project’s model configs you wish to compare.
      Then click the Create button.

Log to your experiment

In order to log data for your experiment without using hl.generate(), you must first determine which model config to use for your LLM provider calls. This is where the hl.get_model_config() function comes in.

  1. First navigate to the Experiments tab of your project and select your Experiment card.
  2. Copy the experiment_id from the experiment summary:

  1. Alter your existing logging code to now first sample a model_config from your experiment to use when making your call to OpenAI:

    import humanloop as hl
    import openai 
    
    # Initialize the SDK with your Humanloop API key
    hl.init(api_key="<YOUR Humanloop API KEY>")
    
    # Sample a model_config from your experiment.
    model_config = hl.get_model_config(experiment_id=experiment_id)
    
    # Make a generation using OpenAI using the parameters from the sampled model_config.
    response = openai.Completion.create(
        prompt="Answer the following question like Paul Graham from YCombinator:\n"
        "How should I think about competition for my startup?",
        model=model_config.model,
        temperature=model_config.temperature,
    )
    
    # Parse the output from the OpenAI response.
    output = response.choices[0].text
    
    # Log the inputs and outputs to the experiment trial associated to the sampled model_config.
    log_response = hl.log(
        project="<YOUR UNIQUE PROJECT NAME>",
        inputs={"question": "How should I think about competition for my startup?"},
        output=output,
        trial_id=model_config.trial_id,
    )
    
    # Use this ID to associate feedback received later to this datapoint.
    data_id = log_response.id