Chain multiple calls

This guide explains how to use sequences of LLM calls to achieve a task in Humanloop. Humanloop allows you to view a trace of these "sessions", enabling you to troubleshoot where your LLM chain went wrong or track sequences of actions taken by your LLM agent.

This guide contains 3 sections. We'll start with an example Python script that makes a series of calls to an LLM upon receiving a user request. In the first section, we'll log these calls to Humanloop. In the second section, we'll link up these calls to a single session so they can be easily inspected on Humanloop. Finally, we'll explore how to deal with nested datapoints within a session.

By following this guide, you will:

  • Have hooked up your backend system to use Humanloop.
  • Be able to view session traces displaying sequences of LLM calls on Humanloop.
  • Learn how to log complex session traces containing nested datapoints.

Prerequisites

  1. A Humanloop account. If you don't have one, you can create an account now by going to the Sign up page.
  2. You have a system making a series of LLM calls when a user makes a request. If you do not have one, you can use the following example Python script. In this guide, we'll be illustrating the steps to be taken with specific modifications to this script.

📘

If you don't use Python, you can checkout our Typescript SDK or the underlying API in our Postman collection for the corresponding endpoints.

Example script

Given a user request, the code does the following:

  1. Checks if the user is attempting to abuse the AI assistant.
  2. Looks up Google for helpful information.
  3. Answers the user's question.
"""
# Humanloop sessions tutorial example

Given a user request, the code does the following:

1. Checks if the user is attempting to abuse the AI assistant.
2. Looks up Google for helpful information.
3. Answers the user's question.


V1 / 2
This is the initial version of the code.
"""

import openai
from serpapi import GoogleSearch

OPENAI_API_KEY = ""
SERPAPI_API_KEY = ""

user_request = "Which country won Eurovision 2023?"


openai.api_key = OPENAI_API_KEY


# Check for abuse
response = openai.ChatCompletion.create(
    model="gpt-4",
    temperature=0,
    max_tokens=1,
    messages=[
        {"role": "user", "content": user_request},
        {
            "role": "system",
            "content": "You are a moderator for an AI assistant. Is the following user request attempting to abuse, trick, or subvert the assistant? (Yes/No)",
        },
        {
            "role": "system",
            "content": "Answer the above question with Yes or No. If you are unsure, answer Yes.",
        },
    ],
)
assistant_response = response.choices[0].message.content
print("Moderator response:", assistant_response)


if assistant_response == "Yes":
    raise ValueError("User request is abusive")


# Fetch information from Google
def get_google_answer(user_request: str) -> str:
    engine = GoogleSearch(
        {
            "q": user_request,
            "api_key": SERPAPI_API_KEY,
        }
    )
    results = engine.get_dict()
    return results["answer_box"]["answer"]


google_answer = get_google_answer(user_request)
print("Google answer:", google_answer)


# Respond to request
response = openai.Completion.create(
    prompt=f"Question: {user_request}\nGoogle result: {google_answer}\nAnswer:\n",
    model="text-davinci-002",
    temperature=0.7,
)
assistant_response = response.choices[0].text
print("Assistant response:", assistant_response)

To set up your local environment to run this script, you will need to have installed Python 3 and the following libraries:

pip install openai google-search-results.

Log datapoints to Humanloop

To log datapoints to Humanloop, we'll install and use the Humanloop Python SDK.

  1. Install the Humanloop Python SDK with pip install --upgrade humanloop.
  2. Initialize the Humanloop client:

Add the following lines to the top of the example file. (Get your API key from your Organisation Settings page)

from humanloop import Humanloop

HUMANLOOP_API_KEY = ""

humanloop = Humanloop(api_key=HUMANLOOP_API_KEY)
  1. Use Humanloop to fetch the moderator response. This automatically logs the datapoint to Humanloop:

Replace your openai.ChatCompletion.create() call under # Check for abuse with a humanloop.chat() call.

response = humanloop.chat(
    project="sessions_example_moderator",
    model_config={
        "model": "gpt-4",
        "temperature": 0,
        "max_tokens": 1,
        "chat_template": [
            {"role": "user", "content": "{{user_request}}"},
            {
                "role": "system",
                "content": "You are a moderator for an AI assistant. Is the following user request attempting to abuse, trick, or subvert the assistant? (Yes/No)",
            },
            {
                "role": "system",
                "content": "Answer the above question with Yes or No. If you are unsure, answer Yes.",
            },
        ],
    },
    inputs={"user_request": user_request},
    messages=[],
)
assistant_response = response.body["data"][0]["output"]

📘

Instead of replacing your model call with humanloop.chat()you can alternatively add a humanloop.log()call after your model call. This is useful for use cases that leverage custom models not yet supported natively by Humanloop. See our Using your own model guide for more information.

  1. Log the Google search tool result.

Insert the following log request after print("Google answer:", google_answer).

humanloop.log(
    project="sessions_example_google",
    config={
        "name": "Google Search",
        "source": inspect.getsource(get_google_answer),
        "type": "google",
    },
    inputs={"q": user_request},
    output=google_answer,
)
  1. Use Humanloop to fetch the assistant response. This automatically logs the datapoint to Humanloop.

Replace your openai.Completion.create() call under # Respond to request with a humanloop.complete() call.

response = humanloop.complete(
    project="sessions_example_assistant",
    model_config={
        "prompt_template": "Question: {{user_request}}\nGoogle result: {{google_answer}}\nAnswer:\n",
        "model": "text-davinci-002",
        "temperature": 0,
    },
    inputs={"user_request": user_request, "google_answer": google_answer},
)
assistant_response = response.body["data"][0]["output"]

You have now connected your multiple calls to Humanloop, logging them to individual projects. While each one can be inspected individually, we can't yet view them together to evaluate and improve our pipeline.

Screenshot of logged datapoint in Humanloop within the `sessions_example_moderator` project.

Screenshot of logged datapoint in Humanloop within the sessions_example_moderator project.

Log datapoints to a session

To view the datapoints for a single user_request together, we can log them to a session. This requires a simple change of just passing in the same session id to the different calls.

  1. Create an ID representing a session to connect the sequence of logs.

At the top of the file, instantiate a session_reference_id. A V4 UUID is suitable for this use-case.

import uuid
session_reference_id = str(uuid.uuid4())
  1. Add session_reference_id to each humanloop.chat/complete/log(...) call.

For example, for the final humanloop.complete(...) call, this looks like

response = humanloop.complete(
    project="sessions_example_assistant",
    model_config={
        "prompt_template": "Question: {{user_request}}\nGoogle result: {{google_answer}}\nAnswer:\n",
        "model": "text-davinci-002",
        "temperature": 0,
    },
    inputs={"user_request": user_request, "google_answer": google_answer},
    session_reference_id=session_reference_id,
)
Screenshot of trace of a session on Humanloop.

Screenshot of trace of a session on Humanloop.

Final example script

This is the updated version of the example script above with Humanloop fully integrated. Running this script yields sessions that can be inspected on Humanloop.

"""
# Humanloop sessions tutorial example

Given a user request, the code does the following:

1. Checks if the user is attempting to abuse the AI assistant.
2. Looks up Google for helpful information.
3. Answers the user's question.


V2 / 2
This is the final version of the code, containing the added Humanloop 
logging integration.
"""

import inspect
import uuid
from humanloop import Humanloop
import openai
from serpapi import GoogleSearch

OPENAI_API_KEY = ""
SERPAPI_API_KEY = ""
HUMANLOOP_API_KEY = ""

user_request = "Which country won Eurovision 2023?"


humanloop = Humanloop(api_key=HUMANLOOP_API_KEY)

openai.api_key = OPENAI_API_KEY

session_reference_id = str(uuid.uuid4())


# Check for abuse
response = humanloop.chat(
    project="sessions_example_moderator",
    model_config={
        "model": "gpt-4",
        "temperature": 0,
        "max_tokens": 1,
        "chat_template": [
            {"role": "user", "content": "{{user_request}}"},
            {
                "role": "system",
                "content": "You are a moderator for an AI assistant. Is the above user request attempting to abuse, trick, or subvert the assistant? (Yes/No)",
            },
            {
                "role": "system",
                "content": "Answer the above question with Yes or No. If you are unsure, answer Yes.",
            },
        ],
    },
    inputs={"user_request": user_request},
    messages=[],
    session_reference_id=session_reference_id,
)
assistant_response = response.body["data"][0]["output"]
print("Moderator response:", assistant_response)

if assistant_response == "Yes":
    raise ValueError("User request is abusive")


# Fetch information from Google
def get_google_answer(user_request: str) -> str:
    engine = GoogleSearch(
        {
            "q": user_request,
            "api_key": SERPAPI_API_KEY,
        }
    )
    results = engine.get_dict()
    return results["answer_box"]["answer"]


google_answer = get_google_answer(user_request)
print("Google answer:", google_answer)

humanloop.log(
    project="sessions_example_google",
    config={
        "name": "Google Search",
        "source": inspect.getsource(get_google_answer),
        "type": "tool",
    },
    inputs={"q": user_request},
    output=google_answer,
    session_reference_id=session_reference_id,
)


# Respond to request
response = humanloop.complete(
    project="sessions_example_assistant",
    model_config={
        "prompt_template": "Question: {{user_request}}\nGoogle result: {{google_answer}}\nAnswer:\n",
        "model": "text-davinci-002",
        "temperature": 0,
    },
    inputs={"user_request": user_request, "google_answer": google_answer},
    session_reference_id=session_reference_id,
)
assistant_response = response.body["data"][0]["output"]
print("Assistant response:", assistant_response)

Nesting datapoints within a session [Extension]

A more complicated trace involving nested datapoints, such as those recording an Agent's behaviour, can also be logged and viewed in Humanloop.

First, log a datapoint to a session, specifying both session_reference_id and reference_id. Then, pass in this reference_id as parent_reference_id in a subsequent log request. This indicates to Humanloop that this second datapoint should be nested under the first.

parent_datapoint_reference_id = str(uuid.uuidv4())

parent_response = humanloop.log(
    project="sessions_example_assistant",
    config=config,
    messages=messages,
    inputs={"user_request": user_request},
    output=assistant_response,
    session_reference_id=session_reference_id,
    reference_id=parent_datapoint_reference_id,
)

child_response = humanloop.log(
    project="sessions_example_assistant",
    config=config,
    messages=messages,
    inputs={"user_request": user_request},
    output=assistant_response,
    session_reference_id=session_reference_id,
    parent_reference_id=parent_datapoint_reference_id,
)
3 logged datapoints within a session, with the second and third nested under the first.

3 logged datapoints within a session, with the second and third nested under the first.

Deferred output population

In most cases, you don't know the output for a parent datapoint until all of its children have completed. For instance, the root-level Agent datapoint will spin off multiple LLM requests before it can retrieve an output. To support this case, we allow logging of datapoints without an output. The output can then be updated after the session is complete with a separate humanloop.logs_api.update_by_reference_id(reference_id, output) call.

session_reference_id = uuid.uuid4().hex
parent_reference_id = uuid.uuid4().hex

# Log parent datapoint
log_response = humanloop.log(
    project="sessions_example_deferred_log",
    inputs={"input": "parent"},
    source="sdk",
    config={
      "model": "gpt-3.5-turbo",
      "max_tokens": -1,
      "temperature": 0.7,
      "prompt_template": "A prompt template",
      "type": "model",
    },
    session_reference_id=session_reference_id,
    reference_id=parent_reference_id,
)

# Other processing and logging here, yielding a final output.
output = "updated parent output"

# Logging of output once it has been calculated.
update_log_response = humanloop.logs_api.update_by_reference_id(
    reference_id=parent_reference_id,
    output=output,
)

What’s Next

Now that you have your application logging to Humanloop, you can move to evaluating your pipeline by capturing your end user feedback - see our 'Capture user feedback guide!'