Human-in-the-Loop for LLMs in Production

Capturing human feedback is crucial for understanding how your application performs in real-world scenarios. Confident AI enables you to incorporate human input into your evaluation cycle through deepeval, which makes it easy to send feedback for a specific LLM response by specifying the response_id returned from the monitor() function.

info

Before diving into this content, it might be helpful to read the following:

LLM Monitoring in Production

There are two types of feedback:

User provided feedback. A user feedback, is feedback provided by your end users. If you're building a discord bot for example, this will likely be the users interacting with your discord bot on a daily basis.
Reviewer provided feedback. A reviewer feedback, is feedback provided by your LLM quality assurance personnel, whoever they might be. These might be data annotators and labellers, or even domain experts that checking the quality of LLM responses in production.

User provided feedback can be sent via deepeval, while reviewers can provide feedback from Confident AI's observatory.

tip

Incorporating simple and enfortless feedback mechanisms such as thumbs-up or thumbs-down, or star rating buttons on your user interface, may encourage feedback leaving.

Collecting User Feedback

Using the response_id returned from monitor(), here's how you can send feedback to Confident:

import deepeval

response_id = deepeval.monitor(...)

deepeval.send_feedback(
    response_id=response_id,
    rating=5,
    explanation="...",
    expected_response="..."
)

There are two mandatory and four optional parameters when using the send_feedback() function:

response_id: a string representing the response_id returned from the deepeval.monitor() function.
rating: an integer ranging from 1 - 5, inclusive.
[Optional] explanation: a string that serves as the explanation for the given rating.
[Optional] expected_response: a string representing what the ideal response is.
[Optional] fail_silently: a boolean which when set to True will neither print nor raise any exceptions on error. Defaulted to False.
[Optional] raise_exception: a boolean which when set to True will not raise any expections on error. Defaulted to True.

tip

The send_feedback() method is a method that blocks the main thread. To use the asynchronous version of send_feedback(), use the a_send_feedback() method which has the exact the function signature instead:

import asyncio
import deepeval

async def send_feedback_concurrently():
    # send multiple feedbacks at once without blocking the main thread
    await asyncio.gather(deepeval.send_feedback(...), deepeval.send_feedback(...))

asyncio.run(send_feedback_concurrently())

Collecting User Feedback​

Collecting User Feedback