Human-in-the-Loop for LLMs in Production
Capturing human feedback is crucial for understanding how your application performs in real-world scenarios. Confident AI enables you to incorporate human input into your evaluation cycle through deepeval
, which makes it easy to send feedback for a specific LLM response by specifying the response_id
returned from the monitor()
function.
Before diving into this content, it might be helpful to read the following:
There are two types of feedback:
- User provided feedback. A user feedback, is feedback provided by your end users. If you're building a discord bot for example, this will likely be the users interacting with your discord bot on a daily basis.
- Reviewer provided feedback. A reviewer feedback, is feedback provided by your LLM quality assurance personnel, whoever they might be. These might be data annotators and labellers, or even domain experts that checking the quality of LLM responses in production.
User provided feedback can be sent via deepeval
, while reviewers can provide feedback from Confident AI's observatory.
Incorporating simple and enfortless feedback mechanisms such as thumbs-up or thumbs-down, or star rating buttons on your user interface, may encourage feedback leaving.
Collecting User Feedback
Using the response_id
returned from monitor()
, here's how you can send feedback to Confident:
import deepeval
response_id = deepeval.monitor(...)
deepeval.send_feedback(
response_id=response_id,
rating=5,
explanation="...",
expected_response="..."
)
There are two mandatory and four optional parameters when using the send_feedback()
function:
response_id
: a string representing theresponse_id
returned from thedeepeval.monitor()
function.rating
: an integer ranging from 1 - 5, inclusive.- [Optional]
explanation
: a string that serves as the explanation for the given rating. - [Optional]
expected_response
: a string representing what the ideal response is. - [Optional]
fail_silently
: a boolean which when set toTrue
will neither print nor raise any exceptions on error. Defaulted toFalse
. - [Optional]
raise_exception
: a boolean which when set toTrue
will not raise any expections on error. Defaulted toTrue
.
The send_feedback()
method is a method that blocks the main thread. To use the asynchronous version of send_feedback()
, use the a_send_feedback()
method which has the exact the function signature instead:
import asyncio
import deepeval
async def send_feedback_concurrently():
# send multiple feedbacks at once without blocking the main thread
await asyncio.gather(deepeval.send_feedback(...), deepeval.send_feedback(...))
asyncio.run(send_feedback_concurrently())