Skip to main content

Iterating on Hyperparameters

In this section, we'll iterate on our summarizer's hyperparameters to enhance summary quality. Our focus is improving the completeness factor, as one of the summaries in the previous section failed to meet the threshold for the Completeness GEval metric.

Revisiting the Summarizer's Hyperparameters

Let's first revisit the hyperparameters (prompt template and model) we used to generate these summaries, which we first defined in the introduction section of this tutorial.

prompt_template = """
You are an AI assistant tasked with summarizing legal documents
concisely and accurately. Given the following legal text, generate
a summary that captures the key points while avoiding unnecessary
details. Ensure neutrality and refrain from interpreting beyond the
provided text.
"""
model = "gpt-3.5"

Improving the Prompt Template

In our original prompt template, we instructed our legal summarizer to:

"Generate a summary that captures the key points while avoiding unnecessary details."

note

This phrasing may have inadvertently led the LLM to exclude details it considered unnecessary (even though it wasn't), potentially omitting important context.

To address the issue above, we can redefine it to ensure that the summary remains concise without omitting details from the document, refining the prompt as follows:

prompt_template = """
You are an AI assistant tasked with summarizing legal documents
concisely and accurately. Given the following legal text, generate
a summary that captures the key points. Be concise as possible, but
DO NOT omit any details from the document. Ensure neutrality and refrain
from interpreting beyond the provided text.
"""

Testing the Prompt Template Change

In order to test our prompt template improvement, we'll need to re-run our evaluations on the same dataset from the previous section and look out for improvements. To do so, we'll generate summaries for the same set of input documents using our improved models and pass the new test cases into an EvaluationDataset for evaluation.

new_dataset = ...
evaluate(
new_dataset, metrics=[concision_metric, completeness_metric]
hyperparameters={
"model"=model,
"prompt_template"=prompt_template
}
)
tip

You can see that we're now logging hyperparameters with the evaluate function, which makes it easy to compare evaluations across runs on Confident AI and attribute score changes to specific hyperparameter changes. Learn more about the evaluate function here.

New Evaluation Report

Our prompt change was effective, improving the previously failing summary so that it now passes. However, the new prompt template has introduced an entirely new issue with the third test case, which is now failing the concision metric.

Click to see new failing test case

failing_llm_test_case = LLMTestCase(
input="""
SOFTWARE MAINTENANCE AND SUPPORT AGREEMENT

This Software Maintenance and Support Agreement ("Agreement") is made as of March 1, 2025, by and between OmegaTech Solutions, Inc., a corporation organized under the laws of the State of Texas with its principal place of business at 789 Innovation Way, Austin, TX 73301 ("Provider"), and DeltaCorp Ltd., a corporation organized under the laws of the State of Florida with its principal place of business at 321 Enterprise Blvd, Miami, FL 33101 ("Client").

WHEREAS, Client owns and operates proprietary software applications and seeks ongoing maintenance and support services;
WHEREAS, Provider specializes in software support, maintenance, and troubleshooting;
NOW, THEREFORE, in consideration of the mutual promises herein, the parties agree as follows:

### **1. SERVICES**
Provider agrees to provide maintenance, troubleshooting, bug fixes, performance optimization, and security updates for Client’s software. Support shall be provided via email, phone, and an online ticketing system. Provider shall respond to critical issues within four (4) hours and non-critical issues within twenty-four (24) hours.

### **2. TERM & TERMINATION**
This Agreement shall commence on March 1, 2025, and continue for an initial period of twelve (12) months, renewing automatically unless terminated with thirty (30) days' written notice by either party. Early termination requires payment for all services rendered up to the termination date.

### **3. COMPENSATION**
Client agrees to pay Provider a fixed monthly fee of Fifteen Thousand Dollars ($15,000), due on the first day of each month. Additional services outside the agreed scope will be billed at an hourly rate of Two Hundred Dollars ($200). Payments overdue by more than ten (10) days will incur a 2% late fee per month.

### **4. CONFIDENTIALITY**
Both parties shall maintain the confidentiality of proprietary software, technical information, and business data. Confidentiality obligations remain in effect for three (3) years post-termination.

### **5. WARRANTIES & LIABILITY**
Provider warrants that maintenance services will be performed in a professional manner. Provider is not liable for indirect or consequential damages. Liability is limited to the total fees paid under this Agreement.

### **6. DISPUTE RESOLUTION**
Disputes shall be resolved through mediation. If unresolved, disputes shall be settled via binding arbitration in Texas under the rules of the American Arbitration Association.

### **7. GENERAL PROVISIONS**
This Agreement constitutes the entire understanding between the parties and supersedes prior agreements. It shall be governed by the laws of the State of Texas.

IN WITNESS WHEREOF, the parties execute this Agreement as of the Effective Date.

Signed,

**OmegaTech Solutions, Inc.**
By: __________________________
Name: Michael Johnson
Title: CEO

**DeltaCorp Ltd.**
By: __________________________
Name: Sarah Reynolds
Title: CTO
""",
actual_output="""
This agreement, a legally binding contract between **OmegaTech Solutions, Inc.** (Provider) and **DeltaCorp Ltd.** (Client), establishes the terms under which the Provider will perform software maintenance and support services for the Client’s proprietary software. The contract outlines the services to be provided, which include **bug fixes, security updates, system troubleshooting, and performance optimization**. The same services, namely **bug fixes, security updates, system troubleshooting, and performance optimization**, will be carried out as agreed.

The contract begins on **March 1, 2025**, and remains in effect for **twelve months**, automatically renewing unless one party provides at least **thirty (30) days' written notice** prior to the renewal date. Without such notice, the agreement will continue for another **twelve months** under the same conditions.

The Client is required to pay a **fixed monthly fee of $15,000**, which must be paid in full on the **first day of each month**. Payments must be made on time, as overdue payments beyond **ten (10) days** will result in a **2% penalty per month**. In the event of late payments, a **2% penalty per month** will be applied until the outstanding balance is settled. Additional services outside the agreed scope are billed at **$200 per hour**, and work exceeding the initial agreement will be charged at **$200 per hour**.

Both parties agree to strict **confidentiality obligations**, ensuring that all proprietary, sensitive, and business-critical information remains protected during the contract period. These confidentiality obligations remain in effect for **three (3) years** after termination. After termination, the confidentiality obligations will continue to apply for **three (3) years** to prevent any unauthorized disclosure.

The Provider’s **liability is limited to the total fees paid**, meaning the maximum financial responsibility of the Provider cannot exceed the total payments made by the Client. Under no circumstances will the Provider be responsible for **indirect, incidental, special, or consequential damages**. Furthermore, the Provider **is not liable for indirect, incidental, special, or consequential damages**, ensuring that its responsibility does not extend beyond the total fees paid under this agreement.

In the event of a **dispute**, both parties agree to first attempt **mediation** to resolve the matter. If mediation is unsuccessful, the dispute will proceed to **binding arbitration in Texas**. If mediation fails, **binding arbitration in Texas** will be the final resolution process.

This agreement represents the **entire understanding** between the parties, replacing and superseding all previous discussions, negotiations, and prior agreements. No previous discussions, negotiations, or prior agreements will have any effect, as this agreement is the final, complete, and binding understanding between both parties.

The contract is governed by **Texas law** and will be interpreted in accordance with **the laws of the State of Texas**.
"""
)

note

In the next section, we'll learn how to compare evaluation test runs and catch these regressions.