Skip to main content
Score cards in PromptLayer provide a powerful way to automatically calculate and track evaluation scores for your pipelines. Scores are calculated automatically when an evaluation completes, giving you immediate insights into your prompt performance.

How Scoring Works

When an evaluation pipeline finishes running, PromptLayer automatically calculates a score based on the results. There are two types of scoring methods available:

Simple Scores

Simple scores are the default scoring method. They automatically aggregate results from selected columns in your pipeline. How Simple Scoring Works:
  1. Column Selection: By default, the last column in your pipeline is used for scoring. You can select specific columns to include in the score calculation.
configure-score-columns-screenshot
  1. Value Aggregation: For each scoring column, the system:
    • Collects all completed cell values
    • Converts values to booleans (for true/false assertions) or numbers
    • Calculates the mean of all values
  2. Score Types:
    • Boolean scores: Displayed as a percentage (0-100%) representing the ratio of true values
    • Numeric scores: Displayed as the average of all numeric values
  3. Final Score: If multiple columns are selected for scoring, the final score is the mean of all column scores. You will see a breakdown of each column’s contribution to the overall score.
score-columns-screenshot

Matrix Scores

Matrix scores provide advanced scoring capabilities using custom code. This allows you to implement complex scoring logic, weighted averages, or custom business rules. How Matrix Scoring Works:
  1. Custom Code Execution: You provide Python (or JavaScript) code that receives all evaluation data
  2. Data Access: Your code receives a data variable containing all row results
  3. Score Calculation: Your code must return:
    • A score key with a numeric value (required)
    • Optionally, a score_matrix for detailed scoring breakdowns. You can provide multiple matrices if needed
matrix-score-screenshot

Updating a Score on a Report

By default scores are inhereted on a pipline. However, you can update the score calculation on an existing report by editing the report settings. Changes to scoring will automatically recalculate the score based on the new configuration.