Getting Started

Introduction

To read about PromptLayer’s view on Evaluations, see the Why PromptLayer? Evaluations page.

Introduction

Evaluations are a core feature of PromptLayer, designed to allow teams to test their prompt templates and agents at scale. At its core, an evaluation is a repeatable pipeline of steps (columns) that you run on a series of data (rows). This allows you to systematically assess the performance of your prompts or agents across various scenarios. You can create evaluations directly through the PromptLayer UI, enabling both technical and non-technical team members to collaborate on prompt testing and refinement. You can also define a score to track the progress as you iterate on your prompts or agents. There are a few core concepts:

A dataset: all evaluations start with a dataset
An eval pipeline: a definition of an evaluation and am optional score, defined on a sample of 4 rows from the dataset
An evaluation report: the results of running the eval pipeline on the entire dataset

Prompt Blueprints Datasets

⌘I

Get Started

Languages & Environments

Usage Documentation

Why PromptLayer?

Reference

Introduction

Get Started

Languages & Environments

Usage Documentation

Why PromptLayer?

Reference

​Introduction

Introduction