Skip to content

Getting Started

This section covers everything you need to go from zero to a working evaluation: how to install Karenina, hands-on quickstarts for all three evaluation modes, and how to set up your workspace.


In This Section

Page What You'll Learn
Installation Requirements, install commands, optional dependencies
Quick Start: Q/A Benchmark Hands-on walkthrough from zero to a working benchmark
Quick Start: Scenarios Build a multi-turn evaluation with branching and outcome criteria
Quick Start: TaskEval Evaluate pre-recorded outputs (agent traces, external text)
Workspace Init Set up a project directory with karenina init

If you're new to Karenina, read these pages in order:

  1. Installation: Install Karenina and set up API keys
  2. Quick Start: Q/A Benchmark: Run your first single-turn benchmark end-to-end
  3. Workspace Init: Set up a project directory with karenina init

If your goal is multi-turn evaluation (sycophancy testing, error correction, progressive disclosure), start with Quick Start: Scenarios after installation.

If your goal is evaluating existing outputs (agent traces, external text) rather than creating benchmarks, start with Quick Start: TaskEval after installation.

Once you're comfortable, move on to Core Concepts for a deeper understanding of checkpoints, templates, rubrics, and adapters.