Getting Started¶

This section covers everything you need to go from zero to a working evaluation: how to install Karenina, hands-on quickstarts for all three evaluation modes, and how to set up your workspace.

In This Section¶

Page	What You'll Learn
Installation	Requirements, install commands, optional dependencies
Quick Start: Q/A Benchmark	Hands-on walkthrough from zero to a working benchmark
Quick Start: Scenarios	Build a multi-turn evaluation with branching and outcome criteria
Quick Start: TaskEval	Evaluate pre-recorded outputs (agent traces, external text)
Workspace Init	Set up a project directory with `karenina init`

Recommended Reading Order¶

If you're new to Karenina, read these pages in order:

Installation: Install Karenina and set up API keys
Quick Start: Q/A Benchmark: Run your first single-turn benchmark end-to-end
Workspace Init: Set up a project directory with karenina init

If your goal is multi-turn evaluation (sycophancy testing, error correction, progressive disclosure), start with Quick Start: Scenarios after installation.

If your goal is evaluating existing outputs (agent traces, external text) rather than creating benchmarks, start with Quick Start: TaskEval after installation.

Once you're comfortable, move on to Core Concepts for a deeper understanding of checkpoints, templates, rubrics, and adapters.