The evaluation result to export.
Aggregate scores averaged across all samples.
Per-sample detailed results.
Optionalstats?: Record<Per-metric score distribution statistics (min, max, stddev, count).
Keys are metric names (same as keys in scores, minus overall).
Useful for understanding score variance and identifying which questions
score poorly. overall is excluded — compute it from individual metric stats.
Metadata about the evaluation run.
Total number of samples evaluated.
Names of the metrics that were evaluated.
LLM provider used (e.g. 'anthropic', 'openai').
LLM model used (e.g. 'claude-opus-4-6').
ISO 8601 timestamp when evaluation started.
ISO 8601 timestamp when evaluation completed.
Wall-clock duration of the evaluation in milliseconds.
CSV string with header row. Returns empty string if dataset is empty.
import { evaluate, toCsv } from 'rageval'
import { writeFileSync } from 'node:fs'
const result = await evaluate({ ... })
writeFileSync('eval-scores.csv', toCsv(result))
// Columns: id,question,faithfulness,answerRelevance,overall
// Example row: q1,What is...,0.9500,0.8750,0.9125
// With reasoning (for audit logs):
const resultWithReasoning = await evaluate({ ..., includeReasoning: true })
writeFileSync('eval-audit.csv', toCsv(resultWithReasoning))
// Columns: id,question,faithfulness,answerRelevance,overall,faithfulness_reasoning,answerRelevance_reasoning
Serializes an EvaluationResult to a CSV string.
Each row represents one sample. Columns:
id-- sample identifier (empty string if not set)question-- the question textfaithfulness,answerRelevance)overall-- per-sample mean of all metric scores{metric}_reasoningcolumns -- included automatically whenincludeReasoning: truewas passed toevaluate()and reasoning text is present. Useful for audit logs in healthcare, legal, or compliance contexts.Scores are formatted to 4 decimal places. The CSV follows RFC 4180 escaping -- fields containing commas, double quotes, or newlines are wrapped in double quotes with internal quotes doubled. Safe to open in Excel, Google Sheets, or pandas.