rageval - v0.1.1
    Preparing search index...

    Interface MetricOutput

    Output from a single metric evaluation on one sample.

    Returned by every metric's score() method and collected by evaluate() into the final EvaluationResult.

    interface MetricOutput {
        score: number;
        reasoning?: string;
        skipped?: boolean;
    }
    Index

    Properties

    score: number

    The metric score for this sample, in the range [0.0, 1.0]. Higher is always better. Scores are clamped to [0, 1] even if the LLM returns values outside that range.

    reasoning?: string

    The LLM judge's explanation of why it assigned this score. Only populated when includeReasoning: true is passed to evaluate(). Useful for debugging unexpectedly low or high scores.

    skipped?: boolean

    When true, this metric could not be computed for this sample and should be excluded from all aggregates. The most common cause is contextRecall being evaluated on a sample without a groundTruth field.

    evaluate() detects skipped: true and omits the score from both per-sample scores and the per-metric aggregate -- it is never counted as a 0. This prevents silent score distortion.

    The score field is still set to 0 for backward compatibility with code that reads raw MetricOutput without checking skipped.