rageval - v0.1.1
    Preparing search index...

    Function toMarkdown

    • Serializes an EvaluationResult to a Markdown string.

      Produces a GitHub-compatible Markdown report with score tables, a per-sample breakdown, and a metric legend. Ideal for:

      • Posting evaluation results as a GitHub PR comment
      • Including in documentation or wikis
      • Committing evaluation snapshots to a repo

      Parameters

      • result: {
            scores: {
                faithfulness?: number;
                contextRelevance?: number;
                answerRelevance?: number;
                contextRecall?: number;
                contextPrecision?: number;
                overall: number;
                [key: string]: unknown;
            };
            samples: {
                id?: string;
                question: string;
                scores: Record<string, number>;
                reasoning?: Record<string, string>;
                tenantId?: string;
                metadata?: Record<string, unknown>;
            }[];
            stats?: Record<
                string,
                { mean: number; min: number; max: number; stddev: number; count: number },
            >;
            meta: {
                totalSamples: number;
                metrics: string[];
                provider: string;
                model: string;
                startedAt: string;
                completedAt: string;
                durationMs: number;
            };
        }

        The evaluation result from evaluate().

        • scores: {
              faithfulness?: number;
              contextRelevance?: number;
              answerRelevance?: number;
              contextRecall?: number;
              contextPrecision?: number;
              overall: number;
              [key: string]: unknown;
          }

          Aggregate scores averaged across all samples.

        • samples: {
              id?: string;
              question: string;
              scores: Record<string, number>;
              reasoning?: Record<string, string>;
              tenantId?: string;
              metadata?: Record<string, unknown>;
          }[]

          Per-sample detailed results.

        • Optionalstats?: Record<
              string,
              { mean: number; min: number; max: number; stddev: number; count: number },
          >

          Per-metric score distribution statistics (min, max, stddev, count).

          Keys are metric names (same as keys in scores, minus overall). Useful for understanding score variance and identifying which questions score poorly. overall is excluded — compute it from individual metric stats.

          const { stats } = await evaluate({ ... })
          // High stddev indicates inconsistent pipeline behaviour:
          if ((stats.faithfulness?.stddev ?? 0) > 0.15) {
          console.warn('Faithfulness varies widely across samples — review your retrieval.')
          }
        • meta: {
              totalSamples: number;
              metrics: string[];
              provider: string;
              model: string;
              startedAt: string;
              completedAt: string;
              durationMs: number;
          }

          Metadata about the evaluation run.

          • totalSamples: number

            Total number of samples evaluated.

          • metrics: string[]

            Names of the metrics that were evaluated.

          • provider: string

            LLM provider used (e.g. 'anthropic', 'openai').

          • model: string

            LLM model used (e.g. 'claude-opus-4-6').

          • startedAt: string

            ISO 8601 timestamp when evaluation started.

          • completedAt: string

            ISO 8601 timestamp when evaluation completed.

          • durationMs: number

            Wall-clock duration of the evaluation in milliseconds.

      • title: string = 'rageval Evaluation Report'

        Optional report title.

      Returns string

      Markdown string.

      import { evaluate, toMarkdown } from 'rageval'
      import { writeFileSync } from 'node:fs'

      const result = await evaluate({ ... })
      writeFileSync('eval-report.md', toMarkdown(result))