rageval - v0.1.1
    Preparing search index...

    Variable faithfulnessConst

    faithfulness: Metric = ...

    Faithfulness — measures whether the generated answer is factually grounded in the provided context (hallucination detection).

    Score 1.0 = every claim in the answer is directly supported by the retrieved context. Score 0.0 = the answer is largely fabricated — major hallucinations not in the context.

    When to use: Run faithfulness on every evaluation. It is the single most critical metric for production RAG pipelines — a high faithfulness score confirms your LLM is staying within the boundaries of retrieved evidence, not inventing facts.

    Common-knowledge exemption: Claims any reasonable person already knows (e.g. "Water boils at 100°C") do not need to appear in the context and are not penalised. The metric focuses on domain-specific factual claims.

    Score interpretation (5-point scale):

    • 1.0: All claims are explicitly supported — excellent, no hallucination
    • 0.75: One or two minor claims slightly exceed the context, but overall grounded
    • 0.5: Some claims supported; others clearly go beyond what the context states
    • 0.25: Most claims are unsupported or contradicted by the context
    • 0.0: Answer is substantially fabricated — severe hallucination

    Uses LLM-as-judge pattern — see arXiv:2306.05685 (RAGAS paper).