Context Recall — measures whether the retrieved context contains the
information needed to produce the ground truth answer.
Score 1.0 = the context contains all information needed for the ground truth.
Score 0.0 = the context is missing the key information from the ground truth.
Requires groundTruth in the sample. Samples without groundTruth are
automatically skipped (skipped: true) and excluded from the aggregate —
they do not contribute a score of 0. A warning is printed to stderr when
contextRecall is in your metrics but no samples have groundTruth.
When to use: Use contextRecall to verify that your retriever is actually
returning the documents needed to construct the correct answer. A low recall
score means your retriever is missing key source material — the LLM cannot
generate a correct answer even if it tries.
Difference from contextRelevance/contextPrecision: Those ask "is the
retrieved content relevant to the question?" — recall asks a harder, more
specific question: "does the retrieved content contain what is needed for
the specific correct answer?" It requires knowing the expected answer.
Score interpretation (5-point scale):
1.0: Context contains all key facts needed for the ground truth — complete recall
0.75: Context contains most needed information; one minor fact may be missing
0.5: Context contains some needed facts but is missing important components
0.25: Context contains only minor signals relevant to the ground truth
0.0: Context does not contain the information needed to derive the ground truth
Uses LLM-as-judge pattern — see arXiv:2306.05685 (RAGAS paper).
Context Recall — measures whether the retrieved context contains the information needed to produce the ground truth answer.
Score 1.0 = the context contains all information needed for the ground truth. Score 0.0 = the context is missing the key information from the ground truth.
Requires
groundTruthin the sample. Samples withoutgroundTruthare automatically skipped (skipped: true) and excluded from the aggregate — they do not contribute a score of 0. A warning is printed to stderr when contextRecall is in your metrics but no samples havegroundTruth.When to use: Use contextRecall to verify that your retriever is actually returning the documents needed to construct the correct answer. A low recall score means your retriever is missing key source material — the LLM cannot generate a correct answer even if it tries.
Difference from contextRelevance/contextPrecision: Those ask "is the retrieved content relevant to the question?" — recall asks a harder, more specific question: "does the retrieved content contain what is needed for the specific correct answer?" It requires knowing the expected answer.
Score interpretation (5-point scale):
Uses LLM-as-judge pattern — see arXiv:2306.05685 (RAGAS paper).