Minimum acceptable score (0.0–1.0) per metric.
If any aggregate score falls below its threshold, evaluate() throws a ThresholdError. Use this to enforce quality gates in CI pipelines.
evaluate()
await evaluate({ provider: { type: 'anthropic', client }, dataset, thresholds: { faithfulness: 0.8, answerRelevance: 0.75 },}) Copy
await evaluate({ provider: { type: 'anthropic', client }, dataset, thresholds: { faithfulness: 0.8, answerRelevance: 0.75 },})
Minimum acceptable score (0.0–1.0) per metric.
If any aggregate score falls below its threshold,
evaluate()throws a ThresholdError. Use this to enforce quality gates in CI pipelines.