Skip to content
TEMPLATE · PYTHON · MARKDOWN · 11 KB

RAG eval harness starter · Python + Markdown

The eval framework scaffold we drop into RAG projects in week one. Giskard + promptfoo + custom metrics.

Starter scaffold for RAG system evaluation: 5 metric classes (faithfulness, context precision, answer relevance, bias, injection resistance), CI integration, diff reporting. MIT-licensed.

WHAT'S INSIDE

  • ·5 metric classes with Python code
  • ·Giskard + promptfoo integration
  • ·Injection-resistance eval suite (80+ prompts)
  • ·CI step YAML (GitHub Actions + GitLab CI)
  • ·Diff-report generator per build

Want a custom version?

A tailored audit or template delivered in 2 weeks · DField Solutions, Budapest.

Get a quote