πŸ€–

Chatbot Evaluation Platform

AI-Powered Quality Analysis

βœ“ Live
1
Questions
2
Rubrics
3
Review & Grading
4
Bot URL
5
Tools Config
6
Review & Run

Step 1: Add Test Questions

Enter questions to test your chatbot

CSV can contain questions only, or questions + answers, it will automatically pick the questions

Step 2: Select Evaluation Rubrics

Choose how to evaluate responses

πŸ“‹ Core Quality

🎯 Quality & Style

πŸ”’ Safety & Compliance

πŸ’¬ Conversation Flow

0 rubrics selected
Need something specific?

Above rubrics not sufficient? Define your own criteria below.
πŸ’‘ Tip: You can combine predefined rubrics with custom ones

No custom rubrics yet.

Step 3: Review & Customize Rubrics

Fine-tune your evaluation criteria and select grading strictness

Step 4: Chatbot URL

Enter your chatbot's URL

Step 5: Configure Tools & Settings

Select enabled tools and preferences

Step 6: Review & Run Evaluation

Review your configuration before running

Questions

Loading...

Rubrics

Loading...

Bot URL

Loading...

Tools

Loading...

πŸ” Preview Evaluation Prompt

See exactly what criteria and examples will be sent to the AI judge

πŸ“Š No Results Yet

Run an evaluation to see results here

πŸ”¬ Hallucination Inspector

Evaluate your chatbot for hallucinations by comparing answers against expected truth

This will auto-fill if you entered a URL in the Eval Workshop
Strict: Minor issues = -15pts, Major = -35pts, Critical = -65pts
Balanced: Minor = -10pts, Major = -25pts, Critical = -50pts
Lenient: Minor = -5pts, Major = -15pts, Critical = -30pts
Question Expected Answer
No Q/A pairs yet. Upload CSV or add manually.

⚑ GEPA Optimizer

Optimize prompts for smaller models using GEPA

These questions will be used to generate a dataset for optimization (minimum 3 recommended)
Select the small/efficient model you want to optimize the prompt for
Maximum number of optimization iterations (default: 5)