Evals

Notes on evals
Title Description
Frequently Asked Questions (And Answers) About AI Evals FAQ from our course on AI Evals.
Intro To Error Analysis With Just Spreadsheets In this lesson, Shreya Shankar and Hamel Husain walk through the process of error analysis from…
Your AI Product Needs Evals How to construct domain-specific LLM evaluation systems.
Creating a LLM-as-a-Judge That Drives Business Results A step-by-step guide with my learnings from 30+ AI implementations.
A Field Guide to Rapidly Improving AI Products Evaluation methods, data-driven improvement, and experimentation techniques from 30+ production…
How to Construct Domain Specific LLM Evaluation Systems AI Engineering World’s Fair talk on building evaluation systems for LLMs.
How Engineers and PMs should collaborate on Evals How to align AI evaluations with business metrics, communicate value to stakeholders, and build a…
Inspect AI, An OSS Python Library For LLM Evals A look at Inspect AI with its creator, JJ Allaire.
Evals Flashcards I created these flashcards to help students learn about evals in our AI Evals course.
Evals Memes Evals can be a dry subject: data pipelines, metrics, LLM-as-a-judge calibration. But that doesn’t…
No matching items