Evals
Notes on evals
| Title | Description |
|---|---|
| Frequently Asked Questions (And Answers) About AI Evals | FAQ from our course on AI Evals. |
| Intro To Error Analysis With Just Spreadsheets | In this lesson, Shreya Shankar and Hamel Husain walk through the process of error analysis from… |
| Your AI Product Needs Evals | How to construct domain-specific LLM evaluation systems. |
| Creating a LLM-as-a-Judge That Drives Business Results | A step-by-step guide with my learnings from 30+ AI implementations. |
| A Field Guide to Rapidly Improving AI Products | Evaluation methods, data-driven improvement, and experimentation techniques from 30+ production… |
| How to Construct Domain Specific LLM Evaluation Systems | AI Engineering World’s Fair talk on building evaluation systems for LLMs. |
| How Engineers and PMs should collaborate on Evals | How to align AI evaluations with business metrics, communicate value to stakeholders, and build a… |
| Inspect AI, An OSS Python Library For LLM Evals | A look at Inspect AI with its creator, JJ Allaire. |
| Evals Flashcards | I created these flashcards to help students learn about evals in our AI Evals course. |
| Evals Memes | Evals can be a dry subject: data pipelines, metrics, LLM-as-a-judge calibration. But that doesn’t… |
No matching items