MLflow Evaluation

Creator

Creator

Seonglae Cho

Created

Created

2024 Mar 25 3:55

Editor

Editor

Seonglae Cho

Edited

Edited

2024 Mar 25 3:55

Refs

Refs

Question answering AI

LLM Evaluation with MLflow Example Notebook — MLflow 2.11.3 documentation

In this notebook, we will demonstrate how to evaluate various LLMs and RAG systems with MLflow, leveraging simple metrics such as toxicity, as well as LLM-judged metrics such as relevance, and even custom LLM-judged metrics such as professionalism

https://mlflow.org/docs/latest/llms/llm-evaluate/notebooks/question-answering-evaluation.html

Recommendations

/////////