Kaggle Launches LLM Evals

9 points | by antgoldbloom 13 hours ago

5 comments

antgoldbloom 13 hours ago
Here’s the announcement https://www.kaggle.com/blog/announcing-kaggle-benchmarks
I was founder and ceo of kaggle. I’ve been out of kaggle for 2.5 years. Super excited to see this announcement. Could solve the biggest problem in the LLM ecosystem.
art82135 13 hours ago
Curious how does it compare to Chat Arena?
[-]
- meganrisdal 13 hours ago
  We love what Chatbot Arena is doing to innovate on evaluation paradigms. The challenge of evaluating GenAI warrants diverse approaches. What we're excited to do is: 1) give anyone access to infra to make evaluation more accessible to more developers and researchers; 2) drive more novel, diverse evals. https://arxiv.org/abs/2505.00612v2
benhamner 13 hours ago
Can we add our own models or benchmarks?
jaimiehwang88 13 hours ago
[dead]