I was founder and ceo of kaggle. I’ve been out of kaggle for 2.5 years. Super excited to see this announcement. Could solve the biggest problem in the LLM ecosystem.
We love what Chatbot Arena is doing to innovate on evaluation paradigms. The challenge of evaluating GenAI warrants diverse approaches. What we're excited to do is: 1) give anyone access to infra to make evaluation more accessible to more developers and researchers; 2) drive more novel, diverse evals. https://arxiv.org/abs/2505.00612v2
Here’s the announcement https://www.kaggle.com/blog/announcing-kaggle-benchmarks
I was founder and ceo of kaggle. I’ve been out of kaggle for 2.5 years. Super excited to see this announcement. Could solve the biggest problem in the LLM ecosystem.
Curious how does it compare to Chat Arena?
We love what Chatbot Arena is doing to innovate on evaluation paradigms. The challenge of evaluating GenAI warrants diverse approaches. What we're excited to do is: 1) give anyone access to infra to make evaluation more accessible to more developers and researchers; 2) drive more novel, diverse evals. https://arxiv.org/abs/2505.00612v2
Can we add our own models or benchmarks?
[dead]