Study identifies weaknesses in how AI systems are evaluated

413 points | by pseudolus 4 days ago

203 comments