©RillNews
new
show
ask
jobs
submit
login
SWE-bench Verified no longer measures frontier coding capabilities
openai.com
207 points by
kmdupree
8 hours ago
|
122 comments
add comment