home
Madison Howard
[email protected]
Profile
Inbox
Activity
Setting
Sing out
?>
timeline
about
photos
friends
more
edit profile
share
Share Your Mood
×
MaimedUbermensch
2024-10-08 20:10:54
copy link to adda
edit post
embed adda
Introducing ScienceAgentBench: A new benchmark to rigorously evaluate language agents on 102 tasks from 44 peer-reviewed publications across 4 scientific disciplines
You and 201 people like this
201
41
07